Comparison of AI Architectures and Hardware: Literature Review and Practical Model Deployment

  • Nan Wu

Student thesis: Master's Thesis

Abstract

This thesis delves into the current the main types of artificial intelligence (AI) networks,
including deep neural networks, recurrent neural networks, and transformers. The applicability of these networks in various scenarios is also analyzed. Subsequently, a detailed
discussion of the current state of AI hardware development, from chip design to popular
inference chips on the market is provided. Building on this discussion, the prospects of
the hardware in model training and inference are outlined. Currently, discussions often
focus on how to design a model and improve its accuracy, while the deployment issues
from an engineering perspective are often overlooked. The current trend in AI development is centred on big data and large models. Thus, how to deploy these models onto
edge devices without sacrificing accuracy has been a challenge. To gain a deeper understanding of model deployment, ShuffleNetV2 is selected as the experimental model.
It is trained on multiple datasets, and the model performance across these datasets
is compared. After that, the model is deployed on Jetson and ARM platforms. INT8
quantization techniques are used for model deployment. The model performance is then
compared with the Single-precision floating-point format (FP32). Finally, this thesis
provides a detailed comparison of the operational efficiency of the deployed models on
different platforms and examines the experimental results in detail.
Date of Award2024
Original languageEnglish (American)
SupervisorThomas Müller-Wipperfürth (Supervisor)

Cite this

'