Comparison of AI Architectures and Hardware: Literature Review and Practical Model Deployment

  • Nan Wu

    Student thesis: Master's Thesis

    Abstract

    This thesis delves into the current the main types of artificial intelligence (AI) networks,
    including deep neural networks, recurrent neural networks, and transformers. The applicability of these networks in various scenarios is also analyzed. Subsequently, a detailed
    discussion of the current state of AI hardware development, from chip design to popular
    inference chips on the market is provided. Building on this discussion, the prospects of
    the hardware in model training and inference are outlined. Currently, discussions often
    focus on how to design a model and improve its accuracy, while the deployment issues
    from an engineering perspective are often overlooked. The current trend in AI development is centred on big data and large models. Thus, how to deploy these models onto
    edge devices without sacrificing accuracy has been a challenge. To gain a deeper understanding of model deployment, ShuffleNetV2 is selected as the experimental model.
    It is trained on multiple datasets, and the model performance across these datasets
    is compared. After that, the model is deployed on Jetson and ARM platforms. INT8
    quantization techniques are used for model deployment. The model performance is then
    compared with the Single-precision floating-point format (FP32). Finally, this thesis
    provides a detailed comparison of the operational efficiency of the deployed models on
    different platforms and examines the experimental results in detail.
    Date of Award2024
    Original languageEnglish (American)
    SupervisorThomas Müller-Wipperfürth (Supervisor)

    Cite this

    '