Abstract
Machine vision drives efficiency and automation in many industries. Since 2012, convolutional neural networks have taken over the leading role in this field of research.Key drivers for the quick advancements of neural networks in computer vision tasks are
Moore’s law and the rapid development of algorithms for artificial intelligence (AI).
Quality control tasks at assembly lines or applications for security and surveillance
are often built on resource-limited embedded devices. Requirements like data privacy
or real-time demands necessitate local data processing. However, neural networks place
high demands on a system’s hardware. Thus, algorithm-specific and hardware-specific
optimizations are mandatory to enable AI on resource-limited devices. Especially use
cases in which retraining of the AI is required for fine-tuning or adaptation to environmental changes, places high demands on the hardware. Therefore, optimizations for
the back-propagation algorithm, as well as optimal hardware occupancy, are essential
to enable the training of neural networks on embedded devices.
This thesis compares three state-of-the-art machine learning (ML) frameworks regarding their computational performance and memory footprint. The three ML frameworks, namely PyTorch, ONNX Runtime, and TensorRT, were utilized to retrain the
VGG16 vision model on the CIFAR-10 dataset. Since TensorRT is a highly optimized
inference framework, its Network Definition API was used to implement a training
model. Measurements on the CPU and GPU of an NVIDIA Jetson Orin utilizing performance optimization techniques like layer freezing and reduced floating-point precision
were carried out.
The experiments conducted show that PyTorch is highly efficient in training convolutional neural networks on NVIDIA GPUs. Further, it is evidenced that TensorRT
cannot well optimize networks where the model parameters are inputs to the network.
Optimizations like layer freezing and reduced floating-point precision can speed up the
training process by nearly a factor of three.
Date of Award | 2024 |
---|---|
Original language | English (American) |
Supervisor | Josef Langer (Supervisor) & Philipp Knaack (Supervisor) |