The automotive supplier industry is currently facing profound structural challenges. Declining profit margins, rising overhead costs, competition from China and political uncertainties combined with stagnating sales volumes are significantly affecting profitability. The reference architecture addressed in this thesis aims to contribute to increasing the profitability of products of the company “ZKW Group GmbH” (hereafter ZKW). The aim of the reference models is to enhance the standardization of individualized products. Evaluating and analyzing on a higher level of aggregation (architecture block instead of individual component) increases the comparability between different products. Currently, components are manually assigned to architecture blocks. Automating this assignment process would lead to significant time savings. This is the objective this thesis aims to address. As a basis for implementing this automation, this thesis explores and compares various text-based classification methods to determine the most suitable one for ZKW. The central focus of the study is the comparison of the performance of classical machine learning algorithms versus transformer models in classifying components based on their names. As part of a literature review, specific text-based classification models for categorizing component names, the necessary text preprocessing steps, and the evaluation of classification performance using suitable metrics are examined. The models identified through the literature review are then compared in an experimental case study based on real bill of materials data from the ZKW. Specifically, several combinations, each consisting of one of the three classical models (Naive Bayes, Support Vector Machine, Logistic Regression) and one of the two feature extraction methods (Bag-of-Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF)), as well as four variants of the BERT model (BERT-base, BERT-large, mBERT, and DistilBERT) are evaluated. The BERT-base model and the combination of Bag-of-Words + Logistic Regression perform best, achieving excellent results with accuracies of 95.29 % and F1-scores of 0.950 and 0.952, respectively. A significance test (bootstrap test) shows no statistically significant difference between the transformer model BERT-base and the classical Bag-of-Words + Logistic Regression model. Based on the results of this thesis, the implementation of the Bag-of-Words + Logistic Regression model is recommended. In the long term, BERT may offer greater robustness and should therefore remain a subject of further consideration.1
Date of Award | 2025 |
---|
Original language | German (Austria) |
---|
Supervisor | Christina Feilmayr (Supervisor) |
---|
Einsatz moderner Machine-Learning-Methoden zur Optimierung der Textklassifikation in Produkt-Stücklisten
Guttmann, B. (Author). 2025
Student thesis: Bachelor's Thesis