Blood group typing is an essential component of transfusion medicine and forms the basis for the safe supply of blood products to patients. Blood group determination is indispensable in transfusion medicine and ensures the correct allocation of blood products. The process of blood group determination is currently based on a semi-automated set of rules that reliably covers standard cases. In rare cases, however, further tests must be ordered through manual decisions. The aim of this thesis was to investigate to what extent statistical methods and a multi-label machine learning approach can support decision-making in follow-up examinations based on previous results in blood group typing. The data basis consisted of examination records from the period 2017–2025, with a total of over 19 million entries.The data were processed, pseudonymized, and prepared for training machine learning models. Decision trees, random forests, Light Gradient Boosting Machine (LGBM), and logistic regression were used for modelling and integrated into classification models using multi-label wrappers. In addition to the machine learning approach, a descriptive analysis was performed to evaluate data quality and frequency distributions, and a process model was used to examine whether an extended rule set could be derived. The results show that the random forest provides the best and most robust predictions, thus demonstrating the greatest potential for supporting blood group typing (Micro-F1 = 0.80, Macro-F1 = 0.53, Weighted-F1 = 0.76, Accuracy = 0.66). LightGBM also achieved good results with an accuracy of 0.58 and a Micro-F1 score of 0.76, while decision trees yielded slightly lower results with an accuracy of 0.56 and a Micro-F1 of 0.75. Logistic regression performed worst overall, although in a second variant with differently processed data it achieved results comparable to the tree-based methods (Accuracy logistic regression = 90%, random forest = 95%, LightGBM = 89%, decision tree = 84%). However, all models exhibited considerable weaknesses in predicting rare examinations, which limits their practical applicability in such cases. The comparison of statistical and machine learning methods showed that machine learning provides a clear added value, especially in terms of covering complex patterns. This thesis therefore makes a significant contribution to the further development of decision support for the selection of serological tests and could help make the workflows at the Blood Center Linz more efficient and reliable in the future.
| Date of Award | 2025 |
|---|
| Original language | German (Austria) |
|---|
| Supervisor | Julia Vetter (Supervisor) |
|---|
- Data Science and Engineering
Entwicklung und Evaluierung eines Multi-Label Machine Learning-Ansatzes zur Optimierung des Analyseprozesses in der Blutgruppenbestimmung
Huber, S. V. (Author). 2025
Student thesis: Master's Thesis