Abstract
In this paper, we present an heterogeneous ensemble modeling approach to learn predictors for yeast contamination in freshly
harvested peppermint batches. Our research is based on data about numerous parameters of the harvesting process, such as planting,
tillage, fertilization, harvesting, drying, as well as information about microbial contamination. We use several different machine
learning methods, namely random forests, gradient boosting trees, symbolic regression by genetic programming, and support
vector machines to learn models that predict contamination on the basis of available harvesting parameters. Using those models
we form model ensembles in order to improve the accuracy as well as to reduce the false negative rate, i.e., to oversee as few
contaminations as possible. As we summarize in this paper, ensemble modeling indeed helps to increase the prediction accuracy
for our application, especially when using only the best models. The final prediction accuracy as well as other statistical indicators
such as false negative rate and false positive rate depend on the choice of the discrimination threshold; in the optimal case, model
ensembles are able to predict yeast contamination with 65.91% accuracy and only 19.15% of the samples are false negative, i.e.,
overseen contaminations.
harvested peppermint batches. Our research is based on data about numerous parameters of the harvesting process, such as planting,
tillage, fertilization, harvesting, drying, as well as information about microbial contamination. We use several different machine
learning methods, namely random forests, gradient boosting trees, symbolic regression by genetic programming, and support
vector machines to learn models that predict contamination on the basis of available harvesting parameters. Using those models
we form model ensembles in order to improve the accuracy as well as to reduce the false negative rate, i.e., to oversee as few
contaminations as possible. As we summarize in this paper, ensemble modeling indeed helps to increase the prediction accuracy
for our application, especially when using only the best models. The final prediction accuracy as well as other statistical indicators
such as false negative rate and false positive rate depend on the choice of the discrimination threshold; in the optimal case, model
ensembles are able to predict yeast contamination with 65.91% accuracy and only 19.15% of the samples are false negative, i.e.,
overseen contaminations.
Translated title of the contribution | Verwendung heterogener Modell-Ensembles zur Verbesserung der Vorhersage von Hefekontaminationen in Pfefferminz |
---|---|
Original language | English |
Pages | 1194-1200 |
Number of pages | 7 |
DOIs | |
Publication status | Published - 2022 |
Event | International Conference on Industry 4.0 and Smart Manufacturing - Hagenberg, Austria Duration: 17 Nov 2021 → 19 Nov 2021 |
Conference
Conference | International Conference on Industry 4.0 and Smart Manufacturing |
---|---|
Abbreviated title | ISM 2021 |
Country/Territory | Austria |
City | Hagenberg |
Period | 17.11.2021 → 19.11.2021 |
Keywords
- herbs
- heterogeneous model ensembles
- machine learning
- yeast contamination