Machine Learning Update Strategies for Real-time Production Environments

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

Abstract

Modern application scenarios in the dynamic industrial, financial, and economic sectors increasingly require quick and agile machine learning solutions. Instead of waiting hours for batch processing systems to deliver results, these systems should ideally adapt and make decisions as soon as new data comes in. As the demand for real-time machine learning solutions using streaming data is steadily increasing, this paper explores a software architecture that efficiently combines the Apache Kafka ecosystem with Microsoft's machine learning framework ML.NET for reliable data processing and model adaptation. The research addresses the complexity of deciding when to retrain these models in an unbounded data stream context. Various update strategies, including online, periodic, and performance-based model training, are evaluated for effectiveness under different conditions. The goal of this paper is to propose a completely autonomous machine learning pipeline that is capable of keeping models updated while, minimizing computational costs required for retraining and ensuring prediction accuracy.
Original languageEnglish
Title of host publicationEurocast 2024
PublisherSpringer
Publication statusAccepted/In press - 2024
EventEUROCAST 2024: 19th International Conference on Computer Aided Systems Theory - Museo Elder de la Ciencia y la Tecnología, Las Palmas de Gran Canaria, Spain
Duration: 25 Feb 20241 Mar 2024
https://eurocast2024.fulp.ulpgc.es

Conference

ConferenceEUROCAST 2024
Country/TerritorySpain
CityLas Palmas de Gran Canaria
Period25.02.202401.03.2024
Internet address

Keywords

  • Data Streaming
  • Machine Learning
  • Update Strategies

Fingerprint

Dive into the research topics of 'Machine Learning Update Strategies for Real-time Production Environments'. Together they form a unique fingerprint.

Cite this