The ongoing digitalisation of industrial production within the context of Industry 4.0 has resulted in an increase in the availability of continuous sensor data. These data streams have great potential for the early detection of anomalies, which is crucial for preventing unplanned downtime, reducing costs and improving safety. However, high data rates, varying operating conditions and complex patterns present significant challenges to traditional detection methods. This thesis systematically investigates how clustering as a preprocessing step affects the accuracy and efficiency of anomaly detection in industrial sensor data. Several clustering algorithms (K-means, DBSCAN and HDBSCAN) were combined with well-established anomaly detection methods (Isolation Forest, Local Outlier Factor, Principal Component Analysis, Autoencoders and Long Short-Term Memory Networks) and evaluated using real-world industrial datasets. The results showed that Local Outlier Factor (LOF) and Autoencoder achieved the best performance without clustering. In contrast, the use of clustering did not significantly improve performance and, in some cases, actually degraded it. Density-based methods such as DBSCAN and HDBSCAN particularly implicitly classified outliers during clustering, reducing the effectiveness of subsequent anomaly detection. These findings improve our understanding of the role of clustering in the detection of anomalies in continuous sensor data. They demonstrate that robust detection methods can achieve reliable and practical results without clustering, supporting the development of resource-efficient, real-time monitoring systems in industrial environments.
| Date of Award | 2025 |
|---|
| Original language | German (Austria) |
|---|
| Supervisor | Philipp Fleck (Supervisor) |
|---|
- Information Engineering and -Management
Einfluss von Clustering auf die Anomalieerkennung in Sensordaten
Graf, M. (Author). 2025
Student thesis: Master's Thesis