Process pruner: A tool for sequence-based event log preprocessing

David Baumgartner, Andreas Haghofer, Martin Limberger, Emmanuel Helm

Publikation: Beitrag in FachzeitschriftKonferenzartikelBegutachtung


A major challenge in applying process mining on real event data is the presence of noisy or incomplete cases or unusual behaviors. Applying process mining on raw event data leads to wrong conclusions during the discovery of process models, concealing the typical behavior. In this paper, an alternative for filtering event data without the need for extensive preprocessing is presented. The method is based on generated footprint matrices of randomly pruned sub-logs and works in a semi-automated manner. By identifying the most similar matrices to validate the whole log, traces representing unusual behavior can be excluded or highlighted. The tool was implemented with Python 3, NumPy and Pandas and is publicly available on GitHub. We evaluated our tool using benchmark data-sets and compared it to human filtering and discovery results.

Seiten (von - bis)1-4
FachzeitschriftCEUR Workshop Proceedings
PublikationsstatusVeröffentlicht - 2019
VeranstaltungICPM Demo Track 2019, ICPM Demo Track 2019 - Aachen, Deutschland
Dauer: 24 Juni 201926 Juni 2019


Untersuchen Sie die Forschungsthemen von „Process pruner: A tool for sequence-based event log preprocessing“. Zusammen bilden sie einen einzigartigen Fingerprint.