Accurately Predicting User Registration in Highly Unbalanced Real-World Datasets from Online News Portals

Eva-Maria Spitzer, Oliver Krauss, Andreas Stöckl

Publikation: Beitrag in Buch/Bericht/TagungsbandKonferenzbeitragBegutachtung

1 Zitat (Scopus)


Getting visitors to register is a crucial factor in marketing for online news portals. Current approaches are rule-based by awarding points for specific actions [3]. Finding efficient rules can be challenging and depends on the specific task. Registration is generally rare compared to regular visitors, leading to highly imbalanced data. We analyze different supervised learning classification algorithms under consideration of the data imbalance. As case study, we use anonymized real-world data from an Austrian newspaper outlet containing the visitor’s session behavior with around 0.1% registrations over all visits. We identify an ensemble approach combining the Balanced Random Forest Classifier and the RUSBoost Classifier correctly identifying 76% of registrations over five independent data sets.

TitelDatabase and Expert Systems Applications - 33rd International Conference, DEXA 2022, Proceedings
Redakteure/-innenChristine Strauss, Alfredo Cuzzocrea, Gabriele Kotsis, Ismail Khalil, A Min Tjoa
Herausgeber (Verlag)Springer
ISBN (Print)9783031124228
PublikationsstatusVeröffentlicht - 29 Juli 2022


NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band13426 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349


Untersuchen Sie die Forschungsthemen von „Accurately Predicting User Registration in Highly Unbalanced Real-World Datasets from Online News Portals“. Zusammen bilden sie einen einzigartigen Fingerprint.