Accurately Predicting User Registration in Highly Unbalanced Real-World Datasets from Online News Portals

Eva-Maria Spitzer, Oliver Krauss, Andreas Stöckl

Publikation: Beitrag in Buch/Bericht/TagungsbandKonferenzbeitragBegutachtung

Abstract

Getting visitors to register is a crucial factor in marketing for online news portals. Current approaches are rule-based by awarding points for specific actions [3]. Finding efficient rules can be challenging and depends on the specific task. Registration is generally rare compared to regular visitors, leading to highly imbalanced data. We analyze different supervised learning classification algorithms under consideration of the data imbalance. As case study, we use anonymized real-world data from an Austrian newspaper outlet containing the visitor’s session behavior with around 0.1% registrations over all visits. We identify an ensemble approach combining the Balanced Random Forest Classifier and the RUSBoost Classifier correctly identifying 76% of registrations over five independent data sets.

OriginalspracheEnglisch
TitelDatabase and Expert Systems Applications - 33rd International Conference, DEXA 2022, Proceedings
Redakteure/-innenChristine Strauss, Alfredo Cuzzocrea, Gabriele Kotsis, Ismail Khalil, A Min Tjoa
ErscheinungsortCham
Herausgeber (Verlag)Springer
Seiten302-315
Seitenumfang14
ISBN (Print)9783031124228
DOIs
PublikationsstatusVeröffentlicht - 29 Juli 2022

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band13426 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Fingerprint

Untersuchen Sie die Forschungsthemen von „Accurately Predicting User Registration in Highly Unbalanced Real-World Datasets from Online News Portals“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren