Effects of data grouping on calibration measures of classifiers

Stephan Dreiseitl, Melanie Osl

Publikation: Beitrag in Buch/Bericht/TagungsbandKonferenzbeitragBegutachtung

1 Zitat (Scopus)

Abstract

The calibration of a probabilistic classifier refers to the extend to which its probability estimates match the true class membership probabilities. Measuring the calibration of a classifier usually relies on performing chi-squared goodness-of-fit tests between grouped probabilities and the observations in these groups. We considered alternatives to the Hosmer-Lemeshow test, the standard chi-squared test with groups based on sorted model outputs. Since this grouping does not represent "natural" groupings in data space, we investigated a chi-squared test with grouping strategies in data space. Using a series of artificial data sets for which the correct models are known, and one real-world data set, we analyzed the performance of the Pigeon-Heyse test with groupings by self-organizing maps, k-means clustering, and random assignment of points to groups. We observed that the Pigeon-Heyse test offers slightly better performance than the Hosmer-Lemeshow test while being able to locate regions of poor calibration in data space.

OriginalspracheEnglisch
TitelComputer Aided Systems Theory, EUROCAST 2011 - 13th International Conference, Revised Selected Papers
Seiten359-366
Seitenumfang8
AuflagePART 1
DOIs
PublikationsstatusVeröffentlicht - 2012
Veranstaltung13th International Conference on Computer Aided Systems Theory EUROCAST 2011 - Las Palmas, Spanien
Dauer: 6 Feb 201111 Feb 2011

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NummerPART 1
Band6927 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Konferenz

Konferenz13th International Conference on Computer Aided Systems Theory EUROCAST 2011
Land/GebietSpanien
OrtLas Palmas
Zeitraum06.02.201111.02.2011

Fingerprint

Untersuchen Sie die Forschungsthemen von „Effects of data grouping on calibration measures of classifiers“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren