Testing noisy numerical data for monotonic association

Ulrich Bodenhofer, Martin Krone, Frank Klawonn

Research output: Contribution to journalArticlepeer-review

14 Citations (Scopus)

Abstract

Rank correlation measures are intended to measure to which extent there is a monotonic association between two observables. While they are mainly designed for ordinal data, they are not ideally suited for noisy numerical data. In order to better account for noisy data, a family of rank correlation measures has previously been introduced that replaces classical ordering relations by fuzzy relations with smooth transitions - thereby ensuring that the correlation measure is continuous with respect to the data. The given paper briefly repeats the basic concepts behind this family of rank correlation measures and investigates it from the viewpoint of robust statistics. Then, on this basis, we introduce a framework of novel rank correlation tests. An extensive experimental evaluation using a large number of simulated data sets is presented which demonstrates that the new tests indeed outperform the classical variants in terms of type II error rates without sacrificing good performance in terms of type I error rates. This is mainly due to the fact that the new tests are more robust to noise for small samples. The Gaussian rank correlation estimator turned out to be the best choice in situations where no prior knowledge is available about the data, whereas the new family of robust gamma test provides an advantage in situations where information about the noise distribution is available. An implementation of all robust rank correlation tests used in this paper is available as an R package from the CRAN repository.

Original languageEnglish
Pages (from-to)21-37
Number of pages17
JournalInformation Sciences
Volume245
DOIs
Publication statusPublished - 1 Oct 2013

Keywords

  • Fuzzy ordering
  • Gamma correlation coefficient
  • R package rococo
  • Rank correlation
  • Rank correlation test
  • Robust statistics

Fingerprint Dive into the research topics of 'Testing noisy numerical data for monotonic association'. Together they form a unique fingerprint.

Cite this