Abstract
The process of feature selection is an important first step in building machine learning models. Feature selection algorithms can be grouped into wrappers and filters; the former use machine learning models to evaluate feature sets, the latter use other criteria to evaluate features individually. We present a new approach to feature selection that combines advantages of both wrapper as well as filter approaches, by using logistic regression and the area under the ROC curve (AUC) to evaluate pairs of features. After choosing as starting feature the one with the highest individual discriminatory power, we incrementally rank features by choosing as next feature the one that achieves the highest AUC in combination with an already chosen feature. To evaluate our approach, we compared it to standard filter and wrapper algorithms. Using two data sets from the biomedical domain, we are able to demonstrate that the performance of our approach exceeds that of filter methods, while being comparable to wrapper methods at smaller computational cost.
Originalsprache | Englisch |
---|---|
Titel | Computer Aided Systems Theory, EUROCAST 2009 - 12th International Conference, Revised Selected Papers |
Seiten | 769-776 |
Seitenumfang | 8 |
DOIs | |
Publikationsstatus | Veröffentlicht - 2009 |
Veranstaltung | Twelve International Conference on Computer Aided Systems Theory 2009 - Las Palmas, Spanien Dauer: 15 Feb. 2009 → 20 Feb. 2009 http://www.iuctc.ulpgc.es/spain/eurocast2009/index.html |
Publikationsreihe
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Band | 5717 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (elektronisch) | 1611-3349 |
Konferenz
Konferenz | Twelve International Conference on Computer Aided Systems Theory 2009 |
---|---|
Land/Gebiet | Spanien |
Ort | Las Palmas |
Zeitraum | 15.02.2009 → 20.02.2009 |
Internetadresse |