On the Performance of Evolutionary Algorithms in Biomedical Keyword Clustering

Viktoria Dorfer, Stephan Winkler, Thomas Kern, Sophie Anna Blank, Gerald Petz, Patrizia Faschang

Publikation: Beitrag in Buch/Bericht/TagungsbandKonferenzbeitragBegutachtung

3 Zitate (Scopus)

Abstract

In the field of life sciences it often turns out to be a challenge to quickly find the desired information due to the huge amount of available data. The research area of information retrieval (IR) addresses this problem and tries to provide suitable solutions. One of the approaches used in IR is query extension based on keyword or document clusters. In this paper we present a deep analysis of a keyword clustering approach using four different kinds of evolutionary algorithms, namely evolution strategy (ES), genetic algorithm (GA), genetic algorithm with strict offspring selection (OSGA), and the multi-objective elitist non-dominated sorting genetic algorithm (NSGA-II). We have identified features that characterize solution candidates for the keyword clustering problem, e.g., the number of documents covered and how well the identified clusters of keywords match with the occurrence of keywords in the given set of documents. The use of these features and how evolutionary algorithms can be used to solve the optimization of keyword clusters is shown in this paper. To test the here presented approach we used a real world data set provided within the TREC-9 conference; this data collection includes information about approximately 36,000 documents collected from the PubMed database. In the results section we compare the performance of the here tested evolutionary algorithms and see that especially ES and NSGA-II produce meaningful results for this documents collection. This approach based on evolutionary algorithms shall be used further on in automated query extension for biomedical information retrieval in PubMed.
OriginalspracheEnglisch
TitelGenetic and Evolutionary Computation Conference, GECCO'11 - Companion Publication
Herausgeber (Verlag)ACM Sigevo
Seiten511-518
Seitenumfang8
ISBN (Print)9781450306904
DOIs
PublikationsstatusVeröffentlicht - 2011
VeranstaltungGenetic and Evolutionary Computation Conference (GECCO) 2011 - Dublin, Irland
Dauer: 12 Juli 201116 Juli 2011
http://www.sigevo.org/gecco-2011/

Publikationsreihe

NameGenetic and Evolutionary Computation Conference, GECCO'11 - Companion Publication

Konferenz

KonferenzGenetic and Evolutionary Computation Conference (GECCO) 2011
Land/GebietIrland
OrtDublin
Zeitraum12.07.201116.07.2011
Internetadresse

Fingerprint

Untersuchen Sie die Forschungsthemen von „On the Performance of Evolutionary Algorithms in Biomedical Keyword Clustering“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren