On the Performance of Evolutionary Algorithms in Biomedical Keyword Clustering

Viktoria Dorfer, Stephan Winkler, Thomas Kern, Sophie Anna Blank, Gerald Petz, Patrizia Faschang

Research output: Chapter in Book/Report/Conference proceedingsConference contributionpeer-review

3 Citations (Scopus)

Abstract

In the field of life sciences it often turns out to be a challenge to quickly find the desired information due to the huge amount of available data. The research area of information retrieval (IR) addresses this problem and tries to provide suitable solutions. One of the approaches used in IR is query extension based on keyword or document clusters. In this paper we present a deep analysis of a keyword clustering approach using four different kinds of evolutionary algorithms, namely evolution strategy (ES), genetic algorithm (GA), genetic algorithm with strict offspring selection (OSGA), and the multi-objective elitist non-dominated sorting genetic algorithm (NSGA-II). We have identified features that characterize solution candidates for the keyword clustering problem, e.g., the number of documents covered and how well the identified clusters of keywords match with the occurrence of keywords in the given set of documents. The use of these features and how evolutionary algorithms can be used to solve the optimization of keyword clusters is shown in this paper. To test the here presented approach we used a real world data set provided within the TREC-9 conference; this data collection includes information about approximately 36,000 documents collected from the PubMed database. In the results section we compare the performance of the here tested evolutionary algorithms and see that especially ES and NSGA-II produce meaningful results for this documents collection. This approach based on evolutionary algorithms shall be used further on in automated query extension for biomedical information retrieval in PubMed.
Original languageEnglish
Title of host publicationGenetic and Evolutionary Computation Conference, GECCO'11 - Companion Publication
PublisherACM Sigevo
Pages511-518
Number of pages8
ISBN (Print)9781450306904
DOIs
Publication statusPublished - 2011
EventGenetic and Evolutionary Computation Conference (GECCO) 2011 - Dublin, Ireland
Duration: 12 Jul 201116 Jul 2011
http://www.sigevo.org/gecco-2011/

Publication series

NameGenetic and Evolutionary Computation Conference, GECCO'11 - Companion Publication

Conference

ConferenceGenetic and Evolutionary Computation Conference (GECCO) 2011
Country/TerritoryIreland
CityDublin
Period12.07.201116.07.2011
Internet address

Keywords

  • bioinformatics
  • evolutionary algorithms
  • information retrieval
  • keyword clustering
  • query extension

Fingerprint

Dive into the research topics of 'On the Performance of Evolutionary Algorithms in Biomedical Keyword Clustering'. Together they form a unique fingerprint.

Cite this