Keyword Clustering in Biomedical Information Retrieval Using Evolutionary Algorithms

Viktoria Dorfer, Stephan Winkler, Thomas Kern, Sophie Anna Blank, Gerald Petz, Patrizia Faschang

Publikation: Beitrag in Buch/Bericht/TagungsbandKonferenzbeitrag

Abstract

As the amount of available data in the field of life sciences grows exponentially, intelligent search strategies are necessary to help people in information retrieval. We here describe the use of a new keyword clustering method: Based on a set of documents (D), keyword clusters are optimized so that the identified groups of keywords consist of keywords that often occur in combination in D. The so generated keyword clusters shall in the near future serve as a solid base for a new PubMed search tool based on query extension, using also user feedback to optimize the search process. We have defined several important characteristics for clustering candidates, including the data set coverage, the cluster confidence (measuring the ratio of clustered keywords that are found in the same documents), and the document confidence (measuring the amount of equal keywords in the documents assigned to a cluster through their keywords). Evolutionary algorithms have been applied for solving this optimization task, amongst others evolution strategies (ES) and a multi-objective genetic algorithm (NSGA-II, used because the optimization objectives are partially contradictory). For testing this approach we have used data published for the TREC-9 conference containing 36,890 entries. Out of this data set we extracted the most significant keywords for clustering using tf-idf weighting. Analyzing first optimization results we see that the best result obtained with 10+1 ES provides 23.5% data set coverage, 45.2% cluster confidence, and 23.4% document confidence; using the NSGA-II we for example got results with respective values 71%, 56% and 37%.
Titel in ÜbersetzungKeyword Clustering in Biomedical Information Retrieval Using Evolutionary Algorithms
OriginalspracheDeutsch
TitelProceedings of the 19th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 10th European Conference on Computational Biology (ECCB)
Herausgeber (Verlag)International Society for Computational Biology
PublikationsstatusVeröffentlicht - 2011
Veranstaltung19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology - Vienna, Österreich
Dauer: 17 Juli 201119 Juli 2011

Konferenz

Konferenz19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology
Land/GebietÖsterreich
OrtVienna
Zeitraum17.07.201119.07.2011

Fingerprint

Untersuchen Sie die Forschungsthemen von „Keyword Clustering in Biomedical Information Retrieval Using Evolutionary Algorithms“. Zusammen bilden sie einen einzigartigen Fingerprint.

Zitieren