Approximation properties of haplotype tagging

Staal Vinterbo, Stephan Dreiseitl, Lucila Ohno-Machado

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)


Background: Single nucleoticle polymorphisms (SNPs) are locations at which the genomic sequences of population members differ. Since these differences are known to follow patterns, disease association studies are facilitated by identifying SNPs that allow the unique identification of such patterns. This process, known as haplotype tagging, is formulated as a combinatorial optimization problem and analyzed in terms of complexity and approximation properties. Results: It is shown that the tagging problem is NP-hard but approximable within 1 + ln((n2 - n)/2) for n haplotypes but not approximable within (1 - ε) ln(n/2) for any ε > 0 unless NP ⊂ Dtime(n log log n). A simple, very easily implementable algorith m that exhibits the above upper bound on solution quality is presented. This algorithm has running time O(np/2(2m-p+1)) ≤ O(m(n2-n)/2) where p ≤ min(n, m) for n haplotypes of size m. As we show that the approximation bound is asymptotically tight, the algorithm presented is optimal with respect to this asymptotic bound. Conclusions: The haplotype tagging problem is hard, but approachable with a fast, practical, and surprisingly simple algorithm that cannot be significantly improved upon on a single processor machine. Hence, significant improvement in computatational efforts expended can only be expected if the computational effort is distributed and done in parallel.

Original languageEnglish
Article number8
Pages (from-to)8
Number of pages5
JournalBMC Bioinformatics
Issue number8
Publication statusPublished - 9 Jan 2006


  • Algorithms
  • Animals
  • Chromosome Mapping
  • Computational Biology/methods
  • Genome, Human
  • Haplotypes
  • Humans
  • Models, Genetic
  • Models, Statistical
  • Models, Theoretical
  • Polymorphism, Single Nucleotide
  • Reproducibility of Results
  • Sequence Alignment
  • Software


Dive into the research topics of 'Approximation properties of haplotype tagging'. Together they form a unique fingerprint.

Cite this