Learning the high-dimensional immunogenomic features that predict public and private antibody repertoires

Victor Greiff, Cedric R. Weber, Johannes Palme, Ulrich Bodenhofer, Enkelejda Miho, Ulrike Menzel, Sai T. Reddy

Research output: Contribution to journalArticlepeer-review

76 Citations (Scopus)


Recent studies have revealed that immune repertoires contain a substantial fraction of public clones, which may be defined as Ab or TCR clonal sequences shared across individuals. It has remained unclear whether public clones possess predictable sequence features that differentiate them from private clones, which are believed to be generated largely stochastically. This knowledge gap represents a lack of insight into the shaping of immune repertoire diversity. Leveraging a machine learning approach capable of capturing the high-dimensional compositional information of each clonal sequence (defined by CDR3), we detected predictive public clone and private clone-specific immunogenomic differences concentrated in CDR3?s N1-D-N2 region, which allowed the prediction of public and private status with 80% accuracy in humans and mice. Our results unexpectedly demonstrate that public, as well as private, clones possess predictable high-dimensional immunogenomic features. Our support vector machine model could be trained effectively on large published datasets (3 million clonal sequences) and was sufficiently robust for public clone prediction across individuals and studies prepared with different library preparation and high-throughput sequencing protocols. In summary, we have uncovered the existence of high-dimensional immunogenomic rules that shape immune repertoire diversity in a predictable fashion. Our approach may pave the way for the construction of a comprehensive atlas of public mouse and human immune repertoires with potential applications in rational vaccine design and immunotherapeutics.

Original languageEnglish
Pages (from-to)2985-2997
Number of pages13
JournalJournal of Immunology
Issue number8
Publication statusPublished - 15 Oct 2017
Externally publishedYes


  • Animals
  • Antibody Diversity
  • B-Lymphocytes/physiology
  • Clonal Selection, Antigen-Mediated
  • Clone Cells
  • Complementarity Determining Regions/genetics
  • Datasets as Topic
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Immunotherapy/methods
  • Mice
  • Mice, Inbred BALB C
  • Mice, Inbred C57BL
  • Receptors, Antigen, B-Cell/genetics
  • Receptors, Antigen, T-Cell/genetics
  • T-Lymphocytes/physiology
  • Vaccines/immunology


Dive into the research topics of 'Learning the high-dimensional immunogenomic features that predict public and private antibody repertoires'. Together they form a unique fingerprint.

Cite this