Abstract
Understanding the relationship between protein sequence and structure is one of the great challenges in biology. In the case of the ubiquitous coiled-coil motif, structure and occurrence have been described in extensive detail, but there is a lack of insight into the rules that govern oligomerization, i.e. how many α-helices form a given coiled coil. To shed new light on the formation of two- and three-stranded coiled coils, we developed a machine learning approach to identify rules in the form of weighted amino acid patterns. These rules form the basis of our classification tool, PrOCoil, which also visualizes the contribution of each individual amino acid to the overall oligomeric tendency of a given coiled-coil sequence. We discovered that sequence positions previously thought irrelevant to direct coiled-coil interaction have an undeniable impact on stoichiometry. Our rules also demystify the oligomerization behavior of the yeast transcription factor GCN4, which can now be described as a hybrid - part dimer and part trimer - with both theoretical and experimental justification.
Original language | English |
---|---|
Journal | Molecular and Cellular Proteomics |
Volume | 10 |
Issue number | 5 |
DOIs | |
Publication status | Published - May 2011 |
Externally published | Yes |
Keywords
- Algorithms
- Amino Acid Motifs
- Area Under Curve
- Artificial Intelligence
- Basic-Leucine Zipper Transcription Factors/chemistry
- Computer Simulation
- Models, Molecular
- Molecular Sequence Annotation
- Mutant Proteins/chemistry
- Protein Multimerization
- ROC Curve
- Saccharomyces cerevisiae Proteins/chemistry