Abstract
Mass spectrometry is an experimental technique that allows the study of the entirety of proteins in a biological system --- the proteome. Proteomics analysis requires dedicated algorithms for peptide identification from mass spectra. The size of the search space, i.e., the total number of peptide candidates that are considered during identification, is a major factor for sensitivity and performance of the analysis.
This thesis summarizes the author's research on the development of bioinformatics algorithms which use optimized search spaces for peptide identification from mass spectrometry data. Major contributions include a novel algorithm for peptide identification using spectral library search (MS Ana). By using the optimized search space of a library created from previously identified spectra MS Ana is able to identify more spectra than comparable algorithms and better control the false discovery rate through an improved algorithm for decoy spectrum generation. Moreover, the author developed an approach to directly optimize the search space for the identification of spectra originating from phosphorylated peptides (PhoStar) which uses a machine learning approach to detect phosphorylation before identification. An analysis workflow using PhoStar can decrease the search space without losing identifications. Furthermore, the author contributed to the MetaProteomeAnalyzer software which uses different strategies to handle the large search spaces that are common in metaproteomics data analysis. All algorithms have been implemented in user-friendly software packages and made freely available. Additional contributions include research on the application of machine learning in the life sciences and mathematical modeling of biological systems such as stochastic modeling of the virus protein production in host cells during influenza infection.
This thesis summarizes the author's research on the development of bioinformatics algorithms which use optimized search spaces for peptide identification from mass spectrometry data. Major contributions include a novel algorithm for peptide identification using spectral library search (MS Ana). By using the optimized search space of a library created from previously identified spectra MS Ana is able to identify more spectra than comparable algorithms and better control the false discovery rate through an improved algorithm for decoy spectrum generation. Moreover, the author developed an approach to directly optimize the search space for the identification of spectra originating from phosphorylated peptides (PhoStar) which uses a machine learning approach to detect phosphorylation before identification. An analysis workflow using PhoStar can decrease the search space without losing identifications. Furthermore, the author contributed to the MetaProteomeAnalyzer software which uses different strategies to handle the large search spaces that are common in metaproteomics data analysis. All algorithms have been implemented in user-friendly software packages and made freely available. Additional contributions include research on the application of machine learning in the life sciences and mathematical modeling of biological systems such as stochastic modeling of the virus protein production in host cells during influenza infection.
| Original language | English |
|---|---|
| Supervisors/Advisors |
|
| Award date | 5 Mar 2025 |
| Publication status | Published - 2025 |
Fingerprint
Dive into the research topics of 'Identifying peptides and proteins in fragment ion mass spectrometry data using optimized search spaces'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver