Comparing true and estimated false discovery rates in spectral library search

Activity: Talk or presentationOral presentation


Introduction Spectral library search uses spectrum-to-spectrum matching for the identification of peptides from fragment ion spectra. This approach is now seeing growing interest in the mass spectrometry community thanks to the increasing number of readily available spectral libraries. Given a suitable library, using spectrum-to-spectrum matching leads to higher sensitivity and faster processing times than database search. However, spectral library search lacks a consensus strategy for validating results and controlling false discovery rates (FDR). The commonly accepted method to estimate false discovery rates in MS/MS experiments is to perform database search using a concatenated target and decoy database, which simulates the occurrences of false positive identifications. Applying the target-decoy approach (TDA) to spectral library search is complicated since the generation of decoy spectra is non-trivial. Suitable decoys need to be different from the real spectra but still similar enough to be mistaken for experimentally observed spectra. Methods Calculating the true FDR needs prior knowledge of the peptides in an experiment, which is inherently difficult. We use HCD MS/MS fragment ion data of synthetic peptides from the ProteomeTools [1] project for searches in the NIST human HCD spectral library. With the list of synthesized peptide sequences, we establish a ground truth and calculate the true FDR for each spectral library search. We evaluate the search performance based on the accuracy of the target-decoy FDR estimate and the distribution of search hits in the target and decoy libraries. We test several different spectral library search engines as well as different methods for the generation of decoy spectral libraries. Results and Discussion To reach a broader acceptance in proteomics research, spectral library search needs a unified standard approach for FDR control. Our comparisons demonstrate the problems in applying the target-decoy approach to spectrum-to-spectrum matching as first results indicate that the TDA for spectral libraries underestimates the FDR. In our experiments, at 1% estimated FDR the calculated true FDR was 7.35% on average. The process by which decoy spectra are generated is of particular importance and still an open issue at this point. Additionally, we set up cross validation searches using redundant spectral library information to verify our results. Furthermore, we compare these results with results from database search under equivalent conditions.
Period7 Sept 2017
Event title15th Austrian Proteomic Research Symposium (APRS 2017)
Event typeConference
LocationGraz, AustriaShow on map