Your browser doesn't support javascript.
loading
The generating function of CID, ETD, and CID/ETD pairs of tandem mass spectra: applications to database search.
Kim, Sangtae; Mischerikow, Nikolai; Bandeira, Nuno; Navarro, J Daniel; Wich, Louis; Mohammed, Shabaz; Heck, Albert J R; Pevzner, Pavel A.
Afiliação
  • Kim S; Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA.
Mol Cell Proteomics ; 9(12): 2840-52, 2010 Dec.
Article em En | MEDLINE | ID: mdl-20829449
ABSTRACT
Recent emergence of new mass spectrometry techniques (e.g. electron transfer dissociation, ETD) and improved availability of additional proteases (e.g. Lys-N) for protein digestion in high-throughput experiments raised the challenge of designing new algorithms for interpreting the resulting new types of tandem mass (MS/MS) spectra. Traditional MS/MS database search algorithms such as SEQUEST and Mascot were originally designed for collision induced dissociation (CID) of tryptic peptides and are largely based on expert knowledge about fragmentation of tryptic peptides (rather than machine learning techniques) to design CID-specific scoring functions. As a result, the performance of these algorithms is suboptimal for new mass spectrometry technologies or nontryptic peptides. We recently proposed the generating function approach (MS-GF) for CID spectra of tryptic peptides. In this study, we extend MS-GF to automatically derive scoring parameters from a set of annotated MS/MS spectra of any type (e.g. CID, ETD, etc.), and present a new database search tool MS-GFDB based on MS-GF. We show that MS-GFDB outperforms Mascot for ETD spectra or peptides digested with Lys-N. For example, in the case of ETD spectra, the number of tryptic and Lys-N peptides identified by MS-GFDB increased by a factor of 2.7 and 2.6 as compared with Mascot. Moreover, even following a decade of Mascot developments for analyzing CID spectra of tryptic peptides, MS-GFDB (that is not particularly tailored for CID spectra or tryptic peptides) resulted in 28% increase over Mascot in the number of peptide identifications. Finally, we propose a statistical framework for analyzing multiple spectra from the same precursor (e.g. CID/ETD spectral pairs) and assigning p values to peptide-spectrum-spectrum matches.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Bases de Dados de Proteínas / Espectrometria de Massas em Tandem Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Mol Cell Proteomics Assunto da revista: BIOLOGIA MOLECULAR / BIOQUIMICA Ano de publicação: 2010 Tipo de documento: Article País de afiliação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Bases de Dados de Proteínas / Espectrometria de Massas em Tandem Tipo de estudo: Prognostic_studies Limite: Humans Idioma: En Revista: Mol Cell Proteomics Assunto da revista: BIOLOGIA MOLECULAR / BIOQUIMICA Ano de publicação: 2010 Tipo de documento: Article País de afiliação: Estados Unidos