Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 2 de 2
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Mol Cell Proteomics ; 7(6): 1135-45, 2008 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-18303013

RESUMEN

High throughput identification of peptides in databases from tandem mass spectrometry data is a key technique in modern proteomics. Common approaches to interpret large scale peptide identification results are based on the statistical analysis of average score distributions, which are constructed from the set of best scores produced by large collections of MS/MS spectra by using searching engines such as SEQUEST. Other approaches calculate individual peptide identification probabilities on the basis of theoretical models or from single-spectrum score distributions constructed by the set of scores produced by each MS/MS spectrum. In this work, we study the mathematical properties of average SEQUEST score distributions by introducing the concept of spectrum quality and expressing these average distributions as compositions of single-spectrum distributions. We predict and demonstrate in the practice that average score distributions are dominated by the quality distribution in the spectra collection, except in the low probability region, where it is possible to predict the dependence of average probability on database size. Our analysis leads to a novel indicator, the probability ratio, which takes optimally into account the statistical information provided by the first and second best scores. The probability ratio is a non-parametric and robust indicator that makes spectra classification according to parameters such as charge state unnecessary and allows a peptide identification performance, on the basis of false discovery rates, that is better than that obtained by other empirical statistical approaches. The probability ratio also compares favorably with statistical probability indicators obtained by the construction of single-spectrum SEQUEST score distributions. These results make the robustness, conceptual simplicity, and ease of automation of the probability ratio algorithm a very attractive alternative to determine peptide identification confidences and error rates in high throughput experiments.


Asunto(s)
Espectrometría de Masas/métodos , Proteómica/métodos , Algoritmos , Automatización , Biología Computacional , Bases de Datos de Proteínas , Humanos , Células Jurkat , Células Madre Mesenquimatosas/metabolismo , Modelos Estadísticos , Modelos Teóricos , Péptidos/química , Probabilidad , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem/métodos
2.
Anal Chem ; 76(23): 6853-60, 2004 Dec 01.
Artículo en Inglés | MEDLINE | ID: mdl-15571333

RESUMEN

Recent technological advances have made multidimensional peptide separation techniques coupled with tandem mass spectrometry the method of choice for high-throughput identification of proteins. Due to these advances, the development of software tools for large-scale, fully automated, unambiguous peptide identification is highly necessary. In this work, we have used as a model the nuclear proteome from Jurkat cells and present a processing algorithm that allows accurate predictions of random matching distributions, based on the two SEQUEST scores Xcorr and DeltaCn. Our method permits a very simple and precise calculation of the probabilities associated with individual peptide assignments, as well as of the false discovery rate among the peptides identified in any experiment. A further mathematical analysis demonstrates that the score distributions are highly dependent on database size and precursor mass window and suggests that the probability associated with SEQUEST scores depends on the number of candidate peptide sequences available for the search. Our results highlight the importance of adjusting the filtering criteria to discriminate between correct and incorrect peptide sequences according to the circumstances of each particular experiment.


Asunto(s)
Bases de Datos de Proteínas , Modelos Estadísticos , Fragmentos de Péptidos/química , Programas Informáticos , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos , Cromatografía Liquida/métodos , Electroforesis en Gel Bidimensional/métodos , Humanos , Células Jurkat , Proteoma/análisis , Sensibilidad y Especificidad , Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/instrumentación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA