Improved classification of mass spectrometry database search results using newer machine learning approaches.

Ulintz, Peter J; Zhu, Ji; Qin, Zhaohui S; Andrews, Philip C

Ulintz, Peter J; Zhu, Ji; Qin, Zhaohui S; Andrews, Philip C.

Afiliación

Ulintz PJ; National Resource for Proteomics and Pathways, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109, USA. pulintz@umich.edu

Mol Cell Proteomics ; 5(3): 497-509, 2006 Mar.

Article en En | MEDLINE | ID: mdl-16321970

ABSTRACT

ABSTRACT

Manual analysis of mass spectrometry data is a current bottleneck in high throughput proteomics. In particular, the need to manually validate the results of mass spectrometry database searching algorithms can be prohibitively time-consuming. Development of software tools that attempt to quantify the confidence in the assignment of a protein or peptide identity to a mass spectrum is an area of active interest. We sought to extend work in this area by investigating the potential of recent machine learning algorithms to improve the accuracy of these approaches and as a flexible framework for accommodating new data features. Specifically we demonstrated the ability of boosting and random forest approaches to improve the discrimination of true hits from false positive identifications in the results of mass spectrometry database search engines compared with thresholding and other machine learning approaches. We accommodated additional attributes obtainable from database search results, including a factor addressing proton mobility. Performance was evaluated using publically available electrospray data and a new collection of MALDI data generated from purified human reference proteins.

Asunto(s)

Inteligencia Artificial; Biología Computacional/métodos; Bases de Datos como Asunto; Péptidos/análisis; Péptidos/clasificación; Proteómica/métodos; Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción/métodos; Secuencia de Aminoácidos; Humanos; Datos de Secuencia Molecular; Péptidos/química; Proteínas/análisis; Proteínas/química; Programas Informáticos

Buscar en Google

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Péptidos / Inteligencia Artificial / Espectrometría de Masa por Láser de Matriz Asistida de Ionización Desorción / Biología Computacional / Bases de Datos como Asunto / Proteómica Tipo de estudio: Prognostic_studies Límite: Humans Idioma: En Revista: Mol Cell Proteomics Asunto de la revista: BIOLOGIA MOLECULAR / BIOQUIMICA Año: 2006 Tipo del documento: Article País de afiliación: Estados Unidos

Buscar en Google

Añadir a Mi BVS

Imprimir

XML

PubMed Links