Búsqueda | BVS CLAP/SMR-OPS/OMS

Application of de Novo Sequencing to Large-Scale Complex Proteomics Data Sets.

Devabhaktuni, Arun; Elias, Joshua E.

J Proteome Res ; 15(3): 732-42, 2016 Mar 04.

Artículo en Inglés | MEDLINE | ID: mdl-26743026

RESUMEN

Dependent on concise, predefined protein sequence databases, traditional search algorithms perform poorly when analyzing mass spectra derived from wholly uncharacterized protein products. Conversely, de novo peptide sequencing algorithms can interpret mass spectra without relying on reference databases. However, such algorithms have been difficult to apply to complex protein mixtures, in part due to a lack of methods for automatically validating de novo sequencing results. Here, we present novel metrics for benchmarking de novo sequencing algorithm performance on large-scale proteomics data sets and present a method for accurately calibrating false discovery rates on de novo results. We also present a novel algorithm (LADS) that leverages experimentally disambiguated fragmentation spectra to boost sequencing accuracy and sensitivity. LADS improves sequencing accuracy on longer peptides relative to that of other algorithms and improves discriminability of correct and incorrect sequences. Using these advancements, we demonstrate accurate de novo identification of peptide sequences not identifiable using database search-based approaches.

Asunto(s)

Algoritmos , Proteómica/métodos , Análisis de Secuencia de Proteína/métodos , Secuencia de Aminoácidos , Bases de Datos de Proteínas , Péptidos/química , Análisis de Secuencia de Proteína/normas , Programas Informáticos , Espectrometría de Masas en Tándem

TagGraph reveals vast protein modification landscapes from large tandem mass spectrometry datasets.

Devabhaktuni, Arun; Lin, Sarah; Zhang, Lichao; Swaminathan, Kavya; Gonzalez, Carlos G; Olsson, Niclas; Pearlman, Samuel M; Rawson, Keith; Elias, Joshua E.

Nat Biotechnol ; 37(4): 469-479, 2019 04.

Artículo en Inglés | MEDLINE | ID: mdl-30936560

RESUMEN

Although mass spectrometry is well suited to identifying thousands of potential protein post-translational modifications (PTMs), it has historically been biased towards just a few. To measure the entire set of PTMs across diverse proteomes, software must overcome the dual challenges of covering enormous search spaces and distinguishing correct from incorrect spectrum interpretations. Here, we describe TagGraph, a computational tool that overcomes both challenges with an unrestricted string-based search method that is as much as 350-fold faster than existing approaches, and a probabilistic validation model that we optimized for PTM assignments. We applied TagGraph to a published human proteomic dataset of 25 million mass spectra and tripled confident spectrum identifications compared to its original analysis. We identified thousands of modification types on almost 1 million sites in the proteome. We show alternative contexts for highly abundant yet understudied PTMs such as proline hydroxylation, and its unexpected association with cancer mutations. By enabling broad characterization of PTMs, TagGraph informs as to how their functions and regulation intersect.

Asunto(s)

Bases de Datos de Proteínas/estadística & datos numéricos , Procesamiento Proteico-Postraduccional , Programas Informáticos , Espectrometría de Masas en Tándem/estadística & datos numéricos , Algoritmos , Secuencia de Aminoácidos , Teorema de Bayes , Biotecnología , Línea Celular Tumoral , Humanos , Hidroxilación , Modelos Estadísticos , Péptidos/química , Péptidos/genética , Proteoma , Proteómica/estadística & datos numéricos , Motor de Búsqueda , Alineación de Secuencia/estadística & datos numéricos

Ver mas detalles

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA