Pesquisa | BVS - MINISTÉRIO DA SAÚDE

1.

Sequencing-grade de novo analysis of MS/MS triplets (CID/HCD/ETD) from overlapping peptides.

Guthals, Adrian; Clauser, Karl R; Frank, Ari M; Bandeira, Nuno.

J Proteome Res ; 12(6): 2846-57, 2013 Jun 07.

Artigo em Inglês | MEDLINE | ID: mdl-23679345

RESUMO

Full-length de novo sequencing of unknown proteins remains a challenging open problem. Traditional methods that sequence spectra individually are limited by short peptide length, incomplete peptide fragmentation, and ambiguous de novo interpretations. We address these issues by determining consensus sequences for assembled tandem mass (MS/MS) spectra from overlapping peptides (e.g., by using multiple enzymatic digests). We have combined electron-transfer dissociation (ETD) with collision-induced dissociation (CID) and higher-energy collision-induced dissociation (HCD) fragmentation methods to boost interpretation of long, highly charged peptides and take advantage of corroborating b/y/c/z ions in CID/HCD/ETD. Using these strategies, we show that triplet CID/HCD/ETD MS/MS spectra from overlapping peptides yield de novo sequences of average length 70 AA and as long as 200 AA at up to 99% sequencing accuracy.

Assuntos

Algoritmos , Fragmentos de Peptídeos/isolamento & purificação , Proteínas/química , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Animais , Armoracia/química , Escherichia coli/química , Cavalos , Humanos , Camundongos , Proteólise , Sensibilidade e Especificidade

2.

Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra.

Frank, Ari M; Monroe, Matthew E; Shah, Anuj R; Carver, Jeremy J; Bandeira, Nuno; Moore, Ronald J; Anderson, Gordon A; Smith, Richard D; Pevzner, Pavel A.

Nat Methods ; 8(7): 587-91, 2011 May 15.

Artigo em Inglês | MEDLINE | ID: mdl-21572408

RESUMO

Tandem mass spectrometry (MS/MS) experiments yield multiple, nearly identical spectra of the same peptide in various laboratories, but proteomics researchers typically do not leverage the unidentified spectra produced in other labs to decode spectra they generate. We propose a spectral archives approach that clusters MS/MS datasets, representing similar spectra by a single consensus spectrum. Spectral archives extend spectral libraries by analyzing both identified and unidentified spectra in the same way and maintaining information about peptide spectra that are common across species and conditions. Thus archives offer both traditional library spectrum similarity-based search capabilities along with new ways to analyze the data. By developing a clustering tool, MS-Cluster, we generated a spectral archive from â¼1.18 billion spectra that greatly exceeds the size of existing spectral repositories. We advocate that publicly available data should be organized into spectral archives rather than be analyzed as disparate datasets, as is mostly the case today.

Assuntos

Bases de Dados Factuais , Peptídeos/análise , Proteínas/análise , Espectrometria de Massas em Tandem/métodos , Arquivos , Peptídeos/química , Proteínas/química , Proteômica/métodos

3.

Predicting intensity ranks of peptide fragment ions.

Frank, Ari M.

J Proteome Res ; 8(5): 2226-40, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19256476

RESUMO

Accurate modeling of peptide fragmentation is necessary for the development of robust scoring functions for peptide-spectrum matches, which are the cornerstone of MS/MS-based identification algorithms. Unfortunately, peptide fragmentation is a complex process that can involve several competing chemical pathways, which makes it difficult to develop generative probabilistic models that describe it accurately. However, the vast amounts of MS/MS data being generated now make it possible to use data-driven machine learning methods to develop discriminative ranking-based models that predict the intensity ranks of a peptide's fragment ions. We use simple sequence-based features that get combined by a boosting algorithm into models that make peak rank predictions with high accuracy. In an accompanying manuscript, we demonstrate how these prediction models are used to significantly improve the performance of peptide identification algorithms. The models can also be useful in the design of optimal multiple reaction monitoring (MRM) transitions, in cases where there is insufficient experimental data to guide the peak selection process. The prediction algorithm can also be run independently through PepNovo+, which is available for download from http://bix.ucsd.edu/Software/PepNovo.html.

Assuntos

Algoritmos , Fragmentos de Peptídeos/análise , Espectrometria de Massas em Tandem/métodos , Fragmentos de Peptídeos/química , Peptídeos/análise , Peptídeos/química , Proteômica/métodos , Reprodutibilidade dos Testes , Análise de Sequência de Proteína/métodos

4.

A ranking-based scoring function for peptide-spectrum matches.

Frank, Ari M.

J Proteome Res ; 8(5): 2241-52, 2009 May.

Artigo em Inglês | MEDLINE | ID: mdl-19231891

RESUMO

The analysis of the large volume of tandem mass spectrometry (MS/MS) proteomics data that is generated these days relies on automated algorithms that identify peptides from their mass spectra. An essential component of these algorithms is the scoring function used to evaluate the quality of peptide-spectrum matches (PSMs). In this paper, we present new approach to scoring of PSMs. We argue that since this problem is at its core a ranking task (especially in the case of de novo sequencing), it can be solved effectively using machine learning ranking algorithms. We developed a new discriminative boosting-based approach to scoring. Our scoring models draw upon a large set of diverse feature functions that measure different qualities of PSMs. Our method improves the performance of our de novo sequencing algorithm beyond the current state-of-the-art, and also greatly enhances the performance of database search programs. Furthermore, by increasing the efficiency of tag filtration and improving the sensitivity of PSM scoring, we make it practical to perform large-scale MS/MS analysis, such as proteogenomic search of a six-frame translation of the human genome (in which we achieve a reduction of the running time by a factor of 15 and a 60% increase in the number of identified peptides, compared to the InsPecT database search tool). Our scoring function is incorporated into PepNovo+ which is available for download or can be run online at http://bix.ucsd.edu.

Assuntos

Peptídeos/análise , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Bases de Dados de Proteínas , Peptídeos/química , Proteômica/métodos , Reprodutibilidade dos Testes

5.

Interpreting top-down mass spectra using spectral alignment.

Frank, Ari M; Pesavento, James J; Mizzen, Craig A; Kelleher, Neil L; Pevzner, Pavel A.

Anal Chem ; 80(7): 2499-505, 2008 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-18302345

RESUMO

Recent advances in mass spectrometry instrumentation, such as FTICR and OrbiTrap, have made it possible to generate high-resolution spectra of entire proteins. While these methods offer new opportunities for performing "top-down" studies of proteins, the computational tools for analyzing top-down data are still scarce. In this paper we investigate the application of spectral alignment to the problem of identifying protein forms in top-down mass spectra (i.e., identifying the modifications, mutations, insertions, and deletions). We demonstrate how spectral alignment efficiently discovers protein forms even in the presence of numerous modifications and how the algorithm can be extended to discover positional isomers from spectra of mixtures of isobaric protein forms.

Assuntos

Espectrometria de Massas/instrumentação , Espectrometria de Massas/métodos , Algoritmos , Histonas/química , Humanos , Reprodutibilidade dos Testes

6.

Clustering millions of tandem mass spectra.

Frank, Ari M; Bandeira, Nuno; Shen, Zhouxin; Tanner, Stephen; Briggs, Steven P; Smith, Richard D; Pevzner, Pavel A.

J Proteome Res ; 7(1): 113-22, 2008 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-18067247

RESUMO

Tandem mass spectrometry (MS/MS) experiments often generate redundant data sets containing multiple spectra of the same peptides. Clustering of MS/MS spectra takes advantage of this redundancy by identifying multiple spectra of the same peptide and replacing them with a single representative spectrum. Analyzing only representative spectra results in significant speed-up of MS/MS database searches. We present an efficient clustering approach for analyzing large MS/MS data sets (over 10 million spectra) with a capability to reduce the number of spectra submitted to further analysis by an order of magnitude. The MS/MS database search of clustered spectra results in fewer spurious hits to the database and increases number of peptide identifications as compared to regular nonclustered searches. Our open source software MS-Clustering is available for download at http://peptide.ucsd.edu or can be run online at http://proteomics.bioprojects.org/MassSpec.

Assuntos

Análise por Conglomerados , Peptídeos/análise , Proteômica/métodos , Espectrometria de Massas em Tandem , Sequência de Aminoácidos , Biologia Computacional , Dados de Sequência Molecular

7.

Sequence similarity-driven proteomics in organisms with unknown genomes by LC-MS/MS and automated de novo sequencing.

Waridel, Patrice; Frank, Ari; Thomas, Henrik; Surendranath, Vineeth; Sunyaev, Shamil; Pevzner, Pavel; Shevchenko, Andrej.

Proteomics ; 7(14): 2318-29, 2007 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-17623296

RESUMO

LC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, combined with data processing, stringent, and sequence-similarity database searching tools, was employed in a layered manner to identify proteins in organisms with unsequenced genomes. Highly specific stringent searches (MASCOT) were applied as a first layer screen to identify either known (i.e. present in a database) proteins, or unknown proteins sharing identical peptides with related database sequences. Once the confidently matched spectra were removed, the remainder was filtered against a nonannotated library of background spectra that cleaned up the dataset from spectra of common protein and chemical contaminants. The rectified spectral dataset was further subjected to rapid batch de novo interpretation by PepNovo software, followed by the MS BLAST sequence-similarity search that used multiple redundant and partially accurate candidate peptide sequences. Importantly, a single dataset was acquired at the uncompromised sensitivity with no need of manual selection of MS/MS spectra for subsequent de novo interpretation. This approach enabled a completely automated identification of novel proteins that were, otherwise, missed by conventional database searches.

Assuntos

Cromatografia Líquida/métodos , Genoma , Proteômica/métodos , Homologia de Sequência de Aminoácidos , Espectrometria de Massas em Tandem/métodos , Proteínas de Algas/química , Proteínas de Algas/metabolismo , Sequência de Aminoácidos , Clorófitas , Cromatografia Líquida/instrumentação , Simulação por Computador , Bases de Dados de Proteínas , Proteínas de Membrana/química , Proteínas de Membrana/metabolismo , Dados de Sequência Molecular , Proteômica/instrumentação , Software , Espectrometria de Massas em Tandem/instrumentação

8.

Protein identification by spectral networks analysis.

Bandeira, Nuno; Tsur, Dekel; Frank, Ari; Pevzner, Pavel A.

Proc Natl Acad Sci U S A ; 104(15): 6140-5, 2007 Apr 10.

Artigo em Inglês | MEDLINE | ID: mdl-17404225

RESUMO

Advances in tandem mass spectrometry (MS/MS) steadily increase the rate of generation of MS/MS spectra. As a result, the existing approaches that compare spectra against databases are already facing a bottleneck, particularly when interpreting spectra of modified peptides. Here we explore a concept that allows one to perform an MS/MS database search without ever comparing a spectrum against a database. We propose to take advantage of spectral pairs, which are pairs of spectra obtained from overlapping (often nontryptic) peptides or from unmodified and modified versions of the same peptide. Having a spectrum of a modified peptide paired with a spectrum of an unmodified peptide allows one to separate the prefix and suffix ladders, to greatly reduce the number of noise peaks, and to generate a small number of peptide reconstructions that are likely to contain the correct one. The MS/MS database search is thus reduced to extremely fast pattern-matching (rather than time-consuming matching of spectra against databases). In addition to speed, our approach provides a unique paradigm for identifying posttranslational modifications by means of spectral networks analysis.

Assuntos

Algoritmos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Processamento de Proteína Pós-Traducional

9.

De novo peptide sequencing and identification with precision mass spectrometry.

Frank, Ari M; Savitski, Mikhail M; Nielsen, Michael L; Zubarev, Roman A; Pevzner, Pavel A.

J Proteome Res ; 6(1): 114-23, 2007 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-17203955

RESUMO

The recent proliferation of novel mass spectrometers such as Fourier transform, QTOF, and OrbiTrap marks a transition into the era of precision mass spectrometry, providing a 2 orders of magnitude boost to the mass resolution, as compared to low-precision ion-trap detectors. We investigate peptide de novo sequencing by precision mass spectrometry and explore some of the differences when compared to analysis of low-precision data. We demonstrate how the dramatically improved performance of de novo sequencing with precision mass spectrometry paves the way for novel approaches to peptide identification that are based on direct sequence lookups, rather than comparisons of spectra to a database. With the direct sequence lookup, it is not only possible to search a database very efficiently, but also to use the database in novel ways, such as searching for products of alternative splicing or products of fusion proteins in cancer. Our de novo sequencing software is available for download at http://peptide.ucsd.edu/.

Assuntos

Espectrometria de Massas/métodos , Peptídeos/química , Algoritmos , Processamento Alternativo , Biologia Computacional/métodos , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Análise de Fourier , Genoma Humano , Genômica , Humanos , Proteínas de Neoplasias/química , Probabilidade , Análise de Sequência de Proteína , Espectroscopia de Infravermelho com Transformada de Fourier

10.

Rapid validation of protein identifications with the borderline statistical confidence via de novo sequencing and MS BLAST searches.

Wielsch, Natalie; Thomas, Henrik; Surendranath, Vineeth; Waridel, Patrice; Frank, Ari; Pevzner, Pavel; Shevchenko, Andrej.

J Proteome Res ; 5(9): 2448-56, 2006 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-16944958

RESUMO

Protein identifications with the borderline statistical confidence are typically produced by matching a few marginal quality MS/MS spectra to database peptide sequences and represent a significant bottleneck in the reliable and reproducible characterization of proteomes. Here, we present a method for rapid validation of borderline hits that circumvents the need in, often biased, manual inspection of raw MS/MS spectra. The approach takes advantage of the independent interpretation of corresponding MS/MS spectra by PepNovo de novo sequencing software followed by mass spectrometry-driven BLAST (MS BLAST) sequence-similarity database searches that utilize all partially inaccurate, degenerate and redundant candidate peptide sequences. In a case study involving the identification of more than 180 Caenorhabditis elegans proteins by nanoLC-MS/MS analysis on a linear ion trap LTQ mass spectrometer, the approach enabled rapid assignment (confirmation or rejection) of more than 70% of Mascot hits of borderline statistical confidence.

Assuntos

Proteínas de Caenorhabditis elegans/análise , Biologia Computacional/métodos , Proteômica/métodos , Análise de Sequência de Proteína/métodos , Sequência de Aminoácidos , Animais , Proteínas de Caenorhabditis elegans/isolamento & purificação , Espectrometria de Massas/métodos , Dados de Sequência Molecular

11.

Peptide sequence tags for fast database search in mass-spectrometry.

Frank, Ari; Tanner, Stephen; Bafna, Vineet; Pevzner, Pavel.

J Proteome Res ; 4(4): 1287-95, 2005.

Artigo em Inglês | MEDLINE | ID: mdl-16083278

RESUMO

Filtration techniques in the form of rapid elimination of candidate sequences while retaining the true one are key ingredients of database searches in genomics. Although SEQUEST and Mascot perform a conceptually similar task to the tool BLAST, the key algorithmic idea of BLAST (filtration) was never implemented in these tools. As a result MS/MS protein identification tools are becoming too time-consuming for many applications including search for post-translationally modified peptides. Moreover, matching millions of spectra against all known proteins will soon make these tools too slow in the same way that "genome vs genome" comparisons instantly made BLAST too slow. We describe the development of filters for MS/MS database searches that dramatically reduce the running time and effectively remove the bottlenecks in searching the huge space of protein modifications. Our approach, based on a probability model for determining the accuracy of sequence tags, achieves superior results compared to GutenTag, a popular tag generation algorithm. Our tag generating algorithm along with our de novo sequencing algorithm PepNovo can be accessed via the URL http://peptide.ucsd.edu/.

Assuntos

Algoritmos , Bases de Dados de Proteínas , Espectrometria de Massas/métodos , Análise de Sequência de Proteína , Sequência de Aminoácidos , Dados de Sequência Molecular , Peptídeos , Reprodutibilidade dos Testes , Estatística como Assunto

12.

InsPecT: identification of posttranslationally modified peptides from tandem mass spectra.

Tanner, Stephen; Shu, Hongjun; Frank, Ari; Wang, Ling-Chi; Zandi, Ebrahim; Mumby, Marc; Pevzner, Pavel A; Bafna, Vineet.

Anal Chem ; 77(14): 4626-39, 2005 Jul 15.

Artigo em Inglês | MEDLINE | ID: mdl-16013882

RESUMO

Reliable identification of posttranslational modifications is key to understanding various cellular regulatory processes. We describe a tool, InsPecT, to identify posttranslational modifications using tandem mass spectrometry data. InsPecT constructs database filters that proved to be very successful in genomics searches. Given an MS/MS spectrum S and a database D, a database filter selects a small fraction of database D that is guaranteed (with high probability) to contain a peptide that produced S. InsPecT uses peptide sequence tags as efficient filters that reduce the size of the database by a few orders of magnitude while retaining the correct peptide with very high probability. In addition to filtering, InsPecT also uses novel algorithms for scoring and validating in the presence of modifications, without explicit enumeration of all variants. InsPecT identifies modified peptides with better or equivalent accuracy than other database search tools while being 2 orders of magnitude faster than SEQUEST, and substantially faster than X!TANDEM on complex mixtures. The tool was used to identify a number of novel modifications in different data sets, including many phosphopeptides in data provided by Alliance for Cellular Signaling that were missed by other tools.

Assuntos

Peptídeos/química , Peptídeos/metabolismo , Processamento de Proteína Pós-Traducional/fisiologia , Software , Espectrometria de Massas em Tandem/métodos , Bases de Dados Factuais , Quinase I-kappa B/química , Queratinas/química , Peptídeo Hidrolases/química , Peptídeos/genética , Sensibilidade e Especificidade

13.

PepNovo: de novo peptide sequencing via probabilistic network modeling.

Frank, Ari; Pevzner, Pavel.

Anal Chem ; 77(4): 964-73, 2005 Feb 15.

Artigo em Inglês | MEDLINE | ID: mdl-15858974

RESUMO

We present a novel scoring method for de novo interpretation of peptides from tandem mass spectrometry data. Our scoring method uses a probabilistic network whose structure reflects the chemical and physical rules that govern the peptide fragmentation. We use a likelihood ratio hypothesis test to determine whether the peaks observed in the mass spectrum are more likely to have been produced under our fragmentation model than under a model that treats peaks as random events. We tested our de novo algorithm PepNovo on ion trap data and achieved results that are superior to popular de novo peptide sequencing algorithms. PepNovo can be accessed via the URL http://www-cse.ucsd.edu/groups/bioinformatics/software.html.

Assuntos

Funções Verossimilhança , Análise de Sequência de Proteína/métodos , Algoritmos , Probabilidade , Espectrometria de Massas em Tandem

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA