Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
J Proteome Res ; 23(6): 1907-1914, 2024 Jun 07.
Artigo em Inglês | MEDLINE | ID: mdl-38687997

RESUMO

Traditional database search methods for the analysis of bottom-up proteomics tandem mass spectrometry (MS/MS) data are limited in their ability to detect peptides with post-translational modifications (PTMs). Recently, "open modification" database search strategies, in which the requirement that the mass of the database peptide closely matches the observed precursor mass is relaxed, have become popular as ways to find a wider variety of types of PTMs. Indeed, in one study, Kong et al. reported that the open modification search tool MSFragger can achieve higher statistical power to detect peptides than a traditional "narrow window" database search. We investigated this claim empirically and, in the process, uncovered a potential general problem with false discovery rate (FDR) control in the machine learning postprocessors Percolator and PeptideProphet. This problem might have contributed to Kong et al.'s report that their empirical results suggest that false discovery (FDR) control in the narrow window setting might generally be compromised. Indeed, reanalyzing the same data while using a more standard form of target-decoy competition-based FDR control, we found that, after accounting for chimeric spectra as well as for the inherent difference in the number of candidates in open and narrow searches, the data does not provide sufficient evidence that FDR control in proteomics MS/MS database search is inherently problematic.


Assuntos
Bases de Dados de Proteínas , Processamento de Proteína Pós-Traducional , Proteômica , Espectrometria de Massas em Tandem , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Peptídeos/análise , Peptídeos/química , Aprendizado de Máquina , Humanos , Algoritmos , Software
2.
J Proteome Res ; 22(2): 647-655, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36629399

RESUMO

Fragmentation ion spectral analysis of chemically cross-linked proteins is an established technology in the proteomics research repertoire for determining protein interactions, spatial orientation, and structure. Here we present Kojak version 2.0, a major update to the original Kojak algorithm, which was developed to identify cross-linked peptides from fragment ion spectra using a database search approach. A substantially improved algorithm with updated scoring metrics, support for cleavable cross-linkers, and identification of cross-links between 15N-labeled homomultimers are among the newest features of Kojak 2.0 presented here. Kojak 2.0 is now integrated into the Trans-Proteomic Pipeline, enabling access to dozens of additional tools within that suite. In particular, the PeptideProphet and iProphet tools for validation of cross-links improve the sensitivity and accuracy of correct cross-link identifications at user-defined thresholds. These new features improve the versatility of the algorithm, enabling its use in a wider range of experimental designs and analysis pipelines. Kojak 2.0 remains open-source and multiplatform.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise , Proteínas/química , Software , Reagentes de Ligações Cruzadas/química
3.
J Proteome Res ; 22(2): 615-624, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36648445

RESUMO

The Trans-Proteomic Pipeline (TPP) mass spectrometry data analysis suite has been in continual development and refinement since its first tools, PeptideProphet and ProteinProphet, were published 20 years ago. The current release provides a large complement of tools for spectrum processing, spectrum searching, search validation, abundance computation, protein inference, and more. Many of the tools include machine-learning modeling to extract the most information from data sets and build robust statistical models to compute the probabilities that derived information is correct. Here we present the latest information on the many TPP tools, and how TPP can be deployed on various platforms from personal Windows laptops to Linux clusters and expansive cloud computing environments. We describe tutorials on how to use TPP in a variety of ways and describe synergistic projects that leverage TPP. We conclude with plans for continued development of TPP.


Assuntos
Proteômica , Software , Proteômica/métodos , Espectrometria de Massas , Probabilidade , Análise de Dados
4.
J Proteome Res ; 17(2): 846-857, 2018 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-29281288

RESUMO

Spectral library searching (SLS) is an attractive alternative to sequence database searching (SDS) for peptide identification due to its speed, sensitivity, and ability to include any selected mass spectra. While decoy methods for SLS have been developed for low mass accuracy peptide spectral libraries, it is not clear that they are optimal or directly applicable to high mass accuracy spectra. Therefore, we report the development and validation of methods for high mass accuracy decoy libraries. Two types of decoy libraries were found to be suitable for this purpose. The first, referred to as Reverse, constructs spectra by reversing a library's peptide sequences except for the C-terminal residue. The second, termed Random, randomly replaces all non-C-terminal residues and either retains the original C-terminal residue or replaces it based on the amino-acid frequency of the library's C-terminus. In both cases the m/z values of fragment ions are shifted accordingly. Determination of FDR is performed in a manner equivalent to SDS, concatenating a library with its decoy prior to a search. The utility of Reverse and Random libraries for target-decoy SLS in estimating false-positives and FDRs was demonstrated using spectra derived from a recently published synthetic human proteome project (Zolg, D. P.; et al. Nat. Methods 2017, 14, 259-262). For data sets from two large-scale label-free and iTRAQ experiments, these decoy building methods yielded highly similar score thresholds and spectral identifications at 1% FDR. The results were also found to be equivalent to those of using the decoy-free PeptideProphet algorithm. Using these new methods for FDR estimation, MSPepSearch, which is freely available search software, led to 18% more identifications at 1% FDR and 23% more at 0.1% FDR when compared with other widely used SDS engines coupled to postprocessing approaches such as Percolator. An application of these methods for FDR estimation for the recently reported "hybrid" library search (Burke, M. C.; et al. J. Proteome Res. 2017, 16, 1924-1935) method is also made. The application of decoy methods for high mass accuracy SLS permits the merging of these results with those of SDS, thereby increasing the assignment of more peptides, leading to deeper proteome coverage.


Assuntos
Algoritmos , Aminoácidos/química , Processamento de Imagem Assistida por Computador/estatística & dados numéricos , Bibliotecas Especializadas/métodos , Biblioteca de Peptídeos , Peptídeos/análise , Sequência de Aminoácidos , Humanos , Peptídeos/química , Software , Espectrometria de Massas em Tandem
5.
J Proteome Res ; 14(11): 4662-73, 2015 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-26390080

RESUMO

The two key steps for analyzing proteomic data generated by high-resolution MS are database searching and postprocessing. While the two steps are interrelated, studies on their combinatory effects and the optimization of these procedures have not been adequately conducted. Here, we investigated the performance of three popular search engines (SEQUEST, Mascot, and MS Amanda) in conjunction with five filtering approaches, including respective score-based filtering, a group-based approach, local false discovery rate (LFDR), PeptideProphet, and Percolator. A total of eight data sets from various proteomes (e.g., E. coli, yeast, and human) produced by various instruments with high-accuracy survey scan (MS1) and high- or low-accuracy fragment ion scan (MS2) (LTQ-Orbitrap, Orbitrap-Velos, Orbitrap-Elite, Q-Exactive, Orbitrap-Fusion, and Q-TOF) were analyzed. It was found combinations involving Percolator achieved markedly more peptide and protein identifications at the same FDR level than the other 12 combinations for all data sets. Among these, combinations of SEQUEST-Percolator and MS Amanda-Percolator provided slightly better performances for data sets with low-accuracy MS2 (ion trap or IT) and high accuracy MS2 (Orbitrap or TOF), respectively, than did other methods. For approaches without Percolator, SEQUEST-group performs the best for data sets with MS2 produced by collision-induced dissociation (CID) and IT analysis; Mascot-LFDR gives more identifications for data sets generated by higher-energy collisional dissociation (HCD) and analyzed in Orbitrap (HCD-OT) and in Orbitrap Fusion (HCD-IT); MS Amanda-Group excels for the Q-TOF data set and the Orbitrap Velos HCD-OT data set. Therefore, if Percolator was not used, a specific combination should be applied for each type of data set. Moreover, a higher percentage of multiple-peptide proteins and lower variation of protein spectral counts were observed when analyzing technical replicates using Percolator-associated combinations; therefore, Percolator enhanced the reliability for both identification and quantification. The analyses were performed using the specific programs embedded in Proteome Discoverer, Scaffold, and an in-house algorithm (BuildSummary). These results provide valuable guidelines for the optimal interpretation of proteomic results and the development of fit-for-purpose protocols under different situations.


Assuntos
Algoritmos , Peptídeos/análise , Proteoma/análise , Proteômica/métodos , Ferramenta de Busca/métodos , Software , Linhagem Celular Tumoral , Bases de Dados de Proteínas , Escherichia coli/genética , Escherichia coli/metabolismo , Humanos , Proteoma/genética , Proteoma/metabolismo , Proteômica/instrumentação , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Espectrometria de Massas em Tandem
6.
J Proteomics ; 91: 375-84, 2013 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-23933159

RESUMO

Mass measurement and precursor mass assignment are independent processes in proteomic data acquisition. Due to misassignments to C-13 peak, or for other reasons, extensive precursor mass shifts (i.e., deviations of the measured from calculated precursor neutral masses) in LC-MS/MS data obtained with the high-accuracy LTQ-Orbitrap mass spectrometers have been reported in previous studies. Although computational methods for post-acquisition reassignment to monoisotopic mass have been developed to curate the MS/MS spectra prior to database search, a simpler method for estimating the fraction of spectra with precursor mass shift so as to determine whether the data require curation remains desirable. Here, we provide the evidence that an easy approach, which applies a large precursor tolerance (2.1Da or higher) in SEQUEST search against a forward and decoy protein sequence database and then filters the data with PeptideProphet peptide identification probability (p≥0.9), could detect most of the MS/MS spectra containing inaccurate precursor masses. Furthermore, through the implementation of artificial mass shifts on 4000 randomly selected MS/MS spectra, which originally had accurate precursor mass assigned by the mass spectrometers, we demonstrated that the accuracy of the precursor mass has almost negligible influence on the efficacy and fidelity of peptide identification. BIOLOGICAL SIGNIFICANCE: Integral precursor mass shift is a known problem and thus proteomic data should be handled and analyzed properly to avoid losing important protein identification and/or quantification information. A quick and easy approach for estimating the number of MS/MS spectra with inaccurate precursor mass assignments would be helpful for evaluating the performance of the instrument, determining whether the data requires curation prior to database search or should be searched with specific search parameter(s). Here we demonstrated most of the MS/MS spectra with inaccurate mass assignments (integral or non-integral changes) that could be easily identified by database search with large precursor tolerance windows.


Assuntos
Bases de Dados de Proteínas , Halobacterium salinarum/química , Proteômica , Espectrometria de Massas em Tandem , Proteínas de Bactérias/química , Isótopos de Carbono/química , Linhagem Celular Tumoral , Etiquetas de Sequências Expressas , Humanos , Peptídeos/química , Probabilidade , Proteoma , Reprodutibilidade dos Testes , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA