Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 141
Filtrar
Mais filtros

Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 23(7): 100798, 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38871251

RESUMO

Rescoring of peptide spectrum matches originating from database search engines enabled by peptide property predictors is exceeding the performance of peptide identification from traditional database search engines. In contrast to the peptide spectrum match scores calculated by traditional database search engines, rescoring peptide spectrum matches generates scores based on comparing observed and predicted peptide properties, such as fragment ion intensities and retention times. These newly generated scores enable a more efficient discrimination between correct and incorrect peptide spectrum matches. This approach was shown to lead to substantial improvements in the number of confidently identified peptides, facilitating the analysis of challenging datasets in various fields such as immunopeptidomics, metaproteomics, proteogenomics, and single-cell proteomics. In this review, we summarize the key elements leading up to the recent introduction of multiple data-driven rescoring pipelines. We provide an overview of relevant post-processing rescoring tools, introduce prominent data-driven rescoring pipelines for various applications, and highlight limitations, opportunities, and future perspectives of this approach and its impact on mass spectrometry-based proteomics.

2.
Mol Cell Proteomics ; 22(4): 100518, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36828128

RESUMO

Single-cell proteomics is growing rapidly and has made several technological advancements. As most research has been focused on improving instrumentation and sample preparation methods, very little attention has been given to algorithms responsible for identifying and quantifying proteins. Given the inherent difference between bulk data and single-cell data, it is necessary to realize that current algorithms being employed on single-cell data were designed for bulk data and have underlying assumptions that may not hold true for single-cell data. In order to develop and optimize algorithms for single-cell data, we need to characterize the differences between single-cell data and bulk data and assess how current algorithms perform on single-cell data. Here, we present a review of algorithms responsible for identifying and quantifying peptides and proteins. We will give a review of how each type of algorithm works, assumptions it relies on, how it performs on single-cell data, and possible optimizations and solutions that could be used to address the differences in single-cell data.


Assuntos
Proteínas , Proteômica , Proteômica/métodos , Peptídeos/química , Algoritmos
3.
Proteomics ; 24(6): e2300236, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37706597

RESUMO

Clinical biomarker discovery is often based on the analysis of human plasma samples. However, the high dynamic range and complexity of plasma pose significant challenges to mass spectrometry-based proteomics. Current methods for improving protein identifications require laborious pre-analytical sample preparation. In this study, we developed and evaluated a TMTpro-specific spectral library for improved protein identification in human plasma proteomics. The library was constructed by LC-MS/MS analysis of highly fractionated TMTpro-tagged human plasma, human cell lysates, and relevant arterial tissues. The library was curated using several quality filters to ensure reliable peptide identifications. Our results show that spectral library searching using the TMTpro spectral library improves the identification of proteins in plasma samples compared to conventional sequence database searching. Protein identifications made by the spectral library search engine demonstrated a high degree of complementarity with the sequence database search engine, indicating the feasibility of increasing the number of protein identifications without additional pre-analytical sample preparation. The TMTpro-specific spectral library provides a resource for future plasma proteomics research and optimization of search algorithms for greater accuracy and speed in protein identifications in human plasma proteomics, and is made publicly available to the research community via ProteomeXchange with identifier PXD042546.


Assuntos
Proteômica , Software , Humanos , Proteômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise , Proteínas , Algoritmos , Bases de Dados de Proteínas , Biblioteca de Peptídeos
4.
J Proteome Res ; 2024 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-38491990

RESUMO

Rescoring of peptide-spectrum matches (PSMs) has emerged as a standard procedure for the analysis of tandem mass spectrometry data. This emphasizes the need for software maintenance and continuous improvement for such algorithms. We introduce MS2Rescore 3.0, a versatile, modular, and user-friendly platform designed to increase peptide identifications. Researchers can install MS2Rescore across various platforms with minimal effort and benefit from a graphical user interface, a modular Python API, and extensive documentation. To showcase this new version, we connected MS2Rescore 3.0 with MS Amanda 3.0, a new release of the well-established search engine, addressing previous limitations on automatic rescoring. Among new features, MS Amanda now contains additional output columns that can be used for rescoring. The full potential of rescoring is best revealed when applied on challenging data sets. We therefore evaluated the performance of these two tools on publicly available single-cell data sets, where the number of PSMs was substantially increased, thereby demonstrating that MS2Rescore offers a powerful solution to boost peptide identifications. MS2Rescore's modular design and user-friendly interface make data-driven rescoring easily accessible, even for inexperienced users. We therefore expect the MS2Rescore to be a valuable tool for the wider proteomics community. MS2Rescore is available at https://github.com/compomics/ms2rescore.

5.
J Proteome Res ; 23(5): 1757-1767, 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38644788

RESUMO

The American lobster, Homarus americanus, is not only of considerable economic importance but has also emerged as a premier model organism in neuroscience research. Neuropeptides, an important class of cell-to-cell signaling molecules, play crucial roles in a wide array of physiological and psychological processes. Leveraging the recently sequenced high-quality draft genome of the American lobster, our study sought to profile the neuropeptidome of this model organism. Employing advanced mass spectrometry techniques, we identified 24 neuropeptide precursors and 101 unique mature neuropeptides in Homarus americanus. Intriguingly, 67 of these neuropeptides were discovered for the first time. Our findings provide a comprehensive overview of the peptidomic attributes of the lobster's nervous system and highlight the tissue-specific distribution of these neuropeptides. Collectively, this research not only enriches our understanding of the neuronal complexities of the American lobster but also lays a foundation for future investigations into the functional roles that these peptides play in crustacean species. The mass spectrometry data have been deposited in the PRIDE repository with the identifier PXD047230.


Assuntos
Sequência de Aminoácidos , Nephropidae , Neuropeptídeos , Proteômica , Animais , Nephropidae/metabolismo , Neuropeptídeos/metabolismo , Neuropeptídeos/genética , Neuropeptídeos/análise , Proteômica/métodos , Espectrometria de Massas , Dados de Sequência Molecular
6.
J Proteome Res ; 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38832920

RESUMO

The advancement of sophisticated instrumentation in mass spectrometry has catalyzed an in-depth exploration of complex proteomes. This exploration necessitates a nuanced balance in experimental design, particularly between quantitative precision and the enumeration of analytes detected. In bottom-up proteomics, a key challenge is that oversampling of abundant proteins can adversely affect the identification of a diverse array of unique proteins. This issue is especially pronounced in samples with limited analytes, such as small tissue biopsies or single-cell samples. Methods such as depletion and fractionation are suboptimal to reduce oversampling in single cell samples, and other improvements on LC and mass spectrometry technologies and methods have been developed to address the trade-off between precision and enumeration. We demonstrate that by using a monosubstrate protease for proteomic analysis of single-cell equivalent digest samples, an improvement in quantitative accuracy can be achieved, while maintaining high proteome coverage established by trypsin. This improvement is particularly vital for the field of single-cell proteomics, where single-cell samples with limited number of protein copies, especially in the context of low-abundance proteins, can benefit from considering analyte complexity. Considerations about analyte complexity, alongside chromatographic complexity, integration with data acquisition methods, and other factors such as those involving enzyme kinetics, will be crucial in the design of future single-cell workflows.

7.
J Proteome Res ; 23(2): 834-843, 2024 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-38252705

RESUMO

In shotgun proteomics, the proteome search engine analyzes mass spectra obtained by experiments, and then a peptide-spectra match (PSM) is reported for each spectrum. However, most of the PSMs identified are incorrect, and therefore various postprocessing software have been developed for reranking the peptide identifications. Yet these methods suffer from issues such as dependency on distribution, reliance on shallow models, and limited effectiveness. In this work, we propose AttnPep, a deep learning model for rescoring PSM scores that utilizes the Self-Attention module. This module helps the neural network focus on features relevant to the classification of PSMs and ignore irrelevant features. This allows AttnPep to analyze the output of different search engines and improve PSM discrimination accuracy. We considered a PSM to be correct if it achieves a q-value <0.01 and compared AttnPep with existing mainstream software PeptideProphet, Percolator, and proteoTorch. The results indicated that AttnPep found an average increase in correct PSMs of 9.29% relative to the other methods. Additionally, AttnPep was able to better distinguish between correct and incorrect PSMs and found more synthetic peptides in the complex SWATH data set.


Assuntos
Algoritmos , Aprendizado Profundo , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos , Software , Bases de Dados de Proteínas
8.
J Proteome Res ; 23(2): 550-559, 2024 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-38153036

RESUMO

In bottom-up proteomics, peptide-spectrum matching is critical for peptide and protein identification. Recently, deep learning models have been used to predict tandem mass spectra of peptides, enabling the calculation of similarity scores between the predicted and experimental spectra for peptide-spectrum matching. These models follow the supervised learning paradigm, which trains a general model using paired peptides and spectra from standard data sets and directly employs the model on experimental data. However, this approach can lead to inaccurate predictions due to differences between the training data and the experimental data, such as sample types, enzyme specificity, and instrument calibration. To tackle this problem, we developed a test-time training paradigm that adapts the pretrained model to generate experimental data-specific models, namely, PepT3. PepT3 yields a 10-40% increase in peptide identification depending on the variability in training and experimental data. Intriguingly, when applied to a patient-derived immunopeptidomic sample, PepT3 increases the identification of tumor-specific immunopeptide candidates by 60%. Two-thirds of the newly identified candidates are predicted to bind to the patient's human leukocyte antigen isoforms. To facilitate access of the model and all the results, we have archived all the intermediate files in Zenodo.org with identifier 8231084.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Humanos , Espectrometria de Massas em Tandem/métodos , Proteínas , Modelos Teóricos , Proteômica/métodos , Algoritmos
9.
J Proteome Res ; 23(2): 574-584, 2024 02 02.
Artigo em Inglês | MEDLINE | ID: mdl-38157563

RESUMO

Accurate and comprehensive peptide precursor ions are crucial to tandem mass-spectrometry-based peptide identification. An identification engine can derive great advantages from the search space reduction enabled by credible and detailed precursors. Furthermore, by considering multiple precursors per spectrum, both the number of identifications and the spectrum explainability can be substantially improved. Here, we introduce PepPre, which detects precursors by decomposing peaks into multiple isotope clusters using linear programming methods. The detected precursors are scored and ranked, and the high-scoring ones are used for subsequent peptide identification. PepPre is evaluated both on regular and cross-linked peptide data sets and compared with 11 methods. The experimental results show that PepPre achieves a remarkable increase of 203% in PSM and 68% in peptide identifications compared to instrument software for regular peptides and 99% in PSM and 27% in peptide pair identifications for cross-linked peptides, surpassing the performance of all other evaluated methods. In addition to the increased identification numbers, further credibility evaluations evidence the reliability of the identified results. Moreover, by widening the isolation window of data acquisition from 2 to 8 Th, with PepPre, an engine is able to identify at least 64% more PSMs, thereby demonstrating the potential advantages of wide-window data acquisition. PepPre is open-source and available at http://peppre.ctarn.io.


Assuntos
Peptídeos , Proteômica , Reprodutibilidade dos Testes , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Bases de Dados de Proteínas , Algoritmos
10.
BMC Genomics ; 25(1): 619, 2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38898442

RESUMO

Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation.


Assuntos
Produtos Agrícolas , Genoma de Planta , Anotação de Sequência Molecular , Proteômica , Produtos Agrícolas/genética , Proteômica/métodos , Genômica/métodos , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo
11.
Mol Cell Proteomics ; 21(8): 100266, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35803561

RESUMO

Immunopeptidomics aims to identify major histocompatibility complex (MHC)-presented peptides on almost all cells that can be used in anti-cancer vaccine development. However, existing immunopeptidomics data analysis pipelines suffer from the nontryptic nature of immunopeptides, complicating their identification. Previously, peak intensity predictions by MS2PIP and retention time predictions by DeepLC have been shown to improve tryptic peptide identifications when rescoring peptide-spectrum matches with Percolator. However, as MS2PIP was tailored toward tryptic peptides, we have here retrained MS2PIP to include nontryptic peptides. Interestingly, the new models not only greatly improve predictions for immunopeptides but also yield further improvements for tryptic peptides. We show that the integration of new MS2PIP models, DeepLC, and Percolator in one software package, MS2Rescore, increases spectrum identification rate and unique identified peptides with 46% and 36% compared to standard Percolator rescoring at 1% FDR. Moreover, MS2Rescore also outperforms the current state-of-the-art in immunopeptide-specific identification approaches. Altogether, MS2Rescore thus allows substantially improved identification of novel epitopes from existing immunopeptidomics workflows.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Algoritmos , Peptídeos , Proteínas
12.
BMC Bioinformatics ; 24(1): 421, 2023 Nov 08.
Artigo em Inglês | MEDLINE | ID: mdl-37940845

RESUMO

BACKGROUND: In proteomics, the interpretation of mass spectra representing peptides carrying multiple complex modifications remains challenging, as it is difficult to strike a balance between reasonable execution time, a limited number of false positives, and a huge search space allowing any number of modifications without a priori. The scientific community needs new developments in this area to aid in the discovery of novel post-translational modifications that may play important roles in disease. RESULTS: To make progress on this issue, we implemented SpecGlobX (SpecGlob eXTended to eXperimental spectra), a standalone Java application that quickly determines the best spectral alignments of a (possibly very large) list of Peptide-to-Spectrum Matches (PSMs) provided by any open modification search method, or generated by the user. As input, SpecGlobX reads a file containing spectra in MGF or mzML format and a semicolon-delimited spreadsheet describing the PSMs. SpecGlobX returns the best alignment for each PSM as output, splitting the mass difference between the spectrum and the peptide into one or more shifts while considering the possibility of non-aligned masses (a phenomenon resulting from many situations including neutral losses). SpecGlobX is fast, able to align one million PSMs in about 1.5 min on a standard desktop. Firstly, we remind the foundations of the algorithm and detail how we adapted SpecGlob (the method we previously developed following the same aim, but limited to the interpretation of perfect simulated spectra) to the interpretation of imperfect experimental spectra. Then, we highlight the interest of SpecGlobX as a complementary tool downstream to three open modification search methods on a large simulated spectra dataset. Finally, we ran SpecGlobX on a proteome-wide dataset downloaded from PRIDE to demonstrate that SpecGlobX functions just as well on simulated and experimental spectra. We then carefully analyzed a limited set of interpretations. CONCLUSIONS: SpecGlobX is helpful as a decision support tool, providing keys to interpret peptides carrying complex modifications still poorly considered by current open modification search software. Better alignment of PSMs enhances confidence in the identification of spectra provided by open modification search methods and should improve the interpretation rate of spectra.


Assuntos
Peptídeos , Proteômica , Proteômica/métodos , Bases de Dados de Proteínas , Espectrometria de Massas/métodos , Software , Algoritmos
13.
J Proteome Res ; 22(12): 3692-3702, 2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-37910637

RESUMO

Spectral libraries are useful resources in proteomic data analysis. Recent advances in deep learning allow tandem mass spectra of peptides to be predicted from their amino acid sequences. This enables predicted spectral libraries to be compiled, and searching against such libraries has been shown to improve the sensitivity in peptide identification over conventional sequence database searching. However, current prediction models lack support for longer peptides, and thus far, predicted library searching has only been demonstrated for backbone ion-only spectrum prediction methods. Here, we propose a deep learning-based full-spectrum prediction method to generate predicted spectral libraries for peptide identification. We demonstrated the superiority of using full-spectrum libraries over backbone ion-only prediction approaches in spectral library searching. Furthermore, merging spectra from different prediction models, as a form of ensemble learning, can produce improved spectral libraries, in terms of identification sensitivity. We also show that a hybrid library combining predicted and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching.


Assuntos
Aprendizado Profundo , Biblioteca de Peptídeos , Proteômica/métodos , Software , Bases de Dados de Proteínas , Peptídeos/química
14.
J Proteome Res ; 22(4): 1159-1171, 2023 04 07.
Artigo em Inglês | MEDLINE | ID: mdl-36962508

RESUMO

One of the chief objectives in mass spectrometry-based peptide identification in proteomics is the statistical validation of top-scoring peptide-spectrum matches (PSMs) in the form of false discovery rate (FDR) estimation. Existing methods construct a null model that captures the characteristics of incorrect target PSMs to estimate the FDR, most often with the help of decoys. Decoy-based methods, however, increase the computational cost and rely on the difficult-to-verify assumption that decoy PSMs constitute a sufficient and representative sample of the population of possible incorrect target PSMs. On the other hand, the possibility of FDR estimation assisted by the plentiful non-top-scoring PSMs, which are almost always incorrect, has been scarcely explored. In this work, we propose a novel decoy-free procedure for developing null models for top-scoring PSMs using the transformed e-value (TEV) score and the distributions of non-top-scoring target PSMs. The method relies on a theoretically derivable relationship between the parameters of the distributions of lower-order statistics of the TEV score and a necessary empirical optimization to fit a single parameter to actual data. The framework was tested on multiple different data sets and two search engines. We present evidence that our method is comparable to and occasionally outperforms popular decoy-free and decoy-based methods in FDR estimation.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos , Ferramenta de Busca , Bases de Dados de Proteínas , Algoritmos
15.
J Proteome Res ; 22(2): 482-490, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36695531

RESUMO

Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Software , Proteômica , Ferramenta de Busca , Biblioteca de Peptídeos , Bases de Dados de Proteínas
16.
J Proteome Res ; 22(10): 3190-3199, 2023 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-37656829

RESUMO

Precision medicine focuses on adapting care to the individual profile of patients, for example, accounting for their unique genetic makeup. Being able to account for the effect of genetic variation on the proteome holds great promise toward this goal. However, identifying the protein products of genetic variation using mass spectrometry has proven very challenging. Here we show that the identification of variant peptides can be improved by the integration of retention time and fragmentation predictors into a unified proteogenomic pipeline. By combining these intrinsic peptide characteristics using the search-engine post-processor Percolator, we demonstrate improved discrimination power between correct and incorrect peptide-spectrum matches. Our results demonstrate that the drop in performance that is induced when expanding a protein sequence database can be compensated, hence enabling efficient identification of genetic variation products in proteomics data. We anticipate that this enhancement of proteogenomic pipelines can provide a more refined picture of the unique proteome of patients and thereby contribute to improving patient care.

17.
J Proteome Res ; 22(2): 557-560, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36508242

RESUMO

A plethora of proteomics search engine output file formats are in circulation. This lack of standardized output files greatly complicates generic downstream processing of peptide-spectrum matches (PSMs) and PSM files. While standards exist to solve this problem, these are far from universally supported by search engines. Moreover, software libraries are available to read a selection of PSM file formats, but a package to parse PSM files into a unified data structure has been missing. Here, we present psm_utils, a Python package to read and write various PSM file formats and to handle peptidoforms, PSMs, and PSM lists in a unified and user-friendly Python-, command line-, and web-interface. psm_utils was developed with pragmatism and maintainability in mind, adhering to community standards and relying on existing packages where possible. The Python API and command line interface greatly facilitate handling various PSM file formats. Moreover, a user-friendly web application was built using psm_utils that allows anyone to interconvert PSM files and retrieve basic PSM statistics. psm_utils is freely available under the permissive Apache2 license at https://github.com/compomics/psm_utils.


Assuntos
Proteômica , Software , Proteômica/métodos , Peptídeos , Ferramenta de Busca
18.
J Proteome Res ; 22(1): 101-113, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36480279

RESUMO

Improving the sensitivity of protein-protein interaction detection and protein structure probing is a principal challenge in cross-linking mass spectrometry (XL-MS) data analysis. In this paper, we propose an exhaustive cross-linking search method with protein feedback (ECL-PF) for cleavable XL-MS data analysis. ECL-PF adopts an optimized α/ß mass detection scheme and establishes protein-peptide association during the identification of cross-linked peptides. Existing major scoring functions can all benefit from the ECL-PF workflow to a great extent. In comparisons using synthetic data sets and hybrid simulated data sets, ECL-PF achieved 3-fold higher sensitivity over standard techniques. In experiments using real data sets, it also identified 65.6% more cross-link spectrum matches and 48.7% more unique cross-links.


Assuntos
Peptídeos , Proteínas , Retroalimentação , Proteínas/química , Peptídeos/análise , Espectrometria de Massas/métodos , Reagentes de Ligações Cruzadas/química
19.
J Proteome Res ; 22(6): 1639-1648, 2023 06 02.
Artigo em Inglês | MEDLINE | ID: mdl-37166120

RESUMO

As current shotgun proteomics experiments can produce gigabytes of mass spectrometry data per hour, processing these massive data volumes has become progressively more challenging. Spectral clustering is an effective approach to speed up downstream data processing by merging highly similar spectra to minimize data redundancy. However, because state-of-the-art spectral clustering tools fail to achieve optimal runtimes, this simply moves the processing bottleneck. In this work, we present a fast spectral clustering tool, HyperSpec, based on hyperdimensional computing (HDC). HDC shows promising clustering capability while only requiring lightweight binary operations with high parallelism that can be optimized using low-level hardware architectures, making it possible to run HyperSpec on graphics processing units to achieve extremely efficient spectral clustering performance. Additionally, HyperSpec includes optimized data preprocessing modules to reduce the spectrum preprocessing time, which is a critical bottleneck during spectral clustering. Based on experiments using various mass spectrometry data sets, HyperSpec produces results with comparable clustering quality as state-of-the-art spectral clustering tools while achieving speedups by orders of magnitude, shortening the clustering runtime of over 21 million spectra from 4 h to only 24 min.


Assuntos
Algoritmos , Peptídeos , Peptídeos/análise , Espectrometria de Massas/métodos , Proteômica/métodos , Análise por Conglomerados
20.
J Proteome Res ; 22(2): 462-470, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36688604

RESUMO

Spectral library search can enable more sensitive peptide identification in tandem mass spectrometry experiments. However, its drawbacks are the limited availability of high-quality libraries and the added difficulty of creating decoy spectra for result validation. We describe MS Ana, a new spectral library search engine that enables high sensitivity peptide identification using either curated or predicted spectral libraries as well as robust false discovery control through its own decoy library generation algorithm. MS Ana identifies on average 36% more spectrum matches and 4% more proteins than database search in a benchmark test on single-shot human cell-line data. Further, we demonstrate the quality of the result validation with tests on synthetic peptide pools and show the importance of library selection through a comparison of library search performance with different configurations of publicly available human spectral libraries.


Assuntos
Biblioteca de Peptídeos , Software , Humanos , Peptídeos/análise , Proteínas/química , Algoritmos , Bases de Dados de Proteínas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA