Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 179
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 23(2): 100712, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38182042

RESUMO

Data-independent acquisition (DIA) mass spectrometry (MS) has emerged as a powerful technology for high-throughput, accurate, and reproducible quantitative proteomics. This review provides a comprehensive overview of recent advances in both the experimental and computational methods for DIA proteomics, from data acquisition schemes to analysis strategies and software tools. DIA acquisition schemes are categorized based on the design of precursor isolation windows, highlighting wide-window, overlapping-window, narrow-window, scanning quadrupole-based, and parallel accumulation-serial fragmentation-enhanced DIA methods. For DIA data analysis, major strategies are classified into spectrum reconstruction, sequence-based search, library-based search, de novo sequencing, and sequencing-independent approaches. A wide array of software tools implementing these strategies are reviewed, with details on their overall workflows and scoring approaches at different steps. The generation and optimization of spectral libraries, which are critical resources for DIA analysis, are also discussed. Publicly available benchmark datasets covering global proteomics and phosphoproteomics are summarized to facilitate performance evaluation of various software tools and analysis workflows. Continued advances and synergistic developments of versatile components in DIA workflows are expected to further enhance the power of DIA-based proteomics.


Assuntos
Proteômica , Software , Proteômica/métodos , Espectrometria de Massas/métodos , Biblioteca Gênica , Proteoma/análise
2.
Mol Cell Proteomics ; 23(6): 100777, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38670310

RESUMO

Transmembrane (TM) proteins constitute over 30% of the mammalian proteome and play essential roles in mediating cell-cell communication, synaptic transmission, and plasticity in the central nervous system. Many of these proteins, especially the G protein-coupled receptors (GPCRs), are validated or candidate drug targets for therapeutic development for mental diseases, yet their expression profiles are underrepresented in most global proteomic studies. Herein, we establish a brain TM protein-enriched spectral library based on 136 data-dependent acquisition runs acquired from various brain regions of both naïve mice and mental disease models. This spectral library comprises 3043 TM proteins including 171 GPCRs, 231 ion channels, and 598 transporters. Leveraging this library, we analyzed the data-independent acquisition data from different brain regions of two mouse models exhibiting depression- or anxiety-like behaviors. By integrating multiple informatics workflows and library sources, our study significantly expanded the mental stress-perturbed TM proteome landscape, from which a new GPCR regulator of depression was verified by in vivo pharmacological testing. In summary, we provide a high-quality mouse brain TM protein spectral library to largely increase the TM proteome coverage in specific brain regions, which would catalyze the discovery of new potential drug targets for the treatment of mental disorders.


Assuntos
Encéfalo , Modelos Animais de Doenças , Transtornos Mentais , Camundongos Endogâmicos C57BL , Proteoma , Proteômica , Animais , Proteoma/metabolismo , Encéfalo/metabolismo , Proteômica/métodos , Camundongos , Transtornos Mentais/metabolismo , Proteínas de Membrana/metabolismo , Masculino , Receptores Acoplados a Proteínas G/metabolismo
3.
Mol Cell Proteomics ; 22(4): 100515, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36796644

RESUMO

Immunopeptidomes are the peptide repertoires bound by the molecules encoded by the major histocompatibility complex [human leukocyte antigen (HLA) in humans]. These HLA-peptide complexes are presented on the cell surface for immune T-cell recognition. Immunopeptidomics denotes the utilization of tandem mass spectrometry to identify and quantify peptides bound to HLA molecules. Data-independent acquisition (DIA) has emerged as a powerful strategy for quantitative proteomics and deep proteome-wide identification; however, DIA application to immunopeptidomics analyses has so far seen limited use. Further, of the many DIA data processing tools currently available, there is no consensus in the immunopeptidomics community on the most appropriate pipeline(s) for in-depth and accurate HLA peptide identification. Herein, we benchmarked four commonly used spectral library-based DIA pipelines developed for proteomics applications (Skyline, Spectronaut, DIA-NN, and PEAKS) for their ability to perform immunopeptidome quantification. We validated and assessed the capability of each tool to identify and quantify HLA-bound peptides. Generally, DIA-NN and PEAKS provided higher immunopeptidome coverage with more reproducible results. Skyline and Spectronaut conferred more accurate peptide identification with lower experimental false-positive rates. All tools demonstrated reasonable correlations in quantifying precursors of HLA-bound peptides. Our benchmarking study suggests a combined strategy of applying at least two complementary DIA software tools to achieve the greatest degree of confidence and in-depth coverage of immunopeptidome data.


Assuntos
Benchmarking , Peptídeos , Humanos , Peptídeos/análise , Antígenos de Histocompatibilidade Classe I/metabolismo , Proteômica/métodos , Espectrometria de Massas em Tandem , Antígenos de Histocompatibilidade Classe II
4.
Proteomics ; 24(6): e2300236, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37706597

RESUMO

Clinical biomarker discovery is often based on the analysis of human plasma samples. However, the high dynamic range and complexity of plasma pose significant challenges to mass spectrometry-based proteomics. Current methods for improving protein identifications require laborious pre-analytical sample preparation. In this study, we developed and evaluated a TMTpro-specific spectral library for improved protein identification in human plasma proteomics. The library was constructed by LC-MS/MS analysis of highly fractionated TMTpro-tagged human plasma, human cell lysates, and relevant arterial tissues. The library was curated using several quality filters to ensure reliable peptide identifications. Our results show that spectral library searching using the TMTpro spectral library improves the identification of proteins in plasma samples compared to conventional sequence database searching. Protein identifications made by the spectral library search engine demonstrated a high degree of complementarity with the sequence database search engine, indicating the feasibility of increasing the number of protein identifications without additional pre-analytical sample preparation. The TMTpro-specific spectral library provides a resource for future plasma proteomics research and optimization of search algorithms for greater accuracy and speed in protein identifications in human plasma proteomics, and is made publicly available to the research community via ProteomeXchange with identifier PXD042546.


Assuntos
Proteômica , Software , Humanos , Proteômica/métodos , Cromatografia Líquida/métodos , Espectrometria de Massas em Tandem/métodos , Peptídeos/análise , Proteínas , Algoritmos , Bases de Dados de Proteínas , Biblioteca de Peptídeos
5.
Proteomics ; 24(14): e2300431, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38468111

RESUMO

SWATH is a data acquisition strategy acclaimed for generating quantitatively accurate and consistent measurements of proteins across multiple samples. Its utility for proteomics studies in nonlaboratory animals, however, is currently compromised by the lack of sufficiently comprehensive and reliable public libraries, either experimental or predicted, and relevant platforms that support their sharing and utilization in an intuitive manner. Here we describe the development of the Veterinary Proteome Browser, VPBrowse (http://browser.proteo.cloud/), an on-line platform for genome-based representation of the Bos taurus proteome, which is equipped with an interactive database and tools for searching, visualization, and building quantitative mass spectrometry assays. In its current version (VPBrowse 1.0), it contains high-quality fragmentation spectra acquired on QToF instrument for over 36,000 proteotypic peptides, the experimental evidence for over 10,000 proteins. Data can be downloaded in different formats to enable analysis using popular software packages for SWATH data processing whilst normalization to iRT scale ensures compatibility with diverse chromatography systems. When applied to published blood plasma dataset from the biomarker discovery study, the resource supported label-free quantification of additional proteins not reported by the authors previously including PSMA4, a tissue leakage protein and a promising candidate biomarker of animal's response to dehorning-related injury.


Assuntos
Proteoma , Proteômica , Software , Espectrometria de Massas em Tandem , Bovinos , Animais , Espectrometria de Massas em Tandem/métodos , Proteômica/métodos , Proteoma/análise , Bases de Dados de Proteínas , Genoma/genética
6.
Proteomics ; 24(15): e2300285, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38171828

RESUMO

Neuropeptides have tremendous potential for application in modern medicine, including utility as biomarkers and therapeutics. To overcome the inherent challenges associated with neuropeptide identification and characterization, data-independent acquisition (DIA) is a fitting mass spectrometry (MS) method of choice to achieve sensitive and accurate analysis. It is advantageous for preliminary neuropeptidomic studies to occur in less complex organisms, with crustacean models serving as a popular choice due to their relatively simple nervous system. With spectral libraries serving as a means to interpret DIA-MS output spectra, and Cancer borealis as a model of choice for neuropeptide analysis, we performed the first spectral library mapping of crustacean neuropeptides. Leveraging pre-existing data-dependent acquisition (DDA) spectra, a spectral library was built using PEAKS Online. The library is comprised of 333 unique neuropeptides. The identification results obtained through the use of this spectral library were compared with those achieved through library-free analysis of crustacean brain, pericardial organs (PO), and thoracic ganglia (TG) tissues. A statistically significant increase (Student's t-test, P value < 0.05) in the number of identifications achieved from the TG data was observed in the spectral library results. Furthermore, in each of the tissues, a distinctly different set of identifications was found in the library search compared to the library-free search. This work highlights the necessity for the use of spectral libraries in neuropeptide analysis, illustrating the advantage of spectral libraries for interpreting DIA spectra in a reproducible manner with greater neuropeptidomic depth.


Assuntos
Espectrometria de Massas , Neuropeptídeos , Animais , Neuropeptídeos/análise , Espectrometria de Massas/métodos , Braquiúros/química , Braquiúros/metabolismo , Biblioteca de Peptídeos , Proteômica/métodos , Crustáceos/química , Bases de Dados de Proteínas
7.
Proteomics ; 24(15): e2300628, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38400697

RESUMO

Botryllus schlosseri, is a model marine invertebrate for studying immunity, regeneration, and stress-induced evolution. Conditions for validating its predicted proteome were optimized using nanoElute® 2 deep-coverage LCMS, revealing up to 4930 protein groups and 20,984 unique peptides per sample. Spectral libraries were generated and filtered to remove interferences, low-quality transitions, and only retain proteins with >3 unique peptides. The resulting DIA assay library enabled label-free quantitation of 3426 protein groups represented by 22,593 unique peptides. Quantitative comparisons of single systems from a laboratory-raised with two field-collected populations revealed (1) a more unique proteome in the laboratory-raised population, and (2) proteins with high/low individual variabilities in each population. DNA repair/replication, ion transport, and intracellular signaling processes were distinct in laboratory-cultured colonies. Spliceosome and Wnt signaling proteins were the least variable (highly functionally constrained) in all populations. In conclusion, we present the first colonial tunicate's deep quantitative proteome analysis, identifying functional protein clusters associated with laboratory conditions, different habitats, and strong versus relaxed abundance constraints. These results empower research on B. schlosseri with proteomics resources and enable quantitative molecular phenotyping of changes associated with transfer from in situ to ex situ and from in vivo to in vitro culture conditions.


Assuntos
Proteoma , Proteômica , Urocordados , Animais , Proteômica/métodos , Urocordados/metabolismo , Proteoma/análise , Proteoma/metabolismo , Cromatografia Líquida/métodos
8.
J Proteome Res ; 23(5): 1768-1778, 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38580319

RESUMO

Biofluids contain molecules in circulation and from nearby organs that can be indicative of disease states. Characterizing the proteome of biofluids with DIA-MS is an emerging area of interest for biomarker discovery; yet, there is limited consensus on DIA-MS data analysis approaches for analyzing large numbers of biofluids. To evaluate various DIA-MS workflows, we collected urine from a clinically heterogeneous cohort of prostate cancer patients and acquired data in DDA and DIA scan modes. We then searched the DIA data against urine spectral libraries generated using common library generation approaches or a library-free method. We show that DIA-MS doubles the sample throughput compared to standard DDA-MS with minimal losses to peptide detection. We further demonstrate that using a sample-specific spectral library generated from individual urines maximizes peptide detection compared to a library-free approach, a pan-human library, or libraries generated from pooled, fractionated urines. Adding urine subproteomes, such as the urinary extracellular vesicular proteome, to the urine spectral library further improves the detection of prostate proteins in unfractionated urine. Altogether, we present an optimized DIA-MS workflow and provide several high-quality, comprehensive prostate cancer urine spectral libraries that can streamline future biomarker discovery studies of prostate cancer using DIA-MS.


Assuntos
Neoplasias da Próstata , Proteoma , Proteômica , Humanos , Masculino , Neoplasias da Próstata/urina , Neoplasias da Próstata/diagnóstico , Proteoma/análise , Proteômica/métodos , Próstata/metabolismo , Próstata/patologia , Biblioteca de Peptídeos , Biomarcadores Tumorais/urina , Espectrometria de Massas em Tandem/métodos , Fluxo de Trabalho
9.
J Proteome Res ; 23(3): 1102-1117, 2024 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-38358903

RESUMO

Nontuberculous mycobacteria are opportunistic bacteria pulmonary and extra-pulmonary infections in humans that closely resemble Mycobacterium tuberculosis. Although genome sequencing strategies helped determine NTMs, a common assay for the detection of coinfection by multiple NTMs with M. tuberculosis in the primary attempt of diagnosis is still elusive. Such a lack of efficiency leads to delayed therapy, an inappropriate choice of drugs, drug resistance, disease complications, morbidity, and mortality. Although a high-resolution LC-MS/MS-based multiprotein panel assay can be developed due to its specificity and sensitivity, it needs a library of species-specific peptides as a platform. Toward this, we performed an analysis of proteomes of 9 NTM species with more than 20 million peptide spectrum matches gathered from 26 proteome data sets. Our metaproteomic analyses determined 48,172 species-specific proteotypic peptides across 9 NTMs. Notably, M. smegmatis (26,008), M. abscessus (12,442), M. vaccae (6487), M. fortuitum (1623), M. avium subsp. paratuberculosis (844), M. avium subsp. hominissuis (580), and M. marinum (112) displayed >100 species-specific proteotypic peptides. Finally, these peptides and corresponding spectra have been compiled into a spectral library, FASTA, and JSON formats for future reference and validation in clinical cohorts by the biomedical community for further translation.


Assuntos
Mycobacterium tuberculosis , Proteômica , Animais , Humanos , Cromatografia Líquida , Espectrometria de Massas em Tandem , Micobactérias não Tuberculosas/genética , Mycobacterium tuberculosis/genética , Peptídeos
10.
J Proteome Res ; 23(4): 1263-1271, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38478054

RESUMO

Amino acid substitutions (AASs) alter proteins from their genome-expected sequences. Accumulation of substitutions in proteins underlies numerous diseases and antibiotic mechanisms. Accurate global detection of AASs and their frequencies is crucial for understanding these mechanisms. Shotgun proteomics provides an untargeted method for measuring AASs but introduces biases when extrapolating from the genome to identify AASs. To characterize these biases, we created a "ground-truth" approach using the similarities betweenEscherichia coli and Salmonella typhimurium to model the complexity of AAS detection. Shotgun proteomics on mixed lysates generated libraries representing ∼100,000 peptide-spectra and 4161 peptide sequences with a single AAS and defined stoichiometry. Identifying S. typhimurium peptide-spectra with only the E. coli genome resulted in 64.1% correctly identified library peptides. Specific AASs exhibit variable identification efficiencies. There was no inherent bias from the stoichiometry of the substitutions. Short peptides and AASs localized near peptide termini had poor identification efficiency. We identify a new class of "scissor substitutions" that gain or lose protease cleavage sites. Scissor substitutions also had poor identification efficiency. This ground-truth AAS library reveals various sources of bias, which will guide the application of shotgun proteomics to validate AAS hypotheses.


Assuntos
Escherichia coli , Proteômica , Proteômica/métodos , Substituição de Aminoácidos , Escherichia coli/genética , Peptídeos/genética , Peptídeos/química , Proteínas
11.
Metabolomics ; 20(6): 114, 2024 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-39397202

RESUMO

INTRODUCTION: Over the past two decades, liquid chromatography-mass spectrometry (LC-MS)-based metabolomics has experienced significant growth, playing a crucial role in various scientific disciplines. However, despite these advance-ments, metabolite identification (MetID) remains a significant challenge. To address this, stringent MetID requirements were established, emphasizing the necessity of aligning experimental data with authentic reference standards using multiple criteria. Establishing dependable methods and corresponding libraries is crucial for instilling confidence in MetID and driving further progress in metabolomics. OBJECTIVE: The EMBL-MCF 2.0 LC-MS/MS method and public library was designed to facilitate both targeted and untargeted metabolomics with exclusive focus on endogenous, polar metabolites, which are known to be challenging to analyze due to their hydrophilic nature. By accompanying spectral data with robust retention times obtained from authentic standards and low-adsorption chromatography, high confidence MetID is achieved and accessible to the metabolomics community. METHODS: The library is built on hydrophilic interaction liquid chromatography (HILIC) and state-of-the-art low adsorption LC hardware. Both high-resolution tandem mass spectra and manually optimized multiple reaction monitoring (MRM) transitions were acquired on an Orbitrap Exploris 240 and a QTRAP 6500+, respectively. RESULTS: Implementation of biocompatible HILIC has facilitated the separation of isomeric metabolites with significant enhancements in both selectivity and sensitivity. The resulting library comprises a diverse collection of more than 250 biologically relevant metabolites. The methodology was successfully applied to investigate a variety of biological matrices, with exemplary findings showcased using murine plasma samples. CONCLUSIONS: Our work has resulted in the development of the EMBL-MCF 2.0 library, a powerful resource for sensitive metabolomics analyses and high-confidence MetID. The library is freely accessible and available in the universal .msp file format under the CC-BY 4.0 license: mona.fiehnlab.ucdavis.edu https://mona.fiehnlab.ucdavis.edu/spectra/browse?query=exists(tags.text:%27EMBL-MCF_2.0_HRMS_Library%27) , EMBL-MCF 2.0 HRMS https://www.embl.org/groups/metabolomics/instrumentation-and-software/#MCF-library .


Assuntos
Interações Hidrofóbicas e Hidrofílicas , Metabolômica , Espectrometria de Massas em Tandem , Metabolômica/métodos , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/métodos , Animais , Camundongos , Humanos , Espectrometria de Massa com Cromatografia Líquida
12.
Anal Bioanal Chem ; 2024 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-39251428

RESUMO

Pharmaceuticals released into the aquatic and soil environments can be absorbed by plants and soil organisms, potentially leading to the formation of unknown metabolites that may negatively affect these organisms or contaminate the food chain. The aim of this study was to identify pharmaceutical metabolites through a triplet approach for metabolite structure prediction (software-based predictions, literature review, and known common metabolic pathways), followed by generating in silico mass spectral libraries and applying various mass spectrometry modes for untargeted LC-qTOF analysis. Therefore, Eisenia fetida and Lactuca sativa were exposed to a pharmaceutical mixture (atenolol, enrofloxacin, erythromycin, ketoprofen, sulfametoxazole, tetracycline) under hydroponic and soil conditions at environmentally relevant concentrations. Samples collected at different time points were extracted using QuEChERS and analyzed with LC-qTOF in data-dependent (DDA) and data-independent (DIA) acquisition modes, applying both positive and negative electrospray ionization. The triplet approach for metabolite structure prediction yielded a total of 3762 pharmaceutical metabolites, and an in silico mass spectral library was created based on these predicted metabolites. This approach resulted in the identification of 26 statistically significant metabolites (p < 0.05), with DDA + and DDA - outperforming DIA modes by successfully detecting 56/67 sample type:metabolite combinations. Lettuce roots had the highest metabolite count (26), followed by leaves (6) and earthworms (2). Despite the lower metabolite count, earthworms showed the highest peak intensities, closely followed by roots, with leaves displaying the lowest intensities. Common metabolic reactions observed included hydroxylation, decarboxylation, acetylation, and glucosidation, with ketoprofen-related metabolites being the most prevalent, totaling 12 distinct metabolites. In conclusion, we developed a high-throughput workflow combining open-source software with LC-HRMS for identifying unknown metabolites across various sample types.

13.
Mol Cell Proteomics ; 21(10): 100408, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-36058520

RESUMO

The mouse is a valuable model organism for biomedical research. Here, we established a comprehensive spectral library and the data-independent acquisition-based quantitative proteome maps for 41 mouse organs, including some rarely reported organs such as the cornea, retina, and nine paired organs. The mouse spectral library contained 178,304 peptides from 12,320 proteins, including 1678 proteins not reported in previous mouse spectral libraries. Our data suggested that organs from the nervous system and immune system expressed the most distinct proteome compared with other organs. We also found characteristic protein expression of immune-privileged organs, which may help understanding possible immune rejection after organ transplantation. Each tissue type expressed characteristic high-abundance proteins related to its physiological functions. We also uncovered some tissue-specific proteins which have not been reported previously. The testis expressed highest number of tissue-specific proteins. By comparison of nine paired organs including kidneys, testes, and adrenal glands, we found left organs exhibited higher levels of antioxidant enzymes. We also observed expression asymmetry for proteins related to the apoptotic process, tumor suppression, and organ functions between the left and right sides. This study provides a comprehensive spectral library and a quantitative proteome resource for mouse studies.


Assuntos
Antioxidantes , Proteoma , Masculino , Camundongos , Animais , Proteômica , Peptídeos
14.
Proteomics ; 23(9): e2200179, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36571325

RESUMO

Data-independent acquisition (DIA) of tandem mass spectrometry spectra has emerged as a promising technology to improve coverage and quantification of proteins in complex mixtures. The success of DIA experiments is dependent on the quality of spectral libraries used for data base searching. Frequently, these libraries need to be generated by labor and time intensive data dependent acquisition (DDA) experiments. Recently, several algorithms have been published that allow the generation of theoretical libraries by an efficient prediction of retention time and intensity of the fragment ions. Sequential windowed acquisition of all theoretical fragment ion spectra mass spectrometry (SWATH-MS) is a DIA method that can be applied at an unprecedented speed, but the fragmentation spectra suffer from a lower quality than data acquired on Orbitrap instruments. To reliably generate theoretical libraries that can be used in SWATH experiments, we developed deep-learning for SWATH analysis (dpSWATH), to improve the sensitivity and specificity of data generated by Q-TOF mass spectrometers. The theoretical library built by dpSWATH allowed us to increase the identification rate of proteins compared to traditional or library-free methods. Based on our analysis we conclude that dpSWATH is a superior prediction framework for SWATH-MS measurements than other algorithms based on Orbitrap data.


Assuntos
Aprendizado Profundo , Espectrometria de Massas em Tandem/métodos , Proteínas , Algoritmos , Bases de Dados Factuais
15.
Proteomics ; 23(7-8): e2200046, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36036492

RESUMO

Protein post-translational modifications (PTMs) increase the functional diversity of the cellular proteome. Accurate and high throughput identification and quantification of protein PTMs is a key task in proteomics research. Recent advancements in data-independent acquisition (DIA) mass spectrometry (MS) technology have achieved deep coverage and accurate quantification of proteins and PTMs. This review provides an overview of DIA data processing methods that cover three aspects of PTMs analysis, that is, detection of PTMs, site localization, and characterization of complex modification moieties, such as glycosylation. In addition, a survey of deep learning methods that boost DIA-based PTMs analysis is presented, including in silico spectral library generation, as well as feature scoring and error rate control. The limitations and future directions of DIA methods for PTMs analysis are also discussed. Novel data analysis methods will take advantage of advanced MS instrumentation techniques to empower DIA MS for in-depth and accurate PTMs measurements.


Assuntos
Peptídeos , Proteômica , Proteômica/métodos , Peptídeos/análise , Processamento de Proteína Pós-Traducional , Espectrometria de Massas/métodos , Proteoma/química
16.
Proteomics ; 23(7-8): e2200041, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36906835

RESUMO

Accurate retention time (RT) prediction is important for spectral library-based analysis in data-independent acquisition mass spectrometry-based proteomics. The deep learning approach has demonstrated superior performance over traditional machine learning methods for this purpose. The transformer architecture is a recent development in deep learning that delivers state-of-the-art performance in many fields such as natural language processing, computer vision, and biology. We assess the performance of the transformer architecture for RT prediction using datasets from five deep learning models Prosit, DeepDIA, AutoRT, DeepPhospho, and AlphaPeptDeep. The experimental results on holdout datasets and independent datasets exhibit state-of-the-art performance of the transformer architecture. The software and evaluation datasets are publicly available for future development in the field.


Assuntos
Biblioteca de Peptídeos , Proteômica , Proteômica/métodos , Espectrometria de Massas/métodos , Software , Cromatografia Líquida/métodos
17.
J Proteome Res ; 22(12): 3692-3702, 2023 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-37910637

RESUMO

Spectral libraries are useful resources in proteomic data analysis. Recent advances in deep learning allow tandem mass spectra of peptides to be predicted from their amino acid sequences. This enables predicted spectral libraries to be compiled, and searching against such libraries has been shown to improve the sensitivity in peptide identification over conventional sequence database searching. However, current prediction models lack support for longer peptides, and thus far, predicted library searching has only been demonstrated for backbone ion-only spectrum prediction methods. Here, we propose a deep learning-based full-spectrum prediction method to generate predicted spectral libraries for peptide identification. We demonstrated the superiority of using full-spectrum libraries over backbone ion-only prediction approaches in spectral library searching. Furthermore, merging spectra from different prediction models, as a form of ensemble learning, can produce improved spectral libraries, in terms of identification sensitivity. We also show that a hybrid library combining predicted and experimental spectra can lead to 20% more confident identifications over experimental library searching or sequence database searching.


Assuntos
Aprendizado Profundo , Biblioteca de Peptídeos , Proteômica/métodos , Software , Bases de Dados de Proteínas , Peptídeos/química
18.
J Proteome Res ; 22(10): 3225-3241, 2023 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-37647588

RESUMO

Glycopeptide Abundance Distribution Spectra (GADS) were recently introduced as a means of representing, storing, and comparing glycan profiles of intact glycopeptides. Here, using that representation, an extensive analysis is made of multiple commercial sources of the recombinant SARS-CoV-2 spike protein, each containing 22 N-linked glycan sites (sequons). Multiple proteases are used along with variable energy fragmentation followed by ion trap confirmation. This enables a detailed examination of the reproducibility of the method across multiple types of variability. These results show that GADS are consistent between replicates and laboratories for sufficiently abundant glycopeptides. Derived GADS enable the examination and comparison of the glycan profiles between commercial sources of the spike protein. Multiple distinct glycopeptide distributions, generated by multiple proteases, confirm these profiles. Comparisons of GADS derived from 11 sources of recombinant spike protein reveal that sources for which protein expression methods were the same produced near-identical glycan profiles, thereby demonstrating the ability of this method to measure GADS of sufficient reliability to distinguish different glycoform distributions between commercial vendors and potentially to reliably determine and compare differences in glycosylation for any glycoprotein under different conditions of production. All mass spectrometry data files have been deposited in the MassIVE repository under the identifier MSV000091776.

19.
J Proteome Res ; 22(2): 482-490, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36695531

RESUMO

Spectrum library searching is a powerful alternative to database searching for data dependent acquisition experiments, but has been historically limited to identifying previously observed peptides in libraries. Here we present Scribe, a new library search engine designed to leverage deep learning fragmentation prediction software such as Prosit. Rather than relying on highly curated DDA libraries, this approach predicts fragmentation and retention times for every peptide in a FASTA database. Scribe embeds Percolator for false discovery rate correction and an interference tolerant, label-free quantification integrator for an end-to-end proteomics workflow. By leveraging expected relative fragmentation and retention time values, we find that library searching with Scribe can outperform traditional database searching tools both in terms of sensitivity and quantitative precision. Scribe and its graphical interface are easy to use, freely accessible, and fully open source.


Assuntos
Peptídeos , Espectrometria de Massas em Tandem , Software , Proteômica , Ferramenta de Busca , Biblioteca de Peptídeos , Bases de Dados de Proteínas
20.
J Proteome Res ; 22(5): 1501-1509, 2023 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-36802412

RESUMO

Liquid chromatography coupled with tandem mass spectrometry is commonly adopted in large-scale glycoproteomic studies involving hundreds of disease and control samples. The software for glycopeptide identification in such data (e.g., the commercial software Byonic) analyzes the individual data set and does not exploit the redundant spectra of glycopeptides presented in the related data sets. Herein, we present a novel concurrent approach for glycopeptide identification in multiple related glycoproteomic data sets by using spectral clustering and spectral library searching. The evaluation on two large-scale glycoproteomic data sets showed that the concurrent approach can identify 105%-224% more spectra as glycopeptides compared to the glycopeptide identification on individual data sets using Byonic alone. The improvement of glycopeptide identification also enabled the discovery of several potential biomarkers of protein glycosylations in hepatocellular carcinoma patients.


Assuntos
Neoplasias Hepáticas , Espectrometria de Massas em Tandem , Humanos , Espectrometria de Massas em Tandem/métodos , Glicopeptídeos/análise , Cromatografia Líquida , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA