Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Mol Cell Proteomics ; 22(5): 100538, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-37004988

RESUMO

Posttranslational modifications of proteins play essential roles in defining and regulating the functions of the proteins they decorate, making identification of these modifications critical to understanding biology and disease. Methods for enriching and analyzing a wide variety of biological and chemical modifications of proteins have been developed using mass spectrometry-based proteomics, largely relying on traditional database search methods to identify the resulting mass spectra of modified peptides. These database search methods treat modifications as static attachments of a mass to particular position in the peptide sequence, but many modifications undergo fragmentation in tandem mass spectrometry experiments alongside, or instead of, the peptide backbone. While this fragmentation can confound traditional search methods, it also offers unique opportunities for improved searches that incorporate modification-specific fragment ions. Here, we present a new labile mode in the MSFragger search engine that provides the flexibility to tailor modification-centric searches to the fragmentation observed. We show that labile mode can dramatically improve spectrum identification rates of phosphopeptides, RNA-crosslinked peptides, and ADP-ribosylated peptides. Each of these modifications presents distinct fragmentation characteristics, showcasing the flexibility of MSFragger labile mode to improve search for a wide variety of biological and chemical modifications.


Assuntos
Processamento de Proteína Pós-Traducional , Proteômica , Proteômica/métodos , Proteínas/metabolismo , Espectrometria de Massas em Tandem/métodos , Fosfopeptídeos/metabolismo , Bases de Dados de Proteínas
2.
Anal Bioanal Chem ; 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38877149

RESUMO

Identification of O-glycopeptides from tandem mass spectrometry data is complicated by the near complete dissociation of O-glycans from the peptide during collisional activation and by the combinatorial explosion of possible glycoforms when glycans are retained intact in electron-based activation. The recent O-Pair search method provides an elegant solution to these problems, using a collisional activation scan to identify the peptide sequence and total glycan mass, and a follow-up electron-based activation scan to localize the glycosite(s) using a graph-based algorithm in a reduced search space. Our previous O-glycoproteomics methods with MSFragger-Glyco allowed for extremely fast and sensitive identification of O-glycopeptides from collisional activation data but had limited support for site localization of glycans and quantification of glycopeptides. Here, we report an improved pipeline for O-glycoproteomics analysis that provides proteome-wide, site-specific, quantitative results by incorporating the O-Pair method as a module within FragPipe. In addition to improved search speed and sensitivity, we add flexible options for oxonium ion-based filtering of glycans and support for a variety of MS acquisition methods and provide a comparison between all software tools currently capable of O-glycosite localization in proteome-wide searches.

3.
Mol Cell Proteomics ; 21(3): 100205, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35091091

RESUMO

Rapidly improving methods for glycoproteomics have enabled increasingly large-scale analyses of complex glycopeptide samples, but annotating the resulting mass spectrometry data with high confidence remains a major bottleneck. We recently introduced a fast and sensitive glycoproteomics search method in our MSFragger search engine, which reports glycopeptides as a combination of a peptide sequence and the mass of the attached glycan. In samples with complex glycosylation patterns, converting this mass to a specific glycan composition is not straightforward; however, as many glycans have similar or identical masses. Here, we have developed a new method for determining the glycan composition of N-linked glycopeptides fragmented by collisional or hybrid activation that uses multiple sources of information from the spectrum, including observed glycan B-type (oxonium) and Y-type ions and mass and precursor monoisotopic selection errors to discriminate between possible glycan candidates. Combined with false discovery rate estimation for the glycan assignment, we show that this method is capable of specifically and sensitively identifying glycans in complex glycopeptide analyses and effectively controls the rate of false glycan assignments. The new method has been incorporated into the PTM-Shepherd modification analysis tool to work directly with the MSFragger glyco search in the FragPipe graphical user interface, providing a complete computational pipeline for annotation of N-glycopeptide spectra with false discovery rate control of both peptide and glycan components that is both sensitive and robust against false identifications.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Glicopeptídeos/química , Glicosilação , Polissacarídeos/análise , Proteômica/métodos
4.
Mol Cell Proteomics ; 21(4): 100218, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-35219905

RESUMO

Proteinaceous cysteine residues act as privileged sensors of oxidative stress. As reactive oxygen and nitrogen species have been implicated in numerous pathophysiological processes, deciphering which cysteines are sensitive to oxidative modification and the specific nature of these modifications is essential to understanding protein and cellular function in health and disease. While established mass spectrometry-based proteomic platforms have improved our understanding of the redox proteome, the widespread adoption of these methods is often hindered by complex sample preparation workflows, prohibitive cost of isotopic labeling reagents, and requirements for custom data analysis workflows. Here, we present the SP3-Rox redox proteomics method that combines tailored low cost isotopically labeled capture reagents with SP3 sample cleanup to achieve high throughput and high coverage proteome-wide identification of redox-sensitive cysteines. By implementing a customized workflow in the free FragPipe computational pipeline, we achieve accurate MS1-based quantitation, including for peptides containing multiple cysteine residues. Application of the SP3-Rox method to cellular proteomes identified cysteines sensitive to the oxidative stressor GSNO and cysteine oxidation state changes that occur during T cell activation.


Assuntos
Cisteína , Proteômica , Cisteína/química , Espectrometria de Massas/métodos , Oxirredução , Proteoma/metabolismo , Proteômica/métodos
5.
J Proteome Res ; 22(2): 520-525, 2023 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-36475762

RESUMO

Here, we describe the implementation of the fast proteomics search engine MSFragger as a processing node in the widely used Proteome Discoverer (PD) software platform. PeptideProphet (via the Philosopher tool kit) is also implemented as an additional PD node to allow validation of MSFragger open (mass-tolerant) search results. These two nodes, along with the existing Percolator validation module, allow users to employ different search strategies and conveniently inspect search results through PD. Our results have demonstrated the improved numbers of PSMs, peptides, and proteins identified by MSFragger coupled with Percolator and significantly faster search speed compared to the conventional SEQUEST/Percolator PD workflows. The MSFragger-PD node is available at https://github.com/nesvilab/PD-Nodes/releases/.


Assuntos
Proteoma , Ferramenta de Busca , Ferramenta de Busca/métodos , Proteoma/metabolismo , Algoritmos , Espectrometria de Massas em Tandem/métodos , Software , Bases de Dados de Proteínas
6.
Anal Chem ; 2023 Jan 13.
Artigo em Inglês | MEDLINE | ID: mdl-36637389

RESUMO

There is a growing demand to develop high-throughput and high-sensitivity mass spectrometry methods for single-cell proteomics. The commonly used isobaric labeling-based multiplexed single-cell proteomics approach suffers from distorted protein quantification due to co-isolated interfering ions during MS/MS fragmentation, also known as ratio compression. We reasoned that the use of MS3-based quantification could mitigate ratio compression and provide better quantification. However, previous studies indicated reduced proteome coverages in the MS3 method, likely due to long duty cycle time and ion losses during multilevel ion selection and fragmentation. Herein, we described an improved MS acquisition method for MS3-based single-cell proteomics by employing a linear ion trap to measure reporter ions. We demonstrated that linear ion trap can increase the proteome coverages for single-cell-level peptides with even higher gain obtained via the MS3 method. The optimized real-time search MS3 method was further applied to study the immune activation of single macrophages. Among a total of 126 single cells studied, over 1200 and 1000 proteins were quantifiable when at least 50 and 75% nonmissing data were required, respectively. Our evaluation also revealed several limitations of the low-resolution ion trap detector for multiplexed single-cell proteomics and suggested experimental solutions to minimize their impacts on single-cell analysis.

7.
Nat Methods ; 17(11): 1125-1132, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33020657

RESUMO

Recent advances in methods for enrichment and mass spectrometric analysis of intact glycopeptides have produced large-scale glycoproteomics datasets, but interpreting these data remains challenging. We present MSFragger-Glyco, a glycoproteomics mode of the MSFragger search engine, for fast and sensitive identification of N- and O-linked glycopeptides and open glycan searches. Reanalysis of recent N-glycoproteomics data resulted in annotation of 80% more glycopeptide spectrum matches (glycoPSMs) than previously reported. In published O-glycoproteomics data, our method more than doubled the number of glycoPSMs annotated when searching the same glycans as the original search, and yielded 4- to 6-fold increases when expanding searches to include additional glycan compositions and other modifications. Expanded searches also revealed many sulfated and complex glycans that remained hidden to the original search. With greatly improved spectral annotation, coupled with the speed of index-based scoring, MSFragger-Glyco makes it possible to comprehensively interrogate glycoproteomics data and illuminate the many roles of glycosylation.


Assuntos
Glicopeptídeos , Proteômica/métodos , Ferramenta de Busca , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Glicopeptídeos/análise , Glicopeptídeos/química , Glicosilação , Proteômica/instrumentação
8.
Mol Cell Proteomics ; 20: 100077, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33813065

RESUMO

Missing values weaken the power of label-free quantitative proteomic experiments to uncover true quantitative differences between biological samples or experimental conditions. Match-between-runs (MBR) has become a common approach to mitigate the missing value problem, where peptides identified by tandem mass spectra in one run are transferred to another by inference based on m/z, charge state, retention time, and ion mobility when applicable. Though tolerances are used to ensure such transferred identifications are reasonably located and meet certain quality thresholds, little work has been done to evaluate the statistical confidence of MBR. Here, we present a mixture model-based approach to estimate the false discovery rate (FDR) of peptide and protein identification transfer, which we implement in the label-free quantification tool IonQuant. Using several benchmarking datasets generated on both Orbitrap and timsTOF mass spectrometers, we demonstrate superior performance of IonQuant with FDR-controlled MBR compared with MaxQuant (19-38 times faster; 6-18% more proteins quantified and with comparable or better accuracy). We further illustrate the performance of IonQuant and highlight the need for FDR-controlled MBR, in two single-cell proteomics experiments, including one acquired with the help of high-field asymmetric ion mobility spectrometry separation. Fully integrated in the FragPipe computational environment, IonQuant with FDR-controlled MBR enables fast and accurate peptide and protein quantification in label-free proteomics experiments.


Assuntos
Proteômica/métodos , Algoritmos , Bases de Dados de Proteínas , Proteínas de Escherichia coli , Células HeLa , Humanos , Peptídeos , Proteínas , Proteínas de Saccharomyces cerevisiae , Análise de Célula Única , Software
9.
Mol Cell Proteomics ; 20: 100018, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33568339

RESUMO

Open searching has proven to be an effective strategy for identifying both known and unknown modifications in shotgun proteomics experiments. Rather than being limited to a small set of user-specified modifications, open searches identify peptides with any mass shift that may correspond to a single modification or a combination of several modifications. Here we present PTM-Shepherd, a bioinformatics tool that automates characterization of post-translational modification profiles detected in open searches based on attributes, such as amino acid localization, fragmentation spectra similarity, retention time shifts, and relative modification rates. PTM-Shepherd can also perform multiexperiment comparisons for studying changes in modification profiles, e.g., in data generated in different laboratories or under different conditions. We demonstrate how PTM-Shepherd improves the analysis of data from formalin-fixed and paraffin-embedded samples, detects extreme underalkylation of cysteine in some data sets, discovers an artifactual modification introduced during peptide synthesis, and uncovers site-specific biases in sample preparation artifacts in a multicenter proteomics profiling study.


Assuntos
Peptídeos/química , Peptídeos/metabolismo , Processamento de Proteína Pós-Traducional , Animais , Bases de Dados de Proteínas , Humanos , Camundongos , Proteômica
10.
Mol Cell Proteomics ; 19(9): 1575-1585, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32616513

RESUMO

Ion mobility brings an additional dimension of separation to LC-MS, improving identification of peptides and proteins in complex mixtures. A recently introduced timsTOF mass spectrometer (Bruker) couples trapped ion mobility separation to TOF mass analysis. With the parallel accumulation serial fragmentation (PASEF) method, the timsTOF platform achieves promising results, yet analysis of the data generated on this platform represents a major bottleneck. Currently, MaxQuant and PEAKS are most used to analyze these data. However, because of the high complexity of timsTOF PASEF data, both require substantial time to perform even standard tryptic searches. Advanced searches (e.g. with many variable modifications, semi- or non-enzymatic searches, or open searches for post-translational modification discovery) are practically impossible. We have extended our fast peptide identification tool MSFragger to support timsTOF PASEF data, and developed a label-free quantification tool, IonQuant, for fast and accurate 4-D feature extraction and quantification. Using a HeLa data set published by Meier et al. (2018), we demonstrate that MSFragger identifies significantly (∼30%) more unique peptides than MaxQuant (1.6.10.43), and performs comparably or better than PEAKS X+ (∼10% more peptides). IonQuant outperforms both in terms of number of quantified proteins while maintaining good quantification precision and accuracy. Runtime tests show that MSFragger and IonQuant can fully process a typical two-hour PASEF run in under 70 min on a typical desktop (6 CPU cores, 32 GB RAM), significantly faster than other tools. Finally, through semi-enzymatic searching, we significantly increase the number of identified peptides. Within these semi-tryptic identifications, we report evidence of gas-phase fragmentation before MS/MS analysis.


Assuntos
Cromatografia Líquida/métodos , Espectrometria de Mobilidade Iônica/métodos , Peptídeos/análise , Proteoma/metabolismo , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Algoritmos , Bases de Dados de Proteínas , Escherichia coli/metabolismo , Células HeLa , Humanos , Peptídeos/metabolismo , Filogenia , Processamento de Proteína Pós-Traducional , Saccharomyces cerevisiae/química , Saccharomyces cerevisiae/metabolismo , Sensibilidade e Especificidade
11.
J Proteome Res ; 20(1): 498-505, 2021 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-33332123

RESUMO

Deisotoping, or the process of removing peaks in a mass spectrum resulting from the incorporation of naturally occurring heavy isotopes, has long been used to reduce complexity and improve the effectiveness of spectral annotation methods in proteomics. We have previously described MSFragger, an ultrafast search engine for proteomics, that did not utilize deisotoping in processing input spectra. Here, we present a new, high-speed parallelized deisotoping algorithm, based on elements of several existing methods, that we have incorporated into the MSFragger search engine. Applying deisotoping with MSFragger reveals substantial improvements to database search speed and performance, particularly for complex methods like open or nonspecific searches. Finally, we evaluate our deisotoping method on data from several instrument types and vendors, revealing a wide range in performance and offering an updated perspective on deisotoping in the modern proteomics environment.


Assuntos
Algoritmos , Bases de Dados de Proteínas , Ferramenta de Busca , Espectrometria de Massas , Proteômica , Software
12.
J Proteome Res ; 20(5): 2266-2282, 2021 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-33900085

RESUMO

Proteinaceous aggregates containing α-synuclein protein called Lewy bodies in the substantia nigra is a hallmark of Parkinson's disease. The molecular mechanisms of Lewy body formation and associated neuronal loss remain largely unknown. To gain insights into proteins and pathways associated with Lewy body pathology, we performed quantitative profiling of the proteome. We analyzed substantia nigra tissue from 51 subjects arranged into three groups: cases with Lewy body pathology, Lewy body-negative controls with matching neuronal loss, and controls with no neuronal loss. Using a label-free liquid chromatography-tandem mass spectrometry (LC-MS/MS) approach, we characterized the proteome both in terms of protein abundances and peptide modifications. Statistical testing for differential abundance of the most abundant 2963 proteins, followed by pathway enrichment and Bayesian learning of the causal network structure, was performed to identify likely drivers of Lewy body formation and dopaminergic neuronal loss. The identified pathways include (1) Arp2/3 complex-mediated actin nucleation; (2) synaptic function; (3) poly(A) RNA binding; (4) basement membrane and endothelium; and (5) hydrogen peroxide metabolic process. According to the data, the endothelial/basement membrane pathway is tightly connected with both pathologies and likely to be one of the drivers of neuronal loss. The poly(A) RNA-binding proteins, including the ones relevant to other neurodegenerative disorders (e.g., TDP-43 and FUS), have a strong inverse correlation with Lewy bodies and may reflect an alternative mechanism of nigral neurodegeneration.


Assuntos
Corpos de Lewy , Proteômica , Teorema de Bayes , Cromatografia Líquida , Humanos , Neurônios/metabolismo , Substância Negra/metabolismo , Espectrometria de Massas em Tandem , alfa-Sinucleína/genética , alfa-Sinucleína/metabolismo
13.
Bioinformatics ; 35(2): 251-257, 2019 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-30649350

RESUMO

Motivation: Cross-linking technique coupled with mass spectrometry (MS) is widely used in the analysis of protein structures and protein-protein interactions. In order to identify cross-linked peptides from MS data, we need to consider all pairwise combinations of peptides, which is computationally prohibitive when the sequence database is large. To alleviate this problem, some heuristic screening strategies are used to reduce the number of peptide pairs during the identification. However, heuristic screening strategies may miss some true cross-linked peptides. Results: We directly tackle the combination challenge without using any screening strategies. With the data structure of double-ended queue, the proposed algorithm reduces the quadratic time complexity of exhaustive searching down to the linear time complexity. We implement the algorithm in a tool named Xolik. The running time of Xolik is validated using databases with different numbers of proteins. Experiments using synthetic and empirical datasets show that Xolik outperforms existing tools in terms of running time and statistical power. Availability and implementation: Source code and binaries of Xolik are freely available at http://bioinformatics.ust.hk/Xolik.html. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Bases de Dados de Proteínas , Peptídeos/química , Mapeamento de Interação de Proteínas/métodos , Proteínas/química , Software , Algoritmos , Biologia Computacional , Espectrometria de Massas
14.
Mol Cell Proteomics ; 17(5): 1010-1027, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29440448

RESUMO

Protein acetylation, one of many types of post-translational modifications (PTMs), is involved in a variety of biological and cellular processes. In the present study, we applied both CsCl density gradient (CDG) centrifugation-based protein fractionation and a dimethyl-labeling-based 4C quantitative PTM proteomics workflow in the study of dynamic acetylproteomic changes in Arabidopsis. This workflow integrates the dimethyl chemical labeling with chromatography-based acetylpeptide separation and enrichment followed by mass spectrometry (MS) analysis, the extracted ion chromatogram (XIC) quantitation-based computational analysis of mass spectrometry data to measure dynamic changes of acetylpeptide level using an in-house software program, named Stable isotope-based Quantitation-Dimethyl labeling (SQUA-D), and finally the confirmation of ethylene hormone-regulated acetylation using immunoblot analysis. Eventually, using this proteomic approach, 7456 unambiguous acetylation sites were found from 2638 different acetylproteins, and 5250 acetylation sites, including 5233 sites on lysine side chain and 17 sites on protein N termini, were identified repetitively. Out of these repetitively discovered acetylation sites, 4228 sites on lysine side chain (i.e. 80.5%) are novel. These acetylproteins are exemplified by the histone superfamily, ribosomal and heat shock proteins, and proteins related to stress/stimulus responses and energy metabolism. The novel acetylproteins enriched by the CDG centrifugation fractionation contain many cellular trafficking proteins, membrane-bound receptors, and receptor-like kinases, which are mostly involved in brassinosteroid, light, gravity, and development signaling. In addition, we identified 12 highly conserved acetylation site motifs within histones, P-glycoproteins, actin depolymerizing factors, ATPases, transcription factors, and receptor-like kinases. Using SQUA-D software, we have quantified 33 ethylene hormone-enhanced and 31 hormone-suppressed acetylpeptide groups or called unique PTM peptide arrays (UPAs) that share the identical unique PTM site pattern (UPSP). This CDG centrifugation protein fractionation in combination with dimethyl labeling-based quantitative PTM proteomics, and SQUA-D may be applied in the quantitation of any PTM proteins in any model eukaryotes and agricultural crops as well as tissue samples of animals and human beings.


Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/metabolismo , Proteômica/métodos , Coloração e Rotulagem , Acetilação , Sequência de Aminoácidos , Cromatografia Líquida , Biologia Computacional , Etilenos/farmacologia , Histonas/metabolismo , Metilação , Reprodutibilidade dos Testes , Espectrometria de Massas em Tandem
15.
J Proteome Res ; 17(9): 3195-3213, 2018 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-30084631

RESUMO

An in planta chemical cross-linking-based quantitative interactomics (IPQCX-MS) workflow has been developed to investigate in vivo protein-protein interactions and alteration in protein structures in a model organism, Arabidopsis thaliana. A chemical cross-linker, azide-tag-modified disuccinimidyl pimelate (AMDSP), was directly applied onto Arabidopsis tissues. Peptides produced from protein fractions of CsCl density gradient centrifugation were dimethyl-labeled, from which the AMDSP cross-linked peptides were fractionated on chromatography, enriched, and analyzed by mass spectrometry. ECL2 and SQUA-D software were used to identify and quantitate these cross-linked peptides, respectively. These computer programs integrate peptide identification with quantitation and statistical evaluation. This workflow eventually identified 354 unique cross-linked peptides, including 61 and 293 inter- and intraprotein cross-linked peptides, respectively, demonstrating that it is able to in vivo identify hundreds of cross-linked peptides at an organismal level by overcoming the difficulties caused by multiple cellular structures and complex secondary metabolites of plants. Coimmunoprecipitation and super-resolution microscopy studies have confirmed the PHB3-PHB6 protein interaction found by IPQCX-MS. The quantitative interactomics also found hormone-induced structural changes of SBPase and other proteins. This mass-spectrometry-based interactomics will be useful in the study of in vivo protein-protein interaction networks in agricultural crops and plant-microbe interactions.


Assuntos
Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , Mapeamento de Interação de Proteínas/métodos , Proteoma/metabolismo , Proteínas Repressoras/metabolismo , Sequência de Aminoácidos , Arabidopsis/genética , Proteínas de Arabidopsis , Cromatografia Líquida , Reagentes de Ligações Cruzadas/química , Modelos Moleculares , Peptídeos/análise , Peptídeos/química , Proibitinas , Ligação Proteica , Isoformas de Proteínas/química , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Estrutura Secundária de Proteína , Proteólise , Proteoma/química , Proteoma/genética , Proteínas Repressoras/química , Proteínas Repressoras/genética , Coloração e Rotulagem/métodos , Succinimidas/química , Espectrometria de Massas em Tandem
16.
J Proteome Res ; 16(10): 3942-3952, 2017 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-28825304

RESUMO

Chemical cross-linking coupled to mass spectrometry is a powerful tool to study protein-protein interactions and protein conformations. Two linked peptides are ionized and fragmented to produce a tandem mass spectrum. In such an experiment, a tandem mass spectrum contains ions from two peptides. The peptide identification problem becomes a peptide-peptide pair identification problem. Currently, most tools do not search all possible pairs due to the quadratic time complexity. Consequently, missed findings are unavoidable. In our previous work, we developed a tool named ECL to search all pairs of peptides exhaustively. Unfortunately, it is very slow due to the quadratic computational complexity, especially when the database is large. Furthermore, ECL uses a score function without statistical calibration, while researchers1-3 have proposed that it is inappropriate to directly compare uncalibrated scores because different spectra have different random score distributions. Here we propose an advanced version of ECL, named ECL2. It achieves a linear time and space complexity by taking advantage of the additive property of a score function. It can search a data set containing tens of thousands of spectra against a database containing thousands of proteins in a few hours. Comparison with other five state-of-the-art tools shows that ECL2 is much faster than pLink, StavroX, ProteinProspector, and ECL. Kojak is the only one that is faster than ECL2, but Kojak does not exhaustively search all possible peptide pairs. The comparison shows that ECL2 has the highest sensitivity among the state-of-the-art tools. The experiment using a large-scale in vivo cross-linking data set demonstrates that ECL2 is the only tool that can find the peptide-spectrum matches (PSMs) passing the false discovery rate/q-value threshold. The result illustrates that the exhaustive search and a well-calibrated score function are useful to find PSMs from a huge search space.


Assuntos
Reagentes de Ligações Cruzadas/química , Peptídeos/química , Proteínas/química , Proteômica , Algoritmos , Bases de Dados de Proteínas , Humanos , Conformação Proteica , Mapas de Interação de Proteínas/genética , Software , Espectrometria de Massas em Tandem
17.
Proteomics ; 16(13): 1915-27, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27198063

RESUMO

Site-specific chemical cross-linking in combination with mass spectrometry analysis has emerged as a powerful proteomic approach for studying the three-dimensional structure of protein complexes and in mapping protein-protein interactions (PPIs). Building on the success of MS analysis of in vitro cross-linked proteins, which has been widely used to investigate specific interactions of bait proteins and their targets in various organisms, we report a workflow for in vivo chemical cross-linking and MS analysis in a multicellular eukaryote. This approach optimizes the in vivo protein cross-linking conditions in Arabidopsis thaliana, establishes a MudPIT procedure for the enrichment of cross-linked peptides, and develops an integrated software program, exhaustive cross-linked peptides identification tool (ECL), to identify the MS spectra of in planta chemical cross-linked peptides. In total, two pairs of in vivo cross-linked peptides of high confidence have been identified from two independent biological replicates. This work demarks the beginning of an alternative proteomic approach in the study of in vivo protein tertiary structure and PPIs in multicellular eukaryotes.


Assuntos
Proteínas de Arabidopsis/química , Arabidopsis/metabolismo , Reagentes de Ligações Cruzadas/química , Mapeamento de Interação de Proteínas/métodos , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Arabidopsis/química , Proteínas de Arabidopsis/metabolismo , Reagentes de Ligações Cruzadas/metabolismo , Modelos Moleculares , Peptídeos/análise , Peptídeos/metabolismo , Conformação Proteica , Software
18.
BMC Bioinformatics ; 17(1): 217, 2016 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-27206479

RESUMO

BACKGROUND: Chemical cross-linking combined with mass spectrometry (CX-MS) is a high-throughput approach to studying protein-protein interactions. The number of peptide-peptide combinations grows quadratically with respect to the number of proteins, resulting in a high computational complexity. Widely used methods including xQuest (Rinner et al., Nat Methods 5(4):315-8, 2008; Walzthoeni et al., Nat Methods 9(9):901-3, 2012), pLink (Yang et al., Nat Methods 9(9):904-6, 2012), ProteinProspector (Chu et al., Mol Cell Proteomics 9:25-31, 2010; Trnka et al., 13(2):420-34, 2014) and Kojak (Hoopmann et al., J Proteome Res 14(5):2190-198, 2015) avoid searching all peptide-peptide combinations by pre-selecting peptides with heuristic approaches. However, pre-selection procedures may cause missing findings. The most intuitive approach is searching all possible candidates. A tool that can exhaustively search a whole database without any heuristic pre-selection procedure is therefore desirable. RESULTS: We have developed a cross-linked peptides identification tool named ECL. It can exhaustively search a whole database in a reasonable period of time without any heuristic pre-selection procedure. Tests showed that searching a database containing 5200 proteins took 7 h. ECL identified more non-redundant cross-linked peptides than xQuest, pLink, and ProteinProspector. Experiments showed that about 30 % of these additional identified peptides were not pre-selected by Kojak. We used protein crystal structures from the protein data bank to check the intra-protein cross-linked peptides. Most of the distances between cross-linking sites were smaller than 30 Å. CONCLUSIONS: To the best of our knowledge, ECL is the first tool that can exhaustively search all candidates in cross-linked peptides identification. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis indicated that some of the additional identified results were thanks to the exhaustive search.


Assuntos
Reagentes de Ligações Cruzadas/química , Bases de Dados de Proteínas , Peptídeos/química , Ferramenta de Busca , Humanos
19.
J Proteome Res ; 15(12): 4423-4435, 2016 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-27748123

RESUMO

In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.


Assuntos
Biologia Computacional/métodos , Processamento de Proteína Pós-Traducional , Proteômica/métodos , Algoritmos , Sequência de Aminoácidos , Animais , Biologia Computacional/normas , Simulação por Computador , Bases de Dados de Proteínas , Humanos , Software/normas
20.
Nat Protoc ; 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38769142

RESUMO

Technological advances in mass spectrometry and proteomics have made it possible to perform larger-scale and more-complex experiments. The volume and complexity of the resulting data create major challenges for downstream analysis. In particular, next-generation data-independent acquisition (DIA) experiments enable wider proteome coverage than more traditional targeted approaches but require computational workflows that can manage much larger datasets and identify peptide sequences from complex and overlapping spectral features. Data-processing tools such as FragPipe, DIA-NN and Spectronaut have undergone substantial improvements to process spectral features in a reasonable time. Statistical analysis tools are needed to draw meaningful comparisons between experimental samples, but these tools were also originally designed with smaller datasets in mind. This protocol describes an updated version of MSstats that has been adapted to be compatible with large-scale DIA experiments. A very large DIA experiment, processed with FragPipe, is used as an example to demonstrate different MSstats workflows. The choice of workflow depends on the user's computational resources. For datasets that are too large to fit into a standard computer's memory, we demonstrate the use of MSstatsBig, a companion R package to MSstats. The protocol also highlights key decisions that have a major effect on both the results and the processing time of the analysis. The MSstats processing can be expected to take 1-3 h depending on the usage of MSstatsBig. The protocol can be run in the point-and-click graphical user interface MSstatsShiny or implemented with minimal coding expertise in R.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA