RESUMO
ABSTRACT: Salidroside has anti-inflammatory and antiatherosclerotic effects, and mitochondrial homeostasis imbalance is closely related to cardiovascular disease. The aim of this study was to investigate the effect of salidroside on mitochondrial homeostasis after macrophage polarization and elucidate its possible mechanism against atherosclerosis. RAW264.7 cells were stimulated with 1 µg·mL -1 Lipopolysaccharide and 50 ng·mL -1 IFN-γ establish M1 polarization and were also pretreated with 400 µM salidroside. The relative expression of proinflammatory genes was detected by RT-PCR whereas that of mitochondrial homeostasis-related proteins and nuclear factor kappa-B (NF-κB) was detected by WB. Levels of intracellular reactive oxygen species (ROS), mitochondrial membrane potential, and mass were measured by chemifluorescence whereas that of NF-κB nuclear translocation was detected by immunofluorescence. Compared with the Mφ group, the M1 group demonstrated increased mRNA expression of interleukin-1ß , inductible nitric oxide synthase (iNOS), and tumor necrosis factor-α ; increased protein expression of iNOS, NOD-like receptor protein 3, putative kinase 1 , and NF-κB p65 but decreased protein expression of MFN2, Tom20, and PGC-1α; decreased mitochondrial membrane potential and mass; and increased ROS levels and NF-κB p65 nuclear translocation. Salidroside intervention decreased mRNA expression of interleukin-1ß and tumor necrosis factor-α compared with the M1 group but did not affect that of iNOS. Furthermore, salidroside intervention prevented the changes in protein expression, mitochondrial membrane potential and mass, ROS levels, and NF-κB p65 nuclear translocation observed in the M1 group. In summary, salidroside ultimately inhibits M1 macrophage polarization and maintains mitochondrial homeostasis after macrophage polarization by increasing mitochondrial membrane potential, decreasing ROS levels, inhibiting NF-κB activation, and in turn regulating the expression of proinflammatory factors and mitochondrial homeostasis-associated proteins.
Assuntos
NF-kappa B , Fator de Necrose Tumoral alfa , NF-kappa B/metabolismo , Interleucina-1beta/metabolismo , Fator de Necrose Tumoral alfa/metabolismo , Espécies Reativas de Oxigênio/metabolismo , Macrófagos , Lipopolissacarídeos/farmacologia , Óxido Nítrico Sintase/metabolismo , Homeostase , RNA Mensageiro/metabolismoRESUMO
We developed a high-throughput mass spectrometry method, pLink-SS (http://pfind.ict.ac.cn/software/pLink/2014/pLink-SS.html), for precise identification of disulfide-linked peptides. Using pLink-SS, we mapped all native disulfide bonds of a monoclonal antibody and ten standard proteins. We performed disulfide proteome analyses and identified 199 disulfide bonds in Escherichia coli and 568 in proteins secreted by human endothelial cells. We discovered many regulatory disulfide bonds involving catalytic or metal-binding cysteine residues.
Assuntos
Dissulfetos/química , Espectrometria de Massas , Proteoma/química , Proteômica/métodos , Sequência de Aminoácidos , Escherichia coli/química , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Biblioteca de Peptídeos , Ribonucleases/químicaRESUMO
De novo peptide sequencing has improved remarkably, but sequencing full-length peptides with unexpected modifications is still a challenging problem. Here we present an open de novo sequencing tool, Open-pNovo, for de novo sequencing of peptides with arbitrary types of modifications. Although the search space increases by â¼300 times, Open-pNovo is close to or even â¼10-times faster than the other three proposed algorithms. Furthermore, considering top-1 candidates on three MS/MS data sets, Open-pNovo can recall over 90% of the results obtained by any one traditional algorithm and report 5-87% more peptides, including 14-250% more modified peptides. On a high-quality simulated data set, â¼85% peptides with arbitrary modifications can be recalled by Open-pNovo, while hardly any results can be recalled by others. In summary, Open-pNovo is an excellent tool for open de novo sequencing and has great potential for discovering unexpected modifications in the real biological applications.
Assuntos
Sequência de Aminoácidos/genética , Peptídeos/genética , Processamento de Proteína Pós-Traducional/genética , Algoritmos , Bases de Dados de Proteínas , Análise de Sequência de Proteína , Software , Espectrometria de Massas em TandemRESUMO
There has been tremendous progress in top-down proteomics (TDP) in the past 5 years, particularly in intact protein separation and high-resolution mass spectrometry. However, bioinformatics to deal with large-scale mass spectra has lagged behind, in both algorithmic research and software development. In this study, we developed pTop 1.0, a novel software tool to significantly improve the accuracy and efficiency of mass spectral data analysis in TDP. The precursor mass offers crucial clues to infer the potential post-translational modifications co-occurring on the protein, the reliability of which relies heavily on its mass accuracy. Concentrating on detecting the precursors more accurately, a machine-learning model incorporating a variety of spectral features was trained online in pTop via a support vector machine (SVM). pTop employs the sequence tags extracted from the MS/MS spectra and a dynamic programming algorithm to accelerate the search speed, especially for those spectra with multiple post-translational modifications. We tested pTop on three publicly available data sets and compared it with ProSight and MS-Align+ in terms of its recall, precision, running time, and so on. The results showed that pTop can, in general, outperform ProSight and MS-Align+. pTop recalled 22% more correct precursors, although it exported 30% fewer precursors than Xtract (in ProSight) from a human histone data set. The running speed of pTop was about 1 to 2 orders of magnitude faster than that of MS-Align+. This algorithmic advancement in pTop, including both accuracy and speed, will inspire the development of other similar software to analyze the mass spectra from the entire proteins.
Assuntos
Bases de Dados de Proteínas , Armazenamento e Recuperação da Informação , Proteínas/análise , Algoritmos , Aprendizado de Máquina , SoftwareRESUMO
The proteome informatics research group of the Association of Biomolecular Resource Facilities conducted a study to assess the community's ability to detect and characterize peptides bearing a range of biologically occurring post-translational modifications when present in a complex peptide background. A data set derived from a mixture of synthetic peptides with biologically occurring modifications combined with a yeast whole cell lysate as background was distributed to a large group of researchers and their results were collectively analyzed. The results from the twenty-four participants, who represented a broad spectrum of experience levels with this type of data analysis, produced several important observations. First, there is significantly more variability in the ability to assess whether a results is significant than there is to determine the correct answer. Second, labile post-translational modifications, particularly tyrosine sulfation, present a challenge for most researchers. Finally, for modification site localization there are many tools being employed, but researchers are currently unsure of the reliability of the results these programs are producing.
Assuntos
Peptídeos/isolamento & purificação , Processamento de Proteína Pós-Traducional/genética , Proteoma , Sequência de Aminoácidos/genética , Misturas Complexas/química , Misturas Complexas/genética , Biologia Computacional , Humanos , Peptídeos/química , Peptídeos/metabolismo , Análise de Sequência de ProteínaRESUMO
In relative protein abundance determination from peptide intensities recorded in full mass scans, a major complication that affects quantitation accuracy is signal interference from coeluting ions of similar m/z values. Here, we present pQuant, a quantitation software tool that solves this problem. pQuant detects interference signals, identifies for each peptide a pair of least interfered isotopic chromatograms: one for the light and one for the heavy isotope-labeled peptide. On the basis of these isotopic pairs, pQuant calculates the relative heavy/light peptide ratios along with their 99.75% confidence intervals (CIs). From the peptides ratios and their CIs, pQuant estimates the protein ratios and associated CIs by kernel density estimation. We tested pQuant, Census and MaxQuant on data sets obtained from mixtures (at varying mixing ratios from 10:1 to 1:10) of light- and heavy-SILAC labeled HeLa cells or (14)N- and (15)N-labeled Escherichia coli cells. pQuant quantitated more peptides with better accuracy than Census and MaxQuant in all 14 data sets. On the SILAC data sets, the nonquantified "NaN" (not a number) ratios generated by Census, MaxQuant, and pQuant accounted for 2.5-10.7%, 1.8-2.7%, and 0.01-0.5% of all ratios, respectively. On the (14)N/(15)N data sets, which cannot be quantified by MaxQuant, Census and pQuant produced 0.9-10.0% and 0.3-2.9% NaN ratios, respectively. Excluding these NaN results, the standard deviations of the numerical ratios calculated by Census or MaxQuant are 30-100% larger than those by pQuant. These results show that pQuant outperforms Census and MaxQuant in SILAC and (15)N-based quantitation.
Assuntos
Peptídeos/química , Proteínas/química , Escherichia coli/química , Células HeLa/química , Humanos , Isótopos , Espectrometria de Massas , Isótopos de Nitrogênio , Radioisótopos de Nitrogênio , SoftwareRESUMO
The increasing need for mass spectrometric analysis of RNA molecules calls for a better understanding of their gas-phase fragmentation behaviors. In this study, we investigate the effect of terminal phosphate groups on the fragmentation spectra of RNA oligonucleotides (oligos) using high-resolution mass spectrometry (MS). Negative-ion mode collision-induced dissociation (CID) and higher-energy collisional dissociation (HCD) were carried out on RNA oligos containing a terminal phosphate group on either end, both ends, or neither end. We find that terminal phosphate groups affect the fragmentation behavior of RNA oligos in a way that is dependent on the precursor charge state and the oligo length. Specifically, for precursor ions of RNA oligos of the same sequence, those with 5'- or 3'-phosphate, or both, have a higher charge state distribution and lose the phosphate group(s) in the form of a neutral (H3PO4 or HPO3) or an anion ([H2PO4]- or [PO3]-) upon CID or HCD. Such a neutral or charged loss is most conspicuous for precursor ions of an intermediate charge state, e.g., 3- for 4-nt oligos or 4- and 5- for 8-nt oligos. This decreases the intensity of sequencing ions (a-, a-B, b-, c-, d-, w-, x-, y-, z-ions) and hence is unfavorable for sequencing by CID or HCD. Removal of terminal phosphate groups by calf intestinal alkaline phosphatase improved MS analysis of RNA oligos. Additionally, the intensity of a fragment ion at m/z 158.925, which we identified as a dehydrated pyrophosphate anion ([HP2O6]-), is markedly increased by the presence of a terminal phosphate group. These findings expand the knowledge base necessary for software development for MS analysis of RNA.
Assuntos
Ânions , Fosfatos , RNA , Ânions/química , Fosfatos/química , RNA/química , RNA/análise , Oligonucleotídeos/química , Oligonucleotídeos/análise , Espectrometria de Massas/métodos , Espectrometria de Massas em Tandem/métodosRESUMO
De novo peptide sequencing is the only tool for extracting peptide sequences directly from tandem mass spectrometry (MS) data without any protein database. However, neither the accuracy nor the efficiency of de novo sequencing has been satisfactory, mainly due to incomplete fragmentation information in experimental spectra. Recent advancement in MS technology has enabled acquisition of higher energy collisional dissociation (HCD) and electron transfer dissociation (ETD) spectra of the same precursor. These spectra contain complementary fragmentation information and can be collected with high resolution and high mass accuracy. Taking these advantages, we have developed a new algorithm called pNovo+, which greatly improves the accuracy and speed of de novo sequencing. On tryptic peptides, 86% of the topmost candidate sequences deduced by pNovo+ from HCD + ETD spectral pairs matched the database search results, and the success rate reached 95% if the top three candidates were included, which was much higher than using only HCD (87%) or only ETD spectra (57%). On Asp-N, Glu-C, or Elastase digested peptides, 69-87% of the HCD + ETD spectral pairs were correctly identified by pNovo+ among the topmost candidates, or 84-95% among the top three. On average, it takes pNovo+ only 0.018 s to extract the sequence from a spectrum or spectral pair on a common personal computer. This is more than three times as fast as other de novo sequencing programs. The increase of speed is mainly due to pDAG, a component algorithm of pNovo+. pDAG finds the k longest paths in a directed acyclic graph without the antisymmetry restriction. We have verified that the antisymmetry restriction is unnecessary for high resolution, high mass accuracy data. The extensive use of HCD and ETD spectral information and the pDAG algorithm make pNovo+ an excellent de novo sequencing tool.
Assuntos
Algoritmos , Peptídeos/isolamento & purificação , Análise de Sequência de Proteína/normas , Espectrometria de Massas em Tandem/normas , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Humanos , Metaloendopeptidases/química , Dados de Sequência Molecular , Elastase Pancreática/química , Peptídeos/química , Sensibilidade e Especificidade , Análise de Sequência de Proteína/métodos , Serina Endopeptidases/química , Tripsina/químicaRESUMO
Identification of proteins and their modifications via liquid chromatography-tandem mass spectrometry is an important task for the field of proteomics. However, because of the complexity of tandem mass spectra, the majority of the spectra cannot be identified. The presence of unanticipated protein modifications is among the major reasons for the low spectral identification rate. The conventional database search approach to protein identification has inherent difficulties in comprehensive detection of protein modifications. In recent years, increasing efforts have been devoted to developing unrestrictive approaches to modification identification, but they often suffer from their lack of speed. This paper presents a statistical algorithm named DeltAMT (Delta Accurate Mass and Time) for fast detection of abundant protein modifications from tandem mass spectra with high-accuracy precursor masses. The algorithm is based on the fact that the modified and unmodified versions of a peptide are usually present simultaneously in a sample and their spectra are correlated with each other in precursor masses and retention times. By representing each pair of spectra as a delta mass and time vector, bivariate Gaussian mixture models are used to detect modification-related spectral pairs. Unlike previous approaches to unrestrictive modification identification that mainly rely upon the fragment information and the mass dimension in liquid chromatography-tandem mass spectrometry, the proposed algorithm makes the most of precursor information. Thus, it is highly efficient while being accurate and sensitive. On two published data sets, the algorithm effectively detected various modifications and other interesting events, yielding deep insights into the data. Based on these discoveries, the spectral identification rates were significantly increased and many modified peptides were identified.
Assuntos
Algoritmos , Cromatografia Líquida/métodos , Processamento de Proteína Pós-Traducional , Proteoma/química , Espectrometria de Massas em Tandem/métodos , Cromatografia Líquida/normas , Interpretação Estatística de Dados , Bases de Dados de Proteínas , Proteínas Fúngicas/química , Células HeLa , Humanos , Peso Molecular , Padrões de Referência , Espectrometria de Massas em Tandem/normasRESUMO
Mass spectrometry (MS)-based analysis of RNA oligonucleotides (oligos) plays an increasingly important role in the development of RNA therapeutics and epitranscriptomics research. However, MS fragmentation behaviors of RNA oligomers are understood insufficiently. Herein, we characterized the negative-ion-mode fragmentation behaviors of 26 synthetic RNA oligos containing four to eight nucleotides using collision-induced dissociation (CID) on a high-resolution, accurate-mass instrument. We found that in CID spectra acquired under the normalized collision energy (NCE) of 35%, approximately 70% of the total peak intensity was attributed to sequencing ions (a-B, a, b, c, d, w, x, y, z), around 25% of the peak intensity came from precursor ions that experienced complete or partial loss of a nucleobase in the form of either a neutral or an anion, and the remainder were internal ions and anionic nucleobases. The top five sequencing ions were the y, c, w, a-B, and a ions. Furthermore, we observed that CID fragmentation behaviors of RNA oligos were significantly impacted by their precursor charge. Specifically, when the precursors had a charge from 1- to 5-, the fractional intensity of sequencing ions decreased, while that of precursors that underwent either neutral or charged losses of a nucleobase increased. Additionally, we found that RNA oligos containing 3'-U tended to produce precursors with HNCO and/or NCO- losses, which presumably corresponded to isocyanic acid and cyanate anion, respectively. These findings provide valuable insights for better comprehending the mechanism behind RNA fragmentation by MS/MS, thereby facilitating the future automated identification of RNA oligos based on their CID spectra in a more efficient manner.
Assuntos
Oligonucleotídeos , Espectrometria de Massas em Tandem , Oligonucleotídeos/química , Espectrometria de Massas em Tandem/métodos , RNA , Íons/química , Ânions , Espectrometria de Massas por Ionização por ElectrosprayRESUMO
Determining the monoisotopic peak of a precursor is a first step in interpreting mass spectra, which is basic but non-trivial. The reason is that in the isolation window of a precursor, other peaks interfere with the determination of the monoisotopic peak, leading to wrong mass-to-charge ratio or charge state. Here we propose a method, named pParse, to export the most probable monoisotopic peaks for precursors, including co-eluted precursors. We use the relationship between the position of the highest peak and the mass of the first peak to detect candidate clusters. Then, we extract three features to sort the candidate clusters: (i) the sum of the intensity, (ii) the similarity of the experimental and the theoretical isotopic distribution, and (iii) the similarity of elution profiles. We showed that the recall of pParse, MaxQuant, and BioWorks was 98-98.8%, 0.5-17%, and 1.8-36.5% at the same precision, respectively. About 50% of tandem mass spectra are triggered by multiple precursors which are difficult to identify. Then we design a new scoring function to identify the co-eluted precursors. About 26% of all identified peptides were exclusively from co-eluted peptides. Therefore, accurately determining monoisotopic peaks, including co-eluted precursors, can greatly increase peptide identification rate.
Assuntos
Peptídeos/análise , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Algoritmos , Células HeLa/química , Humanos , Peptídeos/química , Precursores de Proteínas/análise , Precursores de Proteínas/química , Reprodutibilidade dos Testes , Ferramenta de Busca , Sensibilidade e Especificidade , Fatores de Tempo , Leveduras/químicaRESUMO
The remarkable advancement of top-down proteomics in the past decade is driven by the technological development in separation, mass spectrometry (MS) instrumentation, novel fragmentation, and bioinformatics. However, the accurate identification and quantification of proteoforms, all clearly-defined molecular forms of protein products from a single gene, remain a challenging computational task. This is in part due to the complicated mass spectra from intact proteoforms when compared to those from the digested peptides. Herein, pTop 2.0 is developed to fill in the gap between the large-scale complex top-down MS data and the shortage of high-accuracy bioinformatic tools. Compared with pTop 1.0, the first version, pTop 2.0 concentrates mainly on the identification of the proteoforms with unexpected modifications or a terminal truncation. The quantitation based on isotopic labeling is also a new function, which can be carried out by the convenient and user-friendly "one-key operation," integrated together with the qualitative identifications. The accuracy and running speed of pTop 2.0 is significantly improved on the test data sets. This chapter will introduce the main features, step-by-step running operations, and algorithmic developments of pTop 2.0 in order to push the identification and quantitation of intact proteoforms to a higher-accuracy level in top-down proteomics.
Assuntos
Proteoma , Proteômica , Espectrometria de Massas , Proteoma/metabolismo , Proteômica/métodosRESUMO
MOTIVATION: Identification of post-translationally modified proteins has become one of the central issues of current proteomics. Spectral library search is a new and promising computational approach to mass spectrometry-based protein identification. However, its potential in identification of unanticipated post-translational modifications has rarely been explored. The existing spectral library search tools are designed to match the query spectrum to the reference library spectra with the same peptide mass. Thus, spectra of peptides with unanticipated modifications cannot be identified. RESULTS: In this article, we present an open spectral library search tool, named pMatch. It extends the existing library search algorithms in at least three aspects to support the identification of unanticipated modifications. First, the spectra in library are optimized with the full peptide sequence information to better tolerate the peptide fragmentation pattern variations caused by some modification(s). Second, a new scoring system is devised, which uses charge-dependent mass shifts for peak matching and combines a probability-based model with the general spectral dot-product for scoring. Third, a target-decoy strategy is used for false discovery rate control. To demonstrate the effectiveness of pMatch, a library search experiment was conducted on a public dataset with over 40,000 spectra in comparison with SpectraST, the most popular library search engine. Additional validations were done on four published datasets including over 150,000 spectra. The results showed that pMatch can effectively identify unanticipated modifications and significantly increase spectral identification rate. AVAILABILITY: http://pfind.ict.ac.cn/pmatch/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Espectrometria de Massas/métodos , Processamento de Proteína Pós-Traducional , Proteínas/química , Proteômica/métodos , Bases de Dados de ProteínasRESUMO
Nuclear magnetic resonance/liquid chromatography-mass spectroscopy parallel dynamic spectroscopy (NMR/LC-MS PDS) is a method aimed at the simultaneous structural identification of natural products in complex mixtures. In this study, the method is illustrated with respect to (1)H NMR and rapid resolution liquid chromatography-mass spectroscopy (RRLC-MS) data, acquired from the crude extract of Anoectochilus roxburghii, which was separated into a series of fractions with the concentration of constituent dynamic variation using reversed-phase preparative chromatography. Through fraction ranges and intensity changing profiles in (1)H NMR/RRLC-MS PDS spectrum, (1)H NMR and the extracted ion chromatogram (XIC) signals deriving from the same individual constituent, were correlated due to the signal amplitude co-variation resulting from the concentration variation of constituents in a series of incompletely separated fractions. 1H NMR/RRLC-MS PDS was then successfully used to identify three types of natural products, including eight flavonoids, four organic acids and p-hydroxybenzaldehyde, five of which have not previously been reported in Anoectochilus roxburghii. In addition, two groups of co-eluted compounds were successfully identified. The results prove that this approach should be of benefit in the unequivocal structural determination of a variety of classes of compounds from extremely complex mixtures, such as herbs and biological samples, which will lead to improved efficiency in the identification of new potential lead compounds.
Assuntos
Produtos Biológicos/química , Orchidaceae/química , Extratos Vegetais/química , Produtos Biológicos/isolamento & purificação , Cromatografia Líquida de Alta Pressão , Espectroscopia de Ressonância Magnética , Espectrometria de Massas , Orchidaceae/metabolismo , Extratos Vegetais/isolamento & purificaçãoRESUMO
BACKGROUND: Tandem mass spectrometry-based database searching has become an important technology for peptide and protein identification. One of the key challenges in database searching is the remarkable increase in computational demand, brought about by the expansion of protein databases, semi- or non-specific enzymatic digestion, post-translational modifications and other factors. Some software tools choose peptide indexing to accelerate processing. However, peptide indexing requires a large amount of time and space for construction, especially for the non-specific digestion. Additionally, it is not flexible to use. RESULTS: We developed an algorithm based on the longest common prefix (ABLCP) to efficiently organize a protein sequence database. The longest common prefix is a data structure that is always coupled to the suffix array. It eliminates redundant candidate peptides in databases and reduces the corresponding peptide-spectrum matching times, thereby decreasing the identification time. This algorithm is based on the property of the longest common prefix. Even enzymatic digestion poses a challenge to this property, but some adjustments can be made to this algorithm to ensure that no candidate peptides are omitted. Compared with peptide indexing, ABLCP requires much less time and space for construction and is subject to fewer restrictions. CONCLUSIONS: The ABLCP algorithm can help to improve data analysis efficiency. A software tool implementing this algorithm is available at http://pfind.ict.ac.cn/pfind2dot5/index.htm.
Assuntos
Bases de Dados de Proteínas , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Algoritmos , Mapeamento de Peptídeos , Peptídeos/química , Análise de Sequência de Proteína/métodosRESUMO
In recent years, electron transfer dissociation (ETD) has enjoyed widespread applications from sequencing of peptides with or without post-translational modifications to top-down analysis of intact proteins. However, peptide identification rates from ETD spectra compare poorly with those from collision induced dissociation (CID) spectra, especially for doubly charged precursors. This is in part due to an insufficient understanding of the characteristics of ETD and consequently a failure of database search engines to make use of the rich information contained in the ETD spectra. In this study, we statistically characterized ETD fragmentation patterns from a collection of 461 440 spectra and subsequently implemented our findings into pFind, a database search engine developed earlier for CID data. From ETD spectra of doubly charged precursors, pFind 2.1 identified 63-122% more unique peptides than Mascot 2.2 under the same 1% false discovery rate. For higher charged peptides as well as phosphopeptides, pFind 2.1 also consistently obtained more identifications. Of the features built into pFind 2.1, the following two greatly enhanced its performance: (1) refined automatic detection and removal of high-intensity peaks belonging to the precursor, charge-reduced precursor, or related neutral loss species, whose presence often set spectral matching askew; (2) a thorough consideration of hydrogen-rearranged fragment ions such as z + H and c - H for peptide precursors of different charge states. Our study has revealed that different charge states of precursors result in different hydrogen rearrangement patterns. For a fragment ion, its propensity of gaining or losing a hydrogen depends on (1) the ion type (c or z) and (2) the size of the fragment relative to the precursor, and both dependencies are affected by (3) the charge state of the precursor. In addition, we discovered ETD characteristics that are unique for certain types of amino acids (AAs), such as a prominent neutral loss of SCH(2)CONH(2) (90.0014 Da) from z ions with a carbamidomethylated cysteine at the N-terminus and a neutral loss of histidine side chain C(4)N(2)H(5) (81.0453 Da) from precursor ions containing histidine. The comprehensive list of ETD characteristics summarized in this paper should be valuable for automated database search, de novo peptide sequencing, and manual spectral validation.
Assuntos
Espectrometria de Massas/métodos , Peptídeos/análise , Proteômica/métodos , Sequência de Aminoácidos , Transporte de Elétrons , Dados de Sequência Molecular , Peptídeos/química , Fosfopeptídeos/análise , Fosfopeptídeos/química , Reprodutibilidade dos TestesRESUMO
De novo peptide sequencing has improved remarkably in the past decade as a result of better instruments and computational algorithms. However, de novo sequencing can correctly interpret only approximately 30% of high- and medium-quality spectra generated by collision-induced dissociation (CID), which is much less than database search. This is mainly due to incomplete fragmentation and overlap of different ion series in CID spectra. In this study, we show that higher-energy collisional dissociation (HCD) is of great help to de novo sequencing because it produces high mass accuracy tandem mass spectrometry (MS/MS) spectra without the low-mass cutoff associated with CID in ion trap instruments. Besides, abundant internal and immonium ions in the HCD spectra can help differentiate similar peptide sequences. Taking advantage of these characteristics, we developed an algorithm called pNovo for efficient de novo sequencing of peptides from HCD spectra. pNovo gave correct identifications to 80% or more of the HCD spectra identified by database search. The number of correct full-length peptides sequenced by pNovo is comparable with that obtained by database search. A distinct advantage of de novo sequencing is that deamidated peptides and peptides with amino acid mutations can be identified efficiently without extra cost in computation. In summary, implementation of the HCD characteristics makes pNovo an excellent tool for de novo peptide sequencing from HCD spectra.
Assuntos
Algoritmos , Fragmentos de Peptídeos/química , Análise de Sequência de Proteína/métodos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Animais , Bovinos , Galinhas , Mineração de Dados , Bases de Dados de Proteínas , Proteínas de Escherichia coli , Dados de Sequência Molecular , Proteínas/química , Coelhos , Software , Glycine maxRESUMO
Database searching is the technique of choice for shotgun proteomics, and to date much research effort has been spent on improving its effectiveness. However, database searching faces a serious challenge of efficiency, considering the large numbers of mass spectra and the ever fast increase in peptide databases resulting from genome translations, enzymatic digestions, and post-translational modifications. In this study, we conducted systematic research on speeding up database search engines for protein identification and illustrate the key points with the specific design of the pFind 2.1 search engine as a running example. Firstly, by constructing peptide indexes, pFind achieves a speedup of two to three compared with that without peptide indexes. Secondly, by constructing indexes for observed precursor and fragment ions, pFind achieves another speedup of two. As a result, pFind compares very favorably with predominant search engines such as Mascot, SEQUEST and X!Tandem.
Assuntos
Mineração de Dados/métodos , Bases de Dados de Proteínas , Fragmentos de Peptídeos/química , Proteínas/química , Espectrometria de Massas em Tandem/métodos , Algoritmos , Proteínas Sanguíneas/química , Simulação por Computador , Sistemas de Gerenciamento de Base de Dados , Proteínas Fúngicas/química , Humanos , Proteômica/métodosRESUMO
We describe pLink 2, a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides. With a two-stage open search strategy facilitated by fragment indexing, pLink 2 is ~40 times faster than pLink 1 and 3~10 times faster than Kojak. Furthermore, using simulated datasets, synthetic datasets, 15N metabolically labeled datasets, and entrapment databases, four analysis methods were designed to evaluate the credibility of ten state-of-the-art search engines. This systematic evaluation shows that pLink 2 outperforms these methods in precision and sensitivity, especially at proteome scales. Lastly, re-analysis of four published proteome-scale cross-linking datasets with pLink 2 required only a fraction of the time used by pLink 1, with up to 27% more cross-linked residue pairs identified. pLink 2 is therefore an efficient and reliable tool for cross-linking mass spectrometry analysis, and the systematic evaluation methods described here will be useful for future software development.
Assuntos
Peptídeos/química , Proteoma/química , Ferramenta de Busca/métodos , Algoritmos , Animais , Bases de Dados de Proteínas , Humanos , Proteômica , SoftwareRESUMO
Chemical cross-linking of proteins coupled with mass spectrometry analysis (CXMS) is widely used to study protein-protein interactions (PPI), protein structures, and even protein dynamics. However, structural information provided by CXMS is still limited, partly because most CXMS experiments use lysine-lysine (K-K) cross-linkers. Although superb in selectivity and reactivity, they are ineffective for lysine deficient regions. Herein, we develop aromatic glyoxal cross-linkers (ArGOs) for arginine-arginine (R-R) cross-linking and the lysine-arginine (K-R) cross-linker KArGO. The R-R or K-R cross-links generated by ArGO or KArGO fit well with protein crystal structures and provide information not attainable by K-K cross-links. KArGO, in particular, is highly valuable for CXMS, with robust performance on a variety of samples including a kinase and two multi-protein complexes. In the case of the CNGP complex, KArGO cross-links covered as much of the PPI interface as R-R and K-K cross-links combined and improved the accuracy of Rosetta docking substantially.