RESUMO
We describe pLink 2, a search engine with higher speed and reliability for proteome-scale identification of cross-linked peptides. With a two-stage open search strategy facilitated by fragment indexing, pLink 2 is ~40 times faster than pLink 1 and 3~10 times faster than Kojak. Furthermore, using simulated datasets, synthetic datasets, 15N metabolically labeled datasets, and entrapment databases, four analysis methods were designed to evaluate the credibility of ten state-of-the-art search engines. This systematic evaluation shows that pLink 2 outperforms these methods in precision and sensitivity, especially at proteome scales. Lastly, re-analysis of four published proteome-scale cross-linking datasets with pLink 2 required only a fraction of the time used by pLink 1, with up to 27% more cross-linked residue pairs identified. pLink 2 is therefore an efficient and reliable tool for cross-linking mass spectrometry analysis, and the systematic evaluation methods described here will be useful for future software development.
Assuntos
Peptídeos/química , Proteoma/química , Ferramenta de Busca/métodos , Algoritmos , Animais , Bases de Dados de Proteínas , Humanos , Proteômica , SoftwareRESUMO
Disulfide bonds are vital for protein functions, but locating the linkage sites has been a challenge in protein chemistry, especially when the quantity of a sample is small or the complexity is high. In 2015, our laboratory developed a sensitive and efficient method for mapping protein disulfide bonds from simple or complex samples (Lu et al. in Nat Methods 12:329, 2015). This method is based on liquid chromatography-mass spectrometry (LC-MS) and a powerful data analysis software tool named pLink. To facilitate application of this method, we present step-by-step disulfide mapping protocols for three types of samples-purified proteins in solution, proteins in SDS-PAGE gels, and complex protein mixtures in solution. The minimum amount of protein required for this method can be as low as several hundred nanograms for purified proteins, or tens of micrograms for a mixture of hundreds of proteins. The entire workflow-from sample preparation to LC-MS and data analysis-is described in great detail. We believe that this protocol can be easily implemented in any laboratory with access to a fast-scanning, high-resolution, and accurate-mass LC-MS system.
RESUMO
To improve chemical cross-linking of proteins coupled with mass spectrometry (CXMS), we developed a lysine-targeted enrichable cross-linker containing a biotin tag for affinity purification, a chemical cleavage site to separate cross-linked peptides away from biotin after enrichment, and a spacer arm that can be labeled with stable isotopes for quantitation. By locating the flexible proteins on the surface of 70S ribosome, we show that this trifunctional cross-linker is effective at attaining structural information not easily attainable by crystallography and electron microscopy. From a crude Rrp46 immunoprecipitate, it helped identify two direct binding partners of Rrp46 and 15 protein-protein interactions (PPIs) among the co-immunoprecipitated exosome subunits. Applying it to E. coli and C. elegans lysates, we identified 3130 and 893 inter-linked lysine pairs, representing 677 and 121 PPIs. Using a quantitative CXMS workflow we demonstrate that it can reveal changes in the reactivity of lysine residues due to protein-nucleic acid interaction.
Assuntos
Reagentes de Ligações Cruzadas/metabolismo , Mapeamento de Interação de Proteínas/métodos , Mapas de Interação de Proteínas , Animais , Caenorhabditis elegans/química , Caenorhabditis elegans/fisiologia , Proteínas de Caenorhabditis elegans/análise , Proteínas de Caenorhabditis elegans/química , Escherichia coli/química , Escherichia coli/fisiologia , Proteínas de Escherichia coli/análise , Proteínas de Escherichia coli/química , Conformação Proteica , Ribossomos/químicaRESUMO
Chemical cross-linking of proteins coupled with mass spectrometry (CXMS) is a powerful tool to study protein folding and to map the interfaces between interacting proteins. The most commonly used cross-linkers in CXMS are BS(3) and DSS, which have similar structures and generate the same linkages between pairs of lysine residues in spatial proximity. However, there are cases where no cross-linkable lysine pairs are present at certain regions of a protein or at the interface of two interacting proteins. In order to find the cross-linkers that can best complement the performance of BS(3) and DSS, we tested seven additional cross-linkers that either have different spacer arm structures or that target different amino acids (BS(2)G, EGS, AMAS, GMBS, Sulfo-GMBS, EDC, and TFCS). Using BSA, aldolase, the yeast H/ACA protein complex, and E. coli 70S ribosomes, we showed that, in terms of providing structural information not obtained through the use of BS(3) and DSS, EGS and Sulfo-GMBS worked better than the other cross-linkers that we tested. EGS generated a large number of cross-links not seen with the other amine-specific cross-linkers, possibly due to its hydrophilic spacer arm. We demonstrate that incorporating the cross-links contributed by the EGS and amine-sulfhydryl cross-linkers greatly increased the accuracy of Rosetta in docking the structure of the yeast H/ACA protein complex. Given the improved depth of useful information it can provide, we suggest that the multilinker CXMS approach should be used routinely when the amount of a sample permits.
Assuntos
Reagentes de Ligações Cruzadas/química , Espectrometria de Massas/métodos , Proteínas/análise , Proteínas/química , Modelos Moleculares , Conformação Proteica , Dobramento de ProteínaRESUMO
Database search is the dominant approach in high-throughput proteomic analysis. However, the interpretation rate of MS/MS spectra is very low in such a restricted mode, which is mainly due to unexpected modifications and irregular digestion types. In this study, we developed a new algorithm called Alioth, to be integrated into the search engine of pFind, for fast and accurate unrestricted database search on high-resolution MS/MS data. An ion index is constructed for both peptide precursors and fragment ions, by which arbitrary digestions and a single site of any modifications and mutations can be searched efficiently. A new re-ranking algorithm is used to distinguish the correct peptide-spectrum matches from random ones. The algorithm is tested on several HCD datasets and the interpretation rate of MS/MS spectra using Alioth is as high as 60%-80%. Peptides from semi- and non-specific digestions, as well as those with unexpected modifications or mutations, can be effectively identified using Alioth and confidently validated using other search engines. The average processing speed of Alioth is 5-10 times faster than some other unrestricted search engines and is comparable to or even faster than the restricted search algorithms tested.This article is part of a Special Issue entitled: Computational Proteomics.
RESUMO
Database search is the dominant approach in high-throughput proteomic analysis. However, the interpretation rate of MS/MS spectra is very low in such a restricted mode, which is mainly due to unexpected modifications and irregular digestion types. In this study, we developed a new algorithm called Alioth, to be integrated into the search engine of pFind, for fast and accurate unrestricted database search on high-resolution MS/MS data. An ion index is constructed for both peptide precursors and fragment ions, by which arbitrary digestions and a single site of any modifications and mutations can be searched efficiently. A new re-ranking algorithm is used to distinguish the correct peptide-spectrum matches from random ones. The algorithm is tested on several HCD datasets and the interpretation rate of MS/MS spectra using Alioth is as high as 60%-80%. Peptides from semi- and non-specific digestions, as well as those with unexpected modifications or mutations, can be effectively identified using Alioth and confidently validated using other search engines. The average processing speed of Alioth is 5-10 times faster than some other unrestricted search engines and is comparable to or even faster than the restricted search algorithms tested.
Assuntos
Algoritmos , Bases de Dados de Proteínas , Espectrometria de Massas , Análise de Sequência de Proteína/métodosRESUMO
pLink is a search engine for high-throughput identification of cross-linked peptides from their tandem mass spectra, which is the data-analysis step in chemical cross-linking of proteins coupled with mass spectrometry analysis. pLink has accumulated more than 200 registered users from all over the world since its first release in 2012. After 2 years of continual development, a new version of pLink has been released, which is at least 40 times faster, more versatile, and more user-friendly. Also, the function of the new pLink has been expanded to identifying endogenous protein cross-linking sites such as disulfide bonds and SUMO (Small Ubiquitin-like MOdifier) modification sites. Integrated into the new version are two accessory tools: pLabel, to annotate spectra of cross-linked peptides for visual inspection and publication, and pConfig, to assist users in setting up search parameters. Here, we provide detailed guidance on running a database search for identification of protein cross-links using the 2014 version of pLink.
Assuntos
Reagentes de Ligações Cruzadas/química , Peptídeos/análise , Ferramenta de Busca , Guias como Assunto , Internet , Espectrometria de Massas , Interface Usuário-ComputadorRESUMO
We developed a high-throughput mass spectrometry method, pLink-SS (http://pfind.ict.ac.cn/software/pLink/2014/pLink-SS.html), for precise identification of disulfide-linked peptides. Using pLink-SS, we mapped all native disulfide bonds of a monoclonal antibody and ten standard proteins. We performed disulfide proteome analyses and identified 199 disulfide bonds in Escherichia coli and 568 in proteins secreted by human endothelial cells. We discovered many regulatory disulfide bonds involving catalytic or metal-binding cysteine residues.
Assuntos
Dissulfetos/química , Espectrometria de Massas , Proteoma/química , Proteômica/métodos , Sequência de Aminoácidos , Escherichia coli/química , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Biblioteca de Peptídeos , Ribonucleases/químicaRESUMO
In relative protein abundance determination from peptide intensities recorded in full mass scans, a major complication that affects quantitation accuracy is signal interference from coeluting ions of similar m/z values. Here, we present pQuant, a quantitation software tool that solves this problem. pQuant detects interference signals, identifies for each peptide a pair of least interfered isotopic chromatograms: one for the light and one for the heavy isotope-labeled peptide. On the basis of these isotopic pairs, pQuant calculates the relative heavy/light peptide ratios along with their 99.75% confidence intervals (CIs). From the peptides ratios and their CIs, pQuant estimates the protein ratios and associated CIs by kernel density estimation. We tested pQuant, Census and MaxQuant on data sets obtained from mixtures (at varying mixing ratios from 10:1 to 1:10) of light- and heavy-SILAC labeled HeLa cells or (14)N- and (15)N-labeled Escherichia coli cells. pQuant quantitated more peptides with better accuracy than Census and MaxQuant in all 14 data sets. On the SILAC data sets, the nonquantified "NaN" (not a number) ratios generated by Census, MaxQuant, and pQuant accounted for 2.5-10.7%, 1.8-2.7%, and 0.01-0.5% of all ratios, respectively. On the (14)N/(15)N data sets, which cannot be quantified by MaxQuant, Census and pQuant produced 0.9-10.0% and 0.3-2.9% NaN ratios, respectively. Excluding these NaN results, the standard deviations of the numerical ratios calculated by Census or MaxQuant are 30-100% larger than those by pQuant. These results show that pQuant outperforms Census and MaxQuant in SILAC and (15)N-based quantitation.
Assuntos
Peptídeos/química , Proteínas/química , Escherichia coli/química , Células HeLa/química , Humanos , Isótopos , Espectrometria de Massas , Isótopos de Nitrogênio , Radioisótopos de Nitrogênio , SoftwareRESUMO
We have developed pLink, software for data analysis of cross-linked proteins coupled with mass-spectrometry analysis. pLink reliably estimates false discovery rate in cross-link identification and is compatible with multiple homo- or hetero-bifunctional cross-linkers. We validated the program with proteins of known structures, and we further tested it on protein complexes, crude immunoprecipitates and whole-cell lysates. We show that it is a robust tool for protein-structure and protein-protein-interaction studies.