Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 82
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Proteomics ; 24(8): e2300234, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38487981

RESUMO

The identification of proteoforms by top-down proteomics requires both high quality fragmentation spectra and the neutral mass of the proteoform from which the fragments derive. Intact proteoform spectra can be highly complex and may include multiple overlapping proteoforms, as well as many isotopic peaks and charge states. The resulting lower signal-to-noise ratios for intact proteins complicates downstream analyses such as deconvolution. Averaging multiple scans is a common way to improve signal-to-noise, but mass spectrometry data contains artifacts unique to it that can degrade the quality of an averaged spectra. To overcome these limitations and increase signal-to-noise, we have implemented outlier rejection algorithms to remove outlier measurements efficiently and robustly in a set of MS1 scans prior to averaging. We have implemented averaging with rejection algorithms in the open-source, freely available, proteomics search engine MetaMorpheus. Herein, we report the application of the averaging with rejection algorithms to direct injection and online liquid chromatography mass spectrometry data. Averaging with rejection algorithms demonstrated a 45% increase in the number of proteoforms detected in Jurkat T cell lysate. We show that the increase is due to improved spectral quality, particularly in regions surrounding isotopic envelopes.


Assuntos
Proteoma , Proteômica , Proteoma/análise , Proteômica/métodos , Processamento de Proteína Pós-Traducional , Algoritmos , Espectrometria de Massas
2.
Anal Bioanal Chem ; 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38877149

RESUMO

Identification of O-glycopeptides from tandem mass spectrometry data is complicated by the near complete dissociation of O-glycans from the peptide during collisional activation and by the combinatorial explosion of possible glycoforms when glycans are retained intact in electron-based activation. The recent O-Pair search method provides an elegant solution to these problems, using a collisional activation scan to identify the peptide sequence and total glycan mass, and a follow-up electron-based activation scan to localize the glycosite(s) using a graph-based algorithm in a reduced search space. Our previous O-glycoproteomics methods with MSFragger-Glyco allowed for extremely fast and sensitive identification of O-glycopeptides from collisional activation data but had limited support for site localization of glycans and quantification of glycopeptides. Here, we report an improved pipeline for O-glycoproteomics analysis that provides proteome-wide, site-specific, quantitative results by incorporating the O-Pair method as a module within FragPipe. In addition to improved search speed and sensitivity, we add flexible options for oxonium ion-based filtering of glycans and support for a variety of MS acquisition methods and provide a comparison between all software tools currently capable of O-glycosite localization in proteome-wide searches.

3.
Nat Methods ; 17(11): 1133-1138, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33106676

RESUMO

We report O-Pair Search, an approach to identify O-glycopeptides and localize O-glycosites. Using paired collision- and electron-based dissociation spectra, O-Pair Search identifies O-glycopeptides via an ion-indexed open modification search and localizes O-glycosites using graph theory and probability-based localization. O-Pair Search reduces search times more than 2,000-fold compared to current O-glycopeptide processing software, while defining O-glycosite localization confidence levels and generating more O-glycopeptide identifications. Beyond the mucin-type O-glycopeptides discussed here, O-Pair Search also accepts user-defined glycan databases, making it compatible with many types of O-glycosylation. O-Pair Search is freely available within the open-source MetaMorpheus platform at https://github.com/smith-chem-wisc/MetaMorpheus .


Assuntos
Glicopeptídeos , Proteômica/métodos , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Glicopeptídeos/análise , Glicopeptídeos/química , Glicosilação , Proteômica/instrumentação , Software , Fluxo de Trabalho
4.
J Proteome Res ; 21(11): 2609-2618, 2022 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-36206157

RESUMO

Tandem mass spectrometry (MS/MS) is widely employed for the analysis of complex proteomic samples. While protein sequence database searching and spectral library searching are both well-established peptide identification methods, each has shortcomings. Protein sequence databases lack fragment peak intensity information, which can result in poor discrimination between correct and incorrect spectrum assignments. Spectral libraries usually contain fewer peptides than protein sequence databases, which limits the number of peptides that can be identified. Notably, few post-translationally modified peptides are represented in spectral libraries. This is because few search engines can both identify a broad spectrum of PTMs and create corresponding spectral libraries. Also, programs that generate spectral libraries using deep learning approaches are not yet able to accurately predict spectra for the vast majority of PTMs. Here, we address these limitations through use of a hybrid search strategy that combines protein sequence database and spectral library searches to improve identification success rates and sensitivity. This software uses Global PTM Discovery (G-PTM-D) to produce spectral libraries for a wide variety of different PTMs. These features, along with a new spectrum annotation and visualization tool, have been integrated into the freely available and open-source search engine MetaMorpheus.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Bases de Dados de Proteínas , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Análise de Dados , Software , Peptídeos/análise , Biblioteca de Peptídeos , Algoritmos
5.
J Proteome Res ; 21(10): 2443-2452, 2022 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-36108102

RESUMO

The SARS-CoV-2 omicron variant presented significant challenges to the global effort to counter the pandemic. SARS-CoV-2 is predicted to remain prevalent for the foreseeable future, making the ability to identify SARS-CoV-2 variants imperative in understanding and controlling the pandemic. The predominant variant discovery method, genome sequencing, is time-consuming, insensitive, and expensive. Ultraperformance liquid chromatography-mass spectrometry (UPLC-MS) offers an exciting alternative detection modality provided that variant-containing peptide markers are sufficiently detectable from their tandem mass spectra (MS/MS). We have synthesized model tryptic peptides of SARS-CoV-2 variants alpha, beta, gamma, delta, and omicron and evaluated their signal intensity, HCD spectra, and reverse phase retention time. Detection limits of 781, 781, 65, and 65 amol are obtained for the molecular ions of the proteotypic peptides, beta (QIAPGQTGNIADYNYK), gamma (TQLPSAYTNSFTR), delta (VGGNYNYR), and omicron (TLVKQLSSK), from neat solutions. These detection limits are on par with the detection limits of a previously reported proteotypic peptide from the SARS-CoV-2 spike protein, HTPINLVR. This study demonstrates the potential to differentiate SARS-CoV-2 variants through their proteotypic peptides with an approach that is broadly applicable across a wide range of pathogens.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/diagnóstico , Cromatografia Líquida , Humanos , Peptídeos/química , Peptídeos/genética , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus , Espectrometria de Massas em Tandem
6.
J Proteome Res ; 21(4): 993-1001, 2022 04 01.
Artigo em Inglês | MEDLINE | ID: mdl-35192358

RESUMO

Human immunodeficiency virus type 1 (HIV-1) remains a deadly infectious disease despite existing antiretroviral therapies. A comprehensive understanding of the specific mechanisms of viral infectivity remains elusive and currently limits the development of new and effective therapies. Through in-depth proteomic analysis of HIV-1 virions, we discovered the novel post-translational modification of highly conserved residues within the viral matrix and capsid proteins to the dehydroamino acids, dehydroalanine and dehydrobutyrine. We further confirmed their presence by labeling the reactive alkene, characteristic of dehydroamino acids, with glutathione via Michael addition. Dehydroamino acids are rare, understudied, and have been observed mainly in select bacterial and fungal species. Until now, they have not been observed in HIV proteins. We hypothesize that these residues are important in viral particle maturation and could provide valuable insight into HIV infectivity mechanisms.


Assuntos
HIV-1 , Capsídeo/química , Capsídeo/metabolismo , Proteínas do Capsídeo/análise , Proteínas do Capsídeo/química , Proteínas do Capsídeo/genética , HIV-1/genética , Humanos , Proteômica , Vírion
7.
J Proteome Res ; 21(2): 410-419, 2022 02 04.
Artigo em Inglês | MEDLINE | ID: mdl-35073098

RESUMO

Interpreting proteomics data remains challenging due to the large number of proteins that are quantified by modern mass spectrometry methods. Weighted gene correlation network analysis (WGCNA) can identify groups of biologically related proteins using only protein intensity values by constructing protein correlation networks. However, WGCNA is not widespread in proteomic analyses due to challenges in implementing workflows. To facilitate the adoption of WGCNA by the proteomics field, we created MetaNetwork, an open-source, R-based application to perform sophisticated WGCNA workflows with no coding skill requirements for the end user. We demonstrate MetaNetwork's utility by employing it to identify groups of proteins associated with prostate cancer from a proteomic analysis of tumor and adjacent normal tissue samples. We found a decrease in cytoskeleton-related protein expression, a known hallmark of prostate tumors. We further identified changes in module eigenproteins indicative of dysregulation in protein translation and trafficking pathways. These results demonstrate the value of using MetaNetwork to improve the biological interpretation of quantitative proteomics experiments with 15 or more samples.


Assuntos
Proteínas , Proteômica , Análise por Conglomerados , Humanos , Masculino , Espectrometria de Massas , Fluxo de Trabalho
8.
J Proteome Res ; 20(4): 1997-2004, 2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-33683901

RESUMO

MetaMorpheus is a free, open-source software program for the identification of peptides and proteoforms from data-dependent acquisition tandem MS experiments. There is inherent uncertainty in these assignments for several reasons, including the limited overlap between experimental and theoretical peaks, the m/z uncertainty, and noise peaks or peaks from coisolated peptides that produce false matches. False discovery rates provide only a set-wise approximation for incorrect spectrum matches. Here we implemented a binary decision tree calculation within MetaMorpheus to compute a posterior error probability, which provides a measure of uncertainty for each peptide-spectrum match. We demonstrate its utility for increasing identifications and resolving ambiguities in bottom-up, top-down, proteogenomic, and nonspecific digestion searches.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Algoritmos , Bases de Dados de Proteínas , Peptídeos , Probabilidade , Software
9.
J Proteome Res ; 20(1): 317-325, 2021 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-33074679

RESUMO

Identification of proteoforms, the different forms of a protein, is important to understand biological processes. A proteoform family is the set of different proteoforms from the same gene. We previously developed the software program Proteoform Suite, which constructs proteoform families and identifies proteoforms by intact-mass analysis. Here, we have applied this approach to top-down proteomic data acquired at the National High Magnetic Field Laboratory 21 tesla Fourier transform ion cyclotron resonance mass spectrometer (data available on the MassIVE platform with identifier MSV000085978). We explored the ability to construct proteoform families and identify proteoforms from the high mass accuracy data that this instrument provides for a complex cell lysate sample from the MCF-7 human breast cancer cell line. There were 2830 observed experimental proteforms, of which 932 were identified, 44 were ambiguous, and 1854 were unidentified. Of the 932 unique identified proteoforms, 766 were identified by top-down MS2 analysis at 1% false discovery rate (FDR) using TDPortal, and 166 were additional intact-mass identifications (∼4.7% calculated global FDR) made using Proteoform Suite. We recently published a proteoform level schema to represent ambiguity in proteoform identifications. We implemented this proteoform level classification in Proteoform Suite for intact-mass identifications, which enables users to determine the ambiguity levels and sources of ambiguity for each intact-mass proteoform identification.


Assuntos
Ciclotrons , Proteômica , Análise de Fourier , Humanos , Espectrometria de Massas , Software
10.
J Proteome Res ; 20(4): 1826-1834, 2021 04 02.
Artigo em Inglês | MEDLINE | ID: mdl-32967423

RESUMO

Proteoforms are the workhorses of the cell, and subtle differences between their amino acid sequences or post-translational modifications (PTMs) can change their biological function. To most effectively identify and quantify proteoforms in genetically diverse samples by mass spectrometry (MS), it is advantageous to search the MS data against a sample-specific protein database that is tailored to the sample being analyzed, in that it contains the correct amino acid sequences and relevant PTMs for that sample. To this end, we have developed Spritz (https://smith-chem-wisc.github.io/Spritz/), an open-source software tool for generating protein databases annotated with sequence variations and PTMs. We provide a simple graphical user interface for Windows and scripts that can be run on any operating system. Spritz automatically sets up and executes approximately 20 tools, which enable the construction of a proteogenomic database from only raw RNA sequencing data. Sequence variations that are discovered in RNA sequencing data upon comparison to the Ensembl reference genome are annotated on proteins in these databases, and PTM annotations are transferred from UniProt. Modifications can also be discovered and added to the database using bottom-up mass spectrometry data and global PTM discovery in MetaMorpheus. We demonstrate that such sample-specific databases allow the identification of variant peptides, modified variant peptides, and variant proteoforms by searching bottom-up and top-down proteomic data from the Jurkat human T lymphocyte cell line and demonstrate the identification of phosphorylated variant sites with phosphoproteomic data from the U2OS human osteosarcoma cell line.


Assuntos
Proteogenômica , Bases de Dados de Proteínas , Humanos , Espectrometria de Massas , Processamento de Proteína Pós-Traducional , Proteômica , Software
11.
Anal Chem ; 93(26): 9119-9128, 2021 07 06.
Artigo em Inglês | MEDLINE | ID: mdl-34165955

RESUMO

Proton-transfer reactions (PTRs) have emerged as a powerful tool for the study of intact proteins. When coupled with m/z-selective kinetic excitation, such as parallel ion parking (PIP), one can exert exquisite control over rates of reaction with a high degree of specificity. This allows one to "concentrate", in the gas phase, nearly all the signals from an intact protein charge state envelope into a single charge state, improving the signal-to-noise ratio (S/N) by 10× or more. While this approach has been previously reported, here we show that implementing these technologies on a 21 T FT-ICR MS provides a tremendous advantage for intact protein analysis. Advanced strategies for performing PTR with PIP were developed to complement this unique instrument, including subjecting all analyte ions entering the mass spectrometer to PTR and PIP. This experiment, which we call "PTR-MS1-PIP", generates a pseudo-MS1 spectrum derived from ions that are exposed to the PTR reagent and PIP waveforms but have not undergone any prior true mass filtering or ion isolation. The result is an extremely rapid and significant improvement in the spectral S/N of intact proteins. This permits the observation of many more proteoforms and reduces ion injection periods for subsequent tandem mass spectrometry characterization. Additionally, the product ion parking waveform has been optimized to enhance the PTR rate without compromise to the parking efficiency. We demonstrate that this process, called "rapid park", can improve reaction rates by 5-10× and explore critical factors discovered to influence this process. Finally, we demonstrate how coupling PTR-MS1 and rapid park provides a 10-fold reduction in ion injection time, improving the rate of tandem MS sequencing.


Assuntos
Proteínas , Prótons , Indicadores e Reagentes , Íons , Espectrometria de Massas em Tandem
12.
RNA ; 25(10): 1337-1352, 2019 10.
Artigo em Inglês | MEDLINE | ID: mdl-31296583

RESUMO

Proteins bind mRNA through their entire life cycle from transcription to degradation. We analyzed c-Myc mRNA protein interactors in vivo using the HyPR-MS method to capture the crosslinked mRNA by hybridization and then analyzed the bound proteins using mass spectrometry proteomics. Using HyPR-MS, 229 c-Myc mRNA-binding proteins were identified, confirming previously proposed interactors, suggesting new interactors, and providing information related to the roles and pathways known to involve c-Myc. We performed structural and functional analysis of these proteins and validated our findings with a combination of RIP-qPCR experiments, in vitro results released in past studies, publicly available RIP- and eCLIP-seq data, and results from software tools for predicting RNA-protein interactions.


Assuntos
Espectrometria de Massas/métodos , Proteínas Proto-Oncogênicas c-myc/metabolismo , RNA Mensageiro/metabolismo , Proteínas de Ligação a RNA/metabolismo , Imunoprecipitação da Cromatina , Humanos , Células K562 , Domínios e Motivos de Interação entre Proteínas
13.
J Proteome Res ; 19(5): 1975-1981, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-32243168

RESUMO

Statistical significance tests are a common feature in quantitative proteomics workflows. The Student's t-test is widely used to compute the statistical significance of a protein's change between two groups of samples. However, the t-test's null hypothesis asserts that the difference in means between two groups is exactly zero, often marking small but uninteresting fold-changes as statistically significant. Compensations to address this issue are widely used in quantitative proteomics, but we suggest that a replacement of the t-test with a Bayesian approach offers a better path forward. In this article, we describe a Bayesian hypothesis test in which the null hypothesis is an interval rather than a single point at zero; the width of the interval is estimated from population statistics. The improved sensitivity of the method substantially increases the number of truly changing proteins detected in two benchmark data sets (ProteomeXchange identifiers PXD005590 and PXD016470). The method has been implemented within FlashLFQ, an open-source software program that quantifies bottom-up proteomics search results obtained from any search tool. FlashLFQ is rapid, sensitive, and accurate and is available both as an easy-to-use graphical user interface (Windows) and as a command-line tool (Windows/Linux/OSX).


Assuntos
Proteômica , Software , Teorema de Bayes , Humanos , Proteínas , Fluxo de Trabalho
14.
J Proteome Res ; 19(8): 3510-3517, 2020 08 07.
Artigo em Inglês | MEDLINE | ID: mdl-32584579

RESUMO

Cellular functions are performed by a vast and diverse set of proteoforms. Proteoforms are the specific forms of proteins produced as a result of genetic variations, RNA splicing, and post-translational modifications (PTMs). Top-down mass spectrometric analysis of intact proteins enables proteoform identification, including proteoforms derived from sequence cleavage events or harboring multiple PTMs. In contrast, bottom-up proteomics identifies peptides, which necessitates protein inference and does not yield proteoform identifications. We seek here to exploit the synergies between these two data types to improve the quality and depth of the overall proteomic analysis. To this end, we automated the large-scale integration of results from multiprotease bottom-up and top-down analyses in the software program Proteoform Suite and applied it to the analysis of proteoforms from the human Jurkat T lymphocyte cell line. We implemented the recently developed proteoform-level classification scheme for top-down tandem mass spectrometry (MS/MS) identifications in Proteoform Suite, which enables users to observe the level and type of ambiguity for each proteoform identification, including which of the ambiguous proteoform identifications are supported by bottom-up-level evidence. We used Proteoform Suite to find instances where top-down identifications aid in protein inference from bottom-up analysis and conversely where bottom-up peptide identifications aid in proteoform PTM localization. We also show the use of bottom-up data to infer proteoform candidates potentially present in the sample, allowing confirmation of such proteoform candidates by intact-mass analysis of MS1 spectra. The implementation of these capabilities in the freely available software program Proteoform Suite enables users to integrate large-scale top-down and bottom-up data sets and to utilize the synergies between them to improve and extend the proteomic analysis.


Assuntos
Proteômica , Espectrometria de Massas em Tandem , Humanos , Processamento de Proteína Pós-Traducional , Proteoma/metabolismo , Software
15.
J Proteome Res ; 19(4): 1635-1646, 2020 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-32058723

RESUMO

Identifying single amino acid variants (SAAVs) in cancer is critical for precision oncology. Several advanced algorithms are now available to identify SAAVs, but attempts to combine different algorithms and optimize them on large data sets to achieve a more comprehensive coverage of SAAVs have not been implemented. Herein, we report an expanded detection of SAAVs in the PANC-1 cell line using three different strategies, which results in the identification of 540 SAAVs in the mass spectrometry data. Among the set of 540 SAAVs, 79 are evaluated as deleterious SAAVs based on analysis using the novel AssVar software in which one of the driver mutations found in each protein of KRAS, TP53, and SLC37A4 is further validated using independent selected reaction monitoring (SRM) analysis. Our study represents the most comprehensive discovery of SAAVs to date and the first large-scale detection of deleterious SAAVs in the PANC-1 cell line. This work may serve as the basis for future research in pancreatic cancer and personal immunotherapy and treatment.


Assuntos
Aminoácidos , Neoplasias Pancreáticas , Antiporters , Linhagem Celular , Humanos , Proteínas de Transporte de Monossacarídeos , Neoplasias Pancreáticas/genética , Medicina de Precisão , Proteínas
16.
Proteomics ; 19(10): e1800361, 2019 05.
Artigo em Inglês | MEDLINE | ID: mdl-31050378

RESUMO

A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post-translational modifications. In top-down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top-down proteomic workflows. In this review, some recent advances are outlined and current challenges and future directions for the field are discussed.


Assuntos
Aminoácidos/análise , Espectrometria de Massas , Processamento de Proteína Pós-Traducional , Proteoma/análise , Proteômica/métodos , Animais , Biologia Computacional , Eletroforese Capilar , Humanos , Linguagens de Programação , Reprodutibilidade dos Testes , Software
17.
J Proteome Res ; 18(1): 349-358, 2019 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-30346791

RESUMO

Post-translationally spliced peptides have recently garnered significant interest as potential targets for cancer immunotherapy and as contributors to autoimmune diseases such as type 1 diabetes, yet feasible identification methods for spliced peptides have yet to be developed. Here we present Neo-Fusion, a search program for discovering spliced peptides in tandem mass spectrometry data. Neo-Fusion utilizes two separated ion database searches to identify the two halves of each spliced peptide, and then it infers the full spliced sequence. This strategy allows for the identification of spliced peptides without peptide length constraints, providing a broadly applicable tool suitable for identification of spliced peptides in a variety of systems, such as the HLA-I and HLA-II immunopeptidomes and in vitro digested protein samples obtained from organelles, cells, or tissues of interest. Using simulated spliced peptides to benchmark Neo-Fusion, 25% of all simulated spliced peptides were identified at a measured false-discovery rate of 5% for HLA-I. Neo-Fusion provides the research community with a powerful new tool to aid in the study of the prevalence and biological significance of post-translationally spliced peptides.


Assuntos
Peptídeos/análise , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Antígenos de Histocompatibilidade Classe I/análise , Humanos , Proteólise
18.
J Proteome Res ; 18(9): 3429-3438, 2019 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-31378069

RESUMO

Peptides detected by tandem mass spectrometry (MS/MS) in bottom-up proteomics serve as proxies for the proteins expressed in the sample. Protein inference is a process routinely applied to these peptides to generate a plausible list of candidate protein identifications. The use of multiple proteases for parallel protein digestions expands sequence coverage, provides additional peptide identifications, and increases the probability of identifying peptides that are unique to a single protein, which are all valuable for protein inference. We have developed and implemented a multi-protease protein inference algorithm in MetaMorpheus, a bottom-up search software program, which incorporates the calculation of protease-specific q-values and preserves the association of peptide sequences and their protease of origin. This integrated multi-protease protein inference algorithm provides more accurate results than either the aggregation of results from the separate analysis of the peptide identifications produced by each protease (separate approach) in MetaMorpheus, or results that are obtained using Fido, ProteinProphet, or DTASelect2. MetaMorpheus' integrated multi-protease data analysis decreases the ambiguity of the protein group list, reduces the frequency of erroneous identifications, and increases the number of post-translational modifications identified, while combining multi-protease search and protein inference into a single software program.


Assuntos
Proteínas/isolamento & purificação , Proteômica , Software , Espectrometria de Massas em Tandem/métodos , Algoritmos , Sequência de Aminoácidos/genética , Bases de Dados de Proteínas , Peptídeo Hidrolases/química , Peptídeo Hidrolases/isolamento & purificação , Peptídeos/química , Peptídeos/isolamento & purificação , Proteínas/química
19.
J Proteome Res ; 18(10): 3671-3680, 2019 10 04.
Artigo em Inglês | MEDLINE | ID: mdl-31479276

RESUMO

Complex human biomolecular processes are made possible by the diversity of human proteoforms. Constructing proteoform families, groups of proteoforms derived from the same gene, is one way to represent this diversity. Comprehensive, high-confidence identification of human proteoforms remains a central challenge in mass spectrometry-based proteomics. We have previously reported a strategy for proteoform identification using intact-mass measurements, and we have since improved that strategy by mass calibration based on search results, the use of a global post-translational modification discovery database, and the integration of top-down proteomics results with intact-mass analysis. In the present study, we combine these strategies for enhanced proteoform identification in total cell lysate from the Jurkat human T lymphocyte cell line. We collected, processed, and integrated three types of proteomics data (NeuCode-labeled intact-mass, label-free top-down, and multi-protease bottom-up) to maximize the number of confident proteoform identifications. The integrated analysis revealed 5950 unique experimentally observed proteoforms, which were assembled into 848 proteoform families. Twenty percent of the observed proteoforms were confidently identified at a 3.9% false discovery rate, representing 1207 unique proteoforms derived from 484 genes.


Assuntos
Bases de Dados de Proteínas , Proteoma , Proteômica/métodos , Humanos , Células Jurkat , Espectrometria de Massas , Peptídeo Hidrolases/análise , Isoformas de Proteínas , Processamento de Proteína Pós-Traducional
20.
Anal Chem ; 91(17): 10937-10942, 2019 09 03.
Artigo em Inglês | MEDLINE | ID: mdl-31393705

RESUMO

Proteoforms, the primary effectors of biological processes, are the different forms of proteins that arise from molecular processing events such as alternative splicing and post-translational modifications. Heart diseases exhibit changes in proteoform levels, motivating the development of a deeper understanding of the heart proteoform landscape. Our recently developed two-dimensional top-down proteomics platform coupling serial size exclusion chromatography (sSEC) to reversed-phase chromatography (RPC) expanded coverage of the human heart proteome and allowed observation of high-molecular weight proteoforms. However, most of these observed proteoforms were not identified due to the difficulty in obtaining quality tandem mass spectrometry (MS2) fragmentation data for large proteoforms from complex biological mixtures on a chromatographic time scale. Herein, we sought to identify human heart proteoforms in this data set using an enhanced version of Proteoform Suite, which identifies proteoforms by intact mass alone. Specifically, we added a new feature to Proteoform Suite to determine candidate identifications for isotopically unresolved proteoforms larger than 50 kDa, enabling subsequent MS2 identification of important high-molecular weight human heart proteoforms such as lamin A (72 kDa) and trifunctional enzyme subunit α (79 kDa). With this new workflow for large proteoform identification, endogenous human cardiac myosin binding protein C (140 kDa) was identified for the first time. This study demonstrates the integration of our sSEC-RPC-MS proteomics platform with intact-mass analysis through Proteoform Suite to create a catalog of human heart proteoforms and facilitate the identification of large proteoforms in complex systems.


Assuntos
Proteínas de Transporte/isolamento & purificação , Lamina Tipo A/isolamento & purificação , Subunidade alfa da Proteína Mitocondrial Trifuncional/isolamento & purificação , Miocárdio/química , Processamento de Proteína Pós-Traducional , Proteoma/isolamento & purificação , Software , Processamento Alternativo , Sequência de Aminoácidos , Proteínas de Transporte/química , Proteínas de Transporte/metabolismo , Cromatografia em Gel , Cromatografia de Fase Reversa , Humanos , Lamina Tipo A/química , Lamina Tipo A/metabolismo , Subunidade alfa da Proteína Mitocondrial Trifuncional/química , Subunidade alfa da Proteína Mitocondrial Trifuncional/metabolismo , Miocárdio/metabolismo , Proteoma/química , Proteoma/metabolismo , Proteômica/métodos , Espectrometria de Massas em Tandem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA