Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 16 de 16
Filtrar
1.
J Proteome Res ; 18(3): 878-889, 2019 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-30638379

RESUMO

Top-down mass spectrometry is capable of identifying whole proteoform sequences with multiple post-translational modifications because it generates tandem mass spectra directly from intact proteoforms. Many software tools, such as ProSightPC, MSPathFinder, and TopMG, have been proposed for identifying proteoforms with modifications. In these tools, various methods are employed to estimate the statistical significance of identifications. However, most existing methods are designed for proteoform identifications without modifications, and the challenge remains for accurately estimating the statistical significance of proteoform identifications with modifications. Here we propose TopMCMC, a method that combines a Markov chain random walk algorithm and a greedy algorithm for assigning statistical significance to matches between spectra and protein sequences with variable modifications. Experimental results showed that TopMCMC achieved high accuracy in estimating E-values and false discovery rates of identifications in top-down mass spectrometry. Coupled with TopMG, TopMCMC identified more spectra than the generating function method from an MCF-7 top-down mass spectrometry data set.


Assuntos
Método de Monte Carlo , Proteoma/metabolismo , Proteômica/métodos , Algoritmos , Conjuntos de Dados como Assunto , Humanos , Células MCF-7 , Cadeias de Markov , Processamento de Proteína Pós-Traducional , Proteínas/análise , Software , Espectrometria de Massas em Tandem/métodos
2.
Proteomics ; 18(3-4)2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29327814

RESUMO

Complex proteoforms contain various primary structural alterations resulting from variations in genes, RNA, and proteins. Top-down mass spectrometry is commonly used for analyzing complex proteoforms because it provides whole sequence information of the proteoforms. Proteoform identification by top-down mass spectral database search is a challenging computational problem because the types and/or locations of some alterations in target proteoforms are in general unknown. Although spectral alignment and mass graph alignment algorithms have been proposed for identifying proteoforms with unknown alterations, they are extremely slow to align millions of spectra against tens of thousands of protein sequences in high throughput proteome level analyses. Many software tools in this area combine efficient protein sequence filtering algorithms and spectral alignment algorithms to speed up database search. As a result, the performance of these tools heavily relies on the sensitivity and efficiency of their filtering algorithms. Here, we propose two efficient approximate spectrum-based filtering algorithms for proteoform identification. We evaluated the performances of the proposed algorithms and four existing ones on simulated and real top-down mass spectrometry data sets. Experiments showed that the proposed algorithms outperformed the existing ones for complex proteoform identification. In addition, combining the proposed filtering algorithms and mass graph alignment algorithms identified many proteoforms missed by ProSightPC in proteome-level proteoform analyses.


Assuntos
Algoritmos , Proteínas de Escherichia coli/análise , Processamento de Proteína Pós-Traducional , Proteoma/análise , Análise de Sequência de Proteína/métodos , Software , Espectrometria de Massas em Tandem/métodos , Neoplasias da Mama/metabolismo , Análise por Conglomerados , Escherichia coli/metabolismo , Feminino , Xenoenxertos , Histonas/metabolismo , Humanos , Fosforilação , Isoformas de Proteínas
3.
BMC Bioinformatics ; 19(Suppl 17): 494, 2018 Dec 28.
Artigo em Inglês | MEDLINE | ID: mdl-30591035

RESUMO

BACKGROUND: Top-down mass spectrometry has unique advantages in identifying proteoforms with multiple post-translational modifications and/or unknown alterations. Most software tools in this area search top-down mass spectra against a protein sequence database for proteoform identification. When the species studied in a mass spectrometry experiment lacks its proteome sequence database, a homologous protein sequence database can be used for proteoform identification. The accuracy of homologous protein sequences affects the sensitivity of proteoform identification and the accuracy of mass shift localization. RESULTS: We tested TopPIC, a commonly used software tool for top-down mass spectral identification, on a top-down mass spectrometry data set of Escherichia coli K12 MG1655, and evaluated its performance using an Escherichia coli K12 MG1655 proteome database and a homologous protein database. The number of identified spectra with the homologous database was about half of that with the Escherichia coli K12 MG1655 database. We also tested TopPIC on a top-down mass spectrometry data set of human MCF-7 cells and obtained similar results. CONCLUSIONS: Experimental results demonstrated that TopPIC is capable of identifying many proteoform spectrum matches and localizing unknown alterations using homologous protein sequences containing no more than 2 mutations.


Assuntos
Proteínas/química , Homologia de Sequência de Aminoácidos , Espectrometria de Massas em Tandem/métodos , Sequência de Aminoácidos , Bases de Dados de Proteínas , Escherichia coli/metabolismo , Humanos , Células MCF-7 , Proteoma/análise , Software
4.
Anal Chem ; 90(17): 10095-10099, 2018 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-30085653

RESUMO

Native proteomics aims to characterize complex proteomes under native conditions and ultimately produces a full picture of endogenous protein complexes in cells. It requires novel analytical platforms for high-resolution and liquid-phase separation of protein complexes prior to native mass spectrometry (MS) and MS/MS. In this work, size-exclusion chromatography (SEC)-capillary zone electrophoresis (CZE)-MS/MS was developed for native proteomics in discovery mode, resulting in the identification of 144 proteins, 672 proteoforms, and 23 protein complexes from the Escherichia coli proteome. The protein complexes include four protein homodimers, 16 protein-metal complexes, two protein-[2Fe-2S] complexes, and one protein-glutamine complex. Half of them have not been reported in the literature. This work represents the first example of online liquid-phase separation-MS/MS for the characterization of a complex proteome under the native condition, offering the proteomics community an efficient and simple platform for native proteomics.


Assuntos
Cromatografia em Gel/métodos , Eletroforese Capilar/métodos , Proteômica , Espectrometria de Massas em Tandem/métodos , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/isolamento & purificação
5.
Anal Chem ; 90(9): 5529-5533, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29620868

RESUMO

Capillary zone electrophoresis (CZE)-tandem mass spectrometry (MS/MS) has been recognized as a useful tool for top-down proteomics. However, its performance for deep top-down proteomics is still dramatically lower than widely used reversed-phase liquid chromatography (RPLC)-MS/MS. We present an orthogonal multidimensional separation platform that couples size exclusion chromatography (SEC) and RPLC based protein prefractionation to CZE-MS/MS for deep top-down proteomics of Escherichia coli. The platform generated high peak capacity (∼4000) for separation of intact proteins, leading to the identification of 5700 proteoforms from the Escherichia coli proteome. The data represents a 10-fold improvement in the number of proteoform identifications compared with previous CZE-MS/MS studies and represents the largest bacterial top-down proteomics data set reported to date. The performance of the CZE-MS/MS based platform is comparable to the state-of-the-art RPLC-MS/MS based systems in terms of the number of proteoform identifications and the instrument time.


Assuntos
Proteínas de Escherichia coli/análise , Escherichia coli/química , Proteoma/análise , Proteômica , Eletroforese Capilar , Espectrometria de Massas em Tandem
6.
Bioinformatics ; 33(9): 1309-1316, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-28453668

RESUMO

Motivation: Although proteomics has rapidly developed in the past decade, researchers are still in the early stage of exploring the world of complex proteoforms, which are protein products with various primary structure alterations resulting from gene mutations, alternative splicing, post-translational modifications, and other biological processes. Proteoform identification is essential to mapping proteoforms to their biological functions as well as discovering novel proteoforms and new protein functions. Top-down mass spectrometry is the method of choice for identifying complex proteoforms because it provides a 'bird's eye view' of intact proteoforms. The combinatorial explosion of various alterations on a protein may result in billions of possible proteoforms, making proteoform identification a challenging computational problem. Results: We propose a new data structure, called the mass graph, for efficient representation of proteoforms and design mass graph alignment algorithms. We developed TopMG, a mass graph-based software tool for proteoform identification by top-down mass spectrometry. Experiments on top-down mass spectrometry datasets showed that TopMG outperformed existing methods in identifying complex proteoforms. Availability and implementation: http://proteomics.informatics.iupui.edu/software/topmg/. Contact: xwliu@iupui.edu. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Proteoma/análise , Proteômica/métodos , Software , Espectrometria de Massas em Tandem/métodos , Algoritmos , Processamento Alternativo , Peso Molecular , Mutação , Processamento de Proteína Pós-Traducional , Proteoma/química , Proteoma/genética , Proteoma/metabolismo
7.
Anal Chem ; 89(22): 12059-12067, 2017 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-29064224

RESUMO

Capillary zone electrophoresis-electrospray ionization-tandem mass spectrometry (CZE-ESI-MS/MS) has been recognized as an invaluable platform for top-down proteomics. However, the scale of top-down proteomics using CZE-MS/MS is still limited due to the low loading capacity and narrow separation window of CZE. In this work, for the first time we systematically evaluated the dynamic pH junction method for focusing of intact proteins during CZE-MS. The optimized dynamic pH junction-based CZE-MS/MS approached a 1 µL loading capacity, 90 min separation window, and high peak capacity (∼280) for characterization of an Escherichia coli proteome. The results represent the largest loading capacity and the highest peak capacity of CZE for top-down characterization of complex proteomes. Single-shot CZE-MS/MS identified about 2800 proteoform-spectrum matches, nearly 600 proteoforms, and 200 proteins from the Escherichia coli proteome with spectrum-level false discovery rate (FDR) less than 1%. The number of identified proteoforms in this work is over three times higher than that in previous single-shot CZE-MS/MS studies. Truncations, N-terminal methionine excision, signal peptide removal, and some post-translational modifications including oxidation and acetylation were detected.


Assuntos
Proteínas de Escherichia coli/análise , Escherichia coli/química , Proteoma/análise , Proteômica , Eletroforese Capilar , Concentração de Íons de Hidrogênio , Espectrometria de Massas por Ionização por Electrospray , Espectrometria de Massas em Tandem
8.
Bioinformatics ; 32(22): 3495-3497, 2016 11 15.
Artigo em Inglês | MEDLINE | ID: mdl-27423895

RESUMO

Top-down mass spectrometry enables the observation of whole complex proteoforms in biological samples and provides crucial information complementary to bottom-up mass spectrometry. Because of the complexity of top-down mass spectra and proteoforms, it is a challenging problem to efficiently interpret top-down tandem mass spectra in high-throughput proteome-level proteomics studies. We present TopPIC, a tool that efficiently identifies and characterizes complex proteoforms with unknown primary structure alterations, such as amino acid mutations and post-translational modifications, by searching top-down tandem mass spectra against a protein database. AVAILABILITY AND IMPLEMENTATION: http://proteomics.informatics.iupui.edu/software/toppic/ CONTACT: xwliu@iupui.eduSupplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Proteoma , Proteômica , Animais , Humanos , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas em Tandem
9.
Zhongguo Zhong Yao Za Zhi ; 42(3): 505-509, 2017 Feb.
Artigo em Zh | MEDLINE | ID: mdl-28952256

RESUMO

The method of physical fingerprint spectrum for Reduning injection (RI) was proposed in this paper to improve its quality standards based on the strong correlation between physicochemical properties of drugs, their safety, effectiveness and stability. The quality of RI was studied by the thought and method of physical chemistry. The physical fingerprint spectrum was visually showed by the radar map, and consisted of eight indexes (pH, conductivity, turbidity, refractive index, osmolarity, surface tension, relative density, and kinematic viscosity). Then 12 batch of samples were verified. It was found that the physical fingerprint spectra of 3 batches of RI were in line with the standards within their validity time, with similarity above 0.999; in addition for the expired 9 batches of RI, their physical fingerprint spectra did not meet the standards. The results showed that physical fingerprint spectrum can be used for the quality control of RI, with a certain exemplary role in the quality evaluation of traditional Chinese medicine injection.


Assuntos
Medicamentos de Ervas Chinesas/normas , Controle de Qualidade , Cromatografia Líquida de Alta Pressão , Injeções
10.
J Proteome Res ; 15(8): 2422-32, 2016 08 05.
Artigo em Inglês | MEDLINE | ID: mdl-27291504

RESUMO

Various proteoforms may be generated from a single gene due to primary structure alterations (PSAs) such as genetic variations, alternative splicing, and post-translational modifications (PTMs). Top-down mass spectrometry is capable of analyzing intact proteins and identifying patterns of multiple PSAs, making it the method of choice for studying complex proteoforms. In top-down proteomics, proteoform identification is often performed by searching tandem mass spectra against a protein sequence database that contains only one reference protein sequence for each gene or transcript variant in a proteome. Because of the incompleteness of the protein database, an identified proteoform may contain unknown PSAs compared with the reference sequence. Proteoform characterization is to identify and localize PSAs in a proteoform. Although many software tools have been proposed for proteoform identification by top-down mass spectrometry, the characterization of proteoforms in identified proteoform-spectrum matches still relies mainly on manual annotation. We propose to use the Modification Identification Score (MIScore), which is based on Bayesian models, to automatically identify and localize PTMs in proteoforms. Experiments showed that the MIScore is accurate in identifying and localizing one or two modifications.


Assuntos
Processamento de Proteína Pós-Traducional , Proteoma/análise , Proteômica/métodos , Teorema de Bayes , Escherichia coli/química , Salmonella typhimurium/química , Software , Espectrometria de Massas em Tandem
11.
J Proteome Res ; 13(7): 3241-8, 2014 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-24874765

RESUMO

There are two approaches for de novo protein sequencing: Edman degradation and mass spectrometry (MS). Existing MS-based methods characterize a novel protein by assembling tandem mass spectra of overlapping peptides generated from multiple proteolytic digestions of the protein. Because each tandem mass spectrum covers only a short peptide of the target protein, the key to high coverage protein sequencing is to find spectral pairs from overlapping peptides in order to assemble tandem mass spectra to long ones. However, overlapping regions of peptides may be too short to be confidently identified. High-resolution mass spectrometers have become accessible to many laboratories. These mass spectrometers are capable of analyzing molecules of large mass values, boosting the development of top-down MS. Top-down tandem mass spectra cover whole proteins. However, top-down tandem mass spectra, even combined, rarely provide full ion fragmentation coverage of a protein. We propose an algorithm, TBNovo, for de novo protein sequencing by combining top-down and bottom-up MS. In TBNovo, a top-down tandem mass spectrum is utilized as a scaffold, and bottom-up tandem mass spectra are aligned to the scaffold to increase sequence coverage. Experiments on data sets of two proteins showed that TBNovo achieved high sequence coverage and high sequence accuracy.


Assuntos
Mapeamento de Peptídeos , Análise de Sequência de Proteína , Alemtuzumab , Algoritmos , Sequência de Aminoácidos , Animais , Anticorpos Monoclonais Humanizados/química , Anidrase Carbônica II/química , Bovinos , Dados de Sequência Molecular , Espectrometria de Massas em Tandem/métodos
12.
BMC Genomics ; 15: 1140, 2014 Dec 18.
Artigo em Inglês | MEDLINE | ID: mdl-25523396

RESUMO

BACKGROUND: Top-down mass spectrometry plays an important role in intact protein identification and characterization. Top-down mass spectra are more complex than bottom-up mass spectra because they often contain many isotopomer envelopes from highly charged ions, which may overlap with one another. As a result, spectral deconvolution, which converts a complex top-down mass spectrum into a monoisotopic mass list, is a key step in top-down spectral interpretation. RESULTS: In this paper, we propose a new scoring function, L-score, for evaluating isotopomer envelopes. By combining L-score with MS-Deconv, a new software tool, MS-Deconv+, was developed for top-down spectral deconvolution. Experimental results showed that MS-Deconv+ outperformed existing software tools in top-down spectral deconvolution. CONCLUSIONS: L-score shows high discriminative ability in identification of isotopomer envelopes. Using L-score, MS-Deconv+ reports many correct monoisotopic masses missed by other software tools, which are valuable for proteoform identification and characterization.


Assuntos
Espectrometria de Massas , Proteínas/análise , Algoritmos , Proteínas de Bactérias/análise , Proteínas de Bactérias/isolamento & purificação , Cromatografia Líquida de Alta Pressão , Escherichia coli/genética , Escherichia coli/metabolismo , Modelos Logísticos , Modelos Teóricos , Peso Molecular , Curva ROC , Salmonella typhimurium/genética , Salmonella typhimurium/metabolismo , Software
13.
ScientificWorldJournal ; 2012: 753430, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23365528

RESUMO

Gene expression network reconstruction using microarray data is widely studied aiming to investigate the behavior of a gene cluster simultaneously. Under the Gaussian assumption, the conditional dependence between genes in the network is fully described by the partial correlation coefficient matrix. Due to the high dimensionality and sparsity, we utilize the LEP method to estimate it in this paper. Compared to the existing methods, the LEP reaches the highest PPV with the sensitivity controlled at the satisfactory level. A set of gene expression data from the HapMap project is analyzed for illustration.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica/estatística & dados numéricos , Redes Reguladoras de Genes , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Algoritmos , Simulação por Computador , Reprodutibilidade dos Testes
14.
J Am Soc Mass Spectrom ; 32(6): 1336-1344, 2021 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-33725447

RESUMO

Labeling approaches using isobaric chemical tags (e.g., isobaric tagging for relative and absolute quantification, iTRAQ and tandem mass tag, TMT) have been widely applied for the quantification of peptides and proteins in bottom-up MS. However, until recently, successful applications of these approaches to top-down proteomics have been limited because proteins tend to precipitate and "crash" out of solution during TMT labeling of complex samples making the quantification of such samples difficult. In this study, we report a top-down TMT MS platform for confidently identifying and quantifying low molecular weight intact proteoforms in complex biological samples. To reduce the sample complexity and remove large proteins from complex samples, we developed a filter-SEC technique that combines a molecular weight cutoff filtration step with high-performance size exclusion chromatography (SEC) separation. No protein precipitation was observed in filtered samples under the intact protein-level TMT labeling conditions. The proposed top-down TMT MS platform enables high-throughput analysis of intact proteoforms, allowing for the identification and quantification of hundreds of intact proteoforms from Escherichia coli cell lysates. To our knowledge, this represents the first high-throughput TMT labeling-based, quantitative, top-down MS analysis suitable for complex biological samples.


Assuntos
Proteínas de Escherichia coli/análise , Proteínas de Escherichia coli/química , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos , Cromatografia em Gel , Cromatografia Líquida/métodos , Peso Molecular , Proteínas Periplásmicas/análise , Proteínas Periplásmicas/química , Peroxidases/análise , Peroxidases/química , Proteínas Ribossômicas/análise , Proteínas Ribossômicas/química
15.
J Vis Exp ; (140)2018 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-30417888

RESUMO

Capillary zone electrophoresis-electrospray ionization-tandem mass spectrometry (CZE-ESI-MS/MS) has been recognized as a useful tool for top-down proteomics that aims to characterize proteoforms in complex proteomes. However, the application of CZE-MS/MS for large-scale top-down proteomics has been impeded by the low sample-loading capacity and narrow separation window of CZE. Here, a protocol is described using CZE-MS/MS with a microliter-scale sample-loading volume and a 90-min separation window for large-scale top-down proteomics. The CZE-MS/MS platform is based on a linear polyacrylamide (LPA)-coated separation capillary with extremely low electroosmotic flow, a dynamic pH-junction-based online sample concentration method with a high efficiency for protein stacking, an electro-kinetically pumped sheath flow CE-MS interface with extremely high sensitivity, and an ion trap mass spectrometer with high mass resolution and scan speed. The platform can be used for the high-resolution characterization of simple intact protein samples and the large-scale characterization of proteoforms in various complex proteomes. As an example, a highly efficient separation of a standard protein mixture and a highly sensitive detection of many impurities using the platform is demonstrated. As another example, this platform can produce over 500 proteoform and 190 protein identifications from an Escherichia coli proteome in a single CZE-MS/MS run.


Assuntos
Eletroforese Capilar/métodos , Proteômica/métodos , Espectrometria de Massas em Tandem/métodos
16.
Artigo em Inglês | MEDLINE | ID: mdl-29503761

RESUMO

Database search is the main approach for identifying proteoforms using top-down tandem mass spectra. However, it is extremely slow to align a query spectrum against all protein sequences in a large database when the target proteoform that produced the spectrum contains post-translational modifications and/or mutations. As a result, efficient and sensitive protein sequence filtering algorithms are essential for speeding up database search. In this paper, we propose a novel filtering algorithm, which generates spectrum graphs from subspectra of the query spectrum and searches them against the protein database to find good candidates. Compared with the sequence tag and gaped tag approaches, the proposed method circumvents the step of tag extraction, thus simplifying data processing. Experimental results on real data showed that the proposed method achieved both high speed and high sensitivity in protein sequence filtration.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA