Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Nature ; 590(7847): 649-654, 2021 02.
Artículo en Inglés | MEDLINE | ID: mdl-33627808

RESUMEN

The cell cycle, over which cells grow and divide, is a fundamental process of life. Its dysregulation has devastating consequences, including cancer1-3. The cell cycle is driven by precise regulation of proteins in time and space, which creates variability between individual proliferating cells. To our knowledge, no systematic investigations of such cell-to-cell proteomic variability exist. Here we present a comprehensive, spatiotemporal map of human proteomic heterogeneity by integrating proteomics at subcellular resolution with single-cell transcriptomics and precise temporal measurements of individual cells in the cell cycle. We show that around one-fifth of the human proteome displays cell-to-cell variability, identify hundreds of proteins with previously unknown associations with mitosis and the cell cycle, and provide evidence that several of these proteins have oncogenic functions. Our results show that cell cycle progression explains less than half of all cell-to-cell variability, and that most cycling proteins are regulated post-translationally, rather than by transcriptomic cycling. These proteins are disproportionately phosphorylated by kinases that regulate cell fate, whereas non-cycling proteins that vary between cells are more likely to be modified by kinases that regulate metabolism. This spatially resolved proteomic map of the cell cycle is integrated into the Human Protein Atlas and will serve as a resource for accelerating molecular studies of the human cell cycle and cell proliferation.


Asunto(s)
Ciclo Celular , Proteogenómica/métodos , Análisis de la Célula Individual/métodos , Transcriptoma , Proteínas de Ciclo Celular/metabolismo , Línea Celular Tumoral , Linaje de la Célula , Proliferación Celular , Humanos , Interfase , Mitosis , Proteínas Oncogénicas/metabolismo , Fosforilación , Proteínas Quinasas/metabolismo , Proteoma/metabolismo , Factores de Tiempo
6.
J Proteome Res ; 21(2): 410-419, 2022 02 04.
Artículo en Inglés | MEDLINE | ID: mdl-35073098

RESUMEN

Interpreting proteomics data remains challenging due to the large number of proteins that are quantified by modern mass spectrometry methods. Weighted gene correlation network analysis (WGCNA) can identify groups of biologically related proteins using only protein intensity values by constructing protein correlation networks. However, WGCNA is not widespread in proteomic analyses due to challenges in implementing workflows. To facilitate the adoption of WGCNA by the proteomics field, we created MetaNetwork, an open-source, R-based application to perform sophisticated WGCNA workflows with no coding skill requirements for the end user. We demonstrate MetaNetwork's utility by employing it to identify groups of proteins associated with prostate cancer from a proteomic analysis of tumor and adjacent normal tissue samples. We found a decrease in cytoskeleton-related protein expression, a known hallmark of prostate tumors. We further identified changes in module eigenproteins indicative of dysregulation in protein translation and trafficking pathways. These results demonstrate the value of using MetaNetwork to improve the biological interpretation of quantitative proteomics experiments with 15 or more samples.


Asunto(s)
Proteínas , Proteómica , Análisis por Conglomerados , Humanos , Masculino , Espectrometría de Masas , Flujo de Trabajo
7.
J Proteome Res ; 21(4): 1189-1195, 2022 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-35290070

RESUMEN

It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.


Asunto(s)
Proteoma , Proteómica , Humanos , Procesamiento Proteico-Postraduccional , Proteoma/genética , Estándares de Referencia , Programas Informáticos
8.
Nat Methods ; 16(12): 1254-1261, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31780840

RESUMEN

Pinpointing subcellular protein localizations from microscopy images is easy to the trained eye, but challenging to automate. Based on the Human Protein Atlas image collection, we held a competition to identify deep learning solutions to solve this task. Challenges included training on highly imbalanced classes and predicting multiple labels per image. Over 3 months, 2,172 teams participated. Despite convergence on popular networks and training techniques, there was considerable variety among the solutions. Participants applied strategies for modifying neural networks and loss functions, augmenting data and using pretrained networks. The winning models far outperformed our previous effort at multi-label classification of protein localization patterns by ~20%. These models can be used as classifiers to annotate new images, feature extractors to measure pattern similarity or pretrained networks for a wide range of biological applications.


Asunto(s)
Aprendizaje Profundo , Procesamiento de Imagen Asistido por Computador/métodos , Microscopía Fluorescente/métodos , Proteínas/análisis , Humanos
9.
J Proteome Res ; 20(4): 1826-1834, 2021 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-32967423

RESUMEN

Proteoforms are the workhorses of the cell, and subtle differences between their amino acid sequences or post-translational modifications (PTMs) can change their biological function. To most effectively identify and quantify proteoforms in genetically diverse samples by mass spectrometry (MS), it is advantageous to search the MS data against a sample-specific protein database that is tailored to the sample being analyzed, in that it contains the correct amino acid sequences and relevant PTMs for that sample. To this end, we have developed Spritz (https://smith-chem-wisc.github.io/Spritz/), an open-source software tool for generating protein databases annotated with sequence variations and PTMs. We provide a simple graphical user interface for Windows and scripts that can be run on any operating system. Spritz automatically sets up and executes approximately 20 tools, which enable the construction of a proteogenomic database from only raw RNA sequencing data. Sequence variations that are discovered in RNA sequencing data upon comparison to the Ensembl reference genome are annotated on proteins in these databases, and PTM annotations are transferred from UniProt. Modifications can also be discovered and added to the database using bottom-up mass spectrometry data and global PTM discovery in MetaMorpheus. We demonstrate that such sample-specific databases allow the identification of variant peptides, modified variant peptides, and variant proteoforms by searching bottom-up and top-down proteomic data from the Jurkat human T lymphocyte cell line and demonstrate the identification of phosphorylated variant sites with phosphoproteomic data from the U2OS human osteosarcoma cell line.


Asunto(s)
Proteogenómica , Bases de Datos de Proteínas , Humanos , Espectrometría de Masas , Procesamiento Proteico-Postraduccional , Proteómica , Programas Informáticos
10.
RNA ; 25(10): 1337-1352, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-31296583

RESUMEN

Proteins bind mRNA through their entire life cycle from transcription to degradation. We analyzed c-Myc mRNA protein interactors in vivo using the HyPR-MS method to capture the crosslinked mRNA by hybridization and then analyzed the bound proteins using mass spectrometry proteomics. Using HyPR-MS, 229 c-Myc mRNA-binding proteins were identified, confirming previously proposed interactors, suggesting new interactors, and providing information related to the roles and pathways known to involve c-Myc. We performed structural and functional analysis of these proteins and validated our findings with a combination of RIP-qPCR experiments, in vitro results released in past studies, publicly available RIP- and eCLIP-seq data, and results from software tools for predicting RNA-protein interactions.


Asunto(s)
Espectrometría de Masas/métodos , Proteínas Proto-Oncogénicas c-myc/metabolismo , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/metabolismo , Inmunoprecipitación de Cromatina , Humanos , Células K562 , Dominios y Motivos de Interacción de Proteínas
11.
Mol Syst Biol ; 16(8): e9469, 2020 08.
Artículo en Inglés | MEDLINE | ID: mdl-32744794

RESUMEN

The nucleolus is essential for ribosome biogenesis and is involved in many other cellular functions. We performed a systematic spatiotemporal dissection of the human nucleolar proteome using confocal microscopy. In total, 1,318 nucleolar proteins were identified; 287 were localized to fibrillar components, and 157 were enriched along the nucleoplasmic border, indicating a potential fourth nucleolar subcompartment: the nucleoli rim. We found 65 nucleolar proteins (36 uncharacterized) to relocate to the chromosomal periphery during mitosis. Interestingly, we observed temporal partitioning into two recruitment phenotypes: early (prometaphase) and late (after metaphase), suggesting phase-specific functions. We further show that the expression of MKI67 is critical for this temporal partitioning. We provide the first proteome-wide analysis of intrinsic protein disorder for the human nucleolus and show that nucleolar proteins in general, and mitotic chromosome proteins in particular, have significantly higher intrinsic disorder level compared to cytosolic proteins. In summary, this study provides a comprehensive and essential resource of spatiotemporal expression data for the nucleolar proteome as part of the Human Protein Atlas.


Asunto(s)
Nucléolo Celular/metabolismo , Antígeno Ki-67/metabolismo , Proteínas Nucleares/metabolismo , Proteómica/métodos , Cromosomas Humanos/metabolismo , Células HEK293 , Humanos , Microscopía Confocal , Mitosis , Fenotipo , Análisis de la Célula Individual
12.
J Proteome Res ; 19(4): 1635-1646, 2020 04 03.
Artículo en Inglés | MEDLINE | ID: mdl-32058723

RESUMEN

Identifying single amino acid variants (SAAVs) in cancer is critical for precision oncology. Several advanced algorithms are now available to identify SAAVs, but attempts to combine different algorithms and optimize them on large data sets to achieve a more comprehensive coverage of SAAVs have not been implemented. Herein, we report an expanded detection of SAAVs in the PANC-1 cell line using three different strategies, which results in the identification of 540 SAAVs in the mass spectrometry data. Among the set of 540 SAAVs, 79 are evaluated as deleterious SAAVs based on analysis using the novel AssVar software in which one of the driver mutations found in each protein of KRAS, TP53, and SLC37A4 is further validated using independent selected reaction monitoring (SRM) analysis. Our study represents the most comprehensive discovery of SAAVs to date and the first large-scale detection of deleterious SAAVs in the PANC-1 cell line. This work may serve as the basis for future research in pancreatic cancer and personal immunotherapy and treatment.


Asunto(s)
Aminoácidos , Neoplasias Pancreáticas , Antiportadores , Línea Celular , Humanos , Proteínas de Transporte de Monosacáridos , Neoplasias Pancreáticas/genética , Medicina de Precisión , Proteínas
13.
J Proteome Res ; 17(9): 3022-3038, 2018 09 07.
Artículo en Inglés | MEDLINE | ID: mdl-29972301

RESUMEN

RNA-protein interactions are integral to the regulation of gene expression. RNAs have diverse functions and the protein interactomes of individual RNAs vary temporally, spatially, and with physiological context. These factors make the global acquisition of individual RNA-protein interactomes an essential endeavor. Although techniques have been reported for discovery of the protein interactomes of specific RNAs they are largely laborious, costly, and accomplished singly in individual experiments. We developed HyPR-MS for the discovery and analysis of the protein interactomes of multiple RNAs in a single experiment while also reducing design time and improving efficiencies. Presented here is the application of HyPR-MS to simultaneously and selectively isolate the interactomes of lncRNAs MALAT1, NEAT1, and NORAD. Our analysis features the proteins that potentially contribute to both known and previously undiscovered roles of each lncRNA. This platform provides a powerful new multiplexing tool for the efficient and cost-effective elucidation of specific RNA-protein interactomes.


Asunto(s)
Proteómica/métodos , ARN Largo no Codificante/metabolismo , Proteínas de Unión al ARN/metabolismo , Secuencia de Bases , Línea Celular Tumoral , Regulación de la Expresión Génica , Ontología de Genes , Humanos , Espectrometría de Masas/métodos , Anotación de Secuencia Molecular , Unión Proteica , ARN Largo no Codificante/genética , Proteínas de Unión al ARN/clasificación , Proteínas de Unión al ARN/genética
14.
J Proteome Res ; 17(10): 3526-3536, 2018 10 05.
Artículo en Inglés | MEDLINE | ID: mdl-30180576

RESUMEN

The development of effective strategies for the comprehensive identification and quantification of proteoforms in complex systems is a critical challenge in proteomics. Proteoforms, the specific molecular forms in which proteins are present in biological systems, are the key effectors of biological function. Thus, knowledge of proteoform identities and abundances is essential to unraveling the mechanisms that underlie protein function. We recently reported a strategy that integrates conventional top-down mass spectrometry with intact-mass determinations for enhanced proteoform identifications and the elucidation of proteoform families and applied it to the analysis of yeast cell lysate. In the present work, we extend this strategy to enable quantification of proteoforms, and we examine changes in the abundance of murine mitochondrial proteoforms upon differentiation of mouse myoblasts to myotubes. The integrated top-down and intact-mass strategy provided an increase of ∼37% in the number of identified proteoforms compared to top-down alone, which is in agreement with our previous work in yeast; 1779 unique proteoforms were identified using the integrated strategy compared to 1301 using top-down analysis alone. Quantitative comparison of proteoform differences between the myoblast and myotube cell types showed 129 observed proteoforms exhibiting statistically significant abundance changes (fold change >2 and false discovery rate <5%).


Asunto(s)
Mitocondrias/metabolismo , Proteínas Mitocondriales/metabolismo , Proteoma/metabolismo , Proteómica/métodos , Espectrometría de Masas en Tándem/métodos , Animales , Diferenciación Celular , Línea Celular , Ratones , Fibras Musculares Esqueléticas/citología , Fibras Musculares Esqueléticas/metabolismo , Mioblastos/citología , Mioblastos/metabolismo , Reproducibilidad de los Resultados , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/metabolismo
15.
J Proteome Res ; 17(1): 568-578, 2018 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-29195273

RESUMEN

We present an open-source, interactive program named Proteoform Suite that uses proteoform mass and intensity measurements from complex biological samples to identify and quantify proteoforms. It constructs families of proteoforms derived from the same gene, assesses proteoform function using gene ontology (GO) analysis, and enables visualization of quantified proteoform families and their changes. It is applied here to reveal systemic proteoform variations in the yeast response to salt stress.


Asunto(s)
Proteómica/métodos , Programas Informáticos , Proteínas Fúngicas/análisis , Proteínas Fúngicas/efectos de los fármacos , Ontología de Genes , Espectrometría de Masas , Sales (Química)/farmacología , Estrés Fisiológico/efectos de los fármacos
16.
J Proteome Res ; 17(3): 1321-1325, 2018 03 02.
Artículo en Inglés | MEDLINE | ID: mdl-29397739

RESUMEN

The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human-readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, and the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.


Asunto(s)
Biología Computacional/métodos , Procesamiento Proteico-Postraduccional , Proteoma/análisis , Proteómica/métodos , Programas Informáticos , Espectrometría de Masas en Tándem/normas , Secuencia de Aminoácidos , Biología Computacional/estadística & datos numéricos , Bases de Datos de Proteínas/estadística & datos numéricos , Humanos , Difusión de la Información , Cooperación Internacional , Anotación de Secuencia Molecular , Proteoma/genética , Proteoma/metabolismo , Proteómica/estadística & datos numéricos , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem/métodos
17.
Anal Chem ; 90(2): 1325-1333, 2018 01 16.
Artículo en Inglés | MEDLINE | ID: mdl-29227670

RESUMEN

In top-down proteomics, intact proteins are analyzed by tandem mass spectrometry and proteoforms, which are defined forms of a protein with specific sequences of amino acids and localized post-translational modifications, are identified using precursor mass and fragmentation data. Many proteoforms that are detected in the precursor scan (MS1) are not selected for fragmentation by the instrument and therefore remain unidentified in typical top-down proteomic workflows. Our laboratory has developed the open source software program Proteoform Suite to analyze MS1-only intact proteoform data. Here, we have adapted it to provide identifications of proteoform masses in precursor MS1 spectra of top-down data, supplementing the top-down identifications obtained using the MS2 fragmentation data. Proteoform Suite performs mass calibration using high-scoring top-down identifications and identifies additional proteoforms using calibrated, accurate intact masses. Proteoform families, the set of proteoforms from a given gene, are constructed and visualized from proteoforms identified by both top-down and intact-mass analyses. Using this strategy, we constructed proteoform families and identified 1861 proteoforms in yeast lysate, yielding an approximately 40% increase over the original 1291 proteoform identifications observed using traditional top-down analysis alone.


Asunto(s)
Espectrometría de Masas/métodos , Proteoma/análisis , Proteómica/métodos , Proteínas de Saccharomyces cerevisiae/análisis , Saccharomyces cerevisiae/química , Programas Informáticos
18.
J Proteome Res ; 16(11): 4156-4165, 2017 11 03.
Artículo en Inglés | MEDLINE | ID: mdl-28968100

RESUMEN

A proteoform family is a group of related molecular forms of a protein (proteoforms) derived from the same gene. We have previously described a strategy to identify proteoforms and elucidate proteoform families in complex mixtures of intact proteins. The strategy is based upon measurements of two properties for each proteoform: (i) the accurate proteoform intact-mass, measured by liquid chromatography/mass spectrometry (LC-MS), and (ii) the number of lysine residues in each proteoform, determined using an isotopic labeling approach. These measured properties are then compared with those extracted from a catalog of theoretical proteoforms containing protein sequences and localized post-translational modifications (PTMs) for the organism under study. A match between the measured properties and those in the catalog constitutes an identification of the proteoform. In the present study, this strategy is extended by utilizing a global PTM discovery database and is applied to the widely studied model organism Escherichia coli, providing the most comprehensive elucidation of E. coli proteoforms and proteoform families to date.


Asunto(s)
Escherichia coli/química , Familia de Multigenes , Procesamiento Proteico-Postraduccional , Proteómica/métodos , Cromatografía Liquida , Bases de Datos de Proteínas , Lisina/análisis , Espectrometría de Masas en Tándem
19.
BMC Genomics ; 18(1): 877, 2017 Nov 13.
Artículo en Inglés | MEDLINE | ID: mdl-29132314

RESUMEN

BACKGROUND: Shotgun proteomics utilizes a database search strategy to compare detected mass spectra to a library of theoretical spectra derived from reference genome information. As such, the robustness of proteomics results is contingent upon the completeness and accuracy of the gene annotation in the reference genome. For animal models of disease where genomic annotation is incomplete, such as non-human primates, proteogenomic methods can improve the detection of proteins by incorporating transcriptional data from RNA-Seq to improve proteomics search databases used for peptide spectral matching. Customized search databases derived from RNA-Seq data are capable of identifying unannotated genetic and splice variants while simultaneously reducing the number of comparisons to only those transcripts actively expressed in the tissue. RESULTS: We collected RNA-Seq and proteomic data from 10 vervet monkey liver samples and used the RNA-Seq data to curate sample-specific search databases which were analyzed in the program Morpheus. We compared these results against those from a search database generated from the reference vervet genome. A total of 284 previously unannotated splice junctions were predicted by the RNA-Seq data, 92 of which were confirmed by peptide spectral matches. More than half (53/92) of these unannotated splice variants had orthologs in other non-human primates, suggesting that failure to match these peptides in the reference analyses likely arose from incomplete gene model information. The sample-specific databases also identified 101 unique peptides containing single amino acid substitutions which were missed by the reference database. Because the sample-specific searches were restricted to actively expressed transcripts, the search databases were smaller, more computationally efficient, and identified more peptides at the empirically derived 1 % false discovery rate. CONCLUSION: Proteogenomic approaches are ideally suited to facilitate the discovery and annotation of proteins in less widely studies animal models such as non-human primates. We expect that these approaches will help to improve existing genome annotations of non-human primate species such as vervet.


Asunto(s)
Espectrometría de Masas , Proteómica/métodos , Análisis de Secuencia de ARN , Animales , Chlorocebus aethiops , Bases de Datos Genéticas , Anotación de Secuencia Molecular , Proteómica/normas , Estándares de Referencia
20.
Genomics ; 107(6): 267-73, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27184763

RESUMEN

Currently available methods for interrogating DNA-protein interactions at individual genomic loci have significant limitations, and make it difficult to work with unmodified cells or examine single-copy regions without specific antibodies. In this study, we describe a physiological application of the Hybridization Capture of Chromatin-Associated Proteins for Proteomics (HyCCAPP) methodology we have developed. Both novel and known locus-specific DNA-protein interactions were identified at the ENO2 and GAL1 promoter regions of Saccharomyces cerevisiae, and revealed subgroups of proteins present in significantly different levels at the loci in cells grown on glucose versus galactose as the carbon source. Results were validated using chromatin immunoprecipitation. Overall, our analysis demonstrates that HyCCAPP is an effective and flexible technology that does not require specific antibodies nor prior knowledge of locally occurring DNA-protein interactions and can now be used to identify changes in protein interactions at target regions in the genome in response to physiological challenges.


Asunto(s)
Proteínas de Unión al ADN/genética , Galactoquinasa/genética , Fosfopiruvato Hidratasa/genética , Proteómica/métodos , Proteínas de Saccharomyces cerevisiae/genética , Cromatina/genética , Inmunoprecipitación de Cromatina/métodos , Regiones Promotoras Genéticas , Unión Proteica/genética , Saccharomyces cerevisiae/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA