Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 29
Filter
Add more filters










Publication year range
1.
Genome Biol ; 25(1): 82, 2024 04 02.
Article in English | MEDLINE | ID: mdl-38566187

ABSTRACT

The spatial organization of molecules in a cell is essential for their functions. While current methods focus on discerning tissue architecture, cell-cell interactions, and spatial expression patterns, they are limited to the multicellular scale. We present Bento, a Python toolkit that takes advantage of single-molecule information to enable spatial analysis at the subcellular scale. Bento ingests molecular coordinates and segmentation boundaries to perform three analyses: defining subcellular domains, annotating localization patterns, and quantifying gene-gene colocalization. We demonstrate MERFISH, seqFISH + , Molecular Cartography, and Xenium datasets. Bento is part of the open-source Scverse ecosystem, enabling integration with other single-cell analysis tools.


Subject(s)
Ecosystem , Propanolamines , Gene Expression Profiling , Cell Communication , Single-Cell Analysis , Transcriptome
3.
J Proteome Res ; 21(4): 1189-1195, 2022 04 01.
Article in English | MEDLINE | ID: mdl-35290070

ABSTRACT

It is important for the proteomics community to have a standardized manner to represent all possible variations of a protein or peptide primary sequence, including natural, chemically induced, and artifactual modifications. The Human Proteome Organization Proteomics Standards Initiative in collaboration with several members of the Consortium for Top-Down Proteomics (CTDP) has developed a standard notation called ProForma 2.0, which is a substantial extension of the original ProForma notation developed by the CTDP. ProForma 2.0 aims to unify the representation of proteoforms and peptidoforms. ProForma 2.0 supports use cases needed for bottom-up and middle-/top-down proteomics approaches and allows the encoding of highly modified proteins and peptides using a human- and machine-readable string. ProForma 2.0 can be used to represent protein modifications in a specified or ambiguous location, designated by mass shifts, chemical formulas, or controlled vocabulary terms, including cross-links (natural and chemical) and atomic isotopes. Notational conventions are based on public controlled vocabularies and ontologies. The most up-to-date full specification document and information about software implementations are available at http://psidev.info/proforma.


Subject(s)
Proteome , Proteomics , Humans , Protein Processing, Post-Translational , Proteome/genetics , Reference Standards , Software
4.
J Proteome Res ; 21(2): 410-419, 2022 02 04.
Article in English | MEDLINE | ID: mdl-35073098

ABSTRACT

Interpreting proteomics data remains challenging due to the large number of proteins that are quantified by modern mass spectrometry methods. Weighted gene correlation network analysis (WGCNA) can identify groups of biologically related proteins using only protein intensity values by constructing protein correlation networks. However, WGCNA is not widespread in proteomic analyses due to challenges in implementing workflows. To facilitate the adoption of WGCNA by the proteomics field, we created MetaNetwork, an open-source, R-based application to perform sophisticated WGCNA workflows with no coding skill requirements for the end user. We demonstrate MetaNetwork's utility by employing it to identify groups of proteins associated with prostate cancer from a proteomic analysis of tumor and adjacent normal tissue samples. We found a decrease in cytoskeleton-related protein expression, a known hallmark of prostate tumors. We further identified changes in module eigenproteins indicative of dysregulation in protein translation and trafficking pathways. These results demonstrate the value of using MetaNetwork to improve the biological interpretation of quantitative proteomics experiments with 15 or more samples.


Subject(s)
Proteins , Proteomics , Cluster Analysis , Humans , Male , Mass Spectrometry , Workflow
5.
Science ; 375(6579): 411-418, 2022 01 28.
Article in English | MEDLINE | ID: mdl-35084980

ABSTRACT

Human biology is tightly linked to proteins, yet most measurements do not precisely determine alternatively spliced sequences or posttranslational modifications. Here, we present the primary structures of ~30,000 unique proteoforms, nearly 10 times more than in previous studies, expressed from 1690 human genes across 21 cell types and plasma from human blood and bone marrow. The results, compiled in the Blood Proteoform Atlas (BPA), indicate that proteoforms better describe protein-level biology and are more specific indicators of differentiation than their corresponding proteins, which are more broadly expressed across cell types. We demonstrate the potential for clinical application, by interrogating the BPA in the context of liver transplantation and identifying cell and proteoform signatures that distinguish normal graft function from acute rejection and other causes of graft dysfunction.


Subject(s)
Blood Cells/chemistry , Blood Proteins/chemistry , Bone Marrow Cells/chemistry , Databases, Protein , Protein Isoforms/chemistry , Proteome/chemistry , Alternative Splicing , B-Lymphocytes/chemistry , Blood Proteins/genetics , Cell Lineage , Humans , Leukocytes, Mononuclear/chemistry , Liver Transplantation , Plasma/chemistry , Protein Isoforms/genetics , Protein Processing, Post-Translational , Proteomics , T-Lymphocytes/chemistry
6.
Nature ; 590(7847): 649-654, 2021 02.
Article in English | MEDLINE | ID: mdl-33627808

ABSTRACT

The cell cycle, over which cells grow and divide, is a fundamental process of life. Its dysregulation has devastating consequences, including cancer1-3. The cell cycle is driven by precise regulation of proteins in time and space, which creates variability between individual proliferating cells. To our knowledge, no systematic investigations of such cell-to-cell proteomic variability exist. Here we present a comprehensive, spatiotemporal map of human proteomic heterogeneity by integrating proteomics at subcellular resolution with single-cell transcriptomics and precise temporal measurements of individual cells in the cell cycle. We show that around one-fifth of the human proteome displays cell-to-cell variability, identify hundreds of proteins with previously unknown associations with mitosis and the cell cycle, and provide evidence that several of these proteins have oncogenic functions. Our results show that cell cycle progression explains less than half of all cell-to-cell variability, and that most cycling proteins are regulated post-translationally, rather than by transcriptomic cycling. These proteins are disproportionately phosphorylated by kinases that regulate cell fate, whereas non-cycling proteins that vary between cells are more likely to be modified by kinases that regulate metabolism. This spatially resolved proteomic map of the cell cycle is integrated into the Human Protein Atlas and will serve as a resource for accelerating molecular studies of the human cell cycle and cell proliferation.


Subject(s)
Cell Cycle , Proteogenomics/methods , Single-Cell Analysis/methods , Transcriptome , Cell Cycle Proteins/metabolism , Cell Line, Tumor , Cell Lineage , Cell Proliferation , Humans , Interphase , Mitosis , Oncogene Proteins/metabolism , Phosphorylation , Protein Kinases/metabolism , Proteome/metabolism , Time Factors
7.
Trends Cancer ; 7(4): 278-282, 2021 04.
Article in English | MEDLINE | ID: mdl-33436349

ABSTRACT

Cellular heterogeneity is an important biological phenomenon observed across space and time in human tissues. Imaging-based spatial proteomic technologies can provide fruitful new readouts of phenotypic states for individual cells at subcellular resolution, which may help unravel the roles of non-genetic cellular heterogeneity in tumorigenesis and drug resistance.


Subject(s)
Proteomics , Humans , Molecular Imaging , Neoplasms/metabolism , Proteins/metabolism , Single-Cell Analysis
8.
J Proteome Res ; 20(4): 1826-1834, 2021 04 02.
Article in English | MEDLINE | ID: mdl-32967423

ABSTRACT

Proteoforms are the workhorses of the cell, and subtle differences between their amino acid sequences or post-translational modifications (PTMs) can change their biological function. To most effectively identify and quantify proteoforms in genetically diverse samples by mass spectrometry (MS), it is advantageous to search the MS data against a sample-specific protein database that is tailored to the sample being analyzed, in that it contains the correct amino acid sequences and relevant PTMs for that sample. To this end, we have developed Spritz (https://smith-chem-wisc.github.io/Spritz/), an open-source software tool for generating protein databases annotated with sequence variations and PTMs. We provide a simple graphical user interface for Windows and scripts that can be run on any operating system. Spritz automatically sets up and executes approximately 20 tools, which enable the construction of a proteogenomic database from only raw RNA sequencing data. Sequence variations that are discovered in RNA sequencing data upon comparison to the Ensembl reference genome are annotated on proteins in these databases, and PTM annotations are transferred from UniProt. Modifications can also be discovered and added to the database using bottom-up mass spectrometry data and global PTM discovery in MetaMorpheus. We demonstrate that such sample-specific databases allow the identification of variant peptides, modified variant peptides, and variant proteoforms by searching bottom-up and top-down proteomic data from the Jurkat human T lymphocyte cell line and demonstrate the identification of phosphorylated variant sites with phosphoproteomic data from the U2OS human osteosarcoma cell line.


Subject(s)
Proteogenomics , Databases, Protein , Humans , Mass Spectrometry , Protein Processing, Post-Translational , Proteomics , Software
9.
Mol Syst Biol ; 16(8): e9469, 2020 08.
Article in English | MEDLINE | ID: mdl-32744794

ABSTRACT

The nucleolus is essential for ribosome biogenesis and is involved in many other cellular functions. We performed a systematic spatiotemporal dissection of the human nucleolar proteome using confocal microscopy. In total, 1,318 nucleolar proteins were identified; 287 were localized to fibrillar components, and 157 were enriched along the nucleoplasmic border, indicating a potential fourth nucleolar subcompartment: the nucleoli rim. We found 65 nucleolar proteins (36 uncharacterized) to relocate to the chromosomal periphery during mitosis. Interestingly, we observed temporal partitioning into two recruitment phenotypes: early (prometaphase) and late (after metaphase), suggesting phase-specific functions. We further show that the expression of MKI67 is critical for this temporal partitioning. We provide the first proteome-wide analysis of intrinsic protein disorder for the human nucleolus and show that nucleolar proteins in general, and mitotic chromosome proteins in particular, have significantly higher intrinsic disorder level compared to cytosolic proteins. In summary, this study provides a comprehensive and essential resource of spatiotemporal expression data for the nucleolar proteome as part of the Human Protein Atlas.


Subject(s)
Cell Nucleolus/metabolism , Ki-67 Antigen/metabolism , Nuclear Proteins/metabolism , Proteomics/methods , Chromosomes, Human/metabolism , HEK293 Cells , Humans , Microscopy, Confocal , Mitosis , Phenotype , Single-Cell Analysis
11.
J Proteome Res ; 19(4): 1635-1646, 2020 04 03.
Article in English | MEDLINE | ID: mdl-32058723

ABSTRACT

Identifying single amino acid variants (SAAVs) in cancer is critical for precision oncology. Several advanced algorithms are now available to identify SAAVs, but attempts to combine different algorithms and optimize them on large data sets to achieve a more comprehensive coverage of SAAVs have not been implemented. Herein, we report an expanded detection of SAAVs in the PANC-1 cell line using three different strategies, which results in the identification of 540 SAAVs in the mass spectrometry data. Among the set of 540 SAAVs, 79 are evaluated as deleterious SAAVs based on analysis using the novel AssVar software in which one of the driver mutations found in each protein of KRAS, TP53, and SLC37A4 is further validated using independent selected reaction monitoring (SRM) analysis. Our study represents the most comprehensive discovery of SAAVs to date and the first large-scale detection of deleterious SAAVs in the PANC-1 cell line. This work may serve as the basis for future research in pancreatic cancer and personal immunotherapy and treatment.


Subject(s)
Amino Acids , Pancreatic Neoplasms , Antiporters , Cell Line , Humans , Monosaccharide Transport Proteins , Pancreatic Neoplasms/genetics , Precision Medicine , Proteins
14.
Nat Methods ; 16(12): 1254-1261, 2019 12.
Article in English | MEDLINE | ID: mdl-31780840

ABSTRACT

Pinpointing subcellular protein localizations from microscopy images is easy to the trained eye, but challenging to automate. Based on the Human Protein Atlas image collection, we held a competition to identify deep learning solutions to solve this task. Challenges included training on highly imbalanced classes and predicting multiple labels per image. Over 3 months, 2,172 teams participated. Despite convergence on popular networks and training techniques, there was considerable variety among the solutions. Participants applied strategies for modifying neural networks and loss functions, augmenting data and using pretrained networks. The winning models far outperformed our previous effort at multi-label classification of protein localization patterns by ~20%. These models can be used as classifiers to annotate new images, feature extractors to measure pattern similarity or pretrained networks for a wide range of biological applications.


Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Microscopy, Fluorescence/methods , Proteins/analysis , Humans
15.
RNA ; 25(10): 1337-1352, 2019 10.
Article in English | MEDLINE | ID: mdl-31296583

ABSTRACT

Proteins bind mRNA through their entire life cycle from transcription to degradation. We analyzed c-Myc mRNA protein interactors in vivo using the HyPR-MS method to capture the crosslinked mRNA by hybridization and then analyzed the bound proteins using mass spectrometry proteomics. Using HyPR-MS, 229 c-Myc mRNA-binding proteins were identified, confirming previously proposed interactors, suggesting new interactors, and providing information related to the roles and pathways known to involve c-Myc. We performed structural and functional analysis of these proteins and validated our findings with a combination of RIP-qPCR experiments, in vitro results released in past studies, publicly available RIP- and eCLIP-seq data, and results from software tools for predicting RNA-protein interactions.


Subject(s)
Mass Spectrometry/methods , Proto-Oncogene Proteins c-myc/metabolism , RNA, Messenger/metabolism , RNA-Binding Proteins/metabolism , Chromatin Immunoprecipitation , Humans , K562 Cells , Protein Interaction Domains and Motifs
16.
J Proteome Res ; 17(10): 3526-3536, 2018 10 05.
Article in English | MEDLINE | ID: mdl-30180576

ABSTRACT

The development of effective strategies for the comprehensive identification and quantification of proteoforms in complex systems is a critical challenge in proteomics. Proteoforms, the specific molecular forms in which proteins are present in biological systems, are the key effectors of biological function. Thus, knowledge of proteoform identities and abundances is essential to unraveling the mechanisms that underlie protein function. We recently reported a strategy that integrates conventional top-down mass spectrometry with intact-mass determinations for enhanced proteoform identifications and the elucidation of proteoform families and applied it to the analysis of yeast cell lysate. In the present work, we extend this strategy to enable quantification of proteoforms, and we examine changes in the abundance of murine mitochondrial proteoforms upon differentiation of mouse myoblasts to myotubes. The integrated top-down and intact-mass strategy provided an increase of ∼37% in the number of identified proteoforms compared to top-down alone, which is in agreement with our previous work in yeast; 1779 unique proteoforms were identified using the integrated strategy compared to 1301 using top-down analysis alone. Quantitative comparison of proteoform differences between the myoblast and myotube cell types showed 129 observed proteoforms exhibiting statistically significant abundance changes (fold change >2 and false discovery rate <5%).


Subject(s)
Mitochondria/metabolism , Mitochondrial Proteins/metabolism , Proteome/metabolism , Proteomics/methods , Tandem Mass Spectrometry/methods , Animals , Cell Differentiation , Cell Line , Mice , Muscle Fibers, Skeletal/cytology , Muscle Fibers, Skeletal/metabolism , Myoblasts/cytology , Myoblasts/metabolism , Reproducibility of Results , Saccharomyces cerevisiae/metabolism , Saccharomyces cerevisiae Proteins/metabolism
17.
J Proteome Res ; 17(9): 3022-3038, 2018 09 07.
Article in English | MEDLINE | ID: mdl-29972301

ABSTRACT

RNA-protein interactions are integral to the regulation of gene expression. RNAs have diverse functions and the protein interactomes of individual RNAs vary temporally, spatially, and with physiological context. These factors make the global acquisition of individual RNA-protein interactomes an essential endeavor. Although techniques have been reported for discovery of the protein interactomes of specific RNAs they are largely laborious, costly, and accomplished singly in individual experiments. We developed HyPR-MS for the discovery and analysis of the protein interactomes of multiple RNAs in a single experiment while also reducing design time and improving efficiencies. Presented here is the application of HyPR-MS to simultaneously and selectively isolate the interactomes of lncRNAs MALAT1, NEAT1, and NORAD. Our analysis features the proteins that potentially contribute to both known and previously undiscovered roles of each lncRNA. This platform provides a powerful new multiplexing tool for the efficient and cost-effective elucidation of specific RNA-protein interactomes.


Subject(s)
Proteomics/methods , RNA, Long Noncoding/metabolism , RNA-Binding Proteins/metabolism , Base Sequence , Cell Line, Tumor , Gene Expression Regulation , Gene Ontology , Humans , Mass Spectrometry/methods , Molecular Sequence Annotation , Protein Binding , RNA, Long Noncoding/genetics , RNA-Binding Proteins/classification , RNA-Binding Proteins/genetics
18.
Transl Oncol ; 11(3): 808-814, 2018 Jun.
Article in English | MEDLINE | ID: mdl-29723810

ABSTRACT

INTRODUCTION: The molecular mechanisms underlying aggressive versus indolent disease are not fully understood. Recent research has implicated a class of molecules known as long noncoding RNAs (lncRNAs) in tumorigenesis and progression of cancer. Our objective was to discover lncRNAs that differentiate aggressive and indolent prostate cancers. METHODS: We analyzed paired tumor and normal tissues from six aggressive Gleason score (GS) 8-10 and six indolent GS 6 prostate cancers. Extracted RNA was split for poly(A)+ and ribosomal RNA depletion library preparations, followed byRNA sequencing (RNA-Seq) using an Illumina HiSeq 2000. We developed an RNA-Seq data analysis pipeline to discover and quantify these molecules. Candidate lncRNAs were validated using RT-qPCR on 87 tumor tissue samples: 28 (GS 6), 28 (GS 3+4), 6 (GS 4+3), and 25 (GS 8-10). Statistical correlations between lncRNAs and clinicopathologic variables were tested using ANOVA. RESULTS: The 43 differentially expressed (DE) lncRNAs between aggressive and indolent prostate cancers included 12 annotated and 31 novel lncRNAs. The top six DE lncRNAs were selected based on large, consistent fold-changes in the RNA-Seq results. Three of these candidates passed RT-qPCR validation, including AC009014.3 (P < .001 in tumor tissue) and a newly discovered X-linked lncRNA named XPLAID (P = .049 in tumor tissue and P = .048 in normal tissue). XPLAID and AC009014.3 show promise as prognostic biomarkers. CONCLUSIONS: We discovered several dozen lncRNAs that distinguish aggressive and indolent prostate cancers, of which four were validated using RT-qPCR. The investigation into their biology is ongoing.

19.
J Proteome Res ; 17(3): 1321-1325, 2018 03 02.
Article in English | MEDLINE | ID: mdl-29397739

ABSTRACT

The Consortium for Top-Down Proteomics (CTDP) proposes a standardized notation, ProForma, for writing the sequence of fully characterized proteoforms. ProForma provides a means to communicate any proteoform by writing the amino acid sequence using standard one-letter notation and specifying modifications or unidentified mass shifts within brackets following certain amino acids. The notation is unambiguous, human-readable, and can easily be parsed and written by bioinformatic tools. This system uses seven rules and supports a wide range of possible use cases, ensuring compatibility and reproducibility of proteoform annotations. Standardizing proteoform sequences will simplify storage, comparison, and reanalysis of proteomic studies, and the Consortium welcomes input and contributions from the research community on the continued design and maintenance of this standard.


Subject(s)
Computational Biology/methods , Protein Processing, Post-Translational , Proteome/analysis , Proteomics/methods , Software , Tandem Mass Spectrometry/standards , Amino Acid Sequence , Computational Biology/statistics & numerical data , Databases, Protein/statistics & numerical data , Humans , Information Dissemination , International Cooperation , Molecular Sequence Annotation , Proteome/genetics , Proteome/metabolism , Proteomics/statistics & numerical data , Reproducibility of Results , Tandem Mass Spectrometry/methods
20.
J Proteome Res ; 17(1): 568-578, 2018 01 05.
Article in English | MEDLINE | ID: mdl-29195273

ABSTRACT

We present an open-source, interactive program named Proteoform Suite that uses proteoform mass and intensity measurements from complex biological samples to identify and quantify proteoforms. It constructs families of proteoforms derived from the same gene, assesses proteoform function using gene ontology (GO) analysis, and enables visualization of quantified proteoform families and their changes. It is applied here to reveal systemic proteoform variations in the yeast response to salt stress.


Subject(s)
Proteomics/methods , Software , Fungal Proteins/analysis , Fungal Proteins/drug effects , Gene Ontology , Mass Spectrometry , Salts/pharmacology , Stress, Physiological/drug effects
SELECTION OF CITATIONS
SEARCH DETAIL
...