ABSTRACT
The cell cycle, over which cells grow and divide, is a fundamental process of life. Its dysregulation has devastating consequences, including cancer1-3. The cell cycle is driven by precise regulation of proteins in time and space, which creates variability between individual proliferating cells. To our knowledge, no systematic investigations of such cell-to-cell proteomic variability exist. Here we present a comprehensive, spatiotemporal map of human proteomic heterogeneity by integrating proteomics at subcellular resolution with single-cell transcriptomics and precise temporal measurements of individual cells in the cell cycle. We show that around one-fifth of the human proteome displays cell-to-cell variability, identify hundreds of proteins with previously unknown associations with mitosis and the cell cycle, and provide evidence that several of these proteins have oncogenic functions. Our results show that cell cycle progression explains less than half of all cell-to-cell variability, and that most cycling proteins are regulated post-translationally, rather than by transcriptomic cycling. These proteins are disproportionately phosphorylated by kinases that regulate cell fate, whereas non-cycling proteins that vary between cells are more likely to be modified by kinases that regulate metabolism. This spatially resolved proteomic map of the cell cycle is integrated into the Human Protein Atlas and will serve as a resource for accelerating molecular studies of the human cell cycle and cell proliferation.
Subject(s)
Cell Cycle , Proteogenomics/methods , Single-Cell Analysis/methods , Transcriptome , Cell Cycle Proteins/metabolism , Cell Line, Tumor , Cell Lineage , Cell Proliferation , Humans , Interphase , Mitosis , Oncogene Proteins/metabolism , Phosphorylation , Protein Kinases/metabolism , Proteome/metabolism , Time FactorsABSTRACT
Efforts to understand the complexities of human biology encompass multidimensional aspects, with proteins emerging as crucial components. However, studying the human ovary introduces unique challenges due to its complex dynamics and changes over a lifetime, varied cellular composition, and limited sample access. Here, four new RNA-seq samples of ovarian cortex spanning ages of 7 to 32 were sequenced and added to the existing data in the Human Protein Atlas (HPA) database www.proteinatlas.org, opening the doors to unique possibilities for exploration of oocyte-specific proteins. Based on transcriptomics analysis of the four new tissue samples representing both prepubertal girls and women of fertile age, we selected 20 protein candidates that lacked previous evidence at the protein level, so-called "missing proteins" (MPs). The proteins were validated using high-resolution antibody-based profiling and single-cell transcriptomics. Fourteen proteins exhibited consistent single-cell expression patterns in oocytes and granulosa cells, confirming their presence in the ovary and suggesting that these proteins play important roles in ovarian function, thus proposing that these 14 proteins should no longer be classified as MPs. This research significantly advances the understanding of MPs, unearthing fresh avenues for prospective exploration. By integrating innovative methodologies and leveraging the wealth of data in the HPA database, these insights contribute to refining our understanding of protein roles within the human ovary and opening the doors for further investigations into missing proteins and human reproduction.
Subject(s)
Ovary , Proteomics , Humans , Female , Prospective Studies , Oocytes , Proteins/metabolism , Gene Expression ProfilingABSTRACT
Since 2010, the Human Proteome Project (HPP), the flagship initiative of the Human Proteome Organization (HUPO), has pursued two goals: (1) to credibly identify the protein parts list and (2) to make proteomics an integral part of multiomics studies of human health and disease. The HPP relies on international collaboration, data sharing, standardized reanalysis of MS data sets by PeptideAtlas and MassIVE-KB using HPP Guidelines for quality assurance, integration and curation of MS and non-MS protein data by neXtProt, plus extensive use of antibody profiling carried out by the Human Protein Atlas. According to the neXtProt release 2023-04-18, protein expression has now been credibly detected (PE1) for 18,397 of the 19,778 neXtProt predicted proteins coded in the human genome (93%). Of these PE1 proteins, 17,453 were detected with mass spectrometry (MS) in accordance with HPP Guidelines and 944 by a variety of non-MS methods. The number of neXtProt PE2, PE3, and PE4 missing proteins now stands at 1381. Achieving the unambiguous identification of 93% of predicted proteins encoded from across all chromosomes represents remarkable experimental progress on the Human Proteome parts list. Meanwhile, there are several categories of predicted proteins that have proved resistant to detection regardless of protein-based methods used. Additionally there are some PE1-4 proteins that probably should be reclassified to PE5, specifically 21 LINC entries and â¼30 HERV entries; these are being addressed in the present year. Applying proteomics in a wide array of biological and clinical studies ensures integration with other omics platforms as reported by the Biology and Disease-driven HPP teams and the antibody and pathology resource pillars. Current progress has positioned the HPP to transition to its Grand Challenge Project focused on determining the primary function(s) of every protein itself and in networks and pathways within the context of human health and disease.
Subject(s)
Antibodies , Proteome , Humans , Proteome/genetics , Proteome/analysis , Databases, Protein , Mass Spectrometry/methods , Proteomics/methodsABSTRACT
In the quest for "missing proteins" (MPs), the proteins encoded by the human genome still lacking evidence of existence at the protein level, novel approaches are needed to detect this challenging group of proteins. The current count stands at 1,343 MPs, and it is likely that many of these proteins are expressed at low levels, in rare cell or tissue types, or the cells in which they are expressed may only represent a small minority of the tissue. Here, we used an integrated omics approach to identify and explore MPs in human ovaries. By taking advantage of publicly available transcriptomics and antibody-based proteomics data in the Human Protein Atlas (HPA), we selected 18 candidates for further immunohistochemical analysis using an exclusive collection of ovarian tissues from women and patients of reproductive age. The results were compared with data from single-cell mRNA sequencing, and seven proteins (CTXN1, MRO, RERGL, TTLL3, TRIM61, TRIM73, and ZNF793) could be validated at the single-cell type level with both methods. We present for the first time the cell type-specific spatial localization of 18 MPs in human ovarian follicles, thereby showcasing the utility of the HPA database as an important resource for identification of MPs suitable for exploration in specialized tissue samples. The results constitute a starting point for further quantitative and qualitative analysis of the human ovaries, and the novel data for the seven proteins that were validated with both methods should be considered as evidence of existence of these proteins in human ovary.
Subject(s)
Ovary , Proteomics , Humans , Female , Ovary/chemistry , Proteomics/methods , Proteins/metabolism , Antibodies/metabolism , Gene Expression Profiling , Proteome/genetics , Proteome/analysisABSTRACT
The 2022 Metrics of the Human Proteome from the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18â¯407 (93.2%) of the 19â¯750 predicted proteins coded in the human genome, a net gain of 50 since 2021 from data sets generated around the world and reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 78 from 1421 to 1343. This represents continuing experimental progress on the human proteome parts list across all the chromosomes, as well as significant reclassifications. Meanwhile, applying proteomics in a vast array of biological and clinical studies continues to yield significant findings and growing integration with other omics platforms. We present highlights from the Chromosome-Centric HPP, Biology and Disease-driven HPP, and HPP Resource Pillars, compare features of mass spectrometry and Olink and Somalogic platforms, note the emergence of translation products from ribosome profiling of small open reading frames, and discuss the launch of the initial HPP Grand Challenge Project, "A Function for Each Protein".
Subject(s)
Proteome , Proteomics , Humans , Proteome/genetics , Proteome/analysis , Databases, Protein , Mass Spectrometry/methods , Open Reading Frames , Proteomics/methodsABSTRACT
A multitude of efforts worldwide aim to create a single-cell reference map of the human body, for fundamental understanding of human health, molecular medicine, and targeted treatment. Antibody-based proteomics using immunohistochemistry (IHC) has proven to be an excellent technology for integration with large-scale single-cell transcriptomics datasets. The golden standard for evaluation of IHC staining patterns is manual annotation, which is expensive and may lead to subjective errors. Artificial intelligence holds much promise for efficient and accurate pattern recognition, but confidence in prediction needs to be addressed. Here, the aim was to present a reliable and comprehensive framework for automated annotation of IHC images. We developed a multilabel classification of 7848 complex IHC images of human testis corresponding to 2794 unique proteins, generated as part of the Human Protein Atlas (HPA) project. Manual annotation data for eight different cell types was generated as a basis for training and testing a proposed Hybrid Bayesian Neural Network. By combining the deep learning model with a novel uncertainty metric, DeepHistoClass (DHC) Confidence Score, the average diagnostic performance improved from 86.9% to 96.3%. This metric not only reveals which images are reliably classified by the model, but can also be utilized for identification of manual annotation errors. The proposed streamlined workflow can be developed further for other tissue types in health and disease and has important implications for digital pathology initiatives or large-scale protein mapping efforts such as the HPA project.
Subject(s)
Deep Learning , Image Processing, Computer-Assisted/methods , Proteins/metabolism , Testis/metabolism , Bayes Theorem , Humans , Immunohistochemistry/classification , Male , WorkflowABSTRACT
BACKGROUND: There is a need for functional genome-wide annotation of the protein-coding genes to get a deeper understanding of mammalian biology. Here, a new annotation strategy is introduced based on dimensionality reduction and density-based clustering of whole-body co-expression patterns. This strategy has been used to explore the gene expression landscape in pig, and we present a whole-body map of all protein-coding genes in all major pig tissues and organs. RESULTS: An open-access pig expression map ( www.rnaatlas.org ) is presented based on the expression of 350 samples across 98 well-defined pig tissues divided into 44 tissue groups. A new UMAP-based classification scheme is introduced, in which all protein-coding genes are stratified into tissue expression clusters based on body-wide expression profiles. The distribution and tissue specificity of all 22,342 protein-coding pig genes are presented. CONCLUSIONS: Here, we present a new genome-wide annotation strategy based on dimensionality reduction and density-based clustering. A genome-wide resource of the transcriptome map across all major tissues and organs in pig is presented, and the data is available as an open-access resource ( www.rnaatlas.org ), including a comparison to the expression of human orthologs.
Subject(s)
Genome , Genomics , Animals , Gene Expression Profiling , Mammals , Molecular Sequence Annotation , Organ Specificity , Swine/genetics , TranscriptomeABSTRACT
SARS-coronavirus 2 (SARS-CoV-2) that caused the coronavirus disease 2019 (COVID-19) pandemic has posed to be a global challenge. An increasing number of neurological symptoms have been linked to the COVID-19 disease, but the underlying mechanisms of such symptoms and which patients could be at risk are not yet established. The suggested key receptor for host cell entry is angiotensin I converting enzyme 2 (ACE2). Previous studies on limited tissue material have shown no or low protein expression of ACE2 in the normal brain. Here, we used stringently validated antibodies and immunohistochemistry to examine the protein expression of ACE2 in all major regions of the normal brain. The expression pattern was compared with the COVID-19-affected brain of patients with a varying degree of neurological symptoms. In the normal brain, the expression was restricted to the choroid plexus and ependymal cells with no expression in any other brain cell types. Interestingly, in the COVID-19-affected brain, an upregulation of ACE2 was observed in endothelial cells of certain patients, most prominently in the white matter and with the highest expression observed in the patient with the most severe neurological symptoms. The data shows differential expression of ACE2 in the diseased brain and highlights the need to further study the role of endothelial cells in COVID-19 disease in relation to neurological symptoms.
Subject(s)
Angiotensin-Converting Enzyme 2 , COVID-19 , Angiotensin-Converting Enzyme 2/genetics , Brain/metabolism , Endothelial Cells/metabolism , Humans , Peptidyl-Dipeptidase A/genetics , Peptidyl-Dipeptidase A/metabolism , SARS-CoV-2ABSTRACT
BACKGROUND & AIMS: Colorectal cancer (CRC) is thought to arise when the cumulative mutational burden within colonic crypts exceeds a certain threshold that leads to clonal expansion and ultimately neoplastic transformation. Therefore, quantification of the fixation and subsequent expansion of somatic mutations in normal epithelium is key to understanding colorectal cancer initiation. The aim of the present study was to determine how advantaged expansions can be accommodated in the human colon. METHODS: Immunohistochemistry was used to visualize loss of the cancer driver KDM6A in formalin-fixed paraffin-embedded (FFPE) normal human colonic epithelium. Combining microscopy with neural network-based image analysis, we determined the frequencies of KDM6A-mutant crypts and fission/fusion intermediates as well as the spatial distribution of clones. Mathematical modeling then defined the dynamics of their fixation and expansion. RESULTS: Interpretation of the age-related behavior of KDM6A-negative clones revealed significant competitive advantage in intracrypt dynamics as well as a 5-fold increase in crypt fission rate. This was not accompanied by an increase in crypt fusion. Mathematical modeling of crypt spacing identifies evidence for a crypt diffusion process. We define the threshold fission rate at which diffusion fails to accommodate new crypts, which can be exceeded by KRAS activating mutations. CONCLUSIONS: Advantaged gene mutations in KDM6A expand dramatically by crypt fission but not fusion. The crypt diffusion process enables accommodation of the additional crypts up to a threshold value, beyond which polyp growth may occur. The fission rate associated with KRAS mutations offers a potential explanation for KRAS-initiated polyps.
Subject(s)
Cell Proliferation , Cell Transformation, Neoplastic/genetics , Colonic Polyps/genetics , Colorectal Neoplasms/genetics , Epithelial Cells/pathology , Histone Demethylases/genetics , Intestinal Mucosa/pathology , Mutation , Neoplastic Stem Cells/pathology , Proto-Oncogene Proteins p21(ras)/genetics , Adolescent , Adult , Age Factors , Aged , Aged, 80 and over , Cell Transformation, Neoplastic/metabolism , Cell Transformation, Neoplastic/pathology , Colonic Polyps/metabolism , Colonic Polyps/pathology , Colorectal Neoplasms/metabolism , Colorectal Neoplasms/pathology , Diffusion , Epithelial Cells/metabolism , Female , Histone Demethylases/metabolism , Humans , Intestinal Mucosa/metabolism , Male , Middle Aged , Models, Biological , Neoplastic Stem Cells/metabolism , Proto-Oncogene Proteins p21(ras)/metabolism , Young AdultABSTRACT
Immune cells of the tumor microenvironment are central but erratic targets for immunotherapy. The aim of this study was to characterize novel patterns of immune cell infiltration in non-small cell lung cancer (NSCLC) in relation to its molecular and clinicopathologic characteristics. Lymphocytes (CD3+, CD4+, CD8+, CD20+, FOXP3+, CD45RO+), macrophages (CD163+), plasma cells (CD138+), NK cells (NKp46+), PD1+, and PD-L1+ were annotated on a tissue microarray including 357 NSCLC cases. Somatic mutations were analyzed by targeted sequencing for 82 genes and a tumor mutational load score was estimated. Transcriptomic immune patterns were established in 197 patients based on RNA sequencing data. The immune cell infiltration was variable and showed only poor association with specific mutations. The previously defined immune phenotypic patterns, desert, inflamed, and immune excluded, comprised 30, 13, and 57% of cases, respectively. Notably, mRNA immune activation and high estimated tumor mutational load were unique only for the inflamed pattern. However, in the unsupervised cluster analysis, including all immune cell markers, these conceptual patterns were only weakly reproduced. Instead, four immune classes were identified: (1) high immune cell infiltration, (2) high immune cell infiltration with abundance of CD20+ B cells, (3) low immune cell infiltration, and (4) a phenotype with an imprint of plasma cells and NK cells. This latter class was linked to better survival despite exhibiting low expression of immune response-related genes (e.g. CXCL9, GZMB, INFG, CTLA4). This compartment-specific immune cell analysis in the context of the molecular and clinical background of NSCLC reveals two previously unrecognized immune classes. A refined immune classification, including traits of the humoral and innate immune response, is important to define the immunogenic potency of NSCLC in the era of immunotherapy. © 2021 The Authors. The Journal of Pathology published by John Wiley & Sons, Ltd. on behalf of The Pathological Society of Great Britain and Ireland.
Subject(s)
Carcinoma, Non-Small-Cell Lung/immunology , Killer Cells, Natural/immunology , Lung Neoplasms/immunology , Plasma Cells , Tumor Microenvironment/immunology , Adult , Aged , Female , Humans , Lymphocytes, Tumor-Infiltrating/immunology , Male , Middle AgedABSTRACT
The 2021 Metrics of the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18â¯357 (92.8%) of the 19â¯778 predicted proteins coded in the human genome, a gain of 483 since 2020 from reports throughout the world reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 478 to 1421. This represents remarkable progress on the proteome parts list. The utilization of proteomics in a broad array of biological and clinical studies likewise continues to expand with many important findings and effective integration with other omics platforms. We present highlights from the Immunopeptidomics, Glycoproteomics, Infectious Disease, Cardiovascular, Musculo-Skeletal, Liver, and Cancers B/D-HPP teams and from the Knowledgebase, Mass Spectrometry, Antibody Profiling, and Pathology resource pillars, as well as ethical considerations important to the clinical utilization of proteomics and protein biomarkers.
Subject(s)
Benchmarking , Proteome , Databases, Protein , Humans , Mass Spectrometry/methods , Proteome/analysis , Proteome/genetics , Proteomics/methodsABSTRACT
The novel SARS-coronavirus 2 (SARS-CoV-2) poses a global challenge on healthcare and society. For understanding the susceptibility for SARS-CoV-2 infection, the cell type-specific expression of the host cell surface receptor is necessary. The key protein suggested to be involved in host cell entry is angiotensin I converting enzyme 2 (ACE2). Here, we report the expression pattern of ACE2 across > 150 different cell types corresponding to all major human tissues and organs based on stringent immunohistochemical analysis. The results were compared with several datasets both on the mRNA and protein level. ACE2 expression was mainly observed in enterocytes, renal tubules, gallbladder, cardiomyocytes, male reproductive cells, placental trophoblasts, ductal cells, eye, and vasculature. In the respiratory system, the expression was limited, with no or only low expression in a subset of cells in a few individuals, observed by one antibody only. Our data constitute an important resource for further studies on SARS-CoV-2 host cell entry, in order to understand the biology of the disease and to aid in the development of effective treatments to the viral infection.
Subject(s)
Peptidyl-Dipeptidase A/metabolism , Respiratory System/metabolism , Angiotensin-Converting Enzyme 2 , Betacoronavirus , Blood Vessels/metabolism , Conjunctiva/metabolism , Enterocytes/metabolism , Female , Gallbladder/metabolism , Host Microbial Interactions , Humans , Immunohistochemistry , Kidney Tubules, Proximal/metabolism , Male , Mass Spectrometry , Myocytes, Cardiac/metabolism , Organ Specificity , Peptidyl-Dipeptidase A/genetics , Placenta/metabolism , Pregnancy , RNA-Seq , SARS-CoV-2 , Single-Cell Analysis , Testis/metabolismABSTRACT
BACKGROUND: Programmed cell death 1 (PD-1) and its ligands PD-L1 and PD-L2, as well as Indoleamine 2,3-deoxygenase (IDO1) can be expressed both by tumor and microenvironmental cells and are crucial for tumor immune escape. We aimed to evaluate the role of PD-1, its ligands and IDO1 in a cohort of patients with primary diffuse large B-cell lymphoma of the CNS (PCNSL). MATERIAL AND METHODS: Tissue microarrays (TMAs) were constructed in 45 PCNSL cases. RNA extraction from whole tissue sections and RNA sequencing were successfully performed in 33 cases. Immunohistochemical stainings for PD-1, PD-L1/paired box protein 5 (PAX-5), PD-L2/PAX-5 and IDO1, and Epstein-Barr virus encoding RNA (EBER) in situ hybridization were analyzed. RESULTS: High proportions of PD-L1 and PD-L2 positive tumor cells were observed in 11% and 9% of cases, respectively. High proportions of PD-L1 and PD-L2 positive leukocytes were observed in 55% and 51% of cases, respectively. RNA sequencing revealed that gene expression of IDO1 was high in patients with high proportion of PD-L1 positive leukocytes (p = .01). Protein expression of IDO1 in leukocytes was detected in 14/45 cases, in 79% of these cases a high proportion of PD-L1 positive leukocytes was observed. Gene expression of IDO1 was high in EBER-positive cases (p = .0009) and protein expression of IDO1 was detected in five of six EBER-positive cases. CONCLUSION: Our study shows a significant association between gene and protein expression of IDO1 and protein expression of PD-L1 in the tumor microenvironment of PCNSL, possibly of importance for prediction of response to immunotherapies.
Subject(s)
Epstein-Barr Virus Infections , Lymphoma, Large B-Cell, Diffuse , B7-H1 Antigen/genetics , Herpesvirus 4, Human , Humans , Lymphocytes, Tumor-Infiltrating , Lymphoma, Large B-Cell, Diffuse/drug therapy , Lymphoma, Large B-Cell, Diffuse/genetics , Tumor MicroenvironmentABSTRACT
Renal cell carcinoma (RCC) treatment has improved in the last decade with the introduction of drugs targeting tumor angiogenesis. However, the 5-year survival of metastatic disease is still only 10-15%. Here, we explored the prognostic significance of compartment-specific expression of Neuropilin 1 (NRP1), a co-receptor for vascular endothelial growth factor (VEGF). NRP1 expression was analyzed in RCC tumor vessels, in perivascular tumor cells, and generally in the tumor cell compartment. Moreover, complex formation between NRP1 and the main VEGF receptor, VEGFR2, was determined. Two RCC tissue microarrays were used; a discovery cohort consisting of 64 patients and a validation cohort of 314 patients. VEGFR2/NRP1 complex formation in cis (on the same cell) and trans (between cells) configurations was determined by in situ proximity ligation assay (PLA), and NRP1 protein expression in three compartments (endothelial cells, perivascular tumor cells, and general tumor cell expression) was determined by immunofluorescent staining. Expression of NRP1 in perivascular tumor cells was explored as a marker for RCC survival in the two RCC cohorts. Results were further validated using a publicly available gene expression dataset of clear cell RCC (ccRCC). We found that VEGFR2/NRP1 trans complexes were detected in 75% of the patient samples. The presence of trans VEGFR2/NRP1 complexes or perivascular NRP1 expression was associated with a reduced tumor vessel density and size. When exploring NRP1 as a biomarker for RCC prognosis, perivascular NRP1 and general tumor cell NRP1 protein expression correlated with improved survival in the two independent cohorts, and significant results were obtained also at the mRNA level using the publicly available ccRCC gene expression dataset. Only perivascular NRP1 expression remained significant in multivariable analysis. Our work shows that perivascular NRP1 expression is an independent marker of improved survival in RCC patients, and reduces tumor vascularization by forming complexes in trans with VEGFR2 in the tumor endothelium. © 2019 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.
Subject(s)
Carcinoma, Renal Cell/metabolism , Carcinoma, Renal Cell/mortality , Kidney Neoplasms/metabolism , Neuropilin-1/metabolism , Adult , Aged , Biomarkers/metabolism , Cohort Studies , Endothelial Cells/metabolism , Female , Humans , Kidney Neoplasms/diagnosis , Male , Middle Aged , Neovascularization, Pathologic/metabolism , Neuropilin-1/genetics , PrognosisABSTRACT
Women at high risk of HIV infection, including sex workers and those with active genital inflammation, have molecular signatures of immune activation and epithelial barrier remodeling in samples of their genital mucosa. These alterations in the local immunological milieu are likely to impact HIV susceptibility. We here analyze host genital protein signatures in HIV uninfected women, with high frequency of condom use, living in HIV-serodiscordant relationships. Cervicovaginal secretions from women living in HIV-serodiscordant relationships (n = 62) were collected at three time points over 12 months. Women living in HIV-negative seroconcordant relationships (controls, n = 25) were sampled at one time point. All study subjects were examined for demographic parameters associated with susceptibility to HIV infection. The cervicovaginal samples were analyzed using a high-throughput bead-based affinity assay. Proteins involved in epithelial barrier function and inflammation were increased in HIV-serodiscordant women. By combining several methods of analysis, a total of five proteins (CAPG, KLK10, SPRR3, elafin/PI3, CSTB) were consistently associated with this study group. Proteins analyzed using the affinity set-up were further validated by label-free tandem mass spectrometry in a partially overlapping cohort with concordant results. Women living in HIV-serodiscordant relationships thus had elevated levels of proteins involved in epithelial barrier function and inflammation despite low prevalence of sexually transmitted infections and a high frequency of safe sex practices. The identified proteins are important markers to follow during assessment of mucosal HIV susceptibility factors and a high-throughput bead-based affinity set-up could be a suitable method for such evaluation.
Subject(s)
Cervix Uteri/metabolism , HIV Infections/transmission , Proteomics/methods , Sexually Transmitted Diseases/metabolism , Vagina/metabolism , Adult , Cervix Uteri/virology , Cluster Analysis , Cornified Envelope Proline-Rich Proteins/metabolism , Cystatin B/metabolism , Early Diagnosis , Elafin/metabolism , Female , HIV Infections/metabolism , High-Throughput Screening Assays , Humans , Kallikreins/metabolism , Longitudinal Studies , Male , Microfilament Proteins/metabolism , Nuclear Proteins/metabolism , Sexual Partners , Tandem Mass Spectrometry , Vagina/virology , Young AdultABSTRACT
BACKGROUND: An increasing volume of prostate biopsies and a worldwide shortage of urological pathologists puts a strain on pathology departments. Additionally, the high intra-observer and inter-observer variability in grading can result in overtreatment and undertreatment of prostate cancer. To alleviate these problems, we aimed to develop an artificial intelligence (AI) system with clinically acceptable accuracy for prostate cancer detection, localisation, and Gleason grading. METHODS: We digitised 6682 slides from needle core biopsies from 976 randomly selected participants aged 50-69 in the Swedish prospective and population-based STHLM3 diagnostic study done between May 28, 2012, and Dec 30, 2014 (ISRCTN84445406), and another 271 from 93 men from outside the study. The resulting images were used to train deep neural networks for assessment of prostate biopsies. The networks were evaluated by predicting the presence, extent, and Gleason grade of malignant tissue for an independent test dataset comprising 1631 biopsies from 246 men from STHLM3 and an external validation dataset of 330 biopsies from 73 men. We also evaluated grading performance on 87 biopsies individually graded by 23 experienced urological pathologists from the International Society of Urological Pathology. We assessed discriminatory performance by receiver operating characteristics and tumour extent predictions by correlating predicted cancer length against measurements by the reporting pathologist. We quantified the concordance between grades assigned by the AI system and the expert urological pathologists using Cohen's kappa. FINDINGS: The AI achieved an area under the receiver operating characteristics curve of 0·997 (95% CI 0·994-0·999) for distinguishing between benign (n=910) and malignant (n=721) biopsy cores on the independent test dataset and 0·986 (0·972-0·996) on the external validation dataset (benign n=108, malignant n=222). The correlation between cancer length predicted by the AI and assigned by the reporting pathologist was 0·96 (95% CI 0·95-0·97) for the independent test dataset and 0·87 (0·84-0·90) for the external validation dataset. For assigning Gleason grades, the AI achieved a mean pairwise kappa of 0·62, which was within the range of the corresponding values for the expert pathologists (0·60-0·73). INTERPRETATION: An AI system can be trained to detect and grade cancer in prostate needle biopsy samples at a ranking comparable to that of international experts in prostate pathology. Clinical application could reduce pathology workload by reducing the assessment of benign biopsies and by automating the task of measuring cancer length in positive biopsy cores. An AI system with expert-level grading performance might contribute a second opinion, aid in standardising grading, and provide pathology expertise in parts of the world where it does not exist. FUNDING: Swedish Research Council, Swedish Cancer Society, Swedish eScience Research Center, EIT Health.
Subject(s)
Artificial Intelligence , Diagnosis, Computer-Assisted , Image Interpretation, Computer-Assisted , Neoplasm Grading , Prostatic Neoplasms/pathology , Aged , Biopsy , Humans , Male , Middle Aged , Predictive Value of Tests , Prospective Studies , Reproducibility of Results , SwedenABSTRACT
The localization of proteins at a tissue- or cell-type-specific level is tightly linked to the protein function. To better understand each protein's role in cellular systems, spatial information constitutes an important complement to quantitative data. The standard methods for determining the spatial distribution of proteins in single cells of complex tissue samples make use of antibodies. For a stringent analysis of the human proteome, we used orthogonal methods and independent antibodies to validate 5981 antibodies that show the expression of 3775 human proteins across all major human tissues. This enhanced validation uncovered 56 proteins corresponding to the group of "missing proteins" and 171 proteins of unknown function. The presented strategy will facilitate further discussions around criteria for evidence of protein existence based on immunohistochemistry and serves as a useful guide to identify candidate proteins for integrative studies with quantitative proteomics methods.
Subject(s)
Proteome , Proteomics , Antibodies , Humans , ImmunohistochemistryABSTRACT
According to the 2020 Metrics of the HUPO Human Proteome Project (HPP), expression has now been detected at the protein level for >90% of the 19â¯773 predicted proteins coded in the human genome. The HPP annually reports on progress made throughout the world toward credibly identifying and characterizing the complete human protein parts list and promoting proteomics as an integral part of multiomics studies in medicine and the life sciences. NeXtProt release 2020-01 classified 17â¯874 proteins as PE1, having strong protein-level evidence, up 180 from 17â¯694 one year earlier. These represent 90.4% of the 19â¯773 predicted coding genes (all PE1,2,3,4 proteins in neXtProt). Conversely, the number of neXtProt PE2,3,4 proteins, termed the "missing proteins" (MPs), was reduced by 230 from 2129 to 1899 since the neXtProt 2019-01 release. PeptideAtlas is the primary source of uniform reanalysis of raw mass spectrometry data for neXtProt, supplemented this year with extensive data from MassIVE. PeptideAtlas 2020-01 added 362 canonical proteins between 2019 and 2020 and MassIVE contributed 84 more, many of which converted PE1 entries based on non-MS evidence to the MS-based subgroup. The 19 Biology and Disease-driven B/D-HPP teams continue to pursue the identification of driver proteins that underlie disease states, the characterization of regulatory mechanisms controlling the functions of these proteins, their proteoforms, and their interactions, and the progression of transitions from correlation to coexpression to causal networks after system perturbations. And the Human Protein Atlas published Blood, Brain, and Metabolic Atlases.
Subject(s)
Proteome , Proteomics , Databases, Protein , Genome, Human , Humans , Mass Spectrometry , Proteome/geneticsABSTRACT
BACKGROUND: Deubiquitinating enzymes (DUBs) are linked to cancer progression and dissemination, yet less is known about their regulation and impact on epithelial-mesenchymal transition (EMT). METHODS: An integrative translational approach combining systematic computational analyses of The Cancer Genome Atlas cancer cohorts with CRISPR genetics, biochemistry and immunohistochemistry methodologies to identify and assess the role of human DUBs in EMT. RESULTS: We identify a previously undiscovered biological function of STAM-binding protein like 1 (STAMBPL1) deubiquitinase in the EMT process in lung and breast carcinomas. We show that STAMBPL1 expression can be regulated by mutant p53 and that its catalytic activity is required to affect the transcription factor SNAI1. Accordingly, genetic depletion and CRISPR-mediated gene knockout of STAMBPL1 leads to marked recovery of epithelial markers, SNAI1 destabilisation and impaired migratory capacity of cancer cells. Reversely, STAMBPL1 expression reprogrammes cells towards a mesenchymal phenotype. A significant STAMBPL1-SNAI1 co-signature was observed across multiple tumour types. Importantly, STAMBPL1 is highly expressed in metastatic tissues compared to matched primary tumour of the same lung cancer patient and its expression predicts poor prognosis. CONCLUSIONS: Our study provides a novel concept of oncogenic regulation of a DUB and presents a new role and predictive value of STAMBPL1 in the EMT process across multiple carcinomas.