ABSTRACT
The diagnosis of pancreatic neuroendocrine tumours (PanNETs) is increasing owing to more sensitive detection methods, and this increase is creating challenges for clinical management. We performed whole-genome sequencing of 102 primary PanNETs and defined the genomic events that characterize their pathogenesis. Here we describe the mutational signatures they harbour, including a deficiency in G:C > T:A base excision repair due to inactivation of MUTYH, which encodes a DNA glycosylase. Clinically sporadic PanNETs contain a larger-than-expected proportion of germline mutations, including previously unreported mutations in the DNA repair genes MUTYH, CHEK2 and BRCA2. Together with mutations in MEN1 and VHL, these mutations occur in 17% of patients. Somatic mutations, including point mutations and gene fusions, were commonly found in genes involved in four main pathways: chromatin remodelling, DNA damage repair, activation of mTOR signalling (including previously undescribed EWSR1 gene fusions), and telomere maintenance. In addition, our gene expression analyses identified a subgroup of tumours associated with hypoxia and HIF signalling.
Subject(s)
Carcinoma, Neuroendocrine/genetics , Genome, Human/genetics , Genomics , Pancreatic Neoplasms/genetics , Base Sequence , Calmodulin-Binding Proteins/genetics , Chromatin Assembly and Disassembly/genetics , Chromosome Aberrations , DNA Copy Number Variations/genetics , DNA Glycosylases/genetics , DNA Mutational Analysis , DNA Repair/genetics , Female , Germ-Line Mutation/genetics , Humans , Male , RNA-Binding Protein EWS , RNA-Binding Proteins/genetics , TOR Serine-Threonine Kinases/metabolism , Telomere/genetics , Telomere/metabolismABSTRACT
This corrects the article DOI: 10.1038/nature21063.
ABSTRACT
Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded.
Subject(s)
DNA Mutational Analysis , Genome, Human/genetics , Genomics , Mutation/genetics , Pancreatic Neoplasms/genetics , Adenocarcinoma/drug therapy , Adenocarcinoma/genetics , Animals , Carcinoma, Pancreatic Ductal/drug therapy , Carcinoma, Pancreatic Ductal/genetics , DNA Repair/genetics , Female , Genes, BRCA1 , Genes, BRCA2 , Genetic Markers/genetics , Genomic Instability/genetics , Genotype , Humans , Mice , Pancreatic Neoplasms/classification , Pancreatic Neoplasms/drug therapy , Platinum/pharmacology , Point Mutation/genetics , Poly(ADP-ribose) Polymerase Inhibitors , Xenograft Model Antitumor AssaysABSTRACT
Pancreatic cancer is molecularly diverse, with few effective therapies. Increased mutation burden and defective DNA repair are associated with response to immune checkpoint inhibitors in several other cancer types. We interrogated 385 pancreatic cancer genomes to define hypermutation and its causes. Mutational signatures inferring defects in DNA repair were enriched in those with the highest mutation burdens. Mismatch repair deficiency was identified in 1% of tumors harboring different mechanisms of somatic inactivation of MLH1 and MSH2. Defining mutation load in individual pancreatic cancers and the optimal assay for patient selection may inform clinical trial design for immunotherapy in pancreatic cancer.
Subject(s)
Carcinoma, Pancreatic Ductal/genetics , DNA Mismatch Repair/genetics , Mutation , Pancreatic Neoplasms/genetics , Transcriptome , Adult , Aged , Aged, 80 and over , DNA Mutational Analysis , Female , Genome , Humans , Male , Middle Aged , MutL Protein Homolog 1/genetics , MutS Homolog 2 Protein/genetics , Proto-Oncogene Proteins p21(ras)/geneticsABSTRACT
Pancreatic cancer is a highly lethal malignancy with few effective therapies. We performed exome sequencing and copy number analysis to define genomic aberrations in a prospectively accrued clinical cohort (n = 142) of early (stage I and II) sporadic pancreatic ductal adenocarcinoma. Detailed analysis of 99 informative tumours identified substantial heterogeneity with 2,016 non-silent mutations and 1,628 copy-number variations. We define 16 significantly mutated genes, reaffirming known mutations (KRAS, TP53, CDKN2A, SMAD4, MLL3, TGFBR2, ARID1A and SF3B1), and uncover novel mutated genes including additional genes involved in chromatin modification (EPC1 and ARID2), DNA damage repair (ATM) and other mechanisms (ZIM2, MAP2K4, NALCN, SLC16A4 and MAGEA6). Integrative analysis with in vitro functional data and animal models provided supportive evidence for potential roles for these genetic aberrations in carcinogenesis. Pathway-based analysis of recurrently mutated genes recapitulated clustering in core signalling pathways in pancreatic ductal adenocarcinoma, and identified new mutated genes in each pathway. We also identified frequent and diverse somatic aberrations in genes described traditionally as embryonic regulators of axon guidance, particularly SLIT/ROBO signalling, which was also evident in murine Sleeping Beauty transposon-mediated somatic mutagenesis models of pancreatic cancer, providing further supportive evidence for the potential involvement of axon guidance genes in pancreatic carcinogenesis.
Subject(s)
Axons/metabolism , Carcinoma, Pancreatic Ductal/genetics , Carcinoma, Pancreatic Ductal/pathology , Genome/genetics , Pancreatic Neoplasms/genetics , Pancreatic Neoplasms/pathology , Animals , Gene Dosage , Gene Expression Regulation, Neoplastic , Humans , Kaplan-Meier Estimate , Mice , Mutation , Proteins/genetics , Signal TransductionABSTRACT
Multiple Myeloma (MM) is a haematological malignancy characterised by the clonal expansion of plasma cells (PCs) within the bone marrow. Despite advances in therapy, MM remains a largely incurable disease with a median survival of 6 years. In almost all cases, the development of MM is preceded by the benign PC condition Monoclonal Gammopathy of Undetermined Significance (MGUS). Recent studies show that the transformation of MGUS to MM is associated with complex genetic changes. Understanding how these changes contribute to evolution will present targets for clinical intervention. We discuss three models of MM evolution; the linear, the expansionist and the intraclonal heterogeneity models. Of particular interest is the intraclonal heterogeneity model. Here, distinct populations of MM PCs carry differing combinations of genetic mutations. Acquisition of additional mutations can contribute to subclonal lineages where "driver" mutations may influence selective pressure and dominance, and "passenger" mutations are neutral in their effects. Furthermore, studies show that clinical intervention introduces additional selective pressure on tumour cells and can influence subclone survival, leading to therapy resistance. This review discusses how Next Generation Sequencing approaches are revealing critical insights into the genetics of MM development, disease progression and treatment. MM disease progression will illuminate possible mechanisms underlying the tumour.
Subject(s)
Genomics/methods , Multiple Myeloma/genetics , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , Disease Progression , Enzyme Inhibitors/therapeutic use , Epigenesis, Genetic/genetics , Forecasting , Genomics/trends , Humans , Immunologic Factors/therapeutic use , Multiple Myeloma/drug therapy , Mutation/geneticsABSTRACT
The International Cancer Genome Consortium (ICGC) aims to catalog genomic abnormalities in tumors from 50 different cancer types. Genome sequencing reveals hundreds to thousands of somatic mutations in each tumor but only a minority of these drive tumor progression. We present the result of discussions within the ICGC on how to address the challenge of identifying mutations that contribute to oncogenesis, tumor maintenance or response to therapy, and recommend computational techniques to annotate somatic variants and predict their impact on cancer phenotype.
Subject(s)
Computational Biology/methods , Genome, Human , Neoplasms/genetics , Genetic Variation , Humans , MutationABSTRACT
Treatment options for patients with brain metastases (BMs) have limited efficacy and the mortality rate is virtually 100%. Targeted therapy is critically under-utilized, and our understanding of mechanisms underpinning metastatic outgrowth in the brain is limited. To address these deficiencies, we investigated the genomic and transcriptomic landscapes of 36 BMs from breast, lung, melanoma and oesophageal cancers, using DNA copy-number analysis and exome- and RNA-sequencing. The key findings were as follows. (a) Identification of novel candidates with possible roles in BM development, including the significantly mutated genes DSC2, ST7, PIK3R1 and SMC5, and the DNA repair, ERBB-HER signalling, axon guidance and protein kinase-A signalling pathways. (b) Mutational signature analysis was applied to successfully identify the primary cancer type for two BMs with unknown origins. (c) Actionable genomic alterations were identified in 31/36 BMs (86%); in one case we retrospectively identified ERBB2 amplification representing apparent HER2 status conversion, then confirmed progressive enrichment for HER2-positivity across four consecutive metastatic deposits by IHC and SISH, resulting in the deployment of HER2-targeted therapy for the patient. (d) In the ERBB/HER pathway, ERBB2 expression correlated with ERBB3 (r(2) = 0.496; p < 0.0001) and HER3 and HER4 were frequently activated in an independent cohort of 167 archival BM from seven primary cancer types: 57.6% and 52.6% of cases were phospho-HER3(Y1222) or phospho-HER4(Y1162) membrane-positive, respectively. The HER3 ligands NRG1/2 were barely detectable by RNAseq, with NRG1 (8p12) genomic loss in 63.6% breast cancer-BMs, suggesting a microenvironmental source of ligand. In summary, this is the first study to characterize the genomic landscapes of BM. The data revealed novel candidates, potential clinical applications for genomic profiling of resectable BMs, and highlighted the possibility of therapeutically targeting HER3, which is broadly over-expressed and activated in BMs, independent of primary site and systemic therapy.
Subject(s)
Biomarkers, Tumor/genetics , Brain Neoplasms/genetics , Brain Neoplasms/secondary , Gene Expression Profiling/methods , Genomics/methods , Biomarkers, Tumor/metabolism , Brain Neoplasms/drug therapy , Brain Neoplasms/enzymology , DNA Mutational Analysis , Enzyme Activation , Gene Amplification , Gene Dosage , Gene Expression Regulation, Neoplastic , Genetic Association Studies , Genetic Predisposition to Disease , Humans , Immunohistochemistry , Ligands , Molecular Targeted Therapy , Mutation , Phenotype , Phosphorylation , Precision Medicine , Predictive Value of Tests , Protein Kinase Inhibitors/therapeutic use , Receptor, ErbB-2/genetics , Receptor, ErbB-2/metabolism , Receptor, ErbB-3/genetics , Receptor, ErbB-3/metabolism , Receptor, ErbB-4/genetics , Receptor, ErbB-4/metabolism , Tumor MicroenvironmentABSTRACT
Genetic susceptibility to familial colorectal cancer (CRC), including for individuals classified as Familial Colorectal Cancer Type X (FCCTX), remains poorly understood. We describe a multi-generation CRC-affected family segregating pathogenic variants in both BRCA1, a gene associated with breast and ovarian cancer and RNF43, a gene associated with Serrated Polyposis Syndrome (SPS). A single family out of 105 families meeting the criteria for FCCTX (Amsterdam I family history criteria with mismatch repair (MMR)-proficient CRCs) recruited to the Australasian Colorectal Cancer Family Registry (ACCFR; 1998-2008) that underwent whole exome sequencing (WES), was selected for further testing. CRC and polyp tissue from four carriers were molecularly characterized including a single CRC that underwent WES to determine tumor mutational signatures and loss of heterozygosity (LOH) events. Ten carriers of a germline pathogenic variant BRCA1:c.2681_2682delAA p.Lys894ThrfsTer8 and eight carriers of a germline pathogenic variant RNF43:c.988 C > T p.Arg330Ter were identified in this family. Seven members carried both variants, four of which developed CRC. A single carrier of the RNF43 variant met the 2019 World Health Organization (WHO2019) criteria for SPS, developing a BRAF p.V600 wildtype CRC. Loss of the wildtype allele for both BRCA1 and RNF43 variants was observed in three CRC tumors while a LOH event across chromosome 17q encompassing both genes was observed in a CRC. Tumor mutational signature analysis identified the homologous recombination deficiency (HRD)-associated COSMIC signatures SBS3 and ID6 in a CRC for a carrier of both variants. Our findings show digenic inheritance of pathogenic variants in BRCA1 and RNF43 segregating with CRC in a FCCTX family. LOH and evidence of BRCA1-associated HRD supports the importance of both these tumor suppressor genes in CRC tumorigenesis.
Subject(s)
Colorectal Neoplasms, Hereditary Nonpolyposis , Colorectal Neoplasms , Humans , Colorectal Neoplasms, Hereditary Nonpolyposis/genetics , Colorectal Neoplasms/genetics , Colorectal Neoplasms/pathology , Mutation , Germ-Line Mutation , Genetic Predisposition to Disease , BRCA1 Protein/genetics , Ubiquitin-Protein Ligases/geneticsABSTRACT
Identification of somatic variants in cancer by high-throughput sequencing has become common clinical practice, largely because many of these variants may be predictive biomarkers for targeted therapies. However, there can be high sample quality control (QC) failure rates for some tests that prevent the return of results. Stem-loop inhibition mediated amplification (SLIMamp) is a patented technology that has been incorporated into commercially available cancer next-generation sequencing testing kits. The claimed advantage is that these kits can interrogate challenging formalin-fixed, paraffin-embedded tissue samples with low tumor purity, poor-quality DNA, and/or low-input DNA, resulting in a high sample QC pass rate. The study aimed to substantiate that claim using Pillar Biosciences oncoReveal Solid Tumor Panel. Forty-eight samples that had failed one or more preanalytical QC sample parameters for whole-exome sequencing from the Australian Translational Genomics Centre's ISO15189-accredited diagnostic genomics laboratory were acquired. XING Genomic Services performed an exploratory data analysis to characterize the samples and then tested the samples in their ISO15189-accredited laboratory. Clinical reports could be generated for 37 (77%) samples, of which 29 (60%) contained clinically actionable or significant variants that would not otherwise have been identified. Eleven samples were deemed unreportable, and the sequencing data were likely dominated by artifacts. A novel postsequencing QC metric was developed that can discriminate between clinically reportable and unreportable samples.
Subject(s)
Formaldehyde , Neoplasms , Humans , Tissue Fixation , Australia , Neoplasms/diagnosis , Neoplasms/genetics , DNA , High-Throughput Nucleotide Sequencing/methods , Biomarkers, Tumor/genetics , Mutation , Paraffin EmbeddingABSTRACT
Clonorchis sinensis is a carcinogenic liver fluke that causes clonorchiasis-a neglected tropical disease (NTD) affecting ~35 million people worldwide. No vaccine is available, and chemotherapy relies on one anthelmintic, praziquantel. This parasite has a complex life history and is known to infect a range of species of intermediate (freshwater snails and fish) and definitive (piscivorous) hosts. Despite this biological complexity and the impact of this biocarcinogenic pathogen, there has been no previous study of molecular variation in this parasite on a genome-wide scale. Here, we conducted the first extensive nuclear genomic exploration of C. sinensis individuals (n = 152) representing five distinct populations from mainland China, and one from Far East Russia, and revealed marked genetic variation within this species between "northern" and "southern" geographical regions. The discovery of this variation indicates the existence of biologically distinct variants within C. sinensis, which may have distinct epidemiology, pathogenicity and/or chemotherapic responsiveness. The detection of high heterozygosity within C. sinensis specimens suggests that this parasite has developed mechanisms to readily adapt to changing environments and/or host species during its life history/evolution. From an applied perspective, the identification of invariable genes could assist in finding new intervention targets in this parasite, given the major clinical relevance of clonorchiasis. From a technical perspective, the genomic-informatic workflow established herein will be readily applicable to a wide range of other parasites that cause NTDs.
Subject(s)
Clonorchiasis , Clonorchis sinensis , Animals , Clonorchis sinensis/genetics , Clonorchiasis/diagnosis , Clonorchiasis/epidemiology , Clonorchiasis/parasitology , Genetic Variation , Asia, Eastern , China/epidemiologyABSTRACT
Little is known regarding the molecular differences between basal cell carcinoma (BCC) subtypes, despite clearly distinct phenotypes and clinical outcomes. In particular, infiltrative BCCs have poorer clinical outcomes in terms of response to therapy and propensity for dissemination. In this project, we aimed to use exome sequencing and RNA sequencing to identify somatic mutations and molecular pathways leading to infiltrative BCCs. Using whole-exome sequencing of 36 BCC samples (eight infiltrative) combined with previously reported exome data (58 samples), we determine that infiltrative BCCs do not contain a distinct somatic variant profile and carry classical UV-induced mutational signatures. RNA sequencing on both datasets revealed key differentially expressed genes, such as POSTN and WISP1, suggesting increased integrin and Wnt signaling. Immunostaining for periostin and WISP1 clearly distinguished infiltrative BCCs, and nuclear ß-catenin staining patterns further validated the resulting increase in Wnt signaling in infiltrative BCCs. Of significant interest, in BCCs with mixed morphology, infiltrative areas expressed WISP1, whereas nodular areas did not, supporting a continuum between subtypes. In conclusion, infiltrative BCCs do not differ in their genomic alteration in terms of initiating mutations. They display a specific type of interaction with the extracellular matrix environment regulating Wnt signaling.
Subject(s)
Carcinoma, Basal Cell/genetics , Skin Neoplasms/genetics , Aged , CCN Intercellular Signaling Proteins/analysis , Carcinoma, Basal Cell/classification , Carcinoma, Basal Cell/pathology , Cell Adhesion Molecules/analysis , Female , Humans , Male , Mutation , Proto-Oncogene Proteins/analysis , Skin Neoplasms/classification , Skin Neoplasms/pathologyABSTRACT
BACKGROUND: In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles. RESULTS: The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at http://www.codeplex.com/UCSDBioLit. CONCLUSIONS: The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.
Subject(s)
Information Storage and Retrieval/methods , Publications , Semantics , Databases, Factual , Natural Language Processing , Programming Languages , Vocabulary, ControlledABSTRACT
BACKGROUND: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB). RESULTS: BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the results as simple XML Web Services. Here, we integrate BioLit results with the RCSB PDB website by using these services to find PDB IDs that are mentioned in research articles and subsequently retrieving abstract, figures, and text excerpts for those articles. A new RCSB PDB literature view permits browsing through the figures and abstracts of the articles that mention a given structure. The BioLit Web Services that are providing the underlying data are publicly accessible. A client library is provided that supports querying these services (Java). CONCLUSIONS: The integration between literature and websites, as demonstrated here with the RCSB PDB, provides a broader view for how a given structure has been analyzed and used. This approach detects the mention of a PDB structure even if it is not formally cited in the paper. Other structures related through the same literature references can also be identified, possibly providing new scientific insight. To our knowledge this is the first time that database and literature have been integrated in this way and it speaks to the opportunities afforded by open and free access to both database and literature content.
Subject(s)
Databases, Protein , Proteins/chemistry , Software , PubMed , Publications , Systems Integration , User-Computer InterfaceABSTRACT
BioLit is a web server which provides metadata describing the semantic content of all open access, peer-reviewed articles which describe research from the major life sciences literature archive, PubMed Central. Specifically, these metadata include database identifiers and ontology terms found within the full text of the article. BioLit delivers these metadata in the form of XML-based article files and as a custom web-based article viewer that provides context-specific functionality to the metadata. This resource aims to integrate the traditional scientific publication directly into existing biological databases, thus obviating the need for a user to search in multiple locations for information relating to a specific item of interest, for example published experimental results associated with a particular biological database entry. As an example of a possible use of BioLit, we also present an instance of the Protein Data Bank fully integrated with BioLit data. We expect that the community of life scientists in general will be the primary end-users of the web-based viewer, while biocurators will make use of the metadata-containing XML files and the BioLit database of article data. BioLit is available at http://biolit.ucsd.edu.
Subject(s)
Databases, Factual , PubMed , Software , Databases, Protein , Internet , Systems IntegrationABSTRACT
[This corrects the article DOI: 10.18632/oncotarget.27206.].
ABSTRACT
Clonorchiasis is a neglected tropical disease caused by the Chinese liver fluke, Clonorchis sinensis, and is often associated with a malignant form of bile duct cancer (cholangiocarcinoma). Although some aspects of the epidemiology of clonorchiasis are understood, little is known about the genetics of C. sinensis populations. Here, we conducted a comprehensive genetic exploration of C. sinensis from endemic geographic regions using complete mitochondrial protein gene sets. Genomic DNA samples from C. sinensis individuals (n = 183) collected from cats and dogs in China (provinces of Guangdong, Guangxi, Hunan, Heilongjiang and Jilin) as well as from rats infected with metacercariae from cyprinid fish from the Russian Far East (Primorsky Krai region) were deep sequenced using the BGISEQ-500 platform. Informatic analyses of mitochondrial protein gene data sets revealed marked genetic variation within C. sinensis; significant variation was identified within and among individual worms from distinct geographical locations. No clear affiliation with a particular location or host species was evident, suggesting a high rate of dispersal of the parasite across endemic regions. The present work provides a foundation for future biological, epidemiological and ecological studies using mitochondrial protein gene data sets, which could aid in elucidating associations between particular C. sinensis genotypes/haplotypes and the pathogenesis or severity of clonorchiasis and its complications (including cholangiocarcinoma) in humans.
Subject(s)
Clonorchiasis/parasitology , Clonorchis sinensis/genetics , DNA, Mitochondrial/genetics , Genetic Variation , Animals , China/epidemiology , Clonorchiasis/epidemiology , Haploidy , Host-Parasite Interactions , Humans , Phylogeny , Russia/epidemiologyABSTRACT
To better understand the influence of ultraviolet (UV) irradiation on the initial steps of skin carcinogenesis, we examine patches of labeled keratinocytes as a proxy for clones in the interfollicular epidermis (IFE) and measure their size variation upon UVB irradiation. Multicolor lineage tracing reveals that in chronically irradiated skin, patches near hair follicles (HFs) increase in size, whereas those far from follicles do not change. This is explained by proliferation of basal epidermal cells within 60 µm of HF openings. Upon interruption of UVB, patch size near HFs regresses significantly. These anatomical differences in proliferative behavior have significant consequences for the cell of origin of basal cell carcinomas (BCCs). Indeed, a UV-inducible murine BCC model shows that BCC patches are more frequent, larger, and more invasive near HFs. These findings have major implications for the prevention of field cancerization in the epidermis.
Subject(s)
Epidermis/metabolism , Neoplasms, Radiation-Induced/pathology , Ultraviolet Rays , Animals , Carcinoma, Basal Cell/metabolism , Carcinoma, Basal Cell/pathology , Cell Proliferation , Cyclin D1/metabolism , Disease Models, Animal , Epidermis/radiation effects , Hair Follicle/pathology , Mice , Mice, Inbred C57BL , Mice, Transgenic , Neoplasms, Radiation-Induced/metabolism , Skin Neoplasms/metabolism , Skin Neoplasms/pathology , Stem Cells/cytology , Stem Cells/metabolism , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/metabolismABSTRACT
Chromosome arm aneuploidies (CAAs) are pervasive in cancers. However, how they affect cancer development, prognosis and treatment remains largely unknown. Here, we analyse CAA profiles of 23,427 tumours, identifying aspects of tumour evolution including probable orders in which CAAs occur and CAAs predicting tissue-specific metastasis. Both haematological and solid cancers initially gain chromosome arms, while only solid cancers subsequently preferentially lose multiple arms. 72 CAAs and 88 synergistically co-occurring CAA pairs multivariately predict good or poor survival for 58% of 6977 patients, with negligible impact of whole-genome doubling. Additionally, machine learning identifies 31 CAAs that robustly alter response to 56 chemotherapeutic drugs across cell lines representing 17 cancer types. We also uncover 1024 potential synthetic lethal pharmacogenomic interactions. Notably, in predicting drug response, CAAs substantially outperform mutations and focal deletions/amplifications combined. Thus, CAAs predict cancer prognosis, shape tumour evolution, metastasis and drug response, and may advance precision oncology.
Subject(s)
Aneuploidy , Chromosomes, Human , Drug Resistance, Neoplasm/genetics , Mutation Rate , Neoplasms/drug therapy , Neoplasms/genetics , Cell Line, Tumor , Humans , Kaplan-Meier Estimate , Machine Learning , Models, Biological , Neoplasms/mortality , Neoplasms/pathology , Prognosis , Stochastic ProcessesABSTRACT
Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi-spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane. Application of MemO, a membrane organization annotation pipeline, to the FANTOM3 Isoform Protein Sequence mouse protein set revealed that within the 8,032 transcriptional units (TUs) with multiple protein isoforms, 573 had variation in their use of signal peptides, 1,527 had variation in their use of transmembrane domains, and 615 generated protein isoforms from distinct membrane organization classes. The mechanisms underlying these transcript variations were analyzed. While TUs were identified encoding all pairwise combinations of membrane organization categories, the most common was conversion of membrane proteins to soluble proteins. Observed within our high-confidence set were 156 TUs predicted to generate both extracellular soluble and membrane proteins, and 217 TUs generating both intracellular soluble and membrane proteins. The differential use of endoplasmic reticulum signal peptides and transmembrane domains is a common occurrence within the variable protein output of TUs. The generation of protein isoforms that are targeted to multiple subcellular locations represents a major functional consequence of transcript variation within the mouse transcriptome.