ABSTRACT
Pancreatic cancer, a highly aggressive tumour type with uniformly poor prognosis, exemplifies the classically held view of stepwise cancer development. The current model of tumorigenesis, based on analyses of precursor lesions, termed pancreatic intraepithelial neoplasm (PanINs) lesions, makes two predictions: first, that pancreatic cancer develops through a particular sequence of genetic alterations (KRAS, followed by CDKN2A, then TP53 and SMAD4); and second, that the evolutionary trajectory of pancreatic cancer progression is gradual because each alteration is acquired independently. A shortcoming of this model is that clonally expanded precursor lesions do not always belong to the tumour lineage, indicating that the evolutionary trajectory of the tumour lineage and precursor lesions can be divergent. This prevailing model of tumorigenesis has contributed to the clinical notion that pancreatic cancer evolves slowly and presents at a late stage. However, the propensity for this disease to rapidly metastasize and the inability to improve patient outcomes, despite efforts aimed at early detection, suggest that pancreatic cancer progression is not gradual. Here, using newly developed informatics tools, we tracked changes in DNA copy number and their associated rearrangements in tumour-enriched genomes and found that pancreatic cancer tumorigenesis is neither gradual nor follows the accepted mutation order. Two-thirds of tumours harbour complex rearrangement patterns associated with mitotic errors, consistent with punctuated equilibrium as the principal evolutionary trajectory. In a subset of cases, the consequence of such errors is the simultaneous, rather than sequential, knockout of canonical preneoplastic genetic drivers that are likely to set-off invasive cancer growth. These findings challenge the current progression model of pancreatic cancer and provide insights into the mutational processes that give rise to these aggressive tumours.
Subject(s)
Carcinogenesis/genetics , Carcinogenesis/pathology , Gene Rearrangement/genetics , Genome, Human/genetics , Models, Biological , Mutagenesis/genetics , Pancreatic Neoplasms/genetics , Pancreatic Neoplasms/pathology , Carcinoma in Situ/genetics , Chromothripsis , DNA Copy Number Variations/genetics , Disease Progression , Evolution, Molecular , Female , Genes, Neoplasm/genetics , Humans , Male , Mitosis/genetics , Mutation/genetics , Neoplasm Invasiveness/genetics , Neoplasm Invasiveness/pathology , Neoplasm Metastasis/genetics , Neoplasm Metastasis/pathology , Polyploidy , Precancerous Conditions/geneticsABSTRACT
Noonan syndrome (NS) is a relatively common genetic disorder, characterized by typical facies, short stature, developmental delay, and cardiac abnormalities. Known causative genes account for 70-80% of clinically diagnosed NS patients, but the genetic basis for the remaining 20-30% of cases is unknown. We performed next-generation sequencing on germ-line DNA from 27 NS patients lacking a mutation in the known NS genes. We identified gain-of-function alleles in Ras-like without CAAX 1 (RIT1) and mitogen-activated protein kinase kinase 1 (MAP2K1) and previously unseen loss-of-function variants in RAS p21 protein activator 2 (RASA2) that are likely to cause NS in these patients. Expression of the mutant RASA2, MAP2K1, or RIT1 alleles in heterologous cells increased RAS-ERK pathway activation, supporting a causative role in NS pathogenesis. Two patients had more than one disease-associated variant. Moreover, the diagnosis of an individual initially thought to have NS was revised to neurofibromatosis type 1 based on an NF1 nonsense mutation detected in this patient. Another patient harbored a missense mutation in NF1 that resulted in decreased protein stability and impaired ability to suppress RAS-ERK activation; however, this patient continues to exhibit a NS-like phenotype. In addition, a nonsense mutation in RPS6KA3 was found in one patient initially diagnosed with NS whose diagnosis was later revised to Coffin-Lowry syndrome. Finally, we identified other potential candidates for new NS genes, as well as potential carrier alleles for unrelated syndromes. Taken together, our data suggest that next-generation sequencing can provide a useful adjunct to RASopathy diagnosis and emphasize that the standard clinical categories for RASopathies might not be adequate to describe all patients.
Subject(s)
High-Throughput Nucleotide Sequencing/methods , Mutation/genetics , Noonan Syndrome/genetics , Alleles , Genetic Association Studies , Humans , MAP Kinase Kinase 1/genetics , MAP Kinase Signaling System/genetics , Neurofibromin 1/genetics , ras Proteins/genetics , ras Proteins/metabolismABSTRACT
Neural tube defects (NTDs) are common birth defects of complex etiology. Family and population-based studies have confirmed a genetic component to NTDs. However, despite more than three decades of research, the genes involved in human NTDs remain largely unknown. We tested the hypothesis that rare copy number variants (CNVs), especially de novo germline CNVs, are a significant risk factor for NTDs. We used array-based comparative genomic hybridization (aCGH) to identify rare CNVs in 128 Caucasian and 61 Hispanic patients with non-syndromic lumbar-sacral myelomeningocele. We also performed aCGH analysis on the parents of affected individuals with rare CNVs where parental DNA was available (42 sets). Among the eight de novo CNVs that we identified, three generated copy number changes of entire genes. One large heterozygous deletion removed 27 genes, including PAX3, a known spina bifida-associated gene. A second CNV altered genes (PGPD8, ZC3H6) for which little is known regarding function or expression. A third heterozygous deletion removed GPC5 and part of GPC6, genes encoding glypicans. Glypicans are proteoglycans that modulate the activity of morphogens such as Sonic Hedgehog (SHH) and bone morphogenetic proteins (BMPs), both of which have been implicated in NTDs. Additionally, glypicans function in the planar cell polarity (PCP) pathway, and several PCP genes have been associated with NTDs. Here, we show that GPC5 orthologs are expressed in the neural tube, and that inhibiting their expression in frog and fish embryos results in NTDs. These results implicate GPC5 as a gene required for normal neural tube development.
Subject(s)
Cell Polarity , DNA Copy Number Variations , Glypicans/genetics , Spinal Dysraphism/genetics , Animals , Cohort Studies , Female , Genetic Predisposition to Disease , Hispanic or Latino/genetics , Humans , Male , Neural Tube/embryology , Neural Tube/metabolism , Spinal Dysraphism/embryology , Spinal Dysraphism/physiopathology , White People/genetics , ZebrafishABSTRACT
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the Portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This Portal has been coupled with other resources like Viral AI and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this Portal, including its contextual data not available elsewhere, and the 'Duotang', a web platform that presents key genomic epidemiology and modeling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the Portal (COVID-MVP, CoVizu), are all open-source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
ABSTRACT
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform the public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This portal has been coupled with other resources, such as Viral AI, and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this portal (https://virusseq-dataportal.ca/), including its contextual data not available elsewhere, and the Duotang (https://covarr-net.github.io/duotang/duotang.html), a web platform that presents key genomic epidemiology and modelling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the portal (COVID-MVP, CoVizu), are all open source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
Subject(s)
COVID-19 , Genome, Viral , SARS-CoV-2 , Canada/epidemiology , SARS-CoV-2/genetics , Humans , COVID-19/epidemiology , COVID-19/virology , Genomics/methods , Pandemics , Databases, GeneticABSTRACT
Metastatic relapse is the major cause of death in pediatric neuroblastoma, where there remains a lack of therapies to target this stage of disease. To understand the molecular mechanisms mediating neuroblastoma metastasis, we developed a mouse model using intracardiac injection and in vivo selection to isolate malignant cell subpopulations with a higher propensity for metastasis to bone and the central nervous system. Gene expression profiling revealed primary and metastatic cells as two distinct cell populations defined by differential expression of 412 genes and of multiple pathways, including CADM1, SPHK1, and YAP/TAZ, whose expression independently predicted survival. In the metastatic subpopulations, a gene signature was defined (MET-75) that predicted survival of neuroblastoma patients with metastatic disease. Mechanistic investigations demonstrated causal roles for CADM1, SPHK1, and YAP/TAZ in mediating metastatic phenotypes in vitro and in vivo Notably, pharmacologic targeting of SPHK1 or YAP/TAZ was sufficient to inhibit neuroblastoma metastasis in vivo Overall, we identify gene expression signatures and candidate therapeutics that could improve the treatment of metastatic neuroblastoma. Cancer Res; 77(3); 696-706. ©2017 AACR.
Subject(s)
Neoplasm Invasiveness/genetics , Neoplasm Invasiveness/pathology , Neuroblastoma/genetics , Neuroblastoma/pathology , Transcriptome , Animals , Cell Line, Tumor , Disease Models, Animal , Gene Expression Profiling , Heterografts , Immunoblotting , Magnetic Resonance Imaging , Male , Mice , Mice, Inbred NOD , Mice, SCID , Oligonucleotide Array Sequence Analysis , X-Ray MicrotomographyABSTRACT
Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods. PhyloWGS is free, open-source software, available at https://github.com/morrislab/phylowgs.
Subject(s)
Genome, Human , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Phylogeny , Algorithms , Clone Cells , Cluster Analysis , Computer Simulation , DNA Copy Number Variations , Gene Frequency , Genetic Heterogeneity , Humans , Mutation , Reference StandardsABSTRACT
BACKGROUND: Accurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies. RESULTS: Cell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease. CONCLUSIONS: Our cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO).
Subject(s)
DNA, Neoplasm/genetics , Genetic Variation , Neoplasms/genetics , Algorithms , Carcinoma, Pancreatic Ductal/genetics , Cell Line, Tumor , Computational Biology , DNA Mutational Analysis , Databases, Nucleic Acid , Exome/genetics , Gene Frequency , Gene Library , Humans , INDEL Mutation , Mutation , Pancreatic Neoplasms/genetics , Polymorphism, Single Nucleotide , SoftwareABSTRACT
Reactome is an open source, expert-authored, manually curated and peer-reviewed database of reactions, pathways and biological processes. We provide an intuitive web-based user interface to pathway knowledge and a suite of data analysis tools. The Reactome BioMart provides biologists and bioinformaticians with a single web interface for performing simple or elaborate queries of the Reactome database, aggregating data from different sources and providing an opportunity to integrate experimental and computational results with information relating to biological pathways. Database URL: http://www.reactome.org.
Subject(s)
Databases, Factual , Internet , Metabolic Networks and Pathways , Computational Biology , Humans , Search EngineABSTRACT
Separation of basic proteins with 2-DE presents technical challenges involving protein precipitation, load limitations, and streaking. Cardiac mitochondria are enriched in basic proteins and difficult to resolve by 2-DE. We investigated two methods, cup and paper bridge, for sample loading of this subproteome into the basic range (pH 6-11) gels. Paper bridge loading consistently produced improved resolution of both analytical and preparative protein loads. A unique benefit of this technique is that proteins retained in the paper bridge after loading basic gels can be reloaded onto lower pH gradients (pH 4-7), allowing valued samples to be analyzed on multiple pH ranges.
Subject(s)
Electrophoresis, Gel, Two-Dimensional/methods , Mitochondria, Heart/chemistry , Proteome/analysis , Submitochondrial Particles/chemistry , Animals , Hydrogen-Ion Concentration , Mice , Peptide Mapping , Spectrometry, Mass, Matrix-Assisted Laser Desorption-IonizationABSTRACT
Dilated cardiomyopathy is now the leading cause of cardiovascular morbidity and mortality. While the molecular basis of this disease remains uncertain, evidence is emerging that gene expression profiles of left ventricular myocardium isolated from failing versus nonfailing patients differ dramatically. In this study, we use high-density oligonucleotide microarrays with approximately 22000 probes to characterize differences in the expression profiles further. To facilitate interpretation of experimental data, we evaluate algorithms for normalization of hybridization data and for computation of gene expression indices using a control spike-in data set. We then use these methods to identify statistically significant changes in the expression levels of genes not previously implicated in the molecular phenotype of heart failure. These regulated genes take part in diverse cellular processes, including transcription, apoptosis, sarcomeric and cytoskeletal function, remodeling of the extracellular matrix, membrane transport, and metabolism.