RESUMEN
OBJECTIVE: Intraductal papillary mucinous neoplasms (IPMNs) are non-invasive precursor lesions that can progress to invasive pancreatic cancer and are classified as low-grade or high-grade based on the morphology of the neoplastic epithelium. We aimed to compare genetic alterations in low-grade and high-grade regions of the same IPMN in order to identify molecular alterations underlying neoplastic progression. DESIGN: We performed multiregion whole exome sequencing on tissue samples from 17 IPMNs with both low-grade and high-grade dysplasia (76 IPMN regions, including 49 from low-grade dysplasia and 27 from high-grade dysplasia). We reconstructed the phylogeny for each case, and we assessed mutations in a novel driver gene in an independent cohort of 63 IPMN cyst fluid samples. RESULTS: Our multiregion whole exome sequencing identified KLF4, a previously unreported genetic driver of IPMN tumorigenesis, with hotspot mutations in one of two codons identified in >50% of the analyzed IPMNs. Mutations in KLF4 were significantly more prevalent in low-grade regions in our sequenced cases. Phylogenetic analyses of whole exome sequencing data demonstrated diverse patterns of IPMN initiation and progression. Hotspot mutations in KLF4 were also identified in an independent cohort of IPMN cyst fluid samples, again with a significantly higher prevalence in low-grade IPMNs. CONCLUSION: Hotspot mutations in KLF4 occur at high prevalence in IPMNs. Unique among pancreatic driver genes, KLF4 mutations are enriched in low-grade IPMNs. These data highlight distinct molecular features of low-grade and high-grade dysplasia and suggest diverse pathways to high-grade dysplasia via the IPMN pathway.
Asunto(s)
Adenocarcinoma Mucinoso/genética , Carcinoma Papilar/genética , Secuenciación del Exoma , Neoplasias Intraductales Pancreáticas/genética , Adenocarcinoma Mucinoso/patología , Biomarcadores de Tumor/genética , Carcinoma Papilar/patología , Humanos , Factor 4 Similar a Kruppel/genética , Mutación , Clasificación del Tumor , Neoplasias Intraductales Pancreáticas/patología , Estudios RetrospectivosRESUMEN
BACKGROUND & AIMS: Intraductal papillary mucinous neoplasms (IPMNs) are lesions that can progress to invasive pancreatic cancer and constitute an important system for studies of pancreatic tumorigenesis. We performed comprehensive genomic analyses of entire IPMNs to determine the diversity of somatic mutations in genes that promote tumorigenesis. METHODS: We microdissected neoplastic tissues from 6-24 regions each of 20 resected IPMNs, resulting in 227 neoplastic samples that were analyzed by capture-based targeted sequencing. Somatic mutations in genes associated with pancreatic tumorigenesis were assessed across entire IPMN lesions, and the resulting data were supported by evolutionary modeling, whole-exome sequencing, and in situ detection of mutations. RESULTS: We found a high prevalence of heterogeneity among mutations in IPMNs. Heterogeneity in mutations in KRAS and GNAS was significantly more prevalent in IPMNs with low-grade dysplasia than in IPMNs with high-grade dysplasia (P < .02). Whole-exome sequencing confirmed that IPMNs contained multiple independent clones, each with distinct mutations, as originally indicated by targeted sequencing and evolutionary modeling. We also found evidence for convergent evolution of mutations in RNF43 and TP53, which are acquired during later stages of tumorigenesis. CONCLUSIONS: In an analysis of the heterogeneity of mutations throughout IPMNs, we found that early-stage IPMNs contain multiple independent clones, each with distinct mutations, indicating their polyclonal origin. These findings challenge the model in which pancreatic neoplasms arise from a single clone. Increasing our understanding of the mechanisms of IPMN polyclonality could lead to strategies to identify patients at increased risk for pancreatic cancer.
Asunto(s)
Biomarcadores de Tumor/genética , Transformación Celular Neoplásica/genética , Mutación , Neoplasias Intraductales Pancreáticas/genética , Neoplasias Pancreáticas/genética , Anciano , Anciano de 80 o más Años , Transformación Celular Neoplásica/patología , Cromograninas/genética , Evolución Clonal , Análisis Mutacional de ADN , Proteínas de Unión al ADN/genética , Evolución Molecular , Femenino , Subunidades alfa de la Proteína de Unión al GTP Gs/genética , Predisposición Genética a la Enfermedad , Humanos , Masculino , Persona de Mediana Edad , Tasa de Mutación , Estadificación de Neoplasias , Proteínas Oncogénicas/genética , Neoplasias Intraductales Pancreáticas/patología , Neoplasias Pancreáticas/patología , Fenotipo , Proteínas Proto-Oncogénicas p21(ras)/genética , Estudios Retrospectivos , Ubiquitina-Proteína LigasasRESUMEN
Intraductal papillary mucinous neoplasms (IPMNs) are precursors to pancreatic cancer; however, little is known about genetic heterogeneity in these lesions. The objective of this study was to characterize genetic heterogeneity in IPMNs at the single-cell level. We isolated single cells from fresh tissue from ten IPMNs, followed by whole genome amplification and targeted next-generation sequencing of pancreatic driver genes. We then determined single-cell genotypes using a novel multi-sample mutation calling algorithm. Our analyses revealed that different mutations in the same driver gene frequently occur in the same IPMN. Two IPMNs had multiple mutations in the initiating driver gene KRAS that occurred in unique tumor clones, suggesting the possibility of polyclonal origin or an unidentified initiating event preceding this critical mutation. Multiple mutations in later-occurring driver genes were also common and were frequently localized to unique tumor clones, raising the possibility of convergent evolution of these genetic events in pancreatic tumorigenesis. Single-cell sequencing of IPMNs demonstrated genetic heterogeneity with respect to early and late occurring driver gene mutations, suggesting a more complex pattern of tumor evolution than previously appreciated in these lesions. Copyright © 2018 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd.
Asunto(s)
Heterogeneidad Genética , Neoplasias Intraductales Pancreáticas/genética , Anciano , Anciano de 80 o más Años , Análisis Mutacional de ADN/métodos , Femenino , Genes Relacionados con las Neoplasias/genética , Predisposición Genética a la Enfermedad , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Masculino , Persona de Mediana Edad , Mutación , Neoplasias Intraductales Pancreáticas/patología , Neoplasias Pancreáticas/genética , Neoplasias Pancreáticas/patología , Proteínas Proto-Oncogénicas p21(ras)/genéticaRESUMEN
The evolution of new biochemical activities frequently involves complex dependencies between mutations and rapid evolutionary radiation. Mutation co-occurrence and covariation have previously been used to identify compensating mutations that are the result of physical contacts and preserve protein function and fold. Here, we model pairwise functional dependencies and higher order interactions that enable evolution of new protein functions. We use a network model to find complex dependencies between mutations resulting from evolutionary trade-offs and pleiotropic effects. We present a method to construct these networks and to identify functionally interacting mutations in both extant and reconstructed ancestral sequences (Network Analysis of Protein Adaptation). The time ordering of mutations can be incorporated into the networks through phylogenetic reconstruction. We apply NAPA to three distantly homologous ß-lactamase protein clusters (TEM, CTX-M-3, and OXA-51), each of which has experienced recent evolutionary radiation under substantially different selective pressures. By analyzing the network properties of each protein cluster, we identify key adaptive mutations, positive pairwise interactions, different adaptive solutions to the same selective pressure, and complex evolutionary trajectories likely to increase protein fitness. We also present evidence that incorporating information from phylogenetic reconstruction and ancestral sequence inference can reduce the number of spurious links in the network, whereas preserving overall network community structure. The analysis does not require structural or biochemical data. In contrast to function-preserving mutation dependencies, which are frequently from structural contacts, gain-of-function mutation dependencies are most commonly between residues distal in protein structure.
Asunto(s)
Adaptación Biológica , Evolución Molecular , Modelos Genéticos , Mutación , beta-Lactamasas/genética , FilogeniaRESUMEN
The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuenciación Completa del Genoma/métodos , Área Bajo la Curva , Predisposición Genética a la Enfermedad , Proyecto Genoma Humano , Humanos , Fenotipo , Sitios de Carácter CuantitativoRESUMEN
Recent improvements in next-generation sequencing of tumor samples and the ability to identify somatic mutations at low allelic fractions have opened the way for new approaches to model the evolution of individual cancers. The power and utility of these models is increased when tumor samples from multiple sites are sequenced. Temporal ordering of the samples may provide insight into the etiology of both primary and metastatic lesions and rationalizations for tumor recurrence and therapeutic failures. Additional insights may be provided by temporal ordering of evolving subclones--cellular subpopulations with unique mutational profiles. Current methods for subclone hierarchy inference tightly couple the problem of temporal ordering with that of estimating the fraction of cancer cells harboring each mutation. We present a new framework that includes a rigorous statistical hypothesis test and a collection of tools that make it possible to decouple these problems, which we believe will enable substantial progress in the field of subclone hierarchy inference. The methods presented here can be flexibly combined with methods developed by others addressing either of these problems. We provide tools to interpret hypothesis test results, which inform phylogenetic tree construction, and we introduce the first genetic algorithm designed for this purpose. The utility of our framework is systematically demonstrated in simulations. For most tested combinations of tumor purity, sequencing coverage, and tree complexity, good power (≥ 0.8) can be achieved and Type 1 error is well controlled when at least three tumor samples are available from a patient. Using data from three published multi-region tumor sequencing studies of (murine) small cell lung cancer, acute myeloid leukemia, and chronic lymphocytic leukemia, in which the authors reconstructed subclonal phylogenetic trees by manual expert curation, we show how different configurations of our tools can identify either a single tree in agreement with the authors, or a small set of trees, which include the authors' preferred tree. Our results have implications for improved modeling of tumor evolution and the importance of multi-region tumor sequencing.
Asunto(s)
Evolución Clonal/genética , Análisis Mutacional de ADN/métodos , ADN de Neoplasias/genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Mutación/genética , Neoplasias/genética , Algoritmos , Animales , Secuencia de Bases , Evolución Molecular , Ratones , Datos de Secuencia Molecular , Reconocimiento de Normas Patrones Automatizadas/métodos , Análisis de la Célula Individual/métodosRESUMEN
Genetic screening is becoming possible on an unprecedented scale. However, its utility remains controversial. Although most variant genotypes cannot be easily interpreted, many individuals nevertheless attempt to interpret their genetic information. Initiatives such as the Personal Genome Project (PGP) and Illumina's Understand Your Genome are sequencing thousands of adults, collecting phenotypic information and developing computational pipelines to identify the most important variant genotypes harbored by each individual. These pipelines consider database and allele frequency annotations and bioinformatics classifications. We propose that the next step will be to integrate these different sources of information to estimate the probability that a given individual has specific phenotypes of clinical interest. To this end, we have designed a Bayesian probabilistic model to predict the probability of dichotomous phenotypes. When applied to a cohort from PGP, predictions of Gilbert syndrome, Graves' disease, non-Hodgkin lymphoma, and various blood groups were accurate, as individuals manifesting the phenotype in question exhibited the highest, or among the highest, predicted probabilities. Thirty-eight PGP phenotypes (26%) were predicted with area-under-the-ROC curve (AUC)>0.7, and 23 (15.8%) of these were statistically significant, based on permutation tests. Moreover, in a Critical Assessment of Genome Interpretation (CAGI) blinded prediction experiment, the models were used to match 77 PGP genomes to phenotypic profiles, generating the most accurate prediction of 16 submissions, according to an independent assessor. Although the models are currently insufficiently accurate for diagnostic utility, we expect their performance to improve with growth of publicly available genomics data and model refinement by domain experts.
Asunto(s)
Predisposición Genética a la Enfermedad/genética , Genoma/genética , Genómica/métodos , Modelos Estadísticos , Análisis de Secuencia de ADN/métodos , Teorema de Bayes , Estudio de Asociación del Genoma Completo , Proyecto Genoma Humano , Humanos , FenotipoRESUMEN
Multiple mutations often have non-additive (epistatic) phenotypic effects. Epistasis is of fundamental biological relevance but is not well understood mechanistically. Adaptive evolution, i.e., the evolution of new biochemical activities, is rich in epistatic interactions. To better understand the principles underlying epistasis during genetic adaptation, we studied the evolution of TEM-1 ß-lactamase variants exhibiting cefotaxime resistance. We report the collection of a library of 487 observed evolutionary trajectories for TEM-1 and determine the epistasis status based on cefotaxime resistance phenotype for 206 combinations of 2-3 TEM-1 mutations involving 17 positions under adaptive selective pressure. Gain-of-function (GOF) mutations are gatekeepers for adaptation. To see if GOF phenotypes can be inferred based solely on sequence data, we calculated the enrichment of GOF mutations in the different categories of epistatic pairs. Our results suggest that this is possible because GOF mutations are particularly enriched in sign and reciprocal sign epistasis, which leave a major imprint on the sequence space accessible to evolution. We also used FoldX to explore the relationship between thermodynamic stability and epistasis. We found that mutations in observed evolutionary trajectories tend to destabilize the folded structure of the protein, albeit their cumulative effects are consistently below the protein's free energy of folding. The destabilizing effect is stronger for epistatic pairs, suggesting that modest or local alterations in folding stability can modulate catalysis. Finally, we report a significant relationship between epistasis and the degree to which two protein positions are structurally and dynamically coupled, even in the absence of ligand.
Asunto(s)
Bacterias , Farmacorresistencia Bacteriana , Evolución Molecular , beta-Lactamasas , beta-Lactamasas/genética , Cefotaxima/farmacología , Mutación con Ganancia de Función , Bacterias/efectos de los fármacos , Bacterias/genética , Epistasis Genética , Pliegue de ProteínaRESUMEN
Intraductal papillary mucinous neoplasms (IPMNs) and mucinous cystic neoplasms (MCNs) are non-invasive neoplasms that are often observed in association with invasive pancreatic cancers, but their origins and evolutionary relationships are poorly understood. In this study, we analyze 148 samples from IPMNs, MCNs, and small associated invasive carcinomas from 18 patients using whole exome or targeted sequencing. Using evolutionary analyses, we establish that both IPMNs and MCNs are direct precursors to pancreatic cancer. Mutations in SMAD4 and TGFBR2 are frequently restricted to invasive carcinoma, while RNF43 alterations are largely in non-invasive lesions. Genomic analyses suggest an average window of over three years between the development of high-grade dysplasia and pancreatic cancer. Taken together, these data establish non-invasive IPMNs and MCNs as origins of invasive pancreatic cancer, identifying potential drivers of invasion, highlighting the complex clonal dynamics prior to malignant transformation, and providing opportunities for early detection and intervention.