RESUMEN
Intra-tumor heterogeneity (ITH) is a mechanism of therapeutic resistance and therefore an important clinical challenge. However, the extent, origin, and drivers of ITH across cancer types are poorly understood. To address this, we extensively characterize ITH across whole-genome sequences of 2,658 cancer samples spanning 38 cancer types. Nearly all informative samples (95.1%) contain evidence of distinct subclonal expansions with frequent branching relationships between subclones. We observe positive selection of subclonal driver mutations across most cancer types and identify cancer type-specific subclonal patterns of driver gene mutations, fusions, structural variants, and copy number alterations as well as dynamic changes in mutational processes between subclonal expansions. Our results underline the importance of ITH and its drivers in tumor evolution and provide a pan-cancer resource of comprehensively annotated subclonal events from whole-genome sequencing data.
Asunto(s)
Heterogeneidad Genética , Neoplasias/genética , Variaciones en el Número de Copia de ADN , ADN de Neoplasias/química , ADN de Neoplasias/metabolismo , Bases de Datos Genéticas , Resistencia a Antineoplásicos/genética , Humanos , Neoplasias/patología , Polimorfismo de Nucleótido Simple , Secuenciación Completa del GenomaRESUMEN
The majority of newly diagnosed prostate cancers are slow growing, with a long natural life history. Yet a subset can metastasize with lethal consequences. We reconstructed the phylogenies of 293 localized prostate tumors linked to clinical outcome data. Multiple subclones were detected in 59% of patients, and specific subclonal architectures associate with adverse clinicopathological features. Early tumor development is characterized by point mutations and deletions followed by later subclonal amplifications and changes in trinucleotide mutational signatures. Specific genes are selectively mutated prior to or following subclonal diversification, including MTOR, NKX3-1, and RB1. Patients with low-risk monoclonal tumors rarely relapse after primary therapy (7%), while those with high-risk polyclonal tumors frequently do (61%). The presence of multiple subclones in an index biopsy may be necessary, but not sufficient, for relapse of localized prostate cancer, suggesting that evolution-aware biomarkers should be studied in prospective studies of low-risk tumors suitable for active surveillance.
Asunto(s)
Neoplasias de la Próstata/patología , Biomarcadores de Tumor/sangre , Secuenciación de Nucleótidos de Alto Rendimiento , Proteínas de Homeodominio/genética , Proteínas de Homeodominio/metabolismo , Humanos , Masculino , Clasificación del Tumor , Recurrencia Local de Neoplasia , Polimorfismo de Nucleótido Simple , Modelos de Riesgos Proporcionales , Estudios Prospectivos , Neoplasias de la Próstata/clasificación , Neoplasias de la Próstata/genética , Proteínas de Unión a Retinoblastoma/genética , Proteínas de Unión a Retinoblastoma/metabolismo , Serina-Treonina Quinasas TOR/genética , Serina-Treonina Quinasas TOR/metabolismo , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Ubiquitina-Proteína Ligasas/genética , Ubiquitina-Proteína Ligasas/metabolismoRESUMEN
Cancer develops through a process of somatic evolution1,2. Sequencing data from a single biopsy represent a snapshot of this process that can reveal the timing of specific genomic aberrations and the changing influence of mutational processes3. Here, by whole-genome sequencing analysis of 2,658 cancers as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA)4, we reconstruct the life history and evolution of mutational processes and driver mutation sequences of 38 types of cancer. Early oncogenesis is characterized by mutations in a constrained set of driver genes, and specific copy number gains, such as trisomy 7 in glioblastoma and isochromosome 17q in medulloblastoma. The mutational spectrum changes significantly throughout tumour evolution in 40% of samples. A nearly fourfold diversification of driver genes and increased genomic instability are features of later stages. Copy number alterations often occur in mitotic crises, and lead to simultaneous gains of chromosomal segments. Timing analyses suggest that driver mutations often precede diagnosis by many years, if not decades. Together, these results determine the evolutionary trajectories of cancer, and highlight opportunities for early cancer detection.
Asunto(s)
Evolución Molecular , Genoma Humano/genética , Neoplasias/genética , Reparación del ADN/genética , Dosificación de Gen , Genes Supresores de Tumor , Variación Genética , Humanos , Mutagénesis Insercional/genéticaRESUMEN
Subclonal reconstruction from bulk tumor DNA sequencing has become a pillar of cancer evolution studies, providing insight into the clonality and relative ordering of mutations and mutational processes. We provide an outline of the complex computational approaches used for subclonal reconstruction from single and multiple tumor samples. We identify the underlying assumptions and uncertainties in each step and suggest best practices for analysis and quality assessment. This guide provides a pragmatic resource for the growing user community of subclonal reconstruction methods.
Asunto(s)
ADN de Neoplasias/genética , Neoplasias/genética , Análisis de Secuencia de ADN/métodos , Algoritmos , Humanos , Polimorfismo de Nucleótido SimpleRESUMEN
Tumors contain multiple subpopulations of genetically distinct cancer cells. Reconstructing their evolutionary history can improve our understanding of how cancers develop and respond to treatment. Subclonal reconstruction methods cluster mutations into groups that co-occur within the same subpopulations, estimate the frequency of cells belonging to each subpopulation, and infer the ancestral relationships among the subpopulations by constructing a clone tree. However, often multiple clone trees are consistent with the data and current methods do not efficiently capture this uncertainty; nor can these methods scale to clone trees with a large number of subclonal populations. Here, we formalize the notion of a partially-defined clone tree (partial clone tree for short) that defines a subset of the pairwise ancestral relationships in a clone tree, thereby implicitly representing the set of all clone trees that have these defined pairwise relationships. Also, we introduce a special partial clone tree, the Maximally-Constrained Ancestral Reconstruction (MAR), which summarizes all clone trees fitting the input data equally well. Finally, we extend commonly used clone tree validity conditions to apply to partial clone trees and describe SubMARine, a polynomial-time algorithm producing the subMAR, which approximates the MAR and guarantees that its defined relationships are a subset of those present in the MAR. We also extend SubMARine to work with subclonal copy number aberrations and define equivalence constraints for this purpose. Further, we extend SubMARine to permit noise in the estimates of the subclonal frequencies while retaining its validity conditions and guarantees. In contrast to other clone tree reconstruction methods, SubMARine runs in time and space that scale polynomially in the number of subclones. We show through extensive noise-free simulation, a large lung cancer dataset and a prostate cancer dataset that the subMAR equals the MAR in all cases where only a single clone tree exists and that it is a perfect match to the MAR in most of the other cases. Notably, SubMARine runs in less than 70 seconds on a single thread with less than one Gb of memory on all datasets presented in this paper, including ones with 50 nodes in a clone tree. On the real-world data, SubMARine almost perfectly recovers the previously reported trees and identifies minor errors made in the expert-driven reconstructions of those trees. The freely-available open-source code implementing SubMARine can be downloaded at https://github.com/morrislab/submarine.
Asunto(s)
Algoritmos , Biología Computacional/métodos , Mutación/genética , Neoplasias , Simulación por Computador , Evolución Molecular , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/clasificación , Neoplasias/genética , Secuenciación Completa del GenomaRESUMEN
MOTIVATION: Kablammo is a web-based application that produces interactive, vector-based visualizations of sequence alignments generated by BLAST. These visualizations can illustrate many features, including shared protein domains, chromosome structural modifications and genome misassembly. AVAILABILITY AND IMPLEMENTATION: Kablammo can be used at http://kablammo.wasmuthlab.org. For a local installation, the source code and instructions are available under the MIT license at http://github.com/jwintersinger/kablammo. CONTACT: jeff@wintersinger.org.
Asunto(s)
Gráficos por Computador , Genes de Helminto/genética , Alineación de Secuencia/métodos , Programas Informáticos , Animales , Genoma de los Helmintos , Haemonchus/genética , Internet , Lenguajes de Programación , Análisis de Secuencia de ADNRESUMEN
Subclonal reconstruction algorithms use bulk DNA sequencing data to quantify parameters of tumor evolution, allowing an assessment of how cancers initiate, progress and respond to selective pressures. We launched the ICGC-TCGA (International Cancer Genome Consortium-The Cancer Genome Atlas) DREAM Somatic Mutation Calling Tumor Heterogeneity and Evolution Challenge to benchmark existing subclonal reconstruction algorithms. This 7-year community effort used cloud computing to benchmark 31 subclonal reconstruction algorithms on 51 simulated tumors. Algorithms were scored on seven independent tasks, leading to 12,061 total runs. Algorithm choice influenced performance substantially more than tumor features but purity-adjusted read depth, copy-number state and read mappability were associated with the performance of most algorithms on most tasks. No single algorithm was a top performer for all seven tasks and existing ensemble strategies were unable to outperform the best individual methods, highlighting a key research need. All containerized methods, evaluation code and datasets are available to support further assessment of the determinants of subclonal reconstruction accuracy and development of improved methods to understand tumor evolution.
RESUMEN
Pairtree is a clone tree reconstruction algorithm that uses somatic point mutations to build clone trees describing the evolutionary history of individual cancers. Using the Pairtree software package, we describe steps to preprocess somatic mutation data, cluster mutations into subclones, search for clone trees, and visualize clone trees. Pairtree builds clone trees using up to 100 samples from a single cancer with at least 30 subclonal populations. For complete details on the use and execution of this protocol, please refer to Wintersinger et al. (2022).
Asunto(s)
Neoplasias , Árboles , Filogenia , Algoritmos , Neoplasias/genética , Células ClonalesRESUMEN
Cancers are composed of genetically distinct subpopulations of malignant cells. DNA-sequencing data can be used to determine the somatic point mutations specific to each population and build clone trees describing the evolutionary relationships between them. These clone trees can reveal critical points in disease development and inform treatment. Pairtree is a new method that constructs more accurate and detailed clone trees than previously possible using variant allele frequency data from one or more bulk cancer samples. It does so by first building a Pairs Tensor that captures the evolutionary relationships between pairs of subpopulations, and then it uses these relations to constrain clone trees and infer violations of the infinite sites assumption. Pairtree can accurately build clone trees using up to 100 samples per cancer that contain 30 or more subclonal populations. On 14 B-progenitor acute lymphoblastic leukemias, Pairtree replicates or improves upon expert-derived clone tree reconstructions. SIGNIFICANCE: Clone trees illustrate the evolutionary history of a cancer and can provide insights into how the disease changed through time (e.g., between diagnosis and relapse). Pairtree uses DNA-sequencing data from many samples of the same cancer to build more detailed and accurate clone trees than previously possible. See related commentary by Miller, p. 176. This article is highlighted in the In This Issue feature, p. 171.
Asunto(s)
Algoritmos , Neoplasias , ADN , Evolución Molecular , Humanos , Neoplasias/diagnóstico , Análisis de Secuencia de ADNRESUMEN
Central nervous system (CNS) dissemination of B-precursor acute lymphoblastic leukemia (B-ALL) has poor prognosis and remains a therapeutic challenge. Here we performed targeted DNA sequencing as well as transcriptional and proteomic profiling of paired leukemia-infiltrating cells in the bone marrow (BM) and CNS of xenografts. Genes governing mRNA translation were upregulated in CNS leukemia, and subclonal genetic profiling confirmed this in both BM-concordant and BM-discordant CNS mutational populations. CNS leukemia cells were exquisitely sensitive to the translation inhibitor omacetaxine mepesuccinate, which reduced xenograft leptomeningeal disease burden. Proteomics demonstrated greater abundance of secreted proteins in CNS-infiltrating cells, including complement component 3 (C3), and drug targeting of C3 influenced CNS disease in xenografts. CNS-infiltrating cells also exhibited selection for stemness traits and metabolic reprogramming. Overall, our study identifies targeting of mRNA translation as a potential therapeutic approach for B-ALL leptomeningeal disease. SIGNIFICANCE: Cancer metastases are often driven by distinct subclones with unique biological properties. Here we show that in B-ALL CNS disease, the leptomeningeal environment selects for cells with unique functional dependencies. Pharmacologic inhibition of mRNA translation signaling treats CNS disease and offers a new therapeutic approach for this condition.This article is highlighted in the In This Issue feature, p. 1.
Asunto(s)
Enfermedades del Sistema Nervioso Central , Neoplasias del Sistema Nervioso Central , Neoplasias Meníngeas , Leucemia-Linfoma Linfoblástico de Células Precursoras , Sistema Nervioso Central/metabolismo , Enfermedades del Sistema Nervioso Central/patología , Neoplasias del Sistema Nervioso Central/tratamiento farmacológico , Humanos , Neoplasias Meníngeas/patología , Leucemia-Linfoma Linfoblástico de Células Precursoras/tratamiento farmacológico , Biosíntesis de Proteínas/genética , ProteómicaRESUMEN
The type and genomic context of cancer mutations depend on their causes. These causes have been characterized using signatures that represent mutation types that co-occur in the same tumours. However, it remains unclear how mutation processes change during cancer evolution due to the lack of reliable methods to reconstruct evolutionary trajectories of mutational signature activity. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole-genome sequencing data from 2658 cancers across 38 tumour types, we present TrackSig, a new method that reconstructs these trajectories using optimal, joint segmentation and deconvolution of mutation type and allele frequencies from a single tumour sample. In simulations, we find TrackSig has a 3-5% activity reconstruction error, and 12% false detection rate. It outperforms an aggressive baseline in situations with branching evolution, CNA gain, and neutral mutations. Applied to data from 2658 tumours and 38 cancer types, TrackSig permits pan-cancer insight into evolutionary changes in mutational processes.
Asunto(s)
Biología Computacional/métodos , Mutación , Neoplasias/genética , Simulación por Computador , Evolución Molecular , Frecuencia de los Genes , Genoma Humano , Humanos , Neoplasias/patología , Polimorfismo de Nucleótido Simple , Secuenciación Completa del GenomaRESUMEN
Tumor DNA sequencing data can be interpreted by computational methods that analyze genomic heterogeneity to infer evolutionary dynamics. A growing number of studies have used these approaches to link cancer evolution with clinical progression and response to therapy. Although the inference of tumor phylogenies is rapidly becoming standard practice in cancer genome analyses, standards for evaluating them are lacking. To address this need, we systematically assess methods for reconstructing tumor subclonality. First, we elucidate the main algorithmic problems in subclonal reconstruction and develop quantitative metrics for evaluating them. Then we simulate realistic tumor genomes that harbor all known clonal and subclonal mutation types and processes. Finally, we benchmark 580 tumor reconstructions, varying tumor read depth, tumor type and somatic variant detection. Our analysis provides a baseline for the establishment of gold-standard methods to analyze tumor heterogeneity.