RESUMEN
Cancer is driven by somatically acquired point mutations and chromosomal rearrangements, conventionally thought to accumulate gradually over time. Using next-generation sequencing, we characterize a phenomenon, which we term chromothripsis, whereby tens to hundreds of genomic rearrangements occur in a one-off cellular crisis. Rearrangements involving one or a few chromosomes crisscross back and forth across involved regions, generating frequent oscillations between two copy number states. These genomic hallmarks are highly improbable if rearrangements accumulate over time and instead imply that nearly all occur during a single cellular catastrophe. The stamp of chromothripsis can be seen in at least 2%-3% of all cancers, across many subtypes, and is present in â¼25% of bone cancers. We find that one, or indeed more than one, cancer-causing lesion can emerge out of the genomic crisis. This phenomenon has important implications for the origins of genomic remodeling and temporal emergence of cancer.
Asunto(s)
Aberraciones Cromosómicas , Neoplasias/genética , Neoplasias/patología , Neoplasias Óseas/genética , Línea Celular Tumoral , Pintura Cromosómica , Femenino , Reordenamiento Génico , Humanos , Leucemia Linfocítica Crónica de Células B/genética , Persona de Mediana EdadRESUMEN
Rearrangements are discrete processes whereby discrete segments of DNA are deleted, replicated and inserted into novel positions. A sequence of such configurations, termed a rearrangement evolution, results in jumbled DNA arrangements, frequently observed in cancer genomes. We introduce a method that allows us to precisely count these different evolutions for a range of processes including breakage-fusion-bridge-cycles, tandem-duplications, inverted-duplications, reversals, transpositions and deletions, showing that the space of rearrangement evolution is super-exponential in size. These counts assume the infinite sites model of unique breakpoint usage.
Asunto(s)
ADN , Genoma , Reordenamiento Génico/genética , Genoma/genéticaRESUMEN
Clinical responses to anticancer therapies are often restricted to a subset of patients. In some cases, mutated cancer genes are potent biomarkers for responses to targeted agents. Here, to uncover new biomarkers of sensitivity and resistance to cancer therapeutics, we screened a panel of several hundred cancer cell lines--which represent much of the tissue-type and genetic diversity of human cancers--with 130 drugs under clinical and preclinical investigation. In aggregate, we found that mutated cancer genes were associated with cellular response to most currently available cancer drugs. Classic oncogene addiction paradigms were modified by additional tissue-specific or expression biomarkers, and some frequently mutated genes were associated with sensitivity to a broad range of therapeutic agents. Unexpected relationships were revealed, including the marked sensitivity of Ewing's sarcoma cells harbouring the EWS (also known as EWSR1)-FLI1 gene translocation to poly(ADP-ribose) polymerase (PARP) inhibitors. By linking drug activity to the functional complexity of cancer genomes, systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies.
Asunto(s)
Resistencia a Antineoplásicos/genética , Ensayos de Selección de Medicamentos Antitumorales , Genes Relacionados con las Neoplasias/genética , Marcadores Genéticos/genética , Genoma Humano/genética , Neoplasias/tratamiento farmacológico , Neoplasias/genética , Línea Celular Tumoral , Supervivencia Celular/efectos de los fármacos , Resistencia a Antineoplásicos/efectos de los fármacos , Regulación Neoplásica de la Expresión Génica/genética , Genómica , Humanos , Indoles/farmacología , Neoplasias/patología , Proteínas de Fusión Oncogénica/genética , Farmacogenética , Ftalazinas/farmacología , Piperazinas/farmacología , Inhibidores de Poli(ADP-Ribosa) Polimerasas , Proteína Proto-Oncogénica c-fli-1/genética , Proteína EWS de Unión a ARN/genética , Sarcoma de Ewing/tratamiento farmacológico , Sarcoma de Ewing/genética , Sarcoma de Ewing/patologíaRESUMEN
RNA virus populations will undergo processes of mutation and selection resulting in a mixed population of viral particles. High throughput sequencing of a viral population subsequently contains a mixed signal of the underlying clones. We would like to identify the underlying evolutionary structures. We utilize two sources of information to attempt this; within segment linkage information, and mutation prevalence. We demonstrate that clone haplotypes, their prevalence, and maximum parsimony reticulate evolutionary structures can be identified, although the solutions may not be unique, even for complete sets of information. This is applied to a chain of influenza infection, where we infer evolutionary structures, including reassortment, and demonstrate some of the difficulties of interpretation that arise from deep sequencing due to artifacts such as template switching during PCR amplification.
Asunto(s)
Evolución Molecular , Virus ARN/clasificación , Virus ARN/genética , ARN Viral/genética , Análisis de Secuencia de ARN/métodos , Algoritmos , Biología Computacional , Secuenciación de Nucleótidos de Alto Rendimiento , Modelos Genéticos , Mutación/genética , Filogenia , Reacción en Cadena de la PolimerasaRESUMEN
The cancer genome is moulded by the dual processes of somatic mutation and selection. Homozygous deletions in cancer genomes occur over recessive cancer genes, where they can confer selective growth advantage, and over fragile sites, where they are thought to reflect an increased local rate of DNA breakage. However, most homozygous deletions in cancer genomes are unexplained. Here we identified 2,428 somatic homozygous deletions in 746 cancer cell lines. These overlie 11% of protein-coding genes that, therefore, are not mandatory for survival of human cells. We derived structural signatures that distinguish between homozygous deletions over recessive cancer genes and fragile sites. Application to clusters of unexplained homozygous deletions suggests that many are in regions of inherent fragility, whereas a small subset overlies recessive cancer genes. The results illustrate how structural signatures can be used to distinguish between the influences of mutation and selection in cancer genomes. The extensive copy number, genotyping, sequence and expression data available for this large series of publicly available cancer cell lines renders them informative reagents for future studies of cancer biology and drug discovery.
Asunto(s)
Sitios Frágiles del Cromosoma/genética , Eliminación de Gen , Genes Relacionados con las Neoplasias/genética , Genes Recesivos/genética , Genoma Humano/genética , Homocigoto , Neoplasias/genética , Selección Genética/genética , Línea Celular Tumoral , Cromosomas Humanos/genética , Variaciones en el Número de Copia de ADN/genética , Análisis Mutacional de ADN , Dosificación de Gen/genética , Humanos , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos , Mapeo Físico de Cromosoma , Reproducibilidad de los ResultadosRESUMEN
All cancers carry somatic mutations. A subset of these somatic alterations, termed driver mutations, confer selective growth advantage and are implicated in cancer development, whereas the remainder are passengers. Here we have sequenced the genomes of a malignant melanoma and a lymphoblastoid cell line from the same person, providing the first comprehensive catalogue of somatic mutations from an individual cancer. The catalogue provides remarkable insights into the forces that have shaped this cancer genome. The dominant mutational signature reflects DNA damage due to ultraviolet light exposure, a known risk factor for malignant melanoma, whereas the uneven distribution of mutations across the genome, with a lower prevalence in gene footprints, indicates that DNA repair has been preferentially deployed towards transcribed regions. The results illustrate the power of a cancer genome sequence to reveal traces of the DNA damage, repair, mutation and selection processes that were operative years before the cancer became symptomatic.
Asunto(s)
Genes Relacionados con las Neoplasias/genética , Genoma Humano/genética , Mutación/genética , Neoplasias/genética , Adulto , Línea Celular Tumoral , Daño del ADN/genética , Análisis Mutacional de ADN , Reparación del ADN/genética , Dosificación de Gen/genética , Humanos , Pérdida de Heterocigocidad/genética , Masculino , Melanoma/etiología , Melanoma/genética , MicroARNs/genética , Mutagénesis Insercional/genética , Neoplasias/etiología , Polimorfismo de Nucleótido Simple/genética , Medicina de Precisión , Eliminación de Secuencia/genética , Rayos UltravioletaRESUMEN
Cancer genomes are complex, carrying thousands of somatic mutations including base substitutions, insertions and deletions, rearrangements, and copy number changes that have been acquired over decades. Recently, technologies have been introduced that allow generation of high-resolution, comprehensive catalogs of somatic alterations in cancer genomes. However, analyses of these data sets generally do not indicate the order in which mutations have occurred, or the resulting karyotype. Here, we introduce a mathematical framework that begins to address this problem. By using samples with accurate data sets, we can reconstruct relatively complex temporal sequences of rearrangements and provide an assembly of genomic segments into digital karyotypes. For cancer genes mutated in rearranged regions, this information can provide a chronological examination of the selective events that have taken place.
Asunto(s)
Genoma Humano , Modelos Genéticos , Neoplasias/genética , Filogenia , Translocación Genética , Biología Computacional/métodos , Variaciones en el Número de Copia de ADN , Evolución Molecular , Humanos , MutaciónRESUMEN
Multiple somatic rearrangements are often found in cancer genomes; however, the underlying processes of rearrangement and their contribution to cancer development are poorly characterized. Here we use a paired-end sequencing strategy to identify somatic rearrangements in breast cancer genomes. There are more rearrangements in some breast cancers than previously appreciated. Rearrangements are more frequent over gene footprints and most are intrachromosomal. Multiple rearrangement architectures are present, but tandem duplications are particularly common in some cancers, perhaps reflecting a specific defect in DNA maintenance. Short overlapping sequences at most rearrangement junctions indicate that these have been mediated by non-homologous end-joining DNA repair, although varying sequence patterns indicate that multiple processes of this type are operative. Several expressed in-frame fusion genes were identified but none was recurrent. The study provides a new perspective on cancer genomes, highlighting the diversity of somatic rearrangements and their potential contribution to cancer development.
Asunto(s)
Neoplasias de la Mama/genética , Aberraciones Cromosómicas , Reordenamiento Génico/genética , Genoma Humano/genética , Línea Celular Tumoral , Células Cultivadas , Roturas del ADN , Femenino , Biblioteca Genómica , Humanos , Análisis de Secuencia de ADNRESUMEN
High-throughput oligonucleotide microarrays are commonly employed to investigate genetic disease, including cancer. The algorithms employed to extract genotypes and copy number variation function optimally for diploid genomes usually associated with inherited disease. However, cancer genomes are aneuploid in nature leading to systematic errors when using these techniques. We introduce a preprocessing transformation and hidden Markov model algorithm bespoke to cancer. This produces genotype classification, specification of regions of loss of heterozygosity, and absolute allelic copy number segmentation. Accurate prediction is demonstrated with a combination of independent experimental techniques. These methods are exemplified with affymetrix genome-wide SNP6.0 data from 755 cancer cell lines, enabling inference upon a number of features of biological interest. These data and the coded algorithm are freely available for download.
Asunto(s)
Algoritmos , Alelos , Variaciones en el Número de Copia de ADN/genética , Pruebas Genéticas , Modelos Estadísticos , Neoplasias/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Aneuploidia , Teorema de Bayes , Sesgo , Línea Celular Tumoral , Genes Supresores de Tumor , Genotipo , Humanos , Internet , Pérdida de Heterocigocidad/genética , Cadenas de Markov , Neoplasias/diagnóstico , Polimorfismo de Nucleótido Simple/genética , Poliploidía , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Programas InformáticosRESUMEN
To identify a novel amplified cancer gene a systematic screen of 975 human cancer DNA samples, 750 cell lines and 225 primary tumors, using the Affymetrix 10K SNP microarray was undertaken. The screen identified 193 amplicons. A previously uncharacterized amplicon located on 6p21.2 whose 1 Mb minimal common amplified region contained eight genes (GLO1, DNAH8, GLP1R, C6orf64, KCNK5, KCNK17, KCNK16, and C6orf102) was further investigated to determine which gene(s) are the biological targets of this amplicon. Real time quantitative PCR (qPCR) analysis of all amplicon 6p21.2 genes in 618 human cancer cell lines identified GLO1, encoding glyoxalase 1, to be the most frequently amplified gene [twofold or greater amplification in 8.4% (49/536) of cancers]. Also the association between amplification and overexpression was greatest for GLO1. RNAi knockdown of GLO1 had the greatest and most consistent impact on cell accumulation and apoptosis. Cell lines with GLO1 amplification were more sensitive to inhibition of GLO1 by bromobenzylglutathione cyclopentyl diester (BBGC). Subsequent qPCR of 520 primary tumor samples identified twofold and greater amplification of GLO1 in 8/37 (22%) of breast, 12/71 (17%) of sarcomas, 6/53 (11.3%) of nonsmall cell lung, 2/23 (8.7%) of bladder, 6/93 (6.5%) of renal and 5/83 (6%) of gastric cancers. Amplification of GLO1 was rare in colon cancer (1/35) and glioma (1/94). Collectively the results indicate that GLO1 is at least one of the targets of gene amplification on 6p21.2 and may represent a useful target for therapy in cancers with GLO1 amplification.
Asunto(s)
Biomarcadores de Tumor/genética , Amplificación de Genes , Lactoilglutatión Liasa/genética , Neoplasias/genética , Polimorfismo de Nucleótido Simple/genética , Apoptosis , Biomarcadores de Tumor/metabolismo , Proliferación Celular , Cromosomas Humanos Par 6/genética , Perfilación de la Expresión Génica , Humanos , Neoplasias/enzimología , Neoplasias/patología , Análisis de Secuencia por Matrices de Oligonucleótidos , ARN Mensajero/genética , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Células Tumorales CultivadasRESUMEN
Cell division is a process that involves many biochemical steps and complex biophysical mechanisms. To simplify the understanding of what triggers cell division, three basic models that subsume more microscopic cellular processes associated with cell division have been proposed. Cells can divide based on the time elapsed since their birth, their size, and/or the volume added since their birth-the timer, sizer, and adder models, respectively. Here, we propose unified adder-sizer models and investigate some of the properties of different adder processes arising in cellular proliferation. Although the adder-sizer model provides a direct way to model cell population structure, we illustrate how it is mathematically related to the well-known model in which cell division depends on age and size. Existence and uniqueness of weak solutions to our 2+1-dimensional PDE model are proved, leading to the convergence of the discretized numerical solutions and allowing us to numerically compute the dynamics of cell population densities. We then generalize our PDE model to incorporate recent experimental findings of a system exhibiting mother-daughter correlations in cellular growth rates. Numerical experiments illustrating possible average cell volume blowup and the dynamical behavior of cell populations with mother-daughter correlated growth rates are carried out. Finally, motivated by new experimental findings, we extend our adder model cases where the controlling variable is the added size between DNA replication initiation points in the cell cycle.
RESUMEN
Cell differentiation is affected by complex networks of transcription factors that co-ordinate re-organisation of the chromatin landscape. The hierarchies of these relationships can be difficult to dissect. During in vitro differentiation of normal human uro-epithelial cells, formaldehyde-assisted isolation of regulatory elements (FAIRE-seq) and RNA-seq was used to identify alterations in chromatin accessibility and gene expression changes following activation of the nuclear receptor peroxisome proliferator-activated receptor gamma (PPARγ) as a differentiation-initiating event. Regions of chromatin identified by FAIRE-seq, as having altered accessibility during differentiation, were found to be enriched with sequence-specific binding motifs for transcription factors predicted to be involved in driving basal and differentiated urothelial cell phenotypes, including forkhead box A1 (FOXA1), P63, GRHL2, CTCF and GATA-binding protein 3 (GATA3). In addition, co-occurrence of GATA3 motifs was observed within subsets of differentiation-specific peaks containing P63 or FOXA1. Changes in abundance of GRHL2, GATA3 and P63 were observed in immunoblots of chromatin-enriched extracts. Transient siRNA knockdown of P63 revealed that P63 favoured a basal-like phenotype by inhibiting differentiation and promoting expression of basal marker genes. GATA3 siRNA prevented differentiation-associated downregulation of P63 protein and transcript, and demonstrated positive feedback of GATA3 on PPARG transcript, but showed no effect on FOXA1 transcript or protein expression. This approach indicates that as a transcriptionally regulated programme, urothelial differentiation operates as a heterarchy, wherein GATA3 is able to co-operate with FOXA1 to drive expression of luminal marker genes, but that P63 has potential to transrepress expression of the same genes.
Asunto(s)
Diferenciación Celular/genética , Células Epiteliales/citología , Células Epiteliales/metabolismo , Factor de Transcripción GATA3/genética , Factor Nuclear 3-alfa del Hepatocito/genética , Factores de Transcripción/genética , Proteínas Supresoras de Tumor/genética , Factor de Unión a CCCTC/genética , Factor de Unión a CCCTC/metabolismo , Línea Celular , Cromatina/química , Cromatina/metabolismo , Proteínas de Unión al ADN/genética , Proteínas de Unión al ADN/metabolismo , Formaldehído/química , Factor de Transcripción GATA3/antagonistas & inhibidores , Factor de Transcripción GATA3/metabolismo , Regulación de la Expresión Génica , Factor Nuclear 3-alfa del Hepatocito/antagonistas & inhibidores , Factor Nuclear 3-alfa del Hepatocito/metabolismo , Humanos , PPAR gamma/genética , PPAR gamma/metabolismo , Fenotipo , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN Interferente Pequeño/genética , ARN Interferente Pequeño/metabolismo , Elementos Reguladores de la Transcripción , Análisis de Secuencia de ARN , Transducción de Señal , Factores de Transcripción/antagonistas & inhibidores , Factores de Transcripción/metabolismo , Proteínas Supresoras de Tumor/antagonistas & inhibidores , Proteínas Supresoras de Tumor/metabolismo , Urotelio/citología , Urotelio/metabolismoRESUMEN
Classical age-structured mass-action models such as the McKendrick-von Foerster equation have been extensively studied but are unable to describe stochastic fluctuations or population-size-dependent birth and death rates. Stochastic theories that treat semi-Markov age-dependent processes using, e.g., the Bellman-Harris equation do not resolve a population's age structure and are unable to quantify population-size dependencies. Conversely, current theories that include size-dependent population dynamics (e.g., mathematical models that include carrying capacity such as the logistic equation) cannot be easily extended to take into account age-dependent birth and death rates. In this paper, we present a systematic derivation of a new, fully stochastic kinetic theory for interacting age-structured populations. By defining multiparticle probability density functions, we derive a hierarchy of kinetic equations for the stochastic evolution of an aging population undergoing birth and death. We show that the fully stochastic age-dependent birth-death process precludes factorization of the corresponding probability densities, which then must be solved by using a Bogoliubov--Born--Green--Kirkwood--Yvon-like hierarchy. Explicit solutions are derived in three limits: no birth, no death, and steady state. These are then compared with their corresponding mean-field results. Our results generalize both deterministic models and existing master equation approaches by providing an intuitive and efficient way to simultaneously model age- and population-dependent stochastic dynamics applicable to the study of demography, stem cell dynamics, and disease evolution.
Asunto(s)
Envejecimiento , Muerte , Modelos Biológicos , Parto , Dinámica Poblacional , Simulación por Computador , Cinética , Procesos EstocásticosRESUMEN
We develop mathematical models describing the evolution of stochastic age-structured populations. After reviewing existing approaches, we formulate a complete kinetic framework for age-structured interacting populations undergoing birth, death and fission processes in spatially dependent environments. We define the full probability density for the population-size age chart and find results under specific conditions. Connections with more classical models are also explicitly derived. In particular, we show that factorial moments for non-interacting processes are described by a natural generalization of the McKendrick-von Foerster equation, which describes mean-field deterministic behavior. Our approach utilizes mixed-type, multidimensional probability distributions similar to those employed in the study of gas kinetics and with terms that satisfy BBGKY-like equation hierarchies.
RESUMEN
Many tumors have highly rearranged genomes, but a major unknown is the relative importance and timing of genome rearrangements compared to sequence-level mutation. Chromosome instability might arise early, be a late event contributing little to cancer development, or happen as a single catastrophic event. Another unknown is which of the point mutations and rearrangements are selected. To address these questions we show, using the breast cancer cell line HCC1187 as a model, that we can reconstruct the likely history of a breast cancer genome. We assembled probably the most complete map to date of a cancer genome, by combining molecular cytogenetic analysis with sequence data. In particular, we assigned most sequence-level mutations to individual chromosomes by sequencing of flow sorted chromosomes. The parent of origin of each chromosome was assigned from SNP arrays. We were then able to classify most of the mutations as earlier or later according to whether they occurred before or after a landmark event in the evolution of the genome, endoreduplication (duplication of its entire genome). Genome rearrangements and sequence-level mutations were fairly evenly divided earlier and later, suggesting that genetic instability was relatively constant throughout the life of this tumor, and chromosome instability was not a late event. Mutations that caused chromosome instability would be in the earlier set. Strikingly, the great majority of inactivating mutations and in-frame gene fusions happened earlier. The non-random timing of some of the mutations may be evidence that they were selected.
Asunto(s)
Neoplasias de la Mama/genética , Inestabilidad Cromosómica , Cromosomas Humanos/genética , Reordenamiento Génico , Genoma Humano/genética , Mutación/genética , Polimorfismo de Nucleótido Simple/genética , Neoplasias de la Mama/patología , Mapeo Cromosómico , Femenino , Humanos , Factores de Tiempo , Células Tumorales CultivadasRESUMEN
SNP allelic copy number data provides intensity measurements for the two different alleles separately. We present a method that estimates the number of copies of each allele at each SNP position, using a continuous-index hidden Markov model. The method is especially suited for cancer data, since it includes the fraction of normal tissue contamination, often present when studying data from cancer tumors, into the model. The continuous-index structure takes into account the distances between the SNPs, and is thereby appropriate also when SNPs are unequally spaced. In a simulation study we show that the method performs favorably compared to previous methods even with as much as 70% normal contamination. We also provide results from applications to clinical data produced using the Affymetrix genome-wide SNP 6.0 platform.