RÉSUMÉ
Expression of mRNA is often regulated by the binding of a small RNA (miRNA, snoRNA, siRNA). While the pairing contribution to the net free energy is well parameterized and can be computed in O(N) time, the cost of removing pre-existing mRNA secondary structure has not received sufficient attention. Conventional methods for computing the unfolding free energy of a target mRNA are costly, scaling like the cube of the number of target bases O(N3). Here we introduce a model to describe the unfolding costs of the binding site, which features surprisingly big differences in the free energy parameters for the four bases. The model is implemented in our O(N) algorithm, BindOligoNet. Donor splice site prediction is more accurate when using our calculation of spliceosomal U1-snRNA to mRNA net binding free energy. Our base-dependent free energies also correlate with efficient ribosome docking near the start codon.
Sujet(s)
Initiation de la traduction , Épissage des ARN , ARN messager , Algorithmes , Sites de fixation , Conformation d'acide nucléique , Nucléotides , ARN messager/biosynthèse , ARN messager/composition chimique , Petit ARN nucléaire/composition chimique , Splicéosomes/composition chimique , ThermodynamiqueRÉSUMÉ
The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.
Sujet(s)
Génome humain/génétique , Mutation/génétique , Tumeurs/génétique , Cassures de l'ADN , Bases de données génétiques , Régulation de l'expression des gènes tumoraux , Étude d'association pangénomique , Humains , Mutation de type INDELRÉSUMÉ
Current statistical models for assessing hotspot significance do not properly account for variation in site-specific mutability, thereby yielding many false-positives. We thus (i) detail a Log-normal-Poisson (LNP) background model that accounts for this variability in a manner consistent with models of mutagenesis; (ii) use it to show that passenger hotspots arise from all common mutational processes; and (iii) apply it to a â¼10,000-patient cohort to nominate driver hotspots with far fewer false-positives compared with conventional methods. Overall, we show that many cancer hotspot mutations recurring at the same genomic site across multiple tumors are actually passenger events, recurring at inherently mutable genomic sites under no positive selection.
Sujet(s)
Carcinogenèse/génétique , Génomique/méthodes , Modèles génétiques , Mutagenèse , Tumeurs/génétique , Analyse de mutations d'ADN , Jeux de données comme sujet , Gènes suppresseurs de tumeur , Humains , Loi de Poisson , Courbe ROC , Sélection génétique , Exome SequencingRÉSUMÉ
How somatic mutations accumulate in normal cells is poorly understood. A comprehensive analysis of RNA sequencing data from ~6700 samples across 29 normal tissues revealed multiple somatic variants, demonstrating that macroscopic clones can be found in many normal tissues. We found that sun-exposed skin, esophagus, and lung have a higher mutation burden than other tested tissues, which suggests that environmental factors can promote somatic mosaicism. Mutation burden was associated with both age and tissue-specific cell proliferation rate, highlighting that mutations accumulate over both time and number of cell divisions. Finally, normal tissues were found to harbor mutations in known cancer genes and hotspots. This study provides a broad view of macroscopic clonal expansion in human tissues, thus serving as a foundation for associating clonal expansion with environmental factors, aging, and risk of disease.
Sujet(s)
Analyse de mutations d'ADN/méthodes , Tumeurs/génétique , Analyse de séquence d'ARN/méthodes , Clones cellulaires , Femelle , Humains , Mâle , Spécificité d'organe/génétiqueRÉSUMÉ
Large panels of comprehensively characterized human cancer models, including the Cancer Cell Line Encyclopedia (CCLE), have provided a rigorous framework with which to study genetic variants, candidate targets, and small-molecule and biological therapeutics and to identify new marker-driven cancer dependencies. To improve our understanding of the molecular features that contribute to cancer phenotypes, including drug responses, here we have expanded the characterizations of cancer cell lines to include genetic, RNA splicing, DNA methylation, histone H3 modification, microRNA expression and reverse-phase protein array data for 1,072 cell lines from individuals of various lineages and ethnicities. Integration of these data with functional characterizations such as drug-sensitivity, short hairpin RNA knockdown and CRISPR-Cas9 knockout data reveals potential targets for cancer drugs and associated biomarkers. Together, this dataset and an accompanying public data portal provide a resource for the acceleration of cancer research using model cancer cell lines.
Sujet(s)
Lignée cellulaire tumorale , Tumeurs/génétique , Tumeurs/anatomopathologie , Antinéoplasiques/pharmacologie , Marqueurs biologiques tumoraux , Méthylation de l'ADN , Résistance aux médicaments antinéoplasiques , Ethnies/génétique , Édition de gène , Histone/métabolisme , Humains , microARN/génétique , Thérapie moléculaire ciblée , Tumeurs/métabolisme , Analyse par réseau de protéines , Épissage des ARNRÉSUMÉ
Hürthle cell carcinoma of the thyroid (HCC) is a form of thyroid cancer recalcitrant to radioiodine therapy that exhibits an accumulation of mitochondria. We performed whole-exome sequencing on a cohort of primary, recurrent, and metastatic tumors, and identified recurrent mutations in DAXX, TP53, NRAS, NF1, CDKN1A, ARHGAP35, and the TERT promoter. Parallel analysis of mtDNA revealed recurrent homoplasmic mutations in subunits of complex I of the electron transport chain. Analysis of DNA copy-number alterations uncovered widespread loss of chromosomes culminating in near-haploid chromosomal content in a large fraction of HCC, which was maintained during metastatic spread. This work uncovers a distinct molecular origin of HCC compared with other thyroid malignancies.
Sujet(s)
Aberrations des chromosomes , ADN mitochondrial/génétique , Mutation , Tumeurs de la thyroïde/génétique , Variations de nombre de copies de segment d'ADN , Haploïdie , Humains , Métastase tumorale , Telomerase/génétique , Tumeurs de la thyroïde/anatomopathologie , Exome SequencingRÉSUMÉ
In the version of this article originally published, an asterisk was omitted from Fig. 1a. The asterisk has been added to the figure. Additionally, a "NOTCH2" label was erroneously included in Fig. 4a. The label has been removed. The errors have been corrected in the PDF and HTML versions of this article.
RÉSUMÉ
In the version of this article originally published, some text above the "Tri-nucleotide sequence motifs" label in Fig. 2a appeared incorrectly. The text was garbled and should have appeared as nucleotide codes.Additionally, the labels on the bars in Fig. 2c were not italicized in the original publication. These are gene symbols, and they should have been italicized.The colored labels above the graphs in Fig. 4b were also erroneously not italicized. These labels represent gene names and loci, and they should have been italicized.
RÉSUMÉ
Diffuse large B cell lymphoma (DLBCL), the most common lymphoid malignancy in adults, is a clinically and genetically heterogeneous disease that is further classified into transcriptionally defined activated B cell (ABC) and germinal center B cell (GCB) subtypes. We carried out a comprehensive genetic analysis of 304 primary DLBCLs and identified low-frequency alterations, captured recurrent mutations, somatic copy number alterations, and structural variants, and defined coordinate signatures in patients with available outcome data. We integrated these genetic drivers using consensus clustering and identified five robust DLBCL subsets, including a previously unrecognized group of low-risk ABC-DLBCLs of extrafollicular/marginal zone origin; two distinct subsets of GCB-DLBCLs with different outcomes and targetable alterations; and an ABC/GCB-independent group with biallelic inactivation of TP53, CDKN2A loss, and associated genomic instability. The genetic features of the newly characterized subsets, their mutational signatures, and the temporal ordering of identified alterations provide new insights into DLBCL pathogenesis. The coordinate genetic signatures also predict outcome independent of the clinical International Prognostic Index and suggest new combination treatment strategies. More broadly, our results provide a roadmap for an actionable DLBCL classification.
Sujet(s)
Lymphome B diffus à grandes cellules/génétique , Lymphome B diffus à grandes cellules/anatomopathologie , Variations de nombre de copies de segment d'ADN/génétique , Réarrangement des gènes/génétique , Gènes tumoraux , Hétérogénéité génétique , Humains , Mutation/génétique , Taux de mutation , Résultat thérapeutiqueRÉSUMÉ
Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.
Sujet(s)
Tumeurs/anatomopathologie , Algorithmes , Antigène CD274/génétique , Biologie informatique , Bases de données génétiques , Entropie , Humains , Instabilité des microsatellites , Mutation , Tumeurs/génétique , Tumeurs/immunologie , Analyse en composantes principales , Récepteur-1 de mort cellulaire programmée/génétiqueRÉSUMÉ
Microsatellites (MSs) are tracts of variable-length repeats of short DNA motifs that exhibit high rates of mutation in the form of insertions or deletions (indels) of the repeated motif. Despite their prevalence, the contribution of somatic MS indels to cancer has been largely unexplored, owing to difficulties in detecting them in short-read sequencing data. Here we present two tools: MSMuTect, for accurate detection of somatic MS indels, and MSMutSig, for identification of genes containing MS indels at a higher frequency than expected by chance. Applying MSMuTect to whole-exome data from 6,747 human tumors representing 20 tumor types, we identified >1,000 previously undescribed MS indels in cancer genes. Additionally, we demonstrate that the number and pattern of MS indels can accurately distinguish microsatellite-stable tumors from tumors with microsatellite instability, thus potentially improving classification of clinically relevant subgroups. Finally, we identified seven MS indel driver hotspots: four in known cancer genes (ACVR2A, RNF43, JAK1, and MSH3) and three in genes not previously implicated as cancer drivers (ESRP1, PRDM2, and DOCK3).
Sujet(s)
Mutation de type INDEL/génétique , Répétitions microsatellites/génétique , Tumeurs/génétique , Exome/génétique , Gènes tumoraux , Séquençage nucléotidique à haut débit , Humains , Instabilité des microsatellites , Mutation/génétique , ARN messager/génétique , ARN messager/métabolisme , Protéines de liaison à l'ARN/génétique , Protéines de liaison à l'ARN/métabolismeRÉSUMÉ
Comprehensive multiplatform analysis of 80 uveal melanomas (UM) identifies four molecularly distinct, clinically relevant subtypes: two associated with poor-prognosis monosomy 3 (M3) and two with better-prognosis disomy 3 (D3). We show that BAP1 loss follows M3 occurrence and correlates with a global DNA methylation state that is distinct from D3-UM. Poor-prognosis M3-UM divide into subsets with divergent genomic aberrations, transcriptional features, and clinical outcomes. We report change-of-function SRSF2 mutations. Within D3-UM, EIF1AX- and SRSF2/SF3B1-mutant tumors have distinct somatic copy number alterations and DNA methylation profiles, providing insight into the biology of these low- versus intermediate-risk clinical mutation subtypes.
Sujet(s)
Marqueurs biologiques tumoraux/génétique , Méthylation de l'ADN , Régulation de l'expression des gènes tumoraux , Mélanome/génétique , Mutation , Tumeurs de l'uvée/génétique , Variations de nombre de copies de segment d'ADN , Facteur-1 d'initiation eucaryote/génétique , Humains , Mélanome/classification , Monosomie , Phosphoprotéines/génétique , Pronostic , Facteurs d'épissage des ARN/génétique , Facteurs d'épissage riches en sérine-arginine/génétique , Protéines suppresseurs de tumeurs/génétique , Ubiquitin thiolesterase/génétique , Tumeurs de l'uvée/classificationRÉSUMÉ
Cholangiocarcinoma (CCA) is an aggressive malignancy of the bile ducts, with poor prognosis and limited treatment options. Here, we describe the integrated analysis of somatic mutations, RNA expression, copy number, and DNA methylation by The Cancer Genome Atlas of a set of predominantly intrahepatic CCA cases and propose a molecular classification scheme. We identified an IDH mutant-enriched subtype with distinct molecular features including low expression of chromatin modifiers, elevated expression of mitochondrial genes, and increased mitochondrial DNA copy number. Leveraging the multi-platform data, we observed that ARID1A exhibited DNA hypermethylation and decreased expression in the IDH mutant subtype. More broadly, we found that IDH mutations are associated with an expanded histological spectrum of liver tumors with molecular features that stratify with CCA. Our studies reveal insights into the molecular pathogenesis and heterogeneity of cholangiocarcinoma and provide classification information of potential therapeutic significance.
Sujet(s)
Tumeurs des canaux biliaires/génétique , Cholangiocarcinome/génétique , Génomique/méthodes , Isocitrate dehydrogenases/génétique , Mutation/génétique , Adulte , Sujet âgé , Sujet âgé de 80 ans ou plus , Tumeurs des canaux biliaires/enzymologie , Cholangiocarcinome/enzymologie , Chromatine/métabolisme , Méthylation de l'ADN/génétique , Protéines de liaison à l'ADN , Femelle , Régulation de l'expression des gènes tumoraux , Humains , Foie/anatomopathologie , Tumeurs du foie/génétique , Tumeurs du foie/anatomopathologie , Mâle , Adulte d'âge moyen , Mitochondries/métabolisme , Protéines nucléaires/génétique , Tumeurs du pancréas/génétique , Tumeurs du pancréas/anatomopathologie , Régions promotrices (génétique)/génétique , ARN long non codant/génétique , ARN long non codant/métabolisme , ARN messager/génétique , ARN messager/métabolisme , Facteurs de transcription/génétiqueRÉSUMÉ
There is a striking and unexplained male predominance across many cancer types. A subset of X-chromosome genes can escape X-inactivation, which would protect females from complete functional loss by a single mutation. To identify putative 'escape from X-inactivation tumor-suppressor' (EXITS) genes, we examined somatic alterations from >4,100 cancers across 21 tumor types for sex bias. Six of 783 non-pseudoautosomal region (PAR) X-chromosome genes (ATRX, CNKSR2, DDX3X, KDM5C, KDM6A, and MAGEC3) harbored loss-of-function mutations more frequently in males (based on a false discovery rate < 0.1), in comparison to zero of 18,055 autosomal and PAR genes (Fisher's exact P < 0.0001). Male-biased mutations in genes that escape X-inactivation were observed in combined analysis across many cancers and in several individual tumor types, suggesting a generalized phenomenon. We conclude that biallelic expression of EXITS genes in females explains a portion of the reduced cancer incidence in females as compared to males across a variety of tumor types.
Sujet(s)
Chromosomes X humains/génétique , Gènes suppresseurs de tumeur , Gènes liés au chromosome X/génétique , Mutation/génétique , Tumeurs/génétique , Sexisme/statistiques et données numériques , Inactivation du chromosome X/génétique , Femelle , Humains , MâleRÉSUMÉ
Mutational processes constantly shape the somatic genome, leading to immunity, aging, cancer, and other diseases. When cancer is the outcome, we are afforded a glimpse into these processes by the clonal expansion of the malignant cell. Here, we characterize a less explored layer of the mutational landscape of cancer: mutational asymmetries between the two DNA strands. Analyzing whole-genome sequences of 590 tumors from 14 different cancer types, we reveal widespread asymmetries across mutagenic processes, with transcriptional ("T-class") asymmetry dominating UV-, smoking-, and liver-cancer-associated mutations and replicative ("R-class") asymmetry dominating POLE-, APOBEC-, and MSI-associated mutations. We report a striking phenomenon of transcription-coupled damage (TCD) on the non-transcribed DNA strand and provide evidence that APOBEC mutagenesis occurs on the lagging-strand template during DNA replication. As more genomes are sequenced, studying and classifying their asymmetries will illuminate the underlying biological mechanisms of DNA damage and repair.
Sujet(s)
Altération de l'ADN , Analyse de mutations d'ADN , Réparation de l'ADN , Tumeurs/génétique , Réplication de l'ADN , Génome humain , Étude d'association pangénomique , Humains , Mutation , Tumeurs/anatomopathologie , Transcription génétiqueRÉSUMÉ
Which genetic alterations drive tumorigenesis and how they evolve over the course of disease and therapy are central questions in cancer biology. Here we identify 44 recurrently mutated genes and 11 recurrent somatic copy number variations through whole-exome sequencing of 538 chronic lymphocytic leukaemia (CLL) and matched germline DNA samples, 278 of which were collected in a prospective clinical trial. These include previously unrecognized putative cancer drivers (RPS15, IKZF3), and collectively identify RNA processing and export, MYC activity, and MAPK signalling as central pathways involved in CLL. Clonality analysis of this large data set further enabled reconstruction of temporal relationships between driver events. Direct comparison between matched pre-treatment and relapse samples from 59 patients demonstrated highly frequent clonal evolution. Thus, large sequencing data sets of clinically informative samples enable the discovery of novel genes associated with cancer, the network of relationships between the driver events, and their impact on disease relapse and clinical outcome.