Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Nature ; 578(7793): 102-111, 2020 02.
Artículo en Inglés | MEDLINE | ID: mdl-32025015

RESUMEN

The discovery of drivers of cancer has traditionally focused on protein-coding genes1-4. Here we present analyses of driver point mutations and structural variants in non-coding regions across 2,658 genomes from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium5 of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). For point mutations, we developed a statistically rigorous strategy for combining significance levels from multiple methods of driver discovery that overcomes the limitations of individual methods. For structural variants, we present two methods of driver discovery, and identify regions that are significantly affected by recurrent breakpoints and recurrent somatic juxtapositions. Our analyses confirm previously reported drivers6,7, raise doubts about others and identify novel candidates, including point mutations in the 5' region of TP53, in the 3' untranslated regions of NFKBIZ and TOB1, focal deletions in BRD4 and rearrangements in the loci of AKR1C genes. We show that although point mutations and structural variants that drive cancer are less frequent in non-coding genes and regulatory sequences than in protein-coding genes, additional examples of these drivers will be found as more cancer genomes become available.


Asunto(s)
Genoma Humano/genética , Mutación/genética , Neoplasias/genética , Roturas del ADN , Bases de Datos Genéticas , Regulación Neoplásica de la Expresión Génica , Estudio de Asociación del Genoma Completo , Humanos , Mutación INDEL
3.
Bioinformatics ; 35(2): 189-199, 2019 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-29945188

RESUMEN

Motivation: Understanding the mutational processes that act during cancer development is a key topic of cancer biology. Nevertheless, much remains to be learned, as a complex interplay of processes with dependencies on a range of genomic features creates highly heterogeneous cancer genomes. Accurate driver detection relies on unbiased models of the mutation rate that also capture rate variation from uncharacterized sources. Results: Here, we analyse patterns of observed-to-expected mutation counts across 505 whole cancer genomes, and find that genomic features missing from our mutation-rate model likely operate on a megabase length scale. We extend our site-specific model of the mutation rate to include the additional variance from these sources, which leads to robust significance evaluation of candidate cancer drivers. We thus present ncdDetect v.2, with greatly improved cancer driver detection specificity. Finally, we show that ranking candidates by their posterior mean value of their effect sizes offers an equivalent and more computationally efficient alternative to ranking by their P-values. Availability and implementation: ncdDetect v.2 is implemented as an R-package and is freely available at http://github.com/TobiasMadsen/ncdDetect2. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Modelos Genéticos , Tasa de Mutación , Neoplasias/genética , Biología Computacional , Genómica , Humanos , Programas Informáticos
4.
Stat Appl Genet Mol Biol ; 18(6)2019 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-31734658

RESUMEN

DNA methylation and gene expression are interdependent and both implicated in cancer development and progression, with many individual biomarkers discovered. A joint analysis of the two data types can potentially lead to biological insights that are not discoverable with separate analyses. To optimally leverage the joint data for identifying perturbed genes and classifying clinical cancer samples, it is important to accurately model the interactions between the two data types. Here, we present EBADIMEX for jointly identifying differential expression and methylation and classifying samples. The moderated t-test widely used with empirical Bayes priors in current differential expression methods is generalised to a multivariate setting by developing: (1) a moderated Welch t-test for equality of means with unequal variances; (2) a moderated F-test for equality of variances; and (3) a multivariate test for equality of means with equal variances. This leads to parametric models with prior distributions for the parameters, which allow fast evaluation and robust analysis of small data sets. EBADIMEX is demonstrated on simulated data as well as a large breast cancer (BRCA) cohort from TCGA. We show that the use of empirical Bayes priors and moderated tests works particularly well on small data sets.


Asunto(s)
Teorema de Bayes , Biología Computacional/métodos , Metilación de ADN , Epigenómica/métodos , Perfilación de la Expresión Génica/métodos , Algoritmos , Bases de Datos Genéticas , Regulación de la Expresión Génica , Humanos , Modelos Estadísticos , Reproducibilidad de los Resultados , Transcriptoma
5.
BMC Bioinformatics ; 19(1): 147, 2018 04 19.
Artículo en Inglés | MEDLINE | ID: mdl-29673314

RESUMEN

BACKGROUND: Detailed modelling of the neutral mutational process in cancer cells is crucial for identifying driver mutations and understanding the mutational mechanisms that act during cancer development. The neutral mutational process is very complex: whole-genome analyses have revealed that the mutation rate differs between cancer types, between patients and along the genome depending on the genetic and epigenetic context. Therefore, methods that predict the number of different types of mutations in regions or specific genomic elements must consider local genomic explanatory variables. A major drawback of most methods is the need to average the explanatory variables across the entire region or genomic element. This procedure is particularly problematic if the explanatory variable varies dramatically in the element under consideration. RESULTS: To take into account the fine scale of the explanatory variables, we model the probabilities of different types of mutations for each position in the genome by multinomial logistic regression. We analyse 505 cancer genomes from 14 different cancer types and compare the performance in predicting mutation rate for both regional based models and site-specific models. We show that for 1000 randomly selected genomic positions, the site-specific model predicts the mutation rate much better than regional based models. We use a forward selection procedure to identify the most important explanatory variables. The procedure identifies site-specific conservation (phyloP), replication timing, and expression level as the best predictors for the mutation rate. Finally, our model confirms and quantifies certain well-known mutational signatures. CONCLUSION: We find that our site-specific multinomial regression model outperforms the regional based models. The possibility of including genomic variables on different scales and patient specific variables makes it a versatile framework for studying different mutational mechanisms. Our model can serve as the neutral null model for the mutational process; regions that deviate from the null model are candidates for elements that drive cancer development.


Asunto(s)
Genoma Humano , Modelos Genéticos , Tasa de Mutación , Mutación/genética , Neoplasias/genética , Bases de Datos Genéticas , Epigenómica , Humanos , Polimorfismo de Nucleótido Simple/genética , Análisis de Regresión
6.
Bioinformatics ; 32(9): 1353-65, 2016 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-26740525

RESUMEN

MOTIVATION: Cancer development and progression is driven by a complex pattern of genomic and epigenomic perturbations. Both types of perturbations can affect gene expression levels and disease outcome. Integrative analysis of cancer genomics data may therefore improve detection of perturbed genes and prediction of disease state. As different data types are usually dependent, analysis based on independence assumptions will make inefficient use of the data and potentially lead to false conclusions. MODEL: Here, we present PINCAGE (Probabilistic INtegration of CAncer GEnomics data), a method that uses probabilistic integration of cancer genomics data for combined evaluation of RNA-seq gene expression and 450k array DNA methylation measurements of promoters as well as gene bodies. It models the dependence between expression and methylation using modular graphical models, which also allows future inclusion of additional data types. RESULTS: We apply our approach to a Breast Invasive Carcinoma dataset from The Cancer Genome Atlas consortium, which includes 82 adjacent normal and 730 cancer samples. We identify new biomarker candidates of breast cancer development (PTF1A, RABIF, RAG1AP1, TIMM17A, LOC148145) and progression (SERPINE3, ZNF706). PINCAGE discriminates better between normal and tumour tissue and between progressing and non-progressing tumours in comparison with established methods that assume independence between tested data types, especially when using evidence from multiple genes. Our method can be applied to any type of cancer or, more generally, to any genomic disease for which sufficient amount of molecular data is available. AVAILABILITY AND IMPLEMENTATION: R scripts available at http://moma.ki.au.dk/prj/pincage/ CONTACT: : michal.switnicki@clin.au.dk or jakob.skou@clin.au.dk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Neoplasias de la Mama , Regulación Neoplásica de la Expresión Génica , Genómica , Metilación de ADN , Epigenómica , Genómica/métodos , Humanos
7.
NPJ Genom Med ; 6(1): 33, 2021 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-33986299

RESUMEN

Large sets of whole cancer genomes make it possible to study mutation hotspots genome-wide. Here we detect, categorize, and characterize site-specific hotspots using 2279 whole cancer genomes from the Pan-Cancer Analysis of Whole Genomes project and provide a resource of annotated hotspots genome-wide. We investigate the excess of hotspots in both protein-coding and gene regulatory regions and develop measures of positive selection and functional impact for individual hotspots. Using cancer allele fractions, expression aberrations, mutational signatures, and a variety of genomic features, such as potential gain or loss of transcription factor binding sites, we annotate and prioritize all highly mutated hotspots. Genome-wide we find more high-frequency SNV and indel hotspots than expected given mutational background models. Protein-coding regions are generally enriched for SNV hotspots compared to other regions. Gene regulatory hotspots show enrichment of potential same-patient second-hit missense mutations, consistent with enrichment of hotspot driver mutations compared to singletons. For protein-coding regions, splice-sites, promoters, and enhancers, we see an excess of hotspots associated with cancer genes. Interestingly, missense hotspot mutations in tumor suppressors are associated with elevated expression, suggesting localized amino-acid changes with functional impact. For individual non-coding hotspots, only a small number show clear signs of positive selection, including known sites in the TERT promoter and the 5' UTR of TP53. Most of the new candidates have few mutations and limited driver evidence. However, a hotspot in an enhancer of the oncogene POU2AF1, which may create a transcription factor binding site, presents multiple lines of driver-consistent evidence.

8.
NPJ Genom Med ; 3: 1, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29354286

RESUMEN

Cancer develops by accumulation of somatic driver mutations, which impact cellular function. Mutations in non-coding regulatory regions can now be studied genome-wide and further characterized by correlation with gene expression and clinical outcome to identify driver candidates. Using a new two-stage procedure, called ncDriver, we first screened 507 ICGC whole-genomes from 10 cancer types for non-coding elements, in which mutations are both recurrent and have elevated conservation or cancer specificity. This identified 160 significant non-coding elements, including the TERT promoter, a well-known non-coding driver element, as well as elements associated with known cancer genes and regulatory genes (e.g., PAX5, TOX3, PCF11, MAPRE3). However, in some significant elements, mutations appear to stem from localized mutational processes rather than recurrent positive selection in some cases. To further characterize the driver potential of the identified elements and shortlist candidates, we identified elements where presence of mutations correlated significantly with expression levels (e.g., TERT and CDH10) and survival (e.g., CDH9 and CDH10) in an independent set of 505 TCGA whole-genome samples. In a larger pan-cancer set of 4128 TCGA exomes with expression profiling, we identified mutational correlation with expression for additional elements (e.g., near GATA3, CDC6, ZNF217, and CTCF transcription factor binding sites). Survival analysis further pointed to MIR122, a known marker of poor prognosis in liver cancer. In conclusion, the screen for significant mutation patterns coupled with correlative mutational analysis identified new individual driver candidates and suggest that some non-coding mutations recurrently affect expression and play a role in cancer development.

9.
Elife ; 62017 03 31.
Artículo en Inglés | MEDLINE | ID: mdl-28362259

RESUMEN

Non-coding mutations may drive cancer development. Statistical detection of non-coding driver regions is challenged by a varying mutation rate and uncertainty of functional impact. Here, we develop a statistically founded non-coding driver-detection method, ncdDetect, which includes sample-specific mutational signatures, long-range mutation rate variation, and position-specific impact measures. Using ncdDetect, we screened non-coding regulatory regions of protein-coding genes across a pan-cancer set of whole-genomes (n = 505), which top-ranked known drivers and identified new candidates. For individual candidates, presence of non-coding mutations associates with altered expression or decreased patient survival across an independent pan-cancer sample set (n = 5454). This includes an antigen-presenting gene (CD1A), where 5'UTR mutations correlate significantly with decreased survival in melanoma. Additionally, mutations in a base-excision-repair gene (SMUG1) correlate with a C-to-T mutational-signature. Overall, we find that a rich model of mutational heterogeneity facilitates non-coding driver identification and integrative analysis points to candidates of potential clinical relevance.


Asunto(s)
Carcinogénesis , Tasa de Mutación , Mutación , Neoplasias/patología , Neoplasias/fisiopatología , Bioestadística/métodos , Perfilación de la Expresión Génica , Humanos , Análisis de Supervivencia
10.
Cell Rep ; 7(5): 1649-1663, 2014 Jun 12.
Artículo en Inglés | MEDLINE | ID: mdl-24835989

RESUMEN

Bladder cancer (or urothelial cell carcinoma [UCC]) is characterized by field disease (malignant alterations in surrounding mucosa) and frequent recurrences. Whole-genome, exome, and transcriptome sequencing of 38 tumors, including four metachronous tumor pairs and 20 superficial tumors, identified an APOBEC mutational signature in one-third. This was biased toward the sense strand, correlated with mean expression level, and clustered near breakpoints. A>G mutations were up to eight times more frequent on the sense strand (p<0.002) in [ACG]AT contexts. The patient-specific APOBEC signature was negatively correlated to repair-gene expression and was not related to clinicopathological parameters. Mutations in gene families and single genes were related to tumor stage, and expression of chromatin modifiers correlated with survival. Evolutionary and subclonal analyses of early/late tumor pairs showed a unitary origin, and discrete tumor clones contained mutated cancer genes. The ancestral clones contained Pik3ca/Kdm6a mutations and may reflect the field-disease mutations shared among later tumors.


Asunto(s)
Carcinoma/genética , Evolución Clonal , Mutación Puntual , Neoplasias de la Vejiga Urinaria/genética , Desaminasas APOBEC-1 , Carcinoma/patología , Fosfatidilinositol 3-Quinasa Clase I , Citidina Desaminasa/genética , Citidina Desaminasa/metabolismo , Reparación del ADN , ADN sin Sentido/genética , Humanos , Fosfatidilinositol 3-Quinasas/genética , Transcriptoma , Neoplasias de la Vejiga Urinaria/patología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA