RESUMO
Over time, the human DNA methylation landscape accrues substantial damage, which has been associated with a broad range of age-related diseases, including cardiovascular disease and cancer. Various age-related DNA methylation changes have been described, including at the level of individual CpGs, such as differential and variable methylation, and at the level of the whole methylome, including entropy and correlation networks. Here, we review these changes in the ageing methylome as well as the statistical tools that can be used to quantify them. We detail the evidence linking DNA methylation to ageing phenotypes and the longevity strategies aimed at altering both DNA methylation patterns and machinery to extend healthspan and lifespan. Lastly, we discuss theories on the mechanistic causes of epigenetic ageing.
Assuntos
Epigênese Genética , Epigenoma , Envelhecimento/genética , Metilação de DNA , Epigenômica , HumanosRESUMO
Deciphering cell-type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach for estimating cell-type abundances from a variety of omics data. Despite substantial methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four important challenges related to computational deconvolution: the quality of the reference data, generation of ground truth data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies, and strategies to promote rigorous benchmarking.
Assuntos
Biologia Computacional , Genômica , Biologia Computacional/métodos , BenchmarkingRESUMO
Single-cell omics is transforming our understanding of cell biology and disease, yet the systems-level analysis and interpretation of single-cell data faces many challenges. In this Perspective, we describe the impact that fundamental concepts from statistical mechanics, notably entropy, stochastic processes and critical phenomena, are having on single-cell data analysis. We further advocate the need for more bottom-up modelling of single-cell data and to embrace a statistical mechanics analysis paradigm to help attain a deeper understanding of single-cell systems biology.
Assuntos
Biologia Celular , Interpretação Estatística de Dados , Análise de Célula Única , Animais , Biologia Computacional , Entropia , Humanos , Modelos Estatísticos , RNA-Seq , Processos EstocásticosRESUMO
Bulk-tissue DNA methylomes represent an average over many different cell types, hampering our understanding of cell-type-specific contributions to disease development. As single-cell methylomics is not scalable to large cohorts of individuals, cost-effective computational solutions are needed, yet current methods are limited to tissues such as blood. Here we leverage the high-resolution nature of tissue-specific single-cell RNA-sequencing datasets to construct a DNA methylation atlas defined for 13 solid tissue types and 40 cell types. We comprehensively validate this atlas in independent bulk and single-nucleus DNA methylation datasets. We demonstrate that it correctly predicts the cell of origin of diverse cancer types and discovers new prognostic associations in olfactory neuroblastoma and stage 2 melanoma. In brain, the atlas predicts a neuronal origin for schizophrenia, with neuron-specific differential DNA methylation enriched for corresponding genome-wide association study risk loci. In summary, the DNA methylation atlas enables the decomposition of 13 different human tissue types at a high cellular resolution, paving the way for an improved interpretation of epigenetic data.
Assuntos
Metilação de DNA , Epigenoma , Ilhas de CpG , Epigênese Genética , Epigenômica , Estudo de Associação Genômica Ampla , Humanos , Neurônios/metabolismoRESUMO
Despite recent biotechnological breakthroughs, cancer risk prediction remains a formidable computational and experimental challenge. Addressing it is critical in order to improve prevention, early detection and survival rates. Here, I briefly summarize some key emerging theoretical and computational challenges as well as recent computational advances that promise to help realize the goals of cancer-risk prediction. The focus is on computational strategies based on single-cell data, in particular on bottom-up network modeling approaches that aim to estimate cancer stemness and dedifferentiation at single-cell resolution from a systems-biological perspective. I will describe two promising methods, a tissue and cell-lineage independent one based on the concept of diffusion network entropy, and a tissue and cell-lineage specific one that uses transcription factor regulons. Application of these tools to single-cell and single-nucleus RNA-seq data from stages prior to invasive cancer reveal that they can successfully delineate the heterogeneous inter-cellular cancer-risk landscape, identifying those cells that are more likely to turn cancerous. Bottom-up systems biological modeling of single-cell omic data is a novel computational analysis paradigm that promises to facilitate the development of preventive, early detection and cancer-risk prediction strategies.
Assuntos
Biologia Computacional , Neoplasias , Análise de Célula Única , Humanos , Análise de Célula Única/métodos , Biologia Computacional/métodosRESUMO
Epigenetics plays a key role in cellular development and function. Alterations to the epigenome are thought to capture and mediate the effects of genetic and environmental risk factors on complex disease. Currently, DNA methylation is the only epigenetic mark that can be measured reliably and genome-wide in large numbers of samples. This Review discusses some of the key statistical challenges and algorithms associated with drawing inferences from DNA methylation data, including cell-type heterogeneity, feature selection, reverse causation and system-level analyses that require integration with other data types such as gene expression, genotype, transcription factor binding and other epigenetic information.
Assuntos
Metilação de DNA , Bases de Dados Genéticas , Epigênese Genética , Epigenômica/métodos , Animais , HumanosRESUMO
MOTIVATION: Estimating differentiation potency of single cells is a task of great biological and clinical significance, as it may allow identification of normal and cancer stem cell phenotypes. However, very few single-cell potency models have been proposed, and their robustness and reliability across independent studies have not yet been fully assessed. RESULTS: Using nine independent single-cell RNA-Seq experiments, we here compare four different single-cell potency models to each other, in their ability to discriminate cells that ought to differ in terms of differentiation potency. Two of the potency models approximate potency via network entropy measures that integrate the single-cell RNA-Seq profile of a cell with a protein interaction network. The comparison between the four models reveals that integration of RNA-Seq data with a protein interaction network dramatically improves the robustness and reliability of single-cell potency estimates. We demonstrate that underlying this robustness is a correlation relationship, according to which high differentiation potency is positively associated with overexpression of network hubs. We further show that overexpressed network hubs are strongly enriched for ribosomal mitochondrial proteins, suggesting that their mRNA levels may provide a universal marker of a cell's potency. Thus, this study provides novel systems-biological insight into cellular potency and may provide a foundation for improved models of differentiation potency with far-reaching implications for the discovery of novel stem cell or progenitor cell phenotypes.
RESUMO
MOTIVATION: An important task in the analysis of single-cell RNA-Seq data is the estimation of differentiation potency, as this can help identify stem-or-multipotent cells in non-temporal studies or in tissues where differentiation hierarchies are not well established. A key challenge in the estimation of single-cell potency is the need for a fast and accurate algorithm, scalable to large scRNA-Seq studies profiling millions of cells. RESULTS: Here, we present a single-cell potency measure, called Correlation of Connectome and Transcriptome (CCAT), which can return accurate single-cell potency estimates of a million cells in minutes, a 100-fold improvement over current state-of-the-art methods. We benchmark CCAT against 8 other single-cell potency models and across 28 scRNA-Seq studies, encompassing over 2 million cells, demonstrating comparable accuracy than the current state-of-the-art, at a significantly reduced computational cost, and with increased robustness to dropouts. AVAILABILITY AND IMPLEMENTATION: CCAT is part of the SCENT R-package, freely available from https://github.com/aet21/SCENT. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
RNA Citoplasmático Pequeno , Análise de Célula Única , Diferenciação Celular , Perfilação da Expressão Gênica , Análise de Sequência de RNA , SoftwareRESUMO
An outstanding challenge of epigenome-wide association studies (EWASs) performed in complex tissues is the identification of the specific cell type(s) responsible for the observed differential DNA methylation. Here we present a statistical algorithm called CellDMC ( https://github.com/sjczheng/EpiDISH ), which can identify differentially methylated positions and the specific cell type(s) driving the differential methylation. We validated CellDMC on in silico mixtures of DNA methylation data generated with different technologies, as well as on real mixtures from epigenome-wide association and cancer epigenome studies. CellDMC achieved over 90% sensitivity and specificity in scenarios where current state-of-the-art methods did not identify differential methylation. By applying CellDMC to an EWAS performed in buccal swabs, we identified smoking-associated differentially methylated positions occurring in the epithelial compartment, which we validated in smoking-related lung cancer. CellDMC may be useful in the identification of causal DNA-methylation alterations in disease.
Assuntos
Metilação de DNA , DNA/análise , Epigênese Genética , Epigenômica/métodos , Marcadores Genéticos , Estudo de Associação Genômica Ampla , Análise de Sequência de DNA/métodos , Algoritmos , Artrite Reumatoide/genética , Artrite Reumatoide/patologia , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Ilhas de CpG , Neoplasias do Endométrio/genética , Neoplasias do Endométrio/patologia , Feminino , Humanos , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/patologia , Fumar/efeitos adversos , Fumar/genéticaRESUMO
DNA methylation plays an essential role in cancer. Differential variability (DV) in cancer was recently observed that contributes to cancer heterogeneity and has been shown to be crucial in detecting epigenetic field defects, DNA methylation alterations happening early in carcinogenesis. As neighboring CpG sites are highly correlated, here, we present a new method to detect differentially methylated regions (DMRs) that uses combined signals from differential methylation and DV between sample groups. We demonstrated in simulation studies the superior performance of the new method than existing methods that use only one type of signals when true DMRs have both. Applications to DNA methylation data of breast invasive carcinoma (BRCA) and kidney renal clear cell carcinoma (KIRC) from The Cancer Genome Atlas (TCGA) and BRCA from Gene Expression Omnibus (GEO) suggest that the new method identified additional cancer-related DMRs that were missed by methods using one type of signals. Replication analyses using two independent BRCA data sets suggest that DMRs detected based on DV are reproducible. Only the new method identified epigenetic field defects when comparing normal tissues adjacent to tumors and normal tissues from age-matched cancer-free women from the GEO BRCA data and confirmed their enrichment in the progression to breast cancer.
Assuntos
Metilação de DNA , Algoritmos , Análise de Variância , Neoplasias da Mama/genética , Carcinoma de Células Renais/genética , Estudos de Casos e Controles , Biologia Computacional , Simulação por Computador , Ilhas de CpG , DNA de Neoplasias/genética , Bases de Dados Genéticas/estatística & dados numéricos , Epigênese Genética , Feminino , Variação Genética , Humanos , Neoplasias Renais/genética , Análise de Sequência de DNA/estatística & dados numéricosRESUMO
Identifying epigenetic field defects, notably early DNA methylation alterations, is important for early cancer detection. Research has suggested these early methylation alterations are infrequent across samples and identifiable as outlier samples. Here we developed a weighted epigenetic distance-based method characterizing (dis)similarity in methylation measures at multiple CpGs in a gene or a genetic region between pairwise samples, with weights to up-weight signal CpGs and down-weight noise CpGs. Using distance-based approaches, weak signals that might be filtered out in a CpG site-level analysis could be accumulated and therefore boost the overall study power. In constructing epigenetic distances, we considered both differential methylation (DM) and differential variability (DV) signals. We demonstrated the superior performance of the proposed weighted epigenetic distance-based method over non-weighted versions and site-level EWAS (epigenome-wide association studies) methods in simulation studies. Application to breast cancer methylation data from Gene Expression Omnibus (GEO) comparing normal-adjacent tissue to tumor of breast cancer patients and normal tissue of independent age-matched cancer-free women identified novel epigenetic field defects that were missed by EWAS methods, when majority were previously reported to be associated with breast cancer and were confirmed the progression to breast cancer. We further replicated some of the identified epigenetic field defects.
Assuntos
Neoplasias da Mama/genética , Metilação de DNA/genética , Epigenômica/métodos , Modelos Teóricos , Neoplasias da Mama/patologia , Ilhas de CpG/genética , Progressão da Doença , Detecção Precoce de Câncer/métodos , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Estudo de Associação Genômica Ampla , HumanosRESUMO
Psychosocial adversity in childhood (e.g. abuse) and low socioeconomic position (SEP) can have significant lasting effects on social and health outcomes. DNA methylation-based biomarkers are highly correlated with chronological age; departures of methylation-predicted age from chronological age can be used to define a measure of age acceleration, which may represent a potential biological mechanism linking environmental exposures to later health outcomes. Using data from two cohorts of women Avon Longitudinal Study of Parents and Children, (ALSPAC), N = 989 and MRC National Survey of Health and Development, NSHD, N = 773), we assessed associations of SEP, psychosocial adversity in childhood (parental physical or mental illness or death, parental separation, parental absence, sub-optimal maternal bonding, sexual, emotional and physical abuse and neglect) and a cumulative score of these psychosocial adversity measures, with DNA methylation age acceleration in adulthood (measured in peripheral blood at mean chronological ages 29 and 47 in ALSPAC and buccal cells at age 53 in NSHD). Sexual abuse was strongly associated with age acceleration in ALSPAC (sexual abuse data were not available in NSHD), e.g. at the 47-year time point sexual abuse associated with a 3.41 years higher DNA methylation age (95% CI 1.53 to 5.29) after adjusting for childhood and adulthood SEP. No associations were observed between low SEP, any other psychosocial adversity measure or the cumulative psychosocial adversity score and age acceleration. DNA methylation age acceleration is associated with sexual abuse, suggesting a potential mechanism linking sexual abuse with adverse outcomes. Replication studies with larger sample sizes are warranted.
Assuntos
Abuso Sexual na Infância/psicologia , Metilação de DNA , Exposição Ambiental/efeitos adversos , Epigênese Genética , Transtornos Mentais , Adulto , Criança , Pré-Escolar , Feminino , Humanos , Masculino , Transtornos Mentais/genética , Transtornos Mentais/metabolismo , Transtornos Mentais/psicologia , Pessoa de Meia-Idade , Estudos Prospectivos , Fatores SocioeconômicosRESUMO
MOTIVATION: The biological interpretation of differentially methylated sites derived from Epigenome-Wide-Association Studies (EWAS) remains a significant challenge. Gene Set Enrichment Analysis (GSEA) is a general tool to aid biological interpretation, yet its correct and unbiased implementation in the EWAS context is difficult due to the differential probe representation of Illumina Infinium DNA methylation beadchips. RESULTS: We present a novel GSEA method, called ebGSEA, which ranks genes, not CpGs, according to the overall level of differential methylation, as assessed using all the probes mapping to the given gene. Applied on simulated and real EWAS data, we show how ebGSEA may exhibit higher sensitivity and specificity than the current state-of-the-art, whilst also avoiding differential probe representation bias. Thus, ebGSEA will be a useful additional tool to aid the interpretation of EWAS data. AVAILABILITY AND IMPLEMENTATION: ebGSEA is available from https://github.com/aet21/ebGSEA, and has been incorporated into the ChAMP Bioconductor package (https://www.bioconductor.org). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Metilação de DNA , Epigenoma , ProbabilidadeRESUMO
SUMMARY: It is well recognized that cell-type heterogeneity hampers the interpretation of Epigenome-Wide Association Studies (EWAS). Many tools have emerged to address this issue, including several R/Bioconductor packages that infer cell-type composition. Here we present a web application for cell-type deconvolution, which offers the functionality of our EpiDISH Bioconductor/R package in a user-friendly GUI environment. Users can upload their data to infer cell-type composition and differentially methylated cytosines in individual cell-types (DMCTs) for a range of different tissues. AVAILABILITY AND IMPLEMENTATION: EpiDISH web server is implemented with Shiny in R, and is freely available at https://www.biosino.org/EpiDISH/.
RESUMO
SUMMARY: The Illumina Infinium EPIC BeadChip is a new high-throughput array for DNA methylation analysis, extending the earlier 450k array by over 400 000 new sites. Previously, a method named eFORGE was developed to provide insights into cell type-specific and cell-composition effects for 450k data. Here, we present a significantly updated and improved version of eFORGE that can analyze both EPIC and 450k array data. New features include analysis of chromatin states, transcription factor motifs and DNase I footprints, providing tools for epigenome-wide association study interpretation and epigenome editing. AVAILABILITY AND IMPLEMENTATION: eFORGE v2.0 is implemented as a web tool available from https://eforge.altiusinstitute.org and https://eforge-tf.altiusinstitute.org/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Metilação de DNA , Epigenômica , Cromatina , Ilhas de CpG , Desoxirribonuclease I , Análise de Sequência com Séries de Oligonucleotídeos , SoftwareRESUMO
Motivation: A clear identification of the primary site of tumor is of great importance to the next targeted site-specific treatments and could efficiently improve patient's overall survival. Even though many classifiers based on gene expression had been proposed to predict the tumor primary, only a few studies focus on using DNA methylation (DNAm) profiles to develop classifiers, and none of them compares the performance of classifiers based on different profiles. Results: We introduced novel selection strategies to identify highly tissue-specific CpG sites and then used the random forest approach to construct the classifiers to predict the origin of tumors. We also compared the prediction performance by applying similar strategy on miRNA expression profiles. Our analysis indicated that these classifiers had an accuracy of 96.05% (Maximum-Relevance-Maximum-Distance: 90.02-99.99%) or 95.31% (principal component analysis: 79.82-99.91%) on independent DNAm datasets, and an overall accuracy of 91.30% (range 79.33-98.74%) on independent miRNA test sets for predicting tumor origin. This suggests that our feature selection methods are very effective to identify tissue-specific biomarkers and the classifiers we developed can efficiently predict the origin of tumors. We also developed a user-friendly webserver that helps users to predict the tumor origin by uploading miRNA expression or DNAm profile of their interests. Availability and implementation: The webserver, and relative data, code are accessible at http://server.malab.cn/MMCOP/. Contact: zouquan@nclab.net or a.teschendorff@ucl.ac.uk. Supplementary information: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional/métodos , Metilação de DNA , Genes Neoplásicos , MicroRNAs/genética , Neoplasias/diagnóstico , Ilhas de CpG , DNA de Neoplasias , Feminino , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Humanos , Masculino , Neoplasias/genética , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodosRESUMO
Cancer is characterized by both genetic and epigenetic alterations. While cancer driver mutations and copy-number alterations have been studied at a systems-level, relatively little is known about the systems-level patterns exhibited by their epigenetic counterparts. Here we perform a pan-cancer wide systems-level analysis, mapping candidate cancer-driver DNA methylation (DNAm) alterations onto a human interactome. We demonstrate that functional DNAm alterations in cancer tend to map to nodes of lower connectivity and inter-connectivity, compared to the corresponding alterations at the genomic level. We find that epigenetic alterations are relatively over-represented in extracellular and transmembrane signaling domains, whereas cancer genes undergoing amplification or deletion tend to be enriched within the intracellular domain. A pan-cancer wide meta-analysis identifies WNT and chemokine signaling, as two key pathways where epigenetic deregulation preferentially targets extracellular components. We further pinpoint specific chemokine ligands/receptors whose epigenetic deregulation associates with key epigenetic enzymes, representing potential targets for epigenetic therapy. Our results suggest that epigenetic deregulation in cancer not only targets tissue-specific transcription factors, but also modulates signaling within the extra-cellular domain, providing novel system-level insight into the potential distinctive role of genetic and epigenetic alterations in cancer.
Assuntos
Epigênese Genética , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Neoplasias/metabolismo , Transdução de Sinais , Biologia Computacional/métodos , Variações do Número de Cópias de DNA , Metilação de DNA , Humanos , Mutação , Especificidade de Órgãos/genética , Mapas de Interação de ProteínasRESUMO
MicroRNAs (miRNAs) are often deregulated in cancer and are thought to play an important role in cancer development. Large amount of differentially expressed miRNAs have been identified in various cancers by using high-throughput methods. It is therefore quite important to make a comprehensive collection of these miRNAs and to decipher their roles in oncogenesis and tumor progression. In 2010, we presented the first release of dbDEMC, representing a database for collection of differentially expressed miRNAs in human cancers obtained from microarray data. Here we describe an update of the database. dbDEMC 2.0 documents 209 expression profiling data sets across 36 cancer types and 73 subtypes, and a total of 2224 differentially expressed miRNAs were identified. An easy-to-use web interface was constructed that allows users to make a quick search of the differentially expressed miRNAs in certain cancer types. In addition, a new function of 'meta-profiling' was added to view differential expression events according to user-defined miRNAs and cancer types. We expect this database to continue to serve as a valuable source for cancer investigation and potential clinical application related to miRNAs. dbDEMC 2.0 is freely available at http://www.picb.ac.cn/dbDEMC.
Assuntos
Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica/métodos , Genômica/métodos , MicroRNAs/genética , Neoplasias/genética , Humanos , Ferramenta de Busca , Software , NavegadorRESUMO
Although epigenetic processes have been linked to aging and disease in other systems, it is not yet known whether they relate to reproductive aging. Recently, we developed a highly accurate epigenetic biomarker of age (known as the "epigenetic clock"), which is based on DNA methylation levels. Here we carry out an epigenetic clock analysis of blood, saliva, and buccal epithelium using data from four large studies: the Women's Health Initiative (n = 1,864); Invecchiare nel Chianti (n = 200); Parkinson's disease, Environment, and Genes (n = 256); and the United Kingdom Medical Research Council National Survey of Health and Development (n = 790). We find that increased epigenetic age acceleration in blood is significantly associated with earlier menopause (P = 0.00091), bilateral oophorectomy (P = 0.0018), and a longer time since menopause (P = 0.017). Conversely, epigenetic age acceleration in buccal epithelium and saliva do not relate to age at menopause; however, a higher epigenetic age in saliva is exhibited in women who undergo bilateral oophorectomy (P = 0.0079), while a lower epigenetic age in buccal epithelium was found for women who underwent menopausal hormone therapy (P = 0.00078). Using genetic data, we find evidence of coheritability between age at menopause and epigenetic age acceleration in blood. Using Mendelian randomization analysis, we find that two SNPs that are highly associated with age at menopause exhibit a significant association with epigenetic age acceleration. Overall, our Mendelian randomization approach and other lines of evidence suggest that menopause accelerates epigenetic aging of blood, but mechanistic studies will be needed to dissect cause-and-effect relationships further.
Assuntos
Envelhecimento/fisiologia , Menopausa/fisiologia , Adulto , Epigênese Genética , Feminino , Humanos , Análise da Randomização Mendeliana , Pessoa de Meia-Idade , Ovariectomia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The interaction between the (epi)genetic makeup of an individual and his/her environmental exposure record (exposome) is accepted as a determinant factor for a significant proportion of human malignancies. Recent evidence has highlighted the key role of epigenetic mechanisms in mediating gene-environment interactions and translating exposures into tumorigenesis. There is also growing evidence that epigenetic changes may be risk factor-specific ("fingerprints") that should prove instrumental in the discovery of new biomarkers in cancer. Here, we review the state of the science of epigenetics associated with environmental stimuli and cancer risk, highlighting key developments in the field. Critical knowledge gaps and research needs are discussed and advances in epigenomics that may help in understanding the functional relevance of epigenetic alterations. Key elements required for causality inferences linking epigenetic changes to exposure and cancer are discussed and how these alterations can be incorporated in carcinogen evaluation and in understanding mechanisms underlying epigenome deregulation by the environment.