RESUMEN
Cancer, a collection of more than two hundred different diseases, remains a leading cause of morbidity and mortality worldwide. Usually detected at the advanced stages of disease, metastatic cancer accounts for 90% of cancer-associated deaths. Therefore, the early detection of cancer, combined with current therapies, would have a significant impact on survival and treatment of various cancer types. Epigenetic changes such as DNA methylation are some of the early events underlying carcinogenesis. Here, we report on an interpretable machine learning model that can classify 13 cancer types as well as non-cancer tissue samples using only DNA methylome data, with 98.2% accuracy. We utilize the features identified by this model to develop EMethylNET, a robust model consisting of an XGBoost model that provides information to a deep neural network that can generalize to independent data sets. We also demonstrate that the methylation-associated genomic loci detected by the classifier are associated with genes, pathways and networks involved in cancer, providing insights into the epigenomic regulation of carcinogenesis.
RESUMEN
The Polycomb repressor complex 2 (PRC2) is composed of the core subunits Ezh1/2, Suz12, and Eed, and it mediates all di- and tri-methylation of histone H3 at lysine 27 in higher eukaryotes. However, little is known about how the catalytic activity of PRC2 is regulated to demarcate H3K27me2 and H3K27me3 domains across the genome. To address this, we mapped the endogenous interactomes of Ezh2 and Suz12 in embryonic stem cells (ESCs), and we combined this with a functional screen for H3K27 methylation marks. We found that Nsd1-mediated H3K36me2 co-locates with H3K27me2, and its loss leads to genome-wide expansion of H3K27me3. These increases in H3K27me3 occurred at PRC2/PRC1 target genes and as de novo accumulation within what were previously broad H3K27me2 domains. Our data support a model in which Nsd1 is a key modulator of PRC2 function required for regulating the demarcation of genome-wide H3K27me2 and H3K27me3 domains in ESCs.