RESUMO
The high level of sparsity in methylome profiles obtained using whole-genome bisulfite sequencing in the case of low biological material amount limits its value in the study of systems in which large samples are difficult to assemble, such as mammalian preimplantation embryonic development. The recently developed computational methods for addressing the sparsity by imputing missing have their limits when the required minimum data coverage or profiles of the same tissue in other modalities are not available. In this study, we explored the use of transfer learning together with Kullback-Leibler (KL) divergence to train predictive models for completing methylome profiles with very low coverage (below 2%). Transfer learning was used to leverage less sparse profiles that are typically available for different tissues for the same species, while KL divergence was employed to maximize the usage of information carried in the input data. A deep neural network was adopted to extract both DNA sequence and local methylation patterns for imputation. Our study of training models for completing methylome profiles of bovine oocytes and early embryos demonstrates the effectiveness of transfer learning and KL divergence, with individual increase of 29.98 and 29.43%, respectively, in prediction performance and 38.70% increase when the two were used together. The drastically increased data coverage (43.80-73.6%) after imputation powers downstream analyses involving methylomes that cannot be effectively done using the very low coverage profiles (0.06-1.47%) before imputation.
RESUMO
After myocardial infarction, the massive death of cardiomyocytes leads to cardiac fibroblast proliferation and myofibroblast differentiation, which contributes to the extracellular matrix remodelling of the infarcted myocardium. We recently found that myofibroblasts further differentiate into matrifibrocytes, a newly identified cardiac fibroblast differentiation state. Cardiac fibroblasts of different states have distinct gene expression profiles closely related to their functions. However, the mechanism responsible for the gene expression changes during these activation and differentiation events is still not clear. In this study, the gene expression profiling and genome-wide accessible chromatin mapping of mouse cardiac fibroblasts isolated from the uninjured myocardium and the infarct at multiple time points corresponding to different differentiation states were performed by RNA sequencing (RNA-seq) and the assay for transposase-accessible chromatin with high-throughput sequencing (ATAC-seq), respectively. ATAC-seq peaks were highly enriched in the promoter area and the distal area where the enhancers are located. A positive correlation was identified between the expression and promoter accessibility for many dynamically expressed genes, even though evidence showed that mechanisms independent of chromatin accessibility may also contribute to the gene expression changes in cardiac fibroblasts after MI. Moreover, motif enrichment analysis and gene regulatory network construction identified transcription factors that possibly contributed to the differential gene expression between cardiac fibroblasts of different states.