Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 39
Filter
Add more filters










Publication year range
1.
Nucleic Acids Res ; 52(D1): D154-D163, 2024 Jan 05.
Article in English | MEDLINE | ID: mdl-37971293

ABSTRACT

We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.


Subject(s)
Databases, Genetic , Gene Expression Regulation , Protein Interaction Domains and Motifs , Transcription Factors , Animals , Humans , Mice , Binding Sites/genetics , Nucleotide Motifs , Transcription Factors/genetics , Transcription Factors/metabolism , Internet , Protein Interaction Domains and Motifs/genetics
2.
Brief Bioinform ; 25(1)2023 11 22.
Article in English | MEDLINE | ID: mdl-38084919

ABSTRACT

Single-cell ATAC-seq (scATAC-seq) is a recently developed approach that provides means to investigate open chromatin at single cell level, to assess epigenetic regulation and transcription factors binding landscapes. The sparsity of the scATAC-seq data calls for imputation. Similarly, preprocessing (filtering) may be required to reduce computational load due to the large number of open regions. However, optimal strategies for both imputation and preprocessing have not been yet evaluated together. We present SAPIEnS (scATAC-seq Preprocessing and Imputation Evaluation System), a benchmark for scATAC-seq imputation frameworks, a combination of state-of-the-art imputation methods with commonly used preprocessing techniques. We assess different types of scATAC-seq analysis, i.e. clustering, visualization and digital genomic footprinting, and attain optimal preprocessing-imputation strategies. We discuss the benefits of the imputation framework depending on the task and the number of the dataset features (peaks). We conclude that the preprocessing with the Boruta method is beneficial for the majority of tasks, while imputation is helpful mostly for small datasets. We also implement a SAPIEnS database with pre-computed transcription factor footprints based on imputed data with their activity scores in a specific cell type. SAPIEnS is published at: https://github.com/lab-medvedeva/SAPIEnS. SAPIEnS database is available at: https://sapiensdb.com.


Subject(s)
Epigenesis, Genetic , Genomics , Genomics/methods , Transcription Factors/genetics , Transcription Factors/metabolism , Gene Expression Regulation , Cluster Analysis
3.
Int J Mol Sci ; 24(10)2023 May 11.
Article in English | MEDLINE | ID: mdl-37239934

ABSTRACT

Differential methylation (DM) is actively recruited in different types of fundamental and translational studies. Currently, microarray- and NGS-based approaches for methylation analysis are the most widely used with multiple statistical models designed to extract differential methylation signatures. The benchmarking of DM models is challenging due to the absence of gold standard data. In this study, we analyze an extensive number of publicly available NGS and microarray datasets with divergent and widely utilized statistical models and apply the recently suggested and validated rank-statistic-based approach Hobotnica to evaluate the quality of their results. Overall, microarray-based methods demonstrate more robust and convergent results, while NGS-based models are highly dissimilar. Tests on the simulated NGS data tend to overestimate the quality of the DM methods and therefore are recommended for use with caution. Evaluation of the top 10 DMC and top 100 DMC in addition to the not-subset signature also shows more stable results for microarray data. Summing up, given the observed heterogeneity in NGS methylation data, the evaluation of newly generated methylation signatures is a crucial step in DM analysis. The Hobotnica metric is coordinated with previously developed quality metrics and provides a robust, sensitive, and informative estimation of methods' performance and DM signatures' quality in the absence of gold standard data solving a long-existing problem in DM analysis.


Subject(s)
DNA Methylation , Models, Statistical , Microarray Analysis
4.
Int J Mol Sci ; 24(7)2023 Mar 25.
Article in English | MEDLINE | ID: mdl-37047200

ABSTRACT

Single-cell RNA-seq data contains a lot of dropouts hampering downstream analyses due to the low number and inefficient capture of mRNAs in individual cells. Here, we present Epi-Impute, a computational method for dropout imputation by reconciling expression and epigenomic data. Epi-Impute leverages single-cell ATAC-seq data as an additional source of information about gene activity to reduce the number of dropouts. We demonstrate that Epi-Impute outperforms existing methods, especially for very sparse single-cell RNA-seq data sets, significantly reducing imputation error. At the same time, Epi-Impute accurately captures the primary distribution of gene expression across cells while preserving the gene-gene and cell-cell relationship in the data. Moreover, Epi-Impute allows for the discovery of functionally relevant cell clusters as a result of the increased resolution of scRNA-seq data due to imputation.


Subject(s)
Chromatin Immunoprecipitation Sequencing , Software , Sequence Analysis, RNA/methods , Single-Cell Gene Expression Analysis , Single-Cell Analysis/methods , Gene Expression Profiling
5.
Nucleic Acids Res ; 51(D1): D564-D570, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36350659

ABSTRACT

We present an update of EpiFactors, a manually curated database providing information about epigenetic regulators, their complexes, targets, and products which is openly accessible at http://epifactors.autosome.org. An updated version of the EpiFactors contains information on 902 proteins, including 101 histones and protamines, and, as a main update, a newly curated collection of 124 lncRNAs involved in epigenetic regulation. The amount of publications concerning the role of lncRNA in epigenetics is rapidly growing. Yet, the resource that compiles, integrates, organizes, and presents curated information on lncRNAs in epigenetics is missing. EpiFactors fills this gap and provides data on epigenetic regulators in an accessible and user-friendly form. For 820 of the genes in EpiFactors, we include expression estimates across multiple cell types assessed by CAGE-Seq in the FANTOM5 project. In addition, the updated EpiFactors contains information on 73 protein complexes involved in epigenetic regulation. Our resource is practical for a wide range of users, including biologists, bioinformaticians and molecular/systems biologists.


Subject(s)
Databases, Genetic , Epigenesis, Genetic , Humans , Histones/genetics , Histones/metabolism , Protamines , RNA, Long Noncoding/genetics , RNA, Long Noncoding/metabolism
6.
Int J Mol Sci ; 23(19)2022 Sep 27.
Article in English | MEDLINE | ID: mdl-36232714

ABSTRACT

Acute myeloid leukemia (AML) is a rapidly progressing heterogeneous disease with a high mortality rate, which is characterized by hyperproliferation of atypical immature myeloid cells. The number of AML patients is expected to increase in the near future, due to the old-age-associated nature of AML and increased longevity in the human population. RUNX1 and CEBPA, key transcription factors (TFs) of hematopoiesis, are frequently and independently mutated in AML. RUNX1 and CEBPA can bind TET2 demethylase and attract it to their binding sites (TFBS) in cell lines, leading to DNA demethylation of the regions nearby. Since TET2 does not have a DNA-binding domain, TFs are crucial for its guidance to target genomic locations. In this paper, we show that RUNX1 and CEBPA mutations in AML patients affect the methylation of important regulatory sites that resulted in the silencing of several RUNX1 and CEBPA target genes, most likely in a TET2-dependent manner. We demonstrated that hypermethylation of TFBS in AML cells with RUNX1 mutations was associated with resistance to anticancer chemotherapy. Demethylation therapy restored expression of the RUNX1 target gene, BIK, and increased sensitivity of AML cells to chemotherapy. If our results are confirmed, mutations in RUNX1 could be an indication for prescribing the combination of cytotoxic and demethylation therapies.


Subject(s)
CCAAT-Enhancer-Binding Proteins , Core Binding Factor Alpha 2 Subunit , Leukemia, Myeloid, Acute , CCAAT-Enhancer-Binding Proteins/genetics , CCAAT-Enhancer-Binding Proteins/metabolism , Core Binding Factor Alpha 2 Subunit/genetics , Core Binding Factor Alpha 2 Subunit/metabolism , DNA/genetics , DNA/metabolism , DNA Methylation/genetics , Demethylation/drug effects , Humans , Leukemia, Myeloid, Acute/drug therapy , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/metabolism , Mutation
7.
Noncoding RNA ; 8(1)2022 Feb 08.
Article in English | MEDLINE | ID: mdl-35202091

ABSTRACT

Long non-coding RNAs (lncRNAs) play an important role in genome regulation. Specifically, many lncRNAs interact with chromatin, recruit epigenetic complexes and in this way affect large-scale gene expression programs. However, the experimental data about lncRNA-chromatin interactions is still limited. The majority of experimental protocols do not provide any insight into the mechanics of lncRNA-based genome-wide epigenetic regulation. Here we present the HiMoRNA (Histone-Modifying RNA) database, a resource containing correlated lncRNA-epigenetic changes in specific genomic locations genome-wide. HiMoRNA integrates a large amount of multi-omics data to characterize the effects of lncRNA on epigenetic modifications and gene expression. The current release of HiMoRNA includes more than five million associations in humans for ten histone modifications in multiple genomic loci and 4145 lncRNAs. HiMoRNA provides a user-friendly interface to facilitate browsing, searching and retrieving of lncRNAs associated with epigenetic profiles of various chromatin loci. Analysis of the HiMoRNA data suggests that several lncRNA including JPX might be involved not only in regulation of XIST locus but also in direct establishment or maintenance of X-chromosome inactivation. We believe that HiMoRNA is a convenient and valuable resource that can provide valuable biological insights and greatly facilitate functional annotation of lncRNAs.

8.
F1000Res ; 10: 204, 2021.
Article in English | MEDLINE | ID: mdl-34557292

ABSTRACT

Background: Acute myeloid leukemia (AML) is a hematopoietic malignancy characterized by genetic and epigenetic aberrations that alter the differentiation capacity of myeloid progenitor cells. The transcription factor CEBPα is frequently mutated in AML patients leading to an increase in DNA methylation in many genomic locations. Previously, it has been shown that ecCEBPα (extra coding CEBP α) - a lncRNA transcribed in the same direction as CEBPα gene - regulates DNA methylation of CEBPα promoter in cis. Here, we hypothesize that ecCEBPα could participate in the regulation of DNA methylation in trans. Method: First, we retrieved the methylation profile of AML patients with mutated CEBPα locus from The Cancer Genome Atlas (TCGA). We then predicted the ecCEBPα secondary structure in order to check the potential of ecCEBPα to form triplexes around CpG loci and checked if triplex formation influenced CpG methylation, genome-wide. Results: Using DNA methylation profiles of AML patients with a mutated CEBPα locus, we show that ecCEBPα could interact with DNA by forming DNA:RNA triple helices and protect regions near its binding sites from global DNA methylation. Further analysis revealed that triplex-forming oligonucleotides in ecCEBPα are structurally unpaired supporting the DNA-binding potential of these regions. ecCEBPα triplexes supported with the RNA-chromatin co-localization data are located in the promoters of leukemia-linked transcriptional factors such as MLF2. Discussion: Overall, these results suggest a novel regulatory mechanism for ecCEBPα as a genome-wide epigenetic modulator through triple-helix formation which may provide a foundation for sequence-specific engineering of RNA for regulating methylation of specific genes.


Subject(s)
Leukemia, Myeloid, Acute , RNA, Long Noncoding , CpG Islands/genetics , DNA Methylation , Humans , Leukemia, Myeloid, Acute/genetics , Promoter Regions, Genetic
9.
F1000Res ; 10: 249, 2021.
Article in English | MEDLINE | ID: mdl-34527215

ABSTRACT

Emerging studies demonstrate the ability of microRNAs (miRNAs) to activate genes via different mechanisms. Specifically, miRNAs may trigger an enhancer promoting chromatin remodelling in the enhancer region, thus activating the enhancer and its target genes. Here we present MIREyA, a pipeline developed to predict such miRNA-gene-enhancer trios based on an expression dataset which obviates the need to write custom scripts. We applied our pipeline to primary murine macrophages infected by Mycobacterium tuberculosis (HN878 strain) and detected Mir22, Mir221, Mir222, Mir155 and Mir1956, which could up-regulate genes related to immune responses. We believe that MIREyA is a useful tool for detecting putative miRNA-directed gene activation cases. MIREyA is available from:  https://github.com/veania/MIREyA.


Subject(s)
MicroRNAs , Mycobacterium tuberculosis , Animals , Macrophages , Mice , MicroRNAs/genetics , Mycobacterium tuberculosis/genetics , Regulatory Sequences, Nucleic Acid , Transcriptional Activation
10.
NAR Genom Bioinform ; 3(3): lqab074, 2021 Sep.
Article in English | MEDLINE | ID: mdl-34458728

ABSTRACT

Many human genes are transcribed from both strands and produce sense-antisense gene pairs. Sense-antisense (SAS) chimeric transcripts are produced upon the coalescing of exons/introns from both sense and antisense transcripts of the same gene. SAS chimera was first reported in prostate cancer cells. Subsequently, numerous SAS chimeras have been reported in the ChiTaRS-2.1 database. However, the landscape of their expression in human cells and functional aspects are still unknown. We found that longer palindromic sequences are a unique feature of SAS chimeras. Structural analysis indicates that a long hairpin-like structure formed by many consecutive Watson-Crick base pairs appears because of these long palindromic sequences, which possibly play a similar role as double-stranded RNA (dsRNA), interfering with gene expression. RNA-RNA interaction analysis suggested that SAS chimeras could significantly interact with their parental mRNAs, indicating their potential regulatory features. Here, 267 SAS chimeras were mapped in RNA-seq data from 16 healthy human tissues, revealing their expression in normal cells. Evolutionary analysis suggested the positive selection favoring sense-antisense fusions that significantly impacted the evolution of their function and structure. Overall, our study provides detailed insight into the expression landscape of SAS chimeras in human cells and identifies potential regulatory features.

11.
Cell Death Dis ; 12(9): 798, 2021 08 17.
Article in English | MEDLINE | ID: mdl-34404761

ABSTRACT

Immunomodulation strategies are crucial for several biomedical applications. However, the immune system is highly heterogeneous and its functional responses to infections remains elusive. Indeed, the characterization of immune response particularities to different pathogens is needed to identify immunomodulatory candidates. To address this issue, we compiled a comprehensive map of functional immune cell states of mouse in response to 12 pathogens. To create this atlas, we developed a single-cell-based computational method that partitions heterogeneous cell types into functionally distinct states and simultaneously identifies modules of functionally relevant genes characterizing them. We identified 295 functional states using 114 datasets of six immune cell types, creating a Catalogus Immune Muris. As a result, we found common as well as pathogen-specific functional states and experimentally characterized the function of an unknown macrophage cell state that modulates the response to Salmonella Typhimurium infection. Thus, we expect our Catalogus Immune Muris to be an important resource for studies aiming at discovering new immunomodulatory candidates.


Subject(s)
Immunity , Salmonella typhimurium/pathogenicity , Animals , HEK293 Cells , Humans , Immunomodulation , Inflammation/immunology , Inflammation/pathology , Leukocytes/immunology , Macrophages/immunology , Mice, Inbred C57BL , Time Factors , Transcription Factors/metabolism
13.
Nat Commun ; 11(1): 1018, 2020 02 24.
Article in English | MEDLINE | ID: mdl-32094342

ABSTRACT

Mammalian genomes encode tens of thousands of noncoding RNAs. Most noncoding transcripts exhibit nuclear localization and several have been shown to play a role in the regulation of gene expression and chromatin remodeling. To investigate the function of such RNAs, methods to massively map the genomic interacting sites of multiple transcripts have been developed; however, these methods have some limitations. Here, we introduce RNA And DNA Interacting Complexes Ligated and sequenced (RADICL-seq), a technology that maps genome-wide RNA-chromatin interactions in intact nuclei. RADICL-seq is a proximity ligation-based methodology that reduces the bias for nascent transcription, while increasing genomic coverage and unique mapping rate efficiency compared with existing methods. RADICL-seq identifies distinct patterns of genome occupancy for different classes of transcripts as well as cell type-specific RNA-chromatin interactions, and highlights the role of transcription in the establishment of chromatin structure.


Subject(s)
Chromatin/metabolism , Chromosome Mapping/methods , High-Throughput Nucleotide Sequencing/methods , RNA, Untranslated/genetics , Sequence Analysis, RNA/methods , Animals , Cell Line , Cell Nucleus/genetics , Cell Nucleus/metabolism , Chromatin/genetics , Chromatin Assembly and Disassembly/genetics , Gene Library , Mice , Mouse Embryonic Stem Cells , RNA, Untranslated/metabolism , Transcription, Genetic
14.
Int J Mol Sci ; 21(3)2020 Jan 28.
Article in English | MEDLINE | ID: mdl-32012884

ABSTRACT

Long noncoding RNAs (lncRNAs) play a key role in many cellular processes including chromatin regulation. To modify chromatin, lncRNAs often interact with DNA in a sequence-specific manner forming RNA:DNA triple helices. Computational tools for triple helix search do not always provide genome-wide predictions of sufficient quality. Here, we used four human lncRNAs (MEG3, DACOR1, TERC and HOTAIR) and their experimentally determined binding regions for evaluating triplex parameters that provide the highest prediction accuracy. Additionally, we combined triplex prediction with the lncRNA secondary structure and demonstrated that considering only single-stranded fragments of lncRNA can further improve DNA-RNA triplexes prediction.


Subject(s)
Computational Biology/methods , DNA/metabolism , RNA, Long Noncoding/chemistry , RNA, Long Noncoding/metabolism , Binding Sites , Humans , Models, Molecular , Nucleic Acid Conformation , RNA/chemistry , RNA/metabolism , Telomerase/chemistry , Telomerase/metabolism
15.
Front Immunol ; 10: 421, 2019.
Article in English | MEDLINE | ID: mdl-30941122

ABSTRACT

Mycobacterium tuberculosis (Mtb) can subvert the host defense by skewing macrophage activation toward a less microbicidal alternative activated state to avoid classical effector killing functions. Investigating the molecular basis of this evasion mechanism could uncover potential candidates for host directed therapy against tuberculosis (TB). A limited number of miRNAs have recently been shown to regulate host-mycobacterial interactions. Here, we performed time course kinetics experiments on bone marrow-derived macrophages (BMDMs) and human monocyte-derived macrophages (MDMs) alternatively activated with IL-4, IL-13, or a combination of IL-4/IL-13, followed by infection with Mtb clinical Beijing strain HN878. MiR-143 and miR-365 were highly induced in Mtb-infected M(IL-4/IL-13) BMDMs and MDMs. Knockdown of miR-143 and miR-365 using antagomiRs decreased the intracellular growth of Mtb HN878, reduced the production of IL-6 and CCL5 and promoted the apoptotic death of Mtb HN878-infected M(IL-4/IL-13) BMDMs. Computational target prediction identified c-Maf, Bach-1 and Elmo-1 as potential targets for both miR-143 and miR-365. Functional validation using luciferase assay, RNA-pulldown assay and Western blotting revealed that c-Maf and Bach-1 are directly targeted by miR-143 while c-Maf, Bach-1, and Elmo-1 are direct targets of miR-365. Knockdown of c-Maf using GapmeRs promoted intracellular Mtb growth when compared to control treated M(IL-4/IL-13) macrophages. Meanwhile, the blocking of Bach-1 had no effect and blocking Elmo-1 resulted in decreased Mtb growth. Combination treatment of M(IL-4/IL-13) macrophages with miR-143 mimics or miR-365 mimics and c-Maf, Bach-1, or Elmo-1 gene-specific GapmeRs restored Mtb growth in miR-143 mimic-treated groups and enhanced Mtb growth in miR-365 mimics-treated groups, thus suggesting the Mtb growth-promoting activities of miR-143 and miR-365 are mediated at least partially through interaction with c-Maf, Bach-1, and Elmo-1. We further show that knockdown of miR-143 and miR-365 in M(IL-4/IL-13) BMDMs decreased the expression of HO-1 and IL-10 which are known targets of Bach-1 and c-Maf, respectively, with Mtb growth-promoting activities in macrophages. Altogether, our work reports a host detrimental role of miR-143 and miR-365 during Mtb infection and highlights for the first time the role and miRNA-mediated regulation of c-Maf, Bach-1, and Elmo-1 in Mtb-infected M(IL-4/IL-13) macrophages.


Subject(s)
Adaptor Proteins, Signal Transducing/immunology , Basic-Leucine Zipper Transcription Factors/immunology , Macrophages/microbiology , MicroRNAs/immunology , Mycobacterium tuberculosis/growth & development , Proto-Oncogene Proteins c-maf/immunology , Animals , Interleukin-13/pharmacology , Interleukin-4/pharmacology , Macrophages/drug effects , Macrophages/immunology , Male , Mice, Inbred BALB C , Tuberculosis/genetics , Tuberculosis/immunology , Tuberculosis/microbiology
16.
BMC Genomics ; 20(1): 102, 2019 Feb 01.
Article in English | MEDLINE | ID: mdl-30709331

ABSTRACT

BACKGROUND: DNA methylation is involved in the regulation of gene expression. Although bisulfite-sequencing based methods profile DNA methylation at a single CpG resolution, methylation levels are usually averaged over genomic regions in the downstream bioinformatic analysis. RESULTS: We demonstrate that on the genome level a single CpG methylation can serve as a more accurate predictor of gene expression than an average promoter / gene body methylation. We define CpG traffic lights (CpG TL) as CpG dinucleotides with a significant correlation between methylation and expression of a gene nearby. CpG TL are enriched in all regulatory regions. Among all promoters, CpG TL are especially enriched in poised ones, suggesting involvement of DNA methylation in their regulation. Yet, binding of only a handful of transcription factors, such as NRF1, ETS, STAT and IRF-family members, could be regulated by direct methylation of transcription factor binding sites (TFBS) or its close proximity. For the majority of TF, an alternative scenario is more likely: methylation and inactivation of the whole regulatory element indirectly represses functional TF binding with a CpG TL being a reliable marker of such inactivation. CONCLUSIONS: CpG TL provide a promising insight into mechanisms of enhancer activity and gene regulation linking methylation of single CpG to gene expression. CpG TL methylation can be used as reliable markers of enhancer activity and gene expression in applications, e.g. in clinic where measuring DNA methylation is easier compared to directly measuring gene expression due to more stable nature of DNA.


Subject(s)
CpG Islands , DNA Methylation , Gene Expression Regulation , Genome, Human , Regulatory Sequences, Nucleic Acid , Transcription Factors/metabolism , Humans , Promoter Regions, Genetic , Transcription Factors/genetics , Transcription, Genetic
17.
Brief Bioinform ; 20(2): 551-564, 2019 03 22.
Article in English | MEDLINE | ID: mdl-29697742

ABSTRACT

The genomes of mammalian species are pervasively transcribed producing as many noncoding as protein-coding RNAs. There is a growing body of evidence supporting their functional role. Long noncoding RNA (lncRNA) can bind both nucleic acids and proteins through several mechanisms. A reliable computational prediction of the most probable mechanism of lncRNA interaction can facilitate experimental validation of its function. In this study, we benchmarked computational tools capable to discriminate lncRNA from mRNA and predict lncRNA interactions with other nucleic acids. We assessed the performance of 9 tools for distinguishing protein-coding from noncoding RNAs, as well as 19 tools for prediction of RNA-RNA and RNA-DNA interactions. Our conclusions about the considered tools were based on their performances on the entire genome/transcriptome level, as it is the most common task nowadays. We found that FEELnc and CPAT distinguish between coding and noncoding mammalian transcripts in the most accurate manner. ASSA, RIBlast and LASTAL, as well as Triplexator, turned out to be the best predictors of RNA-RNA and RNA-DNA interactions, respectively. We showed that the normalization of the predicted interaction strength to the transcript length and GC content may improve the accuracy of inferring RNA interactions. Yet, all the current tools have difficulties to make accurate predictions of short-trans RNA-RNA interactions-stretches of sparse contacts. All over, there is still room for improvement in each category, especially for predictions of RNA interactions.


Subject(s)
Benchmarking , Computational Biology/methods , RNA, Long Noncoding/metabolism , RNA, Messenger/metabolism , Humans , RNA, Long Noncoding/genetics , RNA, Messenger/genetics , Transcriptome
18.
F1000Res ; 7: 165, 2018.
Article in English | MEDLINE | ID: mdl-29904589

ABSTRACT

The presence of H3K27me3 has been demonstrated to correlate with the CpG content. In this work, we tested whether H3K27ac has similar sequence preferences. We performed a translocation of DNA sequences with various properties into a beta-globin locus to control for the local chromatin environment. Our results suggest that in contrast to H3K27me3, H3K27ac gain is unlikely affected by the CpG content of the underlying DNA sequence, while extremely high GC-content might contribute to the gain of the H3K27ac.


Subject(s)
Chromatin/chemistry , CpG Islands/genetics , DNA Methylation , Histones/chemistry , Protein Processing, Post-Translational , Sequence Analysis, DNA/methods , beta-Globins/genetics , Acetylation , Chromatin/genetics , Histones/genetics , Humans
19.
Nucleic Acids Res ; 46(12): e72, 2018 07 06.
Article in English | MEDLINE | ID: mdl-29617876

ABSTRACT

Identifying transcription factor (TF) binding sites (TFBSs) is important in the computational inference of gene regulation. Widely used computational methods of TFBS prediction based on position weight matrices (PWMs) usually have high false positive rates. Moreover, computational studies of transcription regulation in eukaryotes frequently require numerous PWM models of TFBSs due to a large number of TFs involved. To overcome these problems we developed DRAF, a novel method for TFBS prediction that requires only 14 prediction models for 232 human TFs, while at the same time significantly improves prediction accuracy. DRAF models use more features than PWM models, as they combine information from TFBS sequences and physicochemical properties of TF DNA-binding domains into machine learning models. Evaluation of DRAF on 98 human ChIP-seq datasets shows on average 1.54-, 1.96- and 5.19-fold reduction of false positives at the same sensitivities compared to models from HOCOMOCO, TRANSFAC and DeepBind, respectively. This observation suggests that one can efficiently replace the PWM models for TFBS prediction by a small number of DRAF models that significantly improve prediction accuracy. The DRAF method is implemented in a web tool and in a stand-alone software freely available at http://cbrc.kaust.edu.sa/DRAF.


Subject(s)
Sequence Analysis, DNA/methods , Transcription Factors/metabolism , Binding Sites , Chromatin Immunoprecipitation , DNA/chemistry , DNA/metabolism , Humans , Machine Learning , Position-Specific Scoring Matrices
20.
Reprod Toxicol ; 78: 40-49, 2018 06.
Article in English | MEDLINE | ID: mdl-29550351

ABSTRACT

BACKGROUND: The association of exposure to endocrine disrupting chemicals in the peripubertal period with subsequent sperm DNA methylation is unknown. OBJECTIVE: We examined the association of peripubertal serum 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) concentrations with whole-genome bisulfite sequencing (WGBS) of sperm collected in young adulthood. METHODS: The Russian Children's Study is a prospective cohort of 516 boys who were enrolled at 8-9 years of age and provided semen samples at 18-19 years of age. WGBS of sperm was conducted to identify differentially methylated regions (DMR) between highest (n = 4) and lowest (n = 4) peripubertal TCDD groups. RESULTS: We found 52 DMRs that distinguished lowest and highest peripubertal serum TCDD concentrations. One of the top scoring networks, "Cellular Assembly and Organization, Cellular Function and Maintenance, Carbohydrate Metabolism", identified estrogen receptor alpha as its central regulator. CONCLUSION: Findings from our limited sample size suggest that peripubertal environmental exposures are associated with sperm DNA methylation in young adults.


Subject(s)
DNA Methylation , Endocrine Disruptors/blood , Environmental Pollutants/blood , Polychlorinated Dibenzodioxins/blood , Spermatozoa/metabolism , Adolescent , Adult , Child , Environmental Monitoring , Humans , Male , Puberty , Russia , Whole Genome Sequencing , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...