Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Brief Bioinform ; 25(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38385876

RESUMEN

Enhancers play an important role in the process of gene expression regulation. In DNA sequence abundance or absence of enhancers and irregularities in the strength of enhancers affects gene expression process that leads to the initiation and propagation of diverse types of genetic diseases such as hemophilia, bladder cancer, diabetes and congenital disorders. Enhancer identification and strength prediction through experimental approaches is expensive, time-consuming and error-prone. To accelerate and expedite the research related to enhancers identification and strength prediction, around 19 computational frameworks have been proposed. These frameworks used machine and deep learning methods that take raw DNA sequences and predict enhancer's presence and strength. However, these frameworks still lack in performance and are not useful in real time analysis. This paper presents a novel deep learning framework that uses language modeling strategies for transforming DNA sequences into statistical feature space. It applies transfer learning by training a language model in an unsupervised fashion by predicting a group of nucleotides also known as k-mers based on the context of existing k-mers in a sequence. At the classification stage, it presents a novel classifier that reaps the benefits of two different architectures: convolutional neural network and attention mechanism. The proposed framework is evaluated over the enhancer identification benchmark dataset where it outperforms the existing best-performing framework by 5%, and 9% in terms of accuracy and MCC. Similarly, when evaluated over the enhancer strength prediction benchmark dataset, it outperforms the existing best-performing framework by 4%, and 7% in terms of accuracy and MCC.


Asunto(s)
Benchmarking , Medicina , Redes Neurales de la Computación , Nucleótidos , Secuencias Reguladoras de Ácidos Nucleicos
2.
BMC Genomics ; 20(1): 511, 2019 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-31221079

RESUMEN

BACKGROUND: Non-coding gene regulatory enhancers are essential to transcription in mammalian cells. As a result, a large variety of experimental and computational strategies have been developed to identify cis-regulatory enhancer sequences. Given the differences in the biological signals assayed, some variation in the enhancers identified by different methods is expected; however, the concordance of enhancers identified by different methods has not been comprehensively evaluated. This is critically needed, since in practice, most studies consider enhancers identified by only a single method. Here, we compare enhancer sets from eleven representative strategies in four biological contexts. RESULTS: All sets we evaluated overlap significantly more than expected by chance; however, there is significant dissimilarity in their genomic, evolutionary, and functional characteristics, both at the element and base-pair level, within each context. The disagreement is sufficient to influence interpretation of candidate SNPs from GWAS studies, and to lead to disparate conclusions about enhancer and disease mechanisms. Most regions identified as enhancers are supported by only one method, and we find limited evidence that regions identified by multiple methods are better candidates than those identified by a single method. As a result, we cannot recommend the use of any single enhancer identification strategy in all settings. CONCLUSIONS: Our results highlight the inherent complexity of enhancer biology and identify an important challenge to mapping the genetic architecture of complex disease. Greater appreciation of how the diverse enhancer identification strategies in use today relate to the dynamic activity of gene regulatory regions is needed to enable robust and reproducible results.


Asunto(s)
Elementos de Facilitación Genéticos , Línea Celular , Bases de Datos Genéticas , Evolución Molecular , Regulación de la Expresión Génica , Genómica , Humanos , Hígado/metabolismo , Anotación de Secuencia Molecular , Miocardio/metabolismo
3.
BMC Genomics ; 18(1): 656, 2017 Aug 23.
Artículo en Inglés | MEDLINE | ID: mdl-28836940

RESUMEN

BACKGROUND: The molecular mechanisms of transcriptional regulation are poorly understood in Plasmodium falciparum. In addition, most of the genes in Plasmodium falciparum are transcriptionally poised and only a handful of cis-regulatory elements are known to operate in transcriptional regulation. Here, we employed an epigenetic signature based approach to identify significance of previously uncharacterised intergenic regions enriched with histone modification marks leading to discovery of enhancer-like elements. RESULTS: We found that enhancer-like elements are significantly enriched with H3K4me1, generate unique non-coding bi-directional RNAs and majority of them can function as cis-regulators. Furthermore, functional enhancer reporter assay demonstrates that the enhancer-like elements regulate transcription of target genes in Plasmodium falciparum. Our study also suggests that the Plasmodium genome segregates functionally related genes into discrete housekeeping and pathogenicity/virulence clusters, presumably for robust transcriptional control of virulence/pathogenicity genes. CONCLUSIONS: This report contributes to the understanding of parasite regulatory genomics by identification of enhancer-like elements, defining their epigenetic and transcriptional features and provides a resource of functional cis-regulatory elements that may give insights into the virulence/pathogenicity of Plasmodium falciparum.


Asunto(s)
Elementos de Facilitación Genéticos/genética , Genómica , Plasmodium falciparum/genética , Transcripción Genética/genética , Epigénesis Genética/genética , Histonas/metabolismo , Plasmodium falciparum/patogenicidad , Virulencia/genética
4.
Brief Funct Genomics ; 22(3): 302-311, 2023 05 18.
Artículo en Inglés | MEDLINE | ID: mdl-36715222

RESUMEN

Enhancers, a class of distal cis-regulatory elements located in the non-coding region of DNA, play a key role in gene regulation. It is difficult to identify enhancers from DNA sequence data because enhancers are freely distributed in the non-coding region, with no specific sequence features, and having a long distance with the targeted promoters. Therefore, this study presents a stacking ensemble learning method to accurately identify enhancers and classify enhancers into strong and weak enhancers. Firstly, we obtain the fusion feature matrix by fusing the four features of Kmer, PseDNC, PCPseDNC and Z-Curve9. Secondly, five K-Nearest Neighbor (KNN) models with different parameters are trained as the base model, and the Logistic Regression algorithm is utilized as the meta-model. Thirdly, the stacking ensemble learning strategy is utilized to construct a two-layer model based on the base model and meta-model to train the preprocessed feature sets. The proposed method, named iEnhancer-SKNN, is a two-layer prediction model, in which the function of the first layer is to predict whether the given DNA sequences are enhancers or non-enhancers, and the function of the second layer is to distinguish whether the predicted enhancers are strong enhancers or weak enhancers. The performance of iEnhancer-SKNN is evaluated on the independent testing dataset and the results show that the proposed method has better performance in predicting enhancers and their strength. In enhancer identification, iEnhancer-SKNN achieves an accuracy of 81.75%, an improvement of 1.35% to 8.75% compared with other predictors, and in enhancer classification, iEnhancer-SKNN achieves an accuracy of 80.50%, an improvement of 5.5% to 25.5% compared with other predictors. Moreover, we identify key transcription factor binding site motifs in the enhancer regions and further explore the biological functions of the enhancers and these key motifs. Source code and data can be downloaded from https://github.com/HaoWuLab-Bioinformatics/iEnhancer-SKNN.


Asunto(s)
Elementos de Facilitación Genéticos , Programas Informáticos , Elementos de Facilitación Genéticos/genética , Regiones Promotoras Genéticas/genética , Análisis de Secuencia de ADN/métodos , ADN , Aprendizaje Automático
5.
Trends Cell Biol ; 28(8): 608-630, 2018 08.
Artículo en Inglés | MEDLINE | ID: mdl-29759817

RESUMEN

Enhancers are distally located genomic cis-regulatory elements that integrate spatiotemporal cues to coordinate gene expression in a tissue-specific manner during metazoan development. Enhancer function depends on a combination of bound transcription factors and cofactors that regulate local chromatin structure, as well as on the topological interactions that are necessary for their activity. Numerous genome-wide studies concur that the vast majority of disease-associated variations occur within non-coding genomic sequences, in other words the 'cis-regulome', and this underscores their relevance for human health. Advances in DNA sequencing and genome-editing technologies have dramatically expanded our ability to identify enhancers and investigate their properties in vivo, revealing an extraordinary level of interconnectivity underlying cis-regulatory networks. We discuss here these recently developed methodologies, as well as emerging trends and remaining questions in the field of enhancer biology, and how perturbation of enhancer activities/functions results in enhanceropathies.


Asunto(s)
Enfermedad/genética , Elementos de Facilitación Genéticos/genética , Factores de Transcripción/metabolismo , Animales , Humanos , Transcripción Genética/genética
6.
Genome Biol ; 17(1): 196, 2016 Sep 27.
Artículo en Inglés | MEDLINE | ID: mdl-27678375

RESUMEN

BACKGROUND: Drosophila dorso-ventral (DV) patterning is one of the best-understood regulatory networks to date, and illustrates the fundamental role of enhancers in controlling patterning, cell fate specification, and morphogenesis during development. Histone acetylation such as H3K27ac is an excellent marker for active enhancers, but it is challenging to obtain precise locations for enhancers as the highest levels of this modification flank the enhancer regions. How to best identify tissue-specific enhancers in a developmental system de novo with a minimal set of data is still unclear. RESULTS: Using DV patterning as a test system, we develop a simple and effective method to identify tissue-specific enhancers de novo. We sample a broad set of candidate enhancer regions using data on CREB-binding protein co-factor binding or ATAC-seq chromatin accessibility, and then identify those regions with significant differences in histone acetylation between tissues. This method identifies hundreds of novel DV enhancers and outperforms ChIP-seq data of relevant transcription factors when benchmarked with mRNA expression data and transgenic reporter assays. These DV enhancers allow the de novo discovery of the relevant transcription factor motifs involved in DV patterning and contain additional motifs that are evolutionarily conserved and for which the corresponding transcription factors are expressed in a DV-biased fashion. Finally, we identify novel target genes of the regulatory network, implicating morphogenesis genes as early targets of DV patterning. CONCLUSIONS: Taken together, our approach has expanded our knowledge of the DV patterning network even further and is a general method to identify enhancers in any developmental system, including mammalian development.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA