Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
BMC Genomics ; 23(1): 78, 2022 Jan 25.
Artículo en Inglés | MEDLINE | ID: mdl-35078412

RESUMEN

BACKGROUND: Transcriptional regulation is primarily mediated by the binding of factors to non-coding regions in DNA. Identification of these binding regions enhances understanding of tissue formation and potentially facilitates the development of gene therapies. However, successful identification of binding regions is made difficult by the lack of a universal biological code for their characterisation. RESULTS: We extend an alignment-based method, changept, and identify clusters of biological significance, through ontology and de novo motif analysis. Further, we apply a Bayesian method to estimate and combine binary classifiers on the clusters we identify to produce a better performing composite. CONCLUSIONS: The analysis we describe provides a computational method for identification of conserved binding sites in the human genome and facilitates an alternative interrogation of combinations of existing data sets with alignment data.


Asunto(s)
Algoritmos , Secuencias Reguladoras de Ácidos Nucleicos , Teorema de Bayes , Sitios de Unión , Genoma Humano , Humanos , Secuencias Reguladoras de Ácidos Nucleicos/genética
2.
BMC Genomics ; 18(1): 259, 2017 03 27.
Artículo en Inglés | MEDLINE | ID: mdl-28347272

RESUMEN

BACKGROUND: Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. RESULTS: We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. CONCLUSIONS: This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.


Asunto(s)
Genoma , ARN no Traducido/metabolismo , Animales , Teorema de Bayes , Sitios de Unión , Secuencia Conservada , Humanos , Intrones , Ratones , Desarrollo de Músculos/genética , Conformación de Ácido Nucleico , ARN no Traducido/química , ARN no Traducido/genética , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Interfaz Usuario-Computador , Pez Cebra/genética
3.
PLoS One ; 10(3): e0118595, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25739023

RESUMEN

Wolbachia pipientis is an endosymbiotic bacterium that induces a wide range of effects in its insect hosts, including manipulation of reproduction and protection against pathogens. Little is known of the molecular mechanisms underlying the insect-Wolbachia interaction, though it is likely to be mediated via the secretion of proteins or other factors. There is an increasing amount of evidence that bacteria regulate many cellular processes, including secretion of virulence factors, using small non-coding RNAs (sRNAs), but sRNAs have not previously been described from Wolbachia. We have used two independent approaches, one based on comparative genomics and the other using RNA-Seq data generated for gene expression studies, to identify candidate sRNAs in Wolbachia. We experimentally characterized the expression of one of these candidates in four Wolbachia strains, and showed that it is differentially regulated in different host tissues and sexes. Given the roles played by sRNAs in other host-associated bacteria, the conservation of the candidate sRNAs between different Wolbachia strains, and the sex- and tissue-specific differential regulation we have identified, we hypothesise that sRNAs may play a significant role in the biology of Wolbachia, and in particular in its interactions with its host.


Asunto(s)
Espacio Intracelular/microbiología , ARN Pequeño no Traducido/genética , Wolbachia/genética , Wolbachia/fisiología , Animales , Biología Computacional , Secuencia Conservada , Drosophila melanogaster/microbiología , Femenino , Especificidad del Huésped , Masculino , Especificidad de Órganos , ARN Mensajero/genética , ARN Mensajero/metabolismo , Análisis de Secuencia de ARN , Transcripción Genética
4.
Comput Struct Biotechnol J ; 10(17): 107-15, 2014 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25349679

RESUMEN

Genomes are composed of a wide variety of elements with distinct roles and characteristics. Some of these elements are well-characterised functional components such as protein-coding exons. Other elements play regulatory or structural roles, encode functional non-protein-coding RNAs, or perform some other function yet to be characterised. Still others may have no functional importance, though they may nevertheless be of interest to biologists. One technique for investigating the composition of genomes is to segment sequences into compositionally homogenous blocks. This technique, known as 'sequence segmentation' or 'change-point analysis', is used to identify patterns of variation across genomes such as GC-rich and GC-poor regions, coding and non-coding regions, slowly evolving and rapidly evolving regions and many other types of variation. In this mini-review we outline many of the genome segmentation methods currently available and then focus on a Bayesian DNA segmentation algorithm, with examples of its various applications.

5.
PLoS One ; 9(5): e97336, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24824035

RESUMEN

The 3' UTRs of eukaryotic genes participate in a variety of post-transcriptional (and some transcriptional) regulatory interactions. Some of these interactions are well characterised, but an undetermined number remain to be discovered. While some regulatory sequences in 3' UTRs may be conserved over long evolutionary time scales, others may have only ephemeral functional significance as regulatory profiles respond to changing selective pressures. Here we propose a sensitive segmentation methodology for investigating patterns of composition and conservation in 3' UTRs based on comparison of closely related species. We describe encodings of pairwise and three-way alignments integrating information about conservation, GC content and transition/transversion ratios and apply the method to three closely related Drosophila species: D. melanogaster, D. simulans and D. yakuba. Incorporating multiple data types greatly increased the number of segment classes identified compared to similar methods based on conservation or GC content alone. We propose that the number of segments and number of types of segment identified by the method can be used as proxies for functional complexity. Our main finding is that the number of segments and segment classes identified in 3' UTRs is greater than in the same length of protein-coding sequence, suggesting greater functional complexity in 3' UTRs. There is thus a need for sustained and extensive efforts by bioinformaticians to delineate functional elements in this important genomic fraction. C code, data and results are available upon request.


Asunto(s)
Regiones no Traducidas 3'/genética , Drosophila/genética , Variación Genética , Modelos Genéticos , Animales , Secuencia de Bases , Biología Computacional , Datos de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Especificidad de la Especie
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA