Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Genome Biol ; 19(1): 32, 2018 03 15.
Artigo em Inglês | MEDLINE | ID: mdl-29540241

RESUMO

BACKGROUND: The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider promoter conservation and positional conservation as indicators of functional commonality. RESULTS: We identify 665 conserved lncRNA promoters in mouse and human that are preserved in genomic position relative to orthologous coding genes. These positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are coexpressed in a tissue-specific manner. Over half of positionally conserved RNAs in this set are linked to chromatin organization structures, overlapping binding sites for the CTCF chromatin organiser and located at chromatin loop anchor points and borders of topologically associating domains (TADs). We define these RNAs as topological anchor point RNAs (tapRNAs). Characterization of these noncoding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other's expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Furthermore, we find that tapRNAs contain conserved sequence domains that are enriched in motifs for zinc finger domain-containing RNA-binding proteins and transcription factors, whose binding sites are found mutated in cancers. CONCLUSIONS: This work leverages positional conservation to identify lncRNAs with potential importance in genome organization, development and disease. The evidence that many developmental transcription factors are physically and functionally connected to lncRNAs represents an exciting stepping-stone to further our understanding of genome regulation.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Loci Gênicos , RNA Longo não Codificante/genética , Animais , Sequência de Bases , Cromatina/química , Sequência Conservada , Genoma , Humanos , Camundongos , Neoplasias/genética , Motivos de Nucleotídeos , Regiões Promotoras Genéticas , RNA Longo não Codificante/química , Fatores de Transcrição/genética
2.
Bioinformatics ; 28(23): 3042-50, 2012 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-23044541

RESUMO

MOTIVATION: Comparing transcriptomic data with proteomic data to identify protein-coding sequences is a long-standing challenge in molecular biology, one that is exacerbated by the increasing size of high-throughput datasets. To address this challenge, and thereby to improve the quality of genome annotation and understanding of genome biology, we have developed an integrated suite of programs, called Pinstripe. We demonstrate its application, utility and discovery power using transcriptomic and proteomic data from publicly available datasets. RESULTS: To demonstrate the efficacy of Pinstripe for large-scale analysis, we applied Pinstripe's reverse peptide mapping pipeline to a transcript library including de novo assembled transcriptomes from the human Illumina Body Atlas (IBA2) and GENCODE v10 gene annotations, and the EBI Proteomics Identifications Database (PRIDE) peptide database. This analysis identified 736 canonical open reading frames (ORFs) supported by three or more PRIDE peptide fragments that are positioned outside any known coding DNA sequence (CDS). Because of the unfiltered nature of the PRIDE database and high probability of false discovery, we further refined this list using independent evidence for translation, including the presence of a Kozak sequence or functional domains, synonymous/non-synonymous substitution ratios and ORF length. Using this integrative approach, we observed evidence of translation from a previously unknown let7e primary transcript, the archetypical lncRNA H19, and a homolog of RD3. Reciprocally, by exclusion of transcripts with mapped peptides or significant ORFs (>80 codon), we identify 32 187 loci with RNAs longer than 2000 nt that are unlikely to encode proteins. AVAILABILITY AND IMPLEMENTATION: Pinstripe (pinstripe.matticklab.com) is freely available as source code or a Mono binary. Pinstripe is written in C# and runs under the Mono framework on Linux or Mac OS X, and both under Mono and .Net under Windows. CONTACT: m.dinger@garvan.org.au or j.mattick@garvan.org.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Perfilação da Expressão Gênica/métodos , Genômica/métodos , Proteômica/métodos , Software , Biologia Computacional/métodos , Bases de Dados de Proteínas , Éxons , Biblioteca Gênica , Genoma , Humanos , Anotação de Sequência Molecular , Fases de Leitura Aberta , Proteínas/genética , RNA Longo não Codificante/genética , RNA Mensageiro/genética , Análise de Sequência de RNA
3.
Biochimie ; 93(11): 2013-8, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21802485

RESUMO

Increasing numbers of transcripts have been reported to transmit both protein-coding and regulatory information. Apart from challenging our conception of the gene, this observation raises the question as to what extent this phenomenon occurs across the genome and how and why such dual encoding of function has evolved in the eukaryotic genome. To address this question, we consider the evolutionary path of genes in the earliest forms of life on Earth, where it is generally regarded that proteins evolved from a cellular machinery based entirely within RNA. This led to the domination of protein-coding genes in the genomes of microorganisms, although it is likely that RNA never lost its other capacities and functionalities, as evidenced by cis-acting riboswitches and UTRs. On the basis that the subsequent evolution of a more sophisticated regulatory architecture to provide higher levels of epigenetic control and accurate spatiotemporal expression in developmentally complex organisms is a complicated task, we hypothesize: (i) that mRNAs have been and remain subject to secondary selection to provide trans-acting regulatory capability in parallel with protein-coding functions; (ii) that some and perhaps many protein-coding loci, possibly as a consequence of gene duplication, have lost protein-coding functions en route to acquiring more sophisticated trans-regulatory functions; (iii) that many transcripts have become subject to secondary processing to release different products; and (iv) that novel proteins have emerged within loci that previously evolved functionality as regulatory RNAs. In support of the idea that there is a dynamic flux between different types of informational RNAs in both evolutionary and real time, we review recent observations that have arisen from transcriptomic surveys of complex eukaryotes and reconsider how these observations impact on the notion that apparently discrete loci may express transcripts with more than one function. In conclusion, we posit that many eukaryotic loci have evolved the capacity to transact a multitude of overlapping and potentially independent functions as both regulatory and protein-coding RNAs.


Assuntos
Evolução Molecular , Fases de Leitura Aberta/genética , RNA não Traduzido/genética , RNA/genética , Processamento Alternativo , Bactérias/genética , Genoma , Humanos , Splicing de RNA/genética , RNA Mensageiro/genética , Riboswitch/genética , Regiões não Traduzidas/genética
4.
Nucleic Acids Res ; 39(Database issue): D146-51, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21112873

RESUMO

Large numbers of long RNAs with little or no protein-coding potential [long noncoding RNAs (lncRNAs)] are being identified in eukaryotes. In parallel, increasing data describing the expression profiles, molecular features and functions of individual lncRNAs in a variety of systems are accumulating. To enable the systematic compilation and updating of this information, we have developed a database (lncRNAdb) containing a comprehensive list of lncRNAs that have been shown to have, or to be associated with, biological functions in eukaryotes, as well as messenger RNAs that have regulatory roles. Each entry contains referenced information about the RNA, including sequences, structural information, genomic context, expression, subcellular localization, conservation, functional evidence and other relevant information. lncRNAdb can be searched by querying published RNA names and aliases, sequences, species and associated protein-coding genes, as well as terms contained in the annotations, such as the tissues in which the transcripts are expressed and associated diseases. In addition, lncRNAdb is linked to the UCSC Genome Browser for visualization and Noncoding RNA Expression Database (NRED) for expression information from a variety of sources. lncRNAdb provides a platform for the ongoing collation of the literature pertaining to lncRNAs and their association with other genomic elements. lncRNAdb can be accessed at: http://www.lncrnadb.org/.


Assuntos
Bases de Dados de Ácidos Nucleicos , RNA não Traduzido/química , RNA não Traduzido/fisiologia , Doença/genética , Interações Hospedeiro-Patógeno , Humanos , RNA não Traduzido/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA