Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
Mais filtros

Bases de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 49(11): 6128-6143, 2021 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-34086938

RESUMO

Many non-coding RNAs with known functions are structurally conserved: their intramolecular secondary and tertiary interactions are maintained across evolutionary time. Consequently, the presence of conserved structure in multiple sequence alignments can be used to identify candidate functional non-coding RNAs. Here, we present a bioinformatics method that couples iterative homology search with covariation analysis to assess whether a genomic region has evidence of conserved RNA structure. We used this method to examine all unannotated regions of five well-studied fungal genomes (Saccharomyces cerevisiae, Candida albicans, Neurospora crassa, Aspergillus fumigatus, and Schizosaccharomyces pombe). We identified 17 novel structurally conserved non-coding RNA candidates, which include four H/ACA box small nucleolar RNAs, four intergenic RNAs and nine RNA structures located within the introns and untranslated regions (UTRs) of mRNAs. For the two structures in the 3' UTRs of the metabolic genes GLY1 and MET13, we performed experiments that provide evidence against them being eukaryotic riboswitches.


Assuntos
RNA Fúngico/química , RNA não Traduzido/química , Regiões 3' não Traduzidas , Biologia Computacional/métodos , Genoma Fúngico , Íntrons , Lisina-tRNA Ligase/genética , Cadeias de Markov , Conformação de Ácido Nucleico , RNA Nucleolar Pequeno/química , Proteínas Ribossômicas/genética , Riboswitch , Alinhamento de Sequência , Tiorredoxinas/genética
2.
Nucleic Acids Res ; 44(D1): D81-9, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26612867

RESUMO

Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.


Assuntos
Elementos de DNA Transponíveis , DNA/química , Bases de Dados de Ácidos Nucleicos , Sequências Repetitivas de Ácido Nucleico , Animais , DNA/classificação , Genoma , Humanos , Internet , Cadeias de Markov , Camundongos , Anotação de Sequência Molecular , Alinhamento de Sequência
3.
Nucleic Acids Res ; 41(Database issue): D70-82, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203985

RESUMO

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.


Assuntos
Elementos de DNA Transponíveis , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Humanos , Internet , Cadeias de Markov , Modelos Estatísticos , Anotação de Sequência Molecular
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA