Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 49(16): 9077-9096, 2021 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-34417604

RESUMO

tRNAscan-SE has been widely used for transfer RNA (tRNA) gene prediction for over twenty years, developed just as the first genomes were decoded. With the massive increase in quantity and phylogenetic diversity of genomes, the accurate detection and functional prediction of tRNAs has become more challenging. Utilizing a vastly larger training set, we created nearly one hundred specialized isotype- and clade-specific models, greatly improving tRNAscan-SE's ability to identify and classify both typical and atypical tRNAs. We employ a new comparative multi-model strategy where predicted tRNAs are scored against a full set of isotype-specific covariance models, allowing functional prediction based on both the anticodon and the highest-scoring isotype model. Comparative model scoring has also enhanced the program's ability to detect tRNA-derived SINEs and other likely pseudogenes. For the first time, tRNAscan-SE also includes fast and highly accurate detection of mitochondrial tRNAs using newly developed models. Overall, tRNA detection sensitivity and specificity is improved for all isotypes, particularly those utilizing specialized models for selenocysteine and the three subtypes of tRNA genes encoding a CAU anticodon. These enhancements will provide researchers with more accurate and detailed tRNA annotation for a wider variety of tRNAs, and may direct attention to tRNAs with novel traits.


Assuntos
RNA de Transferência/genética , Análise de Sequência de DNA/métodos , Software , Genes Arqueais , Genes Bacterianos , Genes Fúngicos
2.
Nucleic Acids Res ; 47(W1): W542-W547, 2019 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-31127306

RESUMO

Transfer RNAs (tRNAs) are ubiquitous across the tree of life. Although tRNA structure is highly conserved, there is still significant variation in sequence features between clades, isotypes and even isodecoders. This variation not only impacts translation, but as shown by a variety of recent studies, nontranslation-associated functions are also sensitive to small changes in tRNA sequence. Despite the rapidly growing number of sequenced genomes, there is a lack of tools for both small- and large-scale comparative genomics analysis of tRNA sequence features. Here, we have integrated over 150 000 tRNAs spanning all domains of life into tRNAviz, a web application for exploring and visualizing tRNA sequence features. tRNAviz implements a framework for determining consensus sequence features and can generate sequence feature distributions by isotypes, clades and anticodons, among other tRNA properties such as score. All visualizations are interactive and exportable. The web server is publicly available at http://trna.ucsc.edu/tRNAviz/.


Assuntos
RNA de Transferência/química , Software , Sequência de Bases , Gráficos por Computador , Sequência Consenso , RNA Arqueal/química , RNA Bacteriano/química , RNA de Transferência/classificação , Análise de Sequência de RNA
3.
Genetics ; 196(3): 891-909, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24653211

RESUMO

The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20-40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.


Assuntos
Genoma de Planta , Anotação de Sequência Molecular/métodos , Pinus taeda/genética , DNA de Plantas/análise , Evolução Molecular , Genes de Plantas , Família Multigênica , Filogenia , Alinhamento de Sequência
4.
Genome Biol ; 15(3): R59, 2014 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-24647006

RESUMO

BACKGROUND: The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. RESULTS: We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. CONCLUSIONS: In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma de Planta , Pinus taeda/genética , Análise de Sequência de DNA/métodos , DNA de Plantas/genética , Haploidia
5.
PLoS One ; 8(9): e72439, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24023741

RESUMO

Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation efforts will be undertaken to characterize the reference sequences. Accurate annotation of these ancient genomes would be aided by a comprehensive repeat library; however, few studies have generated enough sequence to fully evaluate and catalog their non-genic content. In this paper, two sets of loblolly pine genomic sequence, 103 previously assembled BACs and 90,954 newly sequenced and assembled fosmid scaffolds, were analyzed. Together, this sequence represents 280 Mbp (roughly 1% of the loblolly pine genome) and one of the most comprehensive studies of repetitive elements and genes in a gymnosperm species. A combination of homology and de novo methodologies were applied to identify both conserved and novel repeats. Similarity analysis estimated a repetitive content of 27% that included both full and partial elements. When combined with the de novo investigation, the estimate increased to almost 86%. Over 60% of the repetitive sequence consists of full or partial LTR (long terminal repeat) retrotransposons. Through de novo approaches, 6,270 novel, full-length transposable element families and 9,415 sub-families were identified. Among those 6,270 families, 82% were annotated as single-copy. Several of the novel, high-copy families are described here, with the largest, PtPiedmont, comprising 133 full-length copies. In addition to repeats, analysis of the coding region reported 23 full-length eukaryotic orthologous proteins (KOGS) and another 29 novel or orthologous genes. These discoveries, along with other genomic resources, will be used to annotate conifer genomes and address long-standing questions about gymnosperm evolution.


Assuntos
Cromossomos Artificiais Bacterianos/genética , Genoma de Planta/genética , Pinus taeda/genética , Retroelementos/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...