Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Proc Natl Acad Sci U S A ; 113(18): 5053-8, 2016 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-27035985

RESUMEN

Tardigrades are meiofaunal ecdysozoans that are key to understanding the origins of Arthropoda. Many species of Tardigrada can survive extreme conditions through cryptobiosis. In a recent paper [Boothby TC, et al. (2015) Proc Natl Acad Sci USA 112(52):15976-15981], the authors concluded that the tardigrade Hypsibius dujardini had an unprecedented proportion (17%) of genes originating through functional horizontal gene transfer (fHGT) and speculated that fHGT was likely formative in the evolution of cryptobiosis. We independently sequenced the genome of H. dujardini As expected from whole-organism DNA sampling, our raw data contained reads from nontarget genomes. Filtering using metagenomics approaches generated a draft H. dujardini genome assembly of 135 Mb with superior assembly metrics to the previously published assembly. Additional microbial contamination likely remains. We found no support for extensive fHGT. Among 23,021 gene predictions we identified 0.2% strong candidates for fHGT from bacteria and 0.2% strong candidates for fHGT from nonmetazoan eukaryotes. Cross-comparison of assemblies showed that the overwhelming majority of HGT candidates in the Boothby et al. genome derived from contaminants. We conclude that fHGT into H. dujardini accounts for at most 1-2% of genes and that the proposal that one-sixth of tardigrade genes originate from functional HGT events is an artifact of undetected contamination.


Asunto(s)
Transferencia de Gen Horizontal , Tardigrada/genética , Animales , Artrópodos/genética , Genoma , Datos de Secuencia Molecular , Filogenia
2.
Nucleic Acids Res ; 43(Database issue): D130-7, 2015 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-25392425

RESUMEN

The Rfam database (available at http://rfam.xfam.org) is a collection of non-coding RNA families represented by manually curated sequence alignments, consensus secondary structures and annotation gathered from corresponding Wikipedia, taxonomy and ontology resources. In this article, we detail updates and improvements to the Rfam data and website for the Rfam 12.0 release. We describe the upgrade of our search pipeline to use Infernal 1.1 and demonstrate its improved homology detection ability by comparison with the previous version. The new pipeline is easier for users to apply to their own data sets, and we illustrate its ability to annotate RNAs in genomic and metagenomic data sets of various sizes. Rfam has been expanded to include 260 new families, including the well-studied large subunit ribosomal RNA family, and for the first time includes information on short sequence- and structure-based RNA motifs present within families.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/química , Genómica , Internet , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , Motivos de Nucleótidos , ARN Largo no Codificante/química , ARN no Traducido/clasificación , Programas Informáticos
3.
Nucleic Acids Res ; 41(Database issue): D226-32, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23125362

RESUMEN

The Rfam database (available via the website at http://rfam.sanger.ac.uk and through our mirror at http://rfam.janelia.org) is a collection of non-coding RNA families, primarily RNAs with a conserved RNA secondary structure, including both RNA genes and mRNA cis-regulatory elements. Each family is represented by a multiple sequence alignment, predicted secondary structure and covariance model. Here we discuss updates to the database in the latest release, Rfam 11.0, including the introduction of genome-based alignments for large families, the introduction of the Rfam Biomart as well as other user interface improvements. Rfam is available under the Creative Commons Zero license.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/química , ARN no Traducido/clasificación , Secuencia de Bases , Genómica , Internet , Anotación de Secuencia Molecular , Conformación de Ácido Nucleico , ARN no Traducido/genética , Alineación de Secuencia , Interfaz Usuario-Computador
4.
Nucleic Acids Res ; 39(Database issue): D141-5, 2011 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-21062808

RESUMEN

The Rfam database aims to catalogue non-coding RNAs through the use of sequence alignments and statistical profile models known as covariance models. In this contribution, we discuss the pros and cons of using the online encyclopedia, Wikipedia, as a source of community-derived annotation. We discuss the addition of groupings of related RNA families into clans and new developments to the website. Rfam is available on the Web at http://rfam.sanger.ac.uk.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN no Traducido/química , Enciclopedias como Asunto , Modelos Estadísticos , Conformación de Ácido Nucleico , ARN no Traducido/clasificación , Alineación de Secuencia , Análisis de Secuencia de ARN
5.
Nat Genet ; 36(12): 1259-67, 2004 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-15543149

RESUMEN

The phylum Nematoda occupies a huge range of ecological niches, from free-living microbivores to human parasites. We analyzed the genomic biology of the phylum using 265,494 expressed-sequence tag sequences, corresponding to 93,645 putative genes, from 30 species, including 28 parasites. From 35% to 70% of each species' genes had significant similarity to proteins from the model nematode Caenorhabditis elegans. More than half of the putative genes were unique to the phylum, and 23% were unique to the species from which they were derived. We have not yet come close to exhausting the genomic diversity of the phylum. We identified more than 2,600 different known protein domains, some of which had differential abundances between major taxonomic groups of nematodes. We also defined 4,228 nematode-specific protein families from nematode-restricted genes: this class of genes probably underpins species- and higher-level taxonomic disparity. Nematode-specific families are particularly interesting as drug and vaccine targets.


Asunto(s)
Evolución Molecular , Etiquetas de Secuencia Expresada , Variación Genética , Genoma , Nematodos/genética , Estructura Terciaria de Proteína/genética , Animales , Secuencia de Bases , Mapeo Cromosómico , Biología Computacional , Secuencia Conservada/genética , Bases de Datos Genéticas , Datos de Secuencia Molecular , Filogenia , Análisis de Secuencia de ADN , Especificidad de la Especie
6.
Nucleic Acids Res ; 37(Database issue): D136-40, 2009 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-18953034

RESUMEN

Rfam is a collection of RNA sequence families, represented by multiple sequence alignments and covariance models (CMs). The primary aim of Rfam is to annotate new members of known RNA families on nucleotide sequences, particularly complete genomes, using sensitive BLAST filters in combination with CMs. A minority of families with a very broad taxonomic range (e.g. tRNA and rRNA) provide the majority of the sequence annotations, whilst the majority of Rfam families (e.g. snoRNAs and miRNAs) have a limited taxonomic range and provide a limited number of annotations. Recent improvements to the website, methodologies and data used by Rfam are discussed. Rfam is freely available on the Web at http://rfam.sanger.ac.uk/and http://rfam.janelia.org/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN/química , ARN/clasificación , Gráficos por Computador , Internet , Alineación de Secuencia , Análisis de Secuencia de ARN
7.
RNA ; 14(12): 2462-4, 2008 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-18945806

RESUMEN

The online encyclopedia Wikipedia has become one of the most important online references in the world and has a substantial and growing scientific content. A search of Google with many RNA-related keywords identifies a Wikipedia article as the top hit. We believe that the RNA community has an important and timely opportunity to maximize the content and quality of RNA information in Wikipedia. To this end, we have formed the RNA WikiProject (http://en.wikipedia.org/wiki/Wikipedia:WikiProject_RNA) as part of the larger Molecular and Cellular Biology WikiProject. We have created over 600 new Wikipedia articles describing families of noncoding RNAs based on the Rfam database, and invite the community to update, edit, and correct these articles. The Rfam database now redistributes this Wikipedia content as the primary textual annotation of its RNA families. Users can, therefore, for the first time, directly edit the content of one of the major RNA databases. We believe that this Wikipedia/Rfam link acts as a functioning model for incorporating community annotation into molecular biology databases.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , ARN/genética , Sistemas de Administración de Bases de Datos , ARN/química
8.
Nat Genet ; 52(7): 750, 2020 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-32541926

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

9.
Methods Mol Biol ; 1269: 349-63, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25577390

RESUMEN

The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.


Asunto(s)
Biología Computacional/métodos , ARN no Traducido/química , Bases de Datos de Ácidos Nucleicos , Análisis de Secuencia de ARN , Programas Informáticos
10.
Mol Biochem Parasitol ; 137(2): 215-27, 2004 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-15383292

RESUMEN

Comparative nematode genomics has thus far been largely constrained to the genus Caenorhabditis, but a huge diversity of other nematode species, and genomes, exist. The Brugia malayi genome is approximately 100 Mb in size, and distributed across five chromosome pairs. Previous genomic investigations have included definition of major repeat classes and sequencing of selected genes. We have generated over 18,000 sequences from the ends of large-insert clones from bacterial artificial chromosome libraries. These end sequences, totalling over 10 Mb of sequence, contain just under 8 Mb of unique sequence. We identified the known Mbo I and Hha I repeat families in the sequence data, and also identified several new repeats based on their abundance. Genomic copies of 17% of B. malayi genes defined by expressed sequence tags have been identified. Nearly one quarter of end sequences can encode peptides with significant similarity to protein sequences in the public databases, and we estimate that we have identified more than 2700 new B. malayi genes. Importantly, 459 end sequences had homologues in other organisms, but lacked a match in the completely sequenced genomes of Caenorhabditis briggsae and Caenorhabditis elegans, emphasising the role of gene loss in genome evolution. B. malayi is estimated to have over 18,500 protein-coding genes.


Asunto(s)
Brugia Malayi/genética , Genes de Helminto , Animales , Caenorhabditis/genética , Caenorhabditis elegans/genética , Mapeo Cromosómico , Cromosomas Artificiales Bacterianos/genética , ADN de Helmintos/genética , Conversión Génica , Genoma , Genómica , Datos de Secuencia Molecular , ARN de Helminto/genética , ARN Ribosómico/genética , Secuencias Repetitivas de Ácidos Nucleicos , Retroelementos/genética , Especificidad de la Especie
11.
Proc Biol Sci ; 271 Suppl 4: S189-92, 2004 May 07.
Artículo en Inglés | MEDLINE | ID: mdl-15252980

RESUMEN

A molecular survey technique was used to investigate the diversity of terrestrial tardigrades from three sites within Scotland. Ribosomal small subunit sequence was used to classify specimens into molecular operational taxonomic units (MOTU). Most MOTU were identified to the generic level using digital voucher photography. Thirty-two MOTU were defined, a surprising abundance given that the documented British fauna is 68 species. Some tardigrade MOTU were shared between the two rural collection sites, but no MOTU were found in both urban and rural sites, which conflicts with models of ubiquity of meiofaunal taxa. The patterns of relatedness of MOTU were particularly intriguing, with some forming clades with low levels of divergence, suggestive of taxon flocks. Some morphological taxa contained well-separated MOTU, perhaps indicating the existence of cryptic taxa. DNA sequence-based MOTU proved to be a revealing method for meiofaunal diversity studies.


Asunto(s)
Biodiversidad , Invertebrados/clasificación , Invertebrados/genética , Fenotipo , Filogenia , Animales , Secuencia de Bases , Análisis por Conglomerados , Geografía , Datos de Secuencia Molecular , Escocia , Análisis de Secuencia de ADN , Especificidad de la Especie
12.
Int J Parasitol ; 34(6): 733-46, 2004 May.
Artículo en Inglés | MEDLINE | ID: mdl-15111095

RESUMEN

The parasitic nematode, Brugia malayi, causes lymphatic filariasis in humans, which in severe cases leads to the condition known as elephantiasis. The parasite contains an endosymbiotic alpha-proteobacterium of the genus Wolbachia that is required for normal worm development and fecundity and is also implicated in the pathology associated with infections by these filarial nematodes. Bacterial artificial chromosome libraries were constructed from B. malayi DNA and provide over 11-fold coverage of the nematode genome. Wolbachia genomic fragments were simultaneously cloned into the libraries giving over 5-fold coverage of the 1.1 Mb bacterial genome. A physical framework for the Wolbachia genome was developed by construction of a plasmid library enriched for Wolbachia DNA as a source of sequences to hybridise to high-density bacterial artificial chromosome colony filters. Bacterial artificial chromosome end sequencing provided additional Wolbachia probe sequences to facilitate assembly of a contig that spanned the entire genome. The Wolbachia sequences provided a marker approximately every 10 kb. Four rare-cutting restriction endonucleases were used to restriction map the genome to a resolution of approximately 60 kb and demonstrate concordance between the bacterial artificial chromosome clones and native Wolbachia genomic DNA. Comparison of Wolbachia sequences to public databases using BLAST algorithms under stringent conditions allowed confident prediction of 69 Wolbachia peptide functions and two rRNA genes. Comparison to closely related complete genomes revealed that while most sequences had orthologs in the genome of the Wolbachia endosymbiont from Drosophila melanogaster, there was no evidence for long-range synteny. Rather, there were a few cases of short-range conservation of gene order extending over regions of less than 10 kb. The molecular scaffold produced for the genome of the Wolbachia from B. malayi forms the basis of a genomic sequencing effort for this bacterium, circumventing the difficult challenge of purifying sufficient endosymbiont DNA from a tropical parasite for a whole genome shotgun sequencing strategy.


Asunto(s)
Brugia Malayi/genética , Mapeo Cromosómico/métodos , Cromosomas Artificiales Bacterianos/genética , Wolbachia/genética , Animales , Secuencia de Bases , Mapeo Contig/métodos , ADN Bacteriano/genética , ADN Protozoario/genética , Genoma Bacteriano , Genoma de Protozoos , Biblioteca Genómica , Peso Molecular , Plásmidos , ARN Bacteriano/genética , ARN Ribosómico/genética , Mapeo Restrictivo/métodos , Análisis de Secuencia de ADN/métodos , Simbiosis/genética
13.
Trans R Soc Trop Med Hyg ; 96(1): 7-17, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-11925998

RESUMEN

To advance and facilitate molecular studies of Brugia malayi, one of the causative agents of human lymphatic filariasis, an expressed sequence tag (EST)-based gene discovery programme has been carried out. Over 22,000 ESTs have been produced and deposited in the public databases by a consortium of laboratories from endemic and non-endemic countries. The ESTs have been analysed using custom informatic tools to reveal patterns of individual gene expression that may point to potential targets for future research on anti-filarial drugs and vaccines. Many genes first discovered as ESTs are now being analysed by researchers for immunodiagnostic, vaccine and drug target potential. Building on the success of the B. malayi EST programme, significant EST datasets are being generated for a number of other major parasites of humans and domesticated animals, and model parasitic species.


Asunto(s)
Brugia Malayi/genética , Etiquetas de Secuencia Expresada , Genoma de Protozoos , Animales , Brugia Malayi/parasitología , Secuencia Conservada , Genoma Bacteriano , Biblioteca Genómica , Simbiosis , Wolbachia/genética
14.
Methods Mol Biol ; 270: 75-92, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15153623

RESUMEN

Generating expressed sequence tags is a simple, cheap, and efficient way to sample the genome of a target organism. An expressed sequence tag (EST) is a single-pass sequence derived from a single complementary DNA (cDNA) clone, and the sequence serves to identify the gene from which it derives. We present a set of tested laboratory protocols for setting up and performing an EST analysis of any chosen species. These medium-throughput protocols do not require dedicated genomics equipment, such as robots, and focus on the use of microtiter plates and multichannels. Using these protocols, a single competent research worker should be able to generate 2000 ESTs in 1 mo. In a nonnormalized library, these 2000 ESTs should identify between 1000 and 1500 different genes, and thus possibly between 10 and 20% of the genes of any target parasite.


Asunto(s)
Etiquetas de Secuencia Expresada , Secuencia de Bases , Clonación Molecular , ADN Complementario , Reacción en Cadena de la Polimerasa
15.
Genome Biol Evol ; 2: 425-40, 2010 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-20624745

RESUMEN

Ecdysozoa is the recently recognized clade of molting animals that comprises the vast majority of extant animal species and the most important invertebrate model organisms--the fruit fly and the nematode worm. Evolutionary relationships within the ecdysozoans remain, however, unresolved, impairing the correct interpretation of comparative genomic studies. In particular, the affinities of the three Panarthropoda phyla (Arthropoda, Onychophora, and Tardigrada) and the position of Myriapoda within Arthropoda (Mandibulata vs. Myriochelata hypothesis) are among the most contentious issues in animal phylogenetics. To elucidate these relationships, we have determined and analyzed complete or nearly complete mitochondrial genome sequences of two Tardigrada, Hypsibius dujardini and Thulinia sp. (the first genomes to date for this phylum); one Priapulida, Halicryptus spinulosus; and two Onychophora, Peripatoides sp. and Epiperipatus biolleyi; and a partial mitochondrial genome sequence of the Onychophora Euperipatoides kanagrensis. Tardigrada mitochondrial genomes resemble those of the arthropods in term of the gene order and strand asymmetry, whereas Onychophora genomes are characterized by numerous gene order rearrangements and strand asymmetry variations. In addition, Onychophora genomes are extremely enriched in A and T nucleotides, whereas Priapulida and Tardigrada are more balanced. Phylogenetic analyses based on concatenated amino acid coding sequences support a monophyletic origin of the Ecdysozoa and the position of Priapulida as the sister group of a monophyletic Panarthropoda (Tardigrada plus Onychophora plus Arthropoda). The position of Tardigrada is more problematic, most likely because of long branch attraction (LBA). However, experiments designed to reduce LBA suggest that the most likely placement of Tardigrada is as a sister group of Onychophora. The same analyses also recover monophyly of traditionally recognized arthropod lineages such as Arachnida and of the highly debated clade Mandibulata.


Asunto(s)
Artrópodos/clasificación , Artrópodos/genética , Evolución Molecular , Genoma Mitocondrial , Invertebrados/clasificación , Invertebrados/genética , Animales , Arácnidos/clasificación , Arácnidos/genética , Composición de Base , ADN Mitocondrial/química , ADN Mitocondrial/genética , Orden Génico , Reordenamiento Génico , Modelos Genéticos , Nematodos/clasificación , Nematodos/genética , Filogenia , Tardigrada/clasificación , Tardigrada/genética
16.
Genome Res ; 19(7): 1202-13, 2009 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-19363216

RESUMEN

The Apicomplexa are a group of phylogenetically related parasitic protists that include Plasmodium, Cryptosporidium, and Toxoplasma. Together they are a major global burden on human health and economics. To meet this challenge, several international consortia have generated vast amounts of sequence data for many of these parasites. Here, we exploit these data to perform a systematic analysis of protein family and domain incidence across the phylum. A total of 87,736 protein sequences were collected from 15 apicomplexan species. These were compared with three protein databases, including the partial genome database, PartiGeneDB, which increases the breadth of taxonomic coverage. From these searches we constructed taxonomic profiles that reveal the extent of apicomplexan sequence diversity. Sequences without a significant match outside the phylum were denoted as apicomplexan specialized. These were collated into 9134 discrete protein families and placed in the context of the apicomplexan phylogeny, identifying the putative origin of each family. Most apicomplexan families were associated with an individual genus or species. Interestingly, many genera-specific innovations were associated with specialized host cell invasion and/or parasite survival processes. Contrastingly, those families reflecting more ancestral relationships were enriched in generalized housekeeping functions such as translation and transcription, which have diverged within the apicomplexan lineage. Protein domain searches revealed 192 domains not previously reported in apicomplexans together with a number of novel domain combinations. We highlight domains that may be important to parasite survival.


Asunto(s)
Apicomplexa/genética , Proteínas Protozoarias/genética , Animales , Evolución Molecular , Humanos , Filogenia , Estructura Terciaria de Proteína
17.
Science ; 317(5845): 1756-60, 2007 Sep 21.
Artículo en Inglés | MEDLINE | ID: mdl-17885136

RESUMEN

Parasitic nematodes that cause elephantiasis and river blindness threaten hundreds of millions of people in the developing world. We have sequenced the approximately 90 megabase (Mb) genome of the human filarial parasite Brugia malayi and predict approximately 11,500 protein coding genes in 71 Mb of robustly assembled sequence. Comparative analysis with the free-living, model nematode Caenorhabditis elegans revealed that, despite these genes having maintained little conservation of local synteny during approximately 350 million years of evolution, they largely remain in linkage on chromosomal units. More than 100 conserved operons were identified. Analysis of the predicted proteome provides evidence for adaptations of B. malayi to niches in its human and vector hosts and insights into the molecular basis of a mutualistic relationship with its Wolbachia endosymbiont. These findings offer a foundation for rational drug design.


Asunto(s)
Brugia Malayi/genética , Genoma de los Helmintos , Animales , Brugia Malayi/fisiología , Caenorhabditis/genética , Drosophila melanogaster/genética , Resistencia a Medicamentos/genética , Filariasis/parasitología , Humanos , Datos de Secuencia Molecular
18.
Genome Biol ; 5(6): R39, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15186490

RESUMEN

BACKGROUND: Parasitism is a highly successful mode of life and one that requires suites of gene adaptations to permit survival within a potentially hostile host. Among such adaptations is the secretion of proteins capable of modifying or manipulating the host environment. Nippostrongylus brasiliensis is a well-studied model nematode parasite of rodents, which secretes products known to modulate host immunity. RESULTS: Taking a genomic approach to characterize potential secreted products, we analyzed expressed sequence tag (EST) sequences for putative amino-terminal secretory signals. We sequenced ESTs from a cDNA library constructed by oligo-capping to select full-length cDNAs, as well as from conventional cDNA libraries. SignalP analysis was applied to predicted open reading frames, to identify potential signal peptides and anchors. Among 1,234 ESTs, 197 (~16%) contain predicted 5' signal sequences, with 176 classified as conventional signal peptides and 21 as signal anchors. ESTs cluster into 742 distinct genes, of which 135 (18%) bear predicted signal-sequence coding regions. Comparisons of clusters with homologs from Caenorhabditis elegans and more distantly related organisms reveal that the majority (65% at P < e-10) of signal peptide-bearing sequences from N. brasiliensis show no similarity to previously reported genes, and less than 10% align to conserved genes recorded outside the phylum Nematoda. Of all novel sequences identified, 32% contained predicted signal peptides, whereas this was the case for only 3.4% of conserved genes with sequence homologies beyond the Nematoda. CONCLUSIONS: These results indicate that secreted proteins may be undergoing accelerated evolution, either because of relaxed functional constraints, or in response to stronger selective pressure from host immunity.


Asunto(s)
Evolución Molecular , Etiquetas de Secuencia Expresada , Proteínas del Helminto/metabolismo , Nippostrongylus/genética , Parásitos/metabolismo , Señales de Clasificación de Proteína/genética , Análisis de Secuencia de Proteína/métodos , Animales , Proteínas de Caenorhabditis elegans/genética , Secuencia Conservada/genética , Proteínas del Helminto/genética , Selección Genética , Homología de Secuencia de Ácido Nucleico , Trans-Empalme/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA