Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 24
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 51(D1): D445-D451, 2023 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-36350662

RESUMO

OrthoDB provides evolutionary and functional annotations of genes in a diverse sampling of eukaryotes, prokaryotes, and viruses. Genomics continues to accelerate our exploration of gene diversity and orthology is the most precise way of bridging gene functional knowledge with the rapidly expanding universe of genomic sequences. OrthoDB samples the most diverse organisms with the best quality genomics data to provide the leading coverage of species diversity. This update of the underlying data to over 18 000 prokaryotes and almost 2000 eukaryotes with over 100 million genes propels the coverage to another level. This achievement also demonstrates the scalability of the underlying OrthoLoger software for delineation of orthologs, freely available from https://orthologer.ezlab.org. In addition to the ab-initio computations of gene orthology used for the OrthoDB release, the OrthoLoger software allows mapping of novel gene sets to precomputed orthologs and thereby links to their annotations. The LEMMI-style benchmarking of OrthoLoger ensures its state-of-the-art performance and is available from https://lemortho.ezlab.org. The OrthoDB web interface has been further developed to include a pairwise orthology view from any gene to any other sampled species. OrthoDB-computed evolutionary annotations as well as extensively collated functional annotations can be accessed via REST API or SPARQL/RDF, downloaded or browsed online from https://www.orthodb.org.


Assuntos
Bases de Dados Genéticas , Evolução Molecular , Eucariotos/genética , Genômica , Evolução Biológica , Software , Anotação de Sequência Molecular
2.
Nucleic Acids Res ; 49(D1): D389-D393, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33196836

RESUMO

OrthoDB provides evolutionary and functional annotations of orthologs, inferred for a vast number of available organisms. OrthoDB is leading in the coverage and genomic diversity sampling of Eukaryotes, Prokaryotes and Viruses, and the sampling of Bacteria is further set to increase three-fold. The user interface has been enhanced in response to the massive growth in data. OrthoDB provides three views on the data: (i) a list of orthologous groups related to a user query, which are now arranged to visualize their hierarchical relations, (ii) a detailed view of an orthologous group, now featuring a Sankey diagram to facilitate navigation between the levels of orthology, from more finely-resolved to more general groups of orthologs, as well as an arrangement of orthologs into an interactive organism taxonomy structure, and (iii) we added a gene-centric view, showing the gene functional annotations and the pair-wise orthologs in example species. The OrthoDB standalone software for delineation of orthologs, Orthologer, is freely available. Online BUSCO assessments and mapping to OrthoDB of user-uploaded data enable interactive exploration of related annotations and generation of comparative charts. OrthoDB strives to predict orthologs from the broadest coverage of species, as well as to extensively collate available functional annotations, and to compute evolutionary annotations such as evolutionary rate and phyletic profile. OrthoDB data can be assessed via SPARQL RDF, REST API, downloaded or browsed online from https://orthodb.org.


Assuntos
Bases de Dados Genéticas , Evolução Molecular , Anotação de Sequência Molecular , Homologia de Sequência do Ácido Nucleico , Animais , Software , Interface Usuário-Computador
3.
Nucleic Acids Res ; 47(D1): D807-D811, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30395283

RESUMO

OrthoDB (https://www.orthodb.org) provides evolutionary and functional annotations of orthologs. This update features a major scaling up of the resource coverage, sampling the genomic diversity of 1271 eukaryotes, 6013 prokaryotes and 6488 viruses. These include putative orthologs among 448 metazoan, 117 plant, 549 fungal, 148 protist, 5609 bacterial, and 404 archaeal genomes, picking up the best sequenced and annotated representatives for each species or operational taxonomic unit. OrthoDB relies on a concept of hierarchy of levels-of-orthology to enable more finely resolved gene orthologies for more closely related species. Since orthologs are the most likely candidates to retain functions of their ancestor gene, OrthoDB is aimed at narrowing down hypotheses about gene functions and enabling comparative evolutionary studies. Optional registered-user sessions allow on-line BUSCO assessments of gene set completeness and mapping of the uploaded data to OrthoDB to enable further interactive exploration of related annotations and generation of comparative charts. The accelerating expansion of genomics data continues to add valuable information, and OrthoDB strives to provide orthologs from the broadest coverage of species, as well as to extensively collate available functional annotations and to compute evolutionary annotations. The data can be browsed online, downloaded or assessed via REST API or SPARQL RDF compatible with both UniProt and Ensembl.


Assuntos
Bases de Dados Genéticas , Evolução Molecular , Genômica/tendências , Anotação de Sequência Molecular , Animais , Eucariotos/genética , Variação Genética , Genoma Bacteriano/genética , Genoma Fúngico/genética , Genoma de Planta/genética , Genoma Viral/genética , Filogenia , Software
4.
Environ Microbiol ; 20(6): 2288-2300, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-30014616

RESUMO

Antibiotic resistance is increasing among pathogens, and the human microbiome contains a reservoir of antibiotic resistance genes. Acidaminococcus intestini is the first Negativicute bacterium (Gram-negative Firmicute) shown to be resistant to beta-lactam antibiotics. Resistance is conferred by the aci1 gene, but its evolutionary history and prevalence remain obscure. We discovered that ACI-1 proteins are phylogenetically distinct from beta-lactamases of Gram-positive Firmicutes and that aci1 occurs in bacteria scattered across the Negativicute clade, suggesting lateral gene transfer. In the reference A. intestini RyC-MR95 genome, we found transposons residing within a tailed prophage context are likely vehicles for aci1's mobility. We found aci1 in 56 (4.4%) of 1,267 human gut metagenomes, mostly hosted within A. intestini, and, where could be determined, mostly within a consistent mobile element constellation. These samples are from Europe, China and the USA, showing that aci1 is distributed globally. We found that for most Negativicute assemblies with aci1, the prophage observed in A. instestini is absent, but in all cases aci1 is flanked by varying transposons. The chimeric mobile elements we identify here likely have a complex evolutionary history and potentially provide multiple complementary mechanisms for antibiotic resistance gene transfer both within and between cells.


Assuntos
Bactérias/metabolismo , Farmacorresistência Bacteriana/genética , Microbioma Gastrointestinal , Prófagos/genética , beta-Lactamases/metabolismo , Antibacterianos/farmacologia , Bactérias/classificação , Bactérias/efeitos dos fármacos , Bactérias/genética , China , Europa (Continente) , Firmicutes/genética , Transferência Genética Horizontal , Humanos , Metagenoma , Filogenia , Estados Unidos , beta-Lactamases/genética
5.
Mol Biol Evol ; 35(3): 543-548, 2018 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-29220515

RESUMO

Genomics promises comprehensive surveying of genomes and metagenomes, but rapidly changing technologies and expanding data volumes make evaluation of completeness a challenging task. Technical sequencing quality metrics can be complemented by quantifying completeness of genomic data sets in terms of the expected gene content of Benchmarking Universal Single-Copy Orthologs (BUSCO, http://busco.ezlab.org). The latest software release implements a complete refactoring of the code to make it more flexible and extendable to facilitate high-throughput assessments. The original six lineage assessment data sets have been updated with improved species sampling, 34 new subsets have been built for vertebrates, arthropods, fungi, and prokaryotes that greatly enhance resolution, and data sets are now also available for nematodes, protists, and plants. Here, we present BUSCO v3 with example analyses that highlight the wide-ranging utility of BUSCO assessments, which extend beyond quality control of genomics data sets to applications in comparative genomics analyses, gene predictor training, metagenomics, and phylogenomics.

6.
Nucleic Acids Res ; 45(D1): D744-D749, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899580

RESUMO

OrthoDB is a comprehensive catalog of orthologs, genes inherited by extant species from a single gene in their last common ancestor. In 2016 OrthoDB reached its 9th release, growing to over 22 million genes from over 5000 species, now adding plants, archaea and viruses. In this update we focused on usability of this fast-growing wealth of data: updating the user and programmatic interfaces to browse and query the data, and further enhancing the already extensive integration of available gene functional annotations. Collating functional annotations from over 100 resources, and enabled us to propose descriptive titles for 87% of ortholog groups. Additionally, OrthoDB continues to provide computed evolutionary annotations and to allow user queries by sequence homology. The OrthoDB resource now enables users to generate publication-quality comparative genomics charts, as well as to upload, analyze and interactively explore their own private data. OrthoDB is available from http://orthodb.org.


Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Evolução Molecular , Genômica/métodos , Algoritmos , Animais , Archaea/genética , Bactérias/genética , Fungos/genética , Anotação de Sequência Molecular , Plantas/genética , Software , Interface Usuário-Computador , Vírus/genética , Navegador
7.
Bioinformatics ; 31(19): 3210-2, 2015 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-26059717

RESUMO

MOTIVATION: Genomics has revolutionized biological research, but quality assessment of the resulting assembled sequences is complicated and remains mostly limited to technical measures like N50. RESULTS: We propose a measure for quantitative assessment of genome assembly and annotation completeness based on evolutionarily informed expectations of gene content. We implemented the assessment procedure in open-source software, with sets of Benchmarking Universal Single-Copy Orthologs, named BUSCO. AVAILABILITY AND IMPLEMENTATION: Software implemented in Python and datasets available for download from http://busco.ezlab.org. CONTACT: evgeny.zdobnov@unige.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Dosagem de Genes/genética , Genoma , Genômica/métodos , Anotação de Sequência Molecular/métodos , Software , Animais , Humanos
8.
Nucleic Acids Res ; 43(Database issue): D250-6, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25428351

RESUMO

Orthology, refining the concept of homology, is the cornerstone of evolutionary comparative studies. With the ever-increasing availability of genomic data, inference of orthology has become instrumental for generating hypotheses about gene functions crucial to many studies. This update of the OrthoDB hierarchical catalog of orthologs (http://www.orthodb.org) covers 3027 complete genomes, including the most comprehensive set of 87 arthropods, 61 vertebrates, 227 fungi and 2627 bacteria (sampling the most complete and representative genomes from over 11,000 available). In addition to the most extensive integration of functional annotations from UniProt, InterPro, GO, OMIM, model organism phenotypes and COG functional categories, OrthoDB uniquely provides evolutionary annotations including rates of ortholog sequence divergence, copy-number profiles, sibling groups and gene architectures. We re-designed the entirety of the OrthoDB website from the underlying technology to the user interface, enabling the user to specify species of interest and to select the relevant orthology level by the NCBI taxonomy. The text searches allow use of complex logic with various identifiers of genes, proteins, domains, ontologies or annotation keywords and phrases. Gene copy-number profiles can also be queried. This release comes with the freely available underlying ortholog clustering pipeline (http://www.orthodb.org/software).


Assuntos
Bases de Dados Genéticas , Homologia de Sequência , Algoritmos , Animais , Curadoria de Dados , Eucariotos/genética , Evolução Molecular , Genoma Microbiano , Humanos , Software
9.
Genome Res ; 23(8): 1235-47, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23636946

RESUMO

Genomes of eusocial insects code for dramatic examples of phenotypic plasticity and social organization. We compared the genomes of seven ants, the honeybee, and various solitary insects to examine whether eusocial lineages share distinct features of genomic organization. Each ant lineage contains ∼4000 novel genes, but only 64 of these genes are conserved among all seven ants. Many gene families have been expanded in ants, notably those involved in chemical communication (e.g., desaturases and odorant receptors). Alignment of the ant genomes revealed reduced purifying selection compared with Drosophila without significantly reduced synteny. Correspondingly, ant genomes exhibit dramatic divergence of noncoding regulatory elements; however, extant conserved regions are enriched for novel noncoding RNAs and transcription factor-binding sites. Comparison of orthologous gene promoters between eusocial and solitary species revealed significant regulatory evolution in both cis (e.g., Creb) and trans (e.g., fork head) for nearly 2000 genes, many of which exhibit phenotypic plasticity. Our results emphasize that genomic changes can occur remarkably fast in ants, because two recently diverged leaf-cutter ant species exhibit faster accumulation of species-specific genes and greater divergence in regulatory elements compared with other ants or Drosophila. Thus, while the "socio-genomes" of ants and the honeybee are broadly characterized by a pervasive pattern of divergence in gene composition and regulation, they preserve lineage-specific regulatory features linked to eusociality. We propose that changes in gene regulation played a key role in the origins of insect eusociality, whereas changes in gene composition were more relevant for lineage-specific eusocial adaptations.


Assuntos
Formigas/genética , Genoma de Inseto , Animais , Comportamento Animal , Sítios de Ligação , Sequência Conservada , Metilação de DNA , Evolução Molecular , Regulação da Expressão Gênica , Himenópteros/genética , Proteínas de Insetos/genética , MicroRNAs/genética , Modelos Genéticos , Filogenia , Sequências Reguladoras de Ácido Nucleico , Análise de Sequência de DNA , Comportamento Social , Especificidade da Espécie , Sintenia , Fatores de Transcrição/genética
10.
Nucleic Acids Res ; 41(Database issue): D358-65, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23180791

RESUMO

The concept of orthology provides a foundation for formulating hypotheses on gene and genome evolution, and thus forms the cornerstone of comparative genomics, phylogenomics and metagenomics. We present the update of OrthoDB-the hierarchical catalog of orthologs (http://www.orthodb.org). From its conception, OrthoDB promoted delineation of orthologs at varying resolution by explicitly referring to the hierarchy of species radiations, now also adopted by other resources. The current release provides comprehensive coverage of animals and fungi representing 252 eukaryotic species, and is now extended to prokaryotes with the inclusion of 1115 bacteria. Functional annotations of orthologous groups are provided through mapping to InterPro, GO, OMIM and model organism phenotypes, with cross-references to major resources including UniProt, NCBI and FlyBase. Uniquely, OrthoDB provides computed evolutionary traits of orthologs, such as gene duplicability and loss profiles, divergence rates, sibling groups, and now extended with exon-intron architectures, syntenic orthologs and parent-child trees. The interactive web interface allows navigation along the species phylogenies, complex queries with various identifiers, annotation keywords and phrases, as well as with gene copy-number profiles and sequence homology searches. With the explosive growth of available data, OrthoDB also provides mapping of newly sequenced genomes and transcriptomes to the current orthologous groups.


Assuntos
Bases de Dados Genéticas , Genes Bacterianos , Genes Fúngicos , Genes , Animais , Análise por Conglomerados , Evolução Molecular , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Fenótipo , Filogenia , Sintenia
11.
Science ; 331(6017): 555-61, 2011 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-21292972

RESUMO

We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.


Assuntos
Daphnia/genética , Ecossistema , Genoma , Adaptação Fisiológica , Sequência de Aminoácidos , Animais , Sequência de Bases , Mapeamento Cromossômico , Daphnia/fisiologia , Meio Ambiente , Evolução Molecular , Conversão Gênica , Duplicação Gênica , Expressão Gênica , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Genes , Genes Duplicados , Redes e Vias Metabólicas/genética , Anotação de Sequência Molecular , Dados de Sequência Molecular , Família Multigênica , Filogenia , Análise de Sequência de DNA
12.
Nucleic Acids Res ; 39(Database issue): D283-8, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-20972218

RESUMO

The concept of homology drives speculation on a gene's function in any given species when its biological roles in other species are characterized. With reference to a specific species radiation homologous relations define orthologs, i.e. descendants from a single gene of the ancestor. The large-scale delineation of gene genealogies is a challenging task, and the numerous approaches to the problem reflect the importance of the concept of orthology as a cornerstone for comparative studies. Here, we present the updated OrthoDB catalog of eukaryotic orthologs delineated at each radiation of the species phylogeny in an explicitly hierarchical manner of over 100 species of vertebrates, arthropods and fungi (including the metazoa level). New database features include functional annotations, and quantification of evolutionary divergence and relations among orthologous groups. The interface features extended phyletic profile querying and enhanced text-based searches. The ever-increasing sampling of sequenced eukaryotic genomes brings a clearer account of the majority of gene genealogies that will facilitate informed hypotheses of gene function in newly sequenced genomes. Furthermore, uniform analysis across lineages as different as vertebrates, arthropods and fungi with divergence levels varying from several to hundreds of millions of years will provide essential data for uncovering and quantifying long-term trends of gene evolution. OrthoDB is freely accessible from http://cegg.unige.ch/orthodb.


Assuntos
Bases de Dados Genéticas , Filogenia , Homologia de Sequência de Aminoácidos , Animais , Artrópodes/genética , Drosophila melanogaster/genética , Evolução Molecular , Fungos/genética , Genes , Genômica , Camundongos , Anotação de Sequência Molecular , Estrutura Terciária de Proteína , Proteínas/genética , Saccharomyces cerevisiae/genética , Vertebrados/genética
13.
Genome Biol Evol ; 3: 75-86, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21148284

RESUMO

Delineating ancestral gene relations among a large set of sequenced eukaryotic genomes allowed us to rigorously examine links between evolutionary and functional traits. We classified 86% of over 1.36 million protein-coding genes from 40 vertebrates, 23 arthropods, and 32 fungi into orthologous groups and linked over 90% of them to Gene Ontology or InterPro annotations. Quantifying properties of ortholog phyletic retention, copy-number variation, and sequence conservation, we examined correlations with gene essentiality and functional traits. More than half of vertebrate, arthropod, and fungal orthologs are universally present across each lineage. These universal orthologs are preferentially distributed in groups with almost all single-copy or all multicopy genes, and sequence evolution of the predominantly single-copy orthologous groups is markedly more constrained. Essential genes from representative model organisms, Mus musculus, Drosophila melanogaster, and Saccharomyces cerevisiae, are significantly enriched in universal orthologs within each lineage, and essential-gene-containing groups consistently exhibit greater sequence conservation than those without. This study of eukaryotic gene repertoire evolution identifies shared fundamental principles and highlights lineage-specific features, it also confirms that essential genes are highly retained and conclusively supports the "knockout-rate prediction" of stronger constraints on essential gene sequence evolution. However, the distinction between sequence conservation of single- versus multicopy orthologs is quantitatively more prominent than between orthologous groups with and without essential genes. The previously underappreciated difference in the tolerance of gene duplications and contrasting evolutionary modes of "single-copy control" versus "multicopy license" may reflect a major evolutionary mechanism that allows extended exploration of gene sequence space.


Assuntos
Artrópodes/genética , Evolução Molecular , Fungos/genética , Duplicação Gênica , Genes Essenciais , Vertebrados/genética , Animais , Artrópodes/classificação , Biologia Computacional , Fungos/classificação , Genoma , Filogenia , Proteoma , Locos de Características Quantitativas , Vertebrados/classificação
14.
Proc Natl Acad Sci U S A ; 107(27): 12168-73, 2010 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-20566863

RESUMO

As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.


Assuntos
Genoma Bacteriano/genética , Genoma de Inseto/genética , Pediculus/genética , Pediculus/microbiologia , Animais , Enterobacteriaceae/genética , Genes Bacterianos/genética , Genes de Insetos/genética , Genômica/métodos , Humanos , Infestações por Piolhos/parasitologia , Dados de Sequência Molecular , Análise de Sequência de DNA , Simbiose
15.
Genome Biol ; 10(4): R43, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19393040

RESUMO

BACKGROUND: The newly assembled Bos taurus genome sequence enables the linkage of bovine milk and lactation data with other mammalian genomes. RESULTS: Using publicly available milk proteome data and mammary expressed sequence tags, 197 milk protein genes and over 6,000 mammary genes were identified in the bovine genome. Intersection of these genes with 238 milk production quantitative trait loci curated from the literature decreased the search space for milk trait effectors by more than an order of magnitude. Genome location analysis revealed a tendency for milk protein genes to be clustered with other mammary genes. Using the genomes of a monotreme (platypus), a marsupial (opossum), and five placental mammals (bovine, human, dog, mice, rat), gene loss and duplication, phylogeny, sequence conservation, and evolution were examined. Compared with other genes in the bovine genome, milk and mammary genes are: more likely to be present in all mammals; more likely to be duplicated in therians; more highly conserved across Mammalia; and evolving more slowly along the bovine lineage. The most divergent proteins in milk were associated with nutritional and immunological components of milk, whereas highly conserved proteins were associated with secretory processes. CONCLUSIONS: Although both copy number and sequence variation contribute to the diversity of milk protein composition across species, our results suggest that this diversity is primarily due to other mechanisms. Our findings support the essentiality of milk to the survival of mammalian neonates and the establishment of milk secretory mechanisms more than 160 million years ago.


Assuntos
Bovinos/genética , Genoma/genética , Lactação/genética , Proteínas do Leite/genética , Animais , Mapeamento Cromossômico , Cromossomos de Mamíferos/genética , Biologia Computacional/métodos , Bases de Dados Genéticas , Evolução Molecular , Feminino , Humanos , Mamíferos/classificação , Mamíferos/genética , Glândulas Mamárias Animais/metabolismo , Leite/química , Proteínas do Leite/classificação , Filogenia , Locos de Características Quantitativas/genética
16.
Nucleic Acids Res ; 37(Database issue): D111-7, 2009 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-18927110

RESUMO

MicroRNAs (miRNAs) are short, non-protein coding RNAs that direct the widespread phenomenon of post-transcriptional regulation of metazoan genes. The mature approximately 22-nt long RNA molecules are processed from genome-encoded stem-loop structured precursor genes. Hundreds of such genes have been experimentally validated in vertebrate genomes, yet their discovery remains challenging, and substantially higher numbers have been estimated. The miROrtho database (http://cegg.unige.ch/mirortho) presents the results of a comprehensive computational survey of miRNA gene candidates across the majority of sequenced metazoan genomes. We designed and applied a three-tier analysis pipeline: (i) an SVM-based ab initio screen for potent hairpins, plus homologs of known miRNAs, (ii) an orthology delineation procedure and (iii) an SVM-based classifier of the ortholog multiple sequence alignments. The web interface provides direct access to putative miRNA annotations, ortholog multiple alignments, RNA secondary structure conservation, and sequence data. The miROrtho data are conceptually complementary to the miRBase catalog of experimentally verified miRNA sequences, providing a consistent comparative genomics perspective as well as identifying many novel miRNA genes with strong evolutionary support.


Assuntos
Bases de Dados de Ácidos Nucleicos , MicroRNAs/química , MicroRNAs/genética , Genômica , Internet , Conformação de Ácido Nucleico , Alinhamento de Sequência , Interface Usuário-Computador
17.
Am J Hum Genet ; 82(4): 971-81, 2008 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18394580

RESUMO

The elucidation of the largely unknown transcriptome of small RNAs is crucial for the understanding of genome and cellular function. We report here the results of the analysis of small RNAs (< 50 nt) in the ENCODE regions of the human genome. Size-fractionated RNAs from four different cell lines (HepG2, HelaS3, GM06990, SK-N-SH) were mapped with the forward and reverse ENCODE high-density resolution tiling arrays. The top 1% of hybridization signals are termed SmRfrags (Small RNA fragments). Eight percent of SmRfrags overlap the GENCODE genes (CDS), given that the majority map to intergenic regions (34%), intronic regions (53%), and untranslated regions (UTRs) (5%). In addition, 9.6% and 16.8% of SmRfrags in the 5' UTR regions overlap significantly with His/Pol II/TAF250 binding sites and DNase I Hypersensitive sites, respectively (compared to the 5.3% and 9% expected). Interestingly, 17%-24% (depending on the cell line) of SmRfrags are sense-antisense strand pairs that show evidence of overlapping transcription. Only 3.4% and 7.2% of SmRfrags in intergenic regions overlap transcribed fragments (Txfrags) in HeLa and GM06990 cell lines, respectively. We hypothesized that a fraction of the identified SmRfrags corresponded to microRNAs. We tested by Northern blot a set of 15 high-likelihood predictions of microRNA candidates that overlap with smRfrags and validated three potential microRNAs ( approximately 20 nt length). Notably, most of the remaining candidates showed a larger hybridizing band ( approximately 100 nt) that could be a microRNA precursor. The small RNA transcriptome is emerging as an important and abundant component of the genome function.


Assuntos
Mapeamento Cromossômico , Genoma Humano/genética , MicroRNAs/genética , Transcrição Gênica , Regiões 5' não Traduzidas/genética , Sequência de Bases , Linhagem Celular Tumoral , Humanos , Dados de Sequência Molecular , Análise de Sequência com Séries de Oligonucleotídeos
18.
Nucleic Acids Res ; 36(Database issue): D271-5, 2008 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-17947323

RESUMO

The concept of orthology is widely used to relate genes across different species using comparative genomics, and it provides the basis for inferring gene function. Here we present the web accessible OrthoDB database that catalogs groups of orthologous genes in a hierarchical manner, at each radiation of the species phylogeny, from more general groups to more fine-grained delineations between closely related species. We used a COG-like and Inparanoid-like ortholog delineation procedure on the basis of all-against-all Smith-Waterman sequence comparisons to analyze 58 eukaryotic genomes, focusing on vertebrates, insects and fungi to facilitate further comparative studies. The database is freely available at http://cegg.unige.ch/orthodb.


Assuntos
Bases de Dados Genéticas , Genômica , Filogenia , Animais , Fungos/genética , Insetos/genética , Internet , Proteômica , Interface Usuário-Computador , Vertebrados/genética
19.
Genome Biol ; 8(11): R242, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18021399

RESUMO

BACKGROUND: The increasing number of sequenced insect and vertebrate genomes of variable divergence enables refined comparative analyses to quantify the major modes of animal genome evolution and allows tracing of gene genealogy (orthology) and pinpointing of gene extinctions (losses), which can reveal lineage-specific traits. RESULTS: To consistently quantify losses of orthologous groups of genes, we compared the gene repertoires of five vertebrates and five insects, including honeybee and Tribolium beetle, that represent insect orders outside the previously sequenced Diptera. We found hundreds of lost Urbilateria genes in each of the lineages and assessed their phylogenetic origin. The rate of losses correlates well with the species' rates of molecular evolution and radiation times, without distinction between insects and vertebrates, indicating their stochastic nature. Remarkably, this extends to the universal single-copy orthologs, losses of dozens of which have been tolerated in each species. Nevertheless, the propensity for loss differs substantially among genes, where roughly 20% of the orthologs have an 8-fold higher chance of becoming extinct. Extrapolation of our data also suggests that the Urbilateria genome contained more than 7,000 genes. CONCLUSION: Our results indicate that the seemingly higher number of observed gene losses in insects can be explained by their two- to three-fold higher evolutionary rate. Despite the profound effect of many losses on cellular machinery, overall, they seem to be guided by neutral evolution.


Assuntos
Insetos/genética , Vertebrados/genética , Animais , Evolução Molecular , Variação Genética , Humanos , Funções Verossimilhança , Modelos Genéticos , Filogenia
20.
Science ; 316(5832): 1738-43, 2007 Jun 22.
Artigo em Inglês | MEDLINE | ID: mdl-17588928

RESUMO

Mosquitoes are vectors of parasitic and viral diseases of immense importance for public health. The acquisition of the genome sequence of the yellow fever and Dengue vector, Aedes aegypti (Aa), has enabled a comparative phylogenomic analysis of the insect immune repertoire: in Aa, the malaria vector Anopheles gambiae (Ag), and the fruit fly Drosophila melanogaster (Dm). Analysis of immune signaling pathways and response modules reveals both conservative and rapidly evolving features associated with different functional gene categories and particular aspects of immune reactions. These dynamics reflect in part continuous readjustment between accommodation and rejection of pathogens and suggest how innate immunity may have evolved.


Assuntos
Aedes/genética , Anopheles/genética , Evolução Molecular , Imunidade Inata/genética , Insetos Vetores/genética , Aedes/imunologia , Animais , Anopheles/imunologia , Peptídeos Catiônicos Antimicrobianos/fisiologia , Proteínas de Transporte/genética , Proteínas de Transporte/fisiologia , Drosophila melanogaster/genética , Drosophila melanogaster/imunologia , Genes de Insetos , Proteínas de Insetos/genética , Proteínas de Insetos/fisiologia , Insetos Vetores/imunologia , Malária/transmissão , Melaninas/metabolismo , Família Multigênica , Transdução de Sinais , Especificidade da Espécie
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...