Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 557(7703): 43-49, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29695866

RESUMO

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.


Assuntos
Produtos Agrícolas/classificação , Produtos Agrícolas/genética , Variação Genética , Genoma de Planta/genética , Oryza/classificação , Oryza/genética , Ásia , Evolução Molecular , Genes de Plantas/genética , Genética Populacional , Genômica , Haplótipos , Mutação INDEL/genética , Filogenia , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único/genética
2.
Cell ; 135(6): 1053-64, 2008 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-19070576

RESUMO

Vascular development begins when mesodermal cells differentiate into endothelial cells, which then form primitive vessels. It has been hypothesized that endothelial-specific gene expression may be regulated combinatorially, but the transcriptional mechanisms governing specificity in vascular gene expression remain incompletely understood. Here, we identify a 44 bp transcriptional enhancer that is sufficient to direct expression specifically and exclusively to the developing vascular endothelium. This enhancer is regulated by a composite cis-acting element, the FOX:ETS motif, which is bound and synergistically activated by Forkhead and Ets transcription factors. We demonstrate that coexpression of the Forkhead protein FoxC2 and the Ets protein Etv2 induces ectopic expression of vascular genes in Xenopus embryos, and that combinatorial knockdown of the orthologous genes in zebrafish embryos disrupts vascular development. Finally, we show that FOX:ETS motifs are present in many known endothelial-specific enhancers and that this motif is an efficient predictor of endothelial enhancers in the human genome.


Assuntos
Elementos Facilitadores Genéticos , Fatores de Transcrição Forkhead/metabolismo , Regulação da Expressão Gênica no Desenvolvimento , Proteínas Proto-Oncogênicas c-ets/metabolismo , Animais , Vasos Sanguíneos/embriologia , Embrião de Mamíferos/citologia , Embrião não Mamífero/metabolismo , Endotélio/embriologia , Fibroblastos/metabolismo , Humanos , Camundongos , Xenopus , Peixe-Zebra
3.
Nature ; 510(7505): 356-62, 2014 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-24919147

RESUMO

Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.


Assuntos
Eucalyptus/genética , Genoma de Planta , Eucalyptus/classificação , Evolução Molecular , Variação Genética , Endogamia , Filogenia
4.
Nucleic Acids Res ; 45(D1): D1075-D1081, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899667

RESUMO

We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Mutação INDEL , Oryza/genética , Polimorfismo de Nucleotídeo Único , Ferramenta de Busca , Software , Alelos , Biologia Computacional/métodos , Frequência do Gene , Loci Gênicos , Genômica/métodos , Genótipo , Interface Usuário-Computador , Navegador
5.
Genome Res ; 24(12): 2077-89, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25273068

RESUMO

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.


Assuntos
Genoma , Genômica/métodos , Alinhamento de Sequência/métodos , Software , Animais , Biologia Computacional/métodos , Simulação por Computador , Conjuntos de Dados como Assunto , Estudo de Associação Genômica Ampla , Humanos , Mamíferos/genética , Filogenia , Reprodutibilidade dos Testes
6.
Mol Phylogenet Evol ; 117: 10-29, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28860010

RESUMO

Synteny can be maintained for certain genomic regions across broad phylogenetic groups. In these homologous genomic regions, sites that are under relaxed purifying selection, such as intergenic regions, could be used broadly as markers for population genetic and phylogenetic studies on species complexes. To explore the potential of this approach, we found 125 Collinear Orthologous Regions (COR) ranging from 1 to >10kb across nine genomes representing the Lecanoromycetes and Eurotiomycetes (Pezizomycotina, Ascomycota). Twenty-six of these COR were found in all 24 eurotiomycete genomes surveyed for this study. Given the high abundance and availability of fungal genomes we believe this approach could be adopted for other large groups of fungi outside the Pezizomycotina. Asa proof of concept, we selected three Collinear Orthologous Regions (COR1b, COR3, and COR16), based on synteny analyses of several genomes representing three classes of Ascomycota: Eurotiomycetes, Lecanoromycetes, and Lichinomycetes. COR16, for example, was found across these three classes of fungi. Here we compare the resolving power of these three new markers with five loci commonly used in phylogenetic studies of fungi, using section Polydactylon of the cyanolichen-forming genus Peltigera (Lecanoromycetes) - a clade with several challenging species complexes. Sequence data were subjected to three species discovery and two validating methods. COR markers substantially increased phylogenetic resolution and confidence, and highly contributed to species delimitation. The level of phylogenetic signal provided by each of the COR markers was higher than the commonly used fungal barcode ITS. High cryptic diversity was revealed by all methods. As redefined here, most species represent lineages that have relatively narrower, and more homogeneous biogeographical ranges than previously understood. The scabrosoid clade consists of ten species, seven of which are new. For the dolichorhizoid clade, twenty-two new species were discovered for a total of twenty-nine species in this clade.


Assuntos
Ascomicetos/classificação , Ascomicetos/genética , Marcadores Genéticos/genética , Genoma Fúngico/genética , Genômica , Líquens/classificação , Líquens/genética , Filogenia , DNA Intergênico , Reprodutibilidade dos Testes , Especificidade da Espécie , Sintenia
7.
Nucleic Acids Res ; 42(Database issue): D26-31, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24225321

RESUMO

The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. Here we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência de DNA , Integração de Sistemas
8.
Nucleic Acids Res ; 42(Database issue): D699-704, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24297253

RESUMO

MycoCosm is a fungal genomics portal (http://jgi.doe.gov/fungi), developed by the US Department of Energy Joint Genome Institute to support integration, analysis and dissemination of fungal genome sequences and other 'omics' data by providing interactive web-based tools. MycoCosm also promotes and facilitates user community participation through the nomination of new species of fungi for sequencing, and the annotation and analysis of resulting data. By efficiently filling gaps in the Fungal Tree of Life, MycoCosm will help address important problems associated with energy and the environment, taking advantage of growing fungal genomics resources.


Assuntos
Bases de Dados Genéticas , Genoma Fúngico , Fungos/classificação , Fungos/genética , Genômica , Internet , Anotação de Sequência Molecular
9.
BMC Bioinformatics ; 16: 130, 2015 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-25928663

RESUMO

BACKGROUND: Metagenomics, the sequencing of DNA collected from an entire microbial community, enables the study of natural microbial consortia in their native habitats. Metagenomics studies produce huge volumes of data, including both the sequences themselves and metadata describing their abundance, assembly, predicted functional characteristics and environmental parameters. The ability to explore these data visually is critically important to meaningful biological interpretation. Current genomics applications cannot effectively integrate sequence data, assembly metadata, and annotation to support both genome and community-level inquiry. RESULTS: Elviz (Environmental Laboratory Visualization) is an interactive web-based tool for the visual exploration of assembled metagenomes and their complex metadata. Elviz allows scientists to navigate metagenome assemblies across multiple dimensions and scales, plotting parameters such as GC content, relative abundance, phylogenetic affiliation and assembled contig length. Furthermore Elviz enables interactive exploration using real-time plot navigation, search, filters, axis selection, and the ability to drill from a whole-community profile down to individual gene annotations. Thus scientists engage in a rapid feedback loop of visual pattern identification, hypothesis generation, and hypothesis testing. CONCLUSIONS: Compared to the current alternative of generating a succession of static figures, Elviz can greatly accelerate the speed of metagenome analysis. Elviz can be used to explore both user-submitted datasets and numerous metagenome studies publicly available at the Joint Genome Institute (JGI). Elviz is freely available at http://genome.jgi.doe.gov/viz and runs on most current web-browsers.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Bases de Dados Genéticas , Genoma Bacteriano , Metagenoma , Metagenômica/métodos , Software , Anotação de Sequência Molecular , Petróleo/microbiologia , Filogenia
10.
BMC Genomics ; 16: 882, 2015 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-26519295

RESUMO

BACKGROUND: To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. Our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species. RESULTS: We searched for sequences that were conserved within groups of closely related species but not between groups of more distant species, and were associated with an epigenetic mark of enhancer activity. To facilitate inferring orthology between non-conserved sequences, we limited our search to introns whose orthology could be unambiguously established by mapping the bracketing exons. We show that a subset of these non-conserved but syntenic sequences from the mouse and zebrafish genomes have homologous functions in a zebrafish transgenic enhancer assay. The conserved expression patterns driven by these enhancers are probably associated with short transcription factor-binding motifs present in the divergent sequences. CONCLUSIONS: We have identified numerous potential enhancers with divergent sequences but a conserved function. These results indicate that selection on function, rather than sequence, may be a common mode of enhancer evolution; evidence for selection at the sequence level is not a necessary criterion to define a gene regulatory element.


Assuntos
Sequência Conservada , Elementos Facilitadores Genéticos , Variação Genética , Vertebrados/genética , Animais , Animais Geneticamente Modificados , Sítios de Ligação , Biologia Computacional/métodos , Evolução Molecular , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismo
11.
BMC Genomics ; 16: 919, 2015 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-26555820

RESUMO

BACKGROUND: The σ(54) subunit controls a unique class of promoters in bacteria. Such promoters, without exception, require enhancer binding proteins (EBPs) for transcription initiation. Desulfovibrio vulgaris Hildenborough, a model bacterium for sulfate reduction studies, has a high number of EBPs, more than most sequenced bacteria. The cellular processes regulated by many of these EBPs remain unknown. RESULTS: To characterize the σ(54)-dependent regulome of D. vulgaris Hildenborough, we identified EBP binding motifs and regulated genes by a combination of computational and experimental techniques. These predictions were supported by our reconstruction of σ(54)-dependent promoters by comparative genomics. We reassessed and refined the results of earlier studies on regulation in D. vulgaris Hildenborough and consolidated them with our new findings. It allowed us to reconstruct the σ(54) regulome in D. vulgaris Hildenborough. This regulome includes 36 regulons that consist of 201 coding genes and 4 non-coding RNAs, and is involved in nitrogen, carbon and energy metabolism, regulation, transmembrane transport and various extracellular functions. To the best of our knowledge, this is the first report of direct regulation of alanine dehydrogenase, pyruvate metabolism genes and type III secretion system by σ(54)-dependent regulators. CONCLUSIONS: The σ(54)-dependent regulome is an important component of transcriptional regulatory network in D. vulgaris Hildenborough and related free-living Deltaproteobacteria. Our study provides a representative collection of σ(54)-dependent regulons that can be used for regulation prediction in Deltaproteobacteria and other taxa.


Assuntos
Desulfovibrio vulgaris/genética , Desulfovibrio vulgaris/metabolismo , Regulação Bacteriana da Expressão Gênica , Fator sigma/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sítios de Ligação , Análise por Conglomerados , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Elementos Facilitadores Genéticos , Motivos de Nucleotídeos , Filogenia , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Ligação Proteica , Fator sigma/genética , Fatores de Transcrição/metabolismo , Sistemas de Secreção Tipo III/genética
12.
Bioinformatics ; 30(18): 2654-5, 2014 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-24860159

RESUMO

UNLABELLED: With the ubiquitous generation of complete genome assemblies for a variety of species, efficient tools for whole-genome alignment along with user-friendly visualization are critically important. Our VISTA family of tools for comparative genomics, based on algorithms for pairwise and multiple alignments of genomic sequences and whole-genome assemblies, has become one of the standard techniques for comparative analysis. Most of the VISTA programs have been implemented as Web-accessible servers and are extensively used by the biomedical community. In this manuscript, we introduce GenomeVISTA: a novel implementation that incorporates most features of the VISTA family--fast and accurate alignment, visualization capabilities, GUI and analytical tools within a stand-alone software package. GenomeVISTA thus provides flexibility and security for users who need to conduct whole-genome comparisons on their own computers. AVAILABILITY AND IMPLEMENTATION: Implemented in Perl, C/C++ and Java, the source code is freely available for download at the VISTA Web site: http://genome.lbl.gov/vista/.


Assuntos
Genômica/métodos , Alinhamento de Sequência/métodos , Software , Algoritmos , Gráficos por Computador
13.
Nature ; 457(7229): 551-6, 2009 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-19189423

RESUMO

Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.


Assuntos
Evolução Molecular , Genoma de Planta/genética , Poaceae/genética , Sorghum/genética , Arabidopsis/genética , Cromossomos de Plantas/genética , Duplicação Gênica , Genes de Plantas , Oryza/genética , Populus/genética , Recombinação Genética/genética , Alinhamento de Sequência , Análise de Sequência de DNA , Deleção de Sequência/genética , Zea mays/genética
14.
Bioinformatics ; 29(16): 2059-61, 2013 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-23736530

RESUMO

SUMMARY: We have developed a web-based query tool, Whole-Genome rVISTA (WGRV), that determines enrichment of transcription factors (TFs) and associated target genes in sets of co-regulated genes. WGRV enables users to query databases containing pre-computed genome coordinates of evolutionarily conserved transcription factor binding sites in the proximal promoters (from 100 bp to 5 kb upstream) of human, mouse and Drosophila genomes. TF binding sites are based on position-weight matrices from the TRANSFAC Professional database. For a given set of co-regulated genes, WGRV returns statistically enriched and evolutionarily conserved binding sites, mapped by the regulatory VISTA (rVISTA) algorithm. Users can then retrieve a list of genes from the query set containing the enriched TF binding sites and their location in the query set promoters. Results are exported in a BED format for rapid visualization in the UCSC genome browser. Flat files of mapped conserved sites and their genomic coordinates are also available for analysis with stand-alone software. AVAILABILITY: http://genome.lbl.gov/cgi-bin/WGRVistaInputCommon.pl.


Assuntos
Perfilação da Expressão Gênica , Regiões Promotoras Genéticas , Software , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Genômica , Humanos , Internet , Camundongos
15.
Nature ; 453(7198): 1064-71, 2008 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-18563158

RESUMO

Lancelets ('amphioxus') are the modern survivors of an ancient chordate lineage, with a fossil record dating back to the Cambrian period. Here we describe the structure and gene content of the highly polymorphic approximately 520-megabase genome of the Florida lancelet Branchiostoma floridae, and analyse it in the context of chordate evolution. Whole-genome comparisons illuminate the murky relationships among the three chordate groups (tunicates, lancelets and vertebrates), and allow not only reconstruction of the gene complement of the last common chordate ancestor but also partial reconstruction of its genomic organization, as well as a description of two genome-wide duplications and subsequent reorganizations in the vertebrate lineage. These genome-scale events shaped the vertebrate genome and provided additional genetic variation for exploitation during vertebrate evolution.


Assuntos
Cordados/genética , Evolução Molecular , Genoma/genética , Animais , Cordados/classificação , Sequência Conservada , Elementos de DNA Transponíveis/genética , Duplicação Gênica , Genes/genética , Ligação Genética , Humanos , Íntrons/genética , Cariotipagem , Família Multigênica , Filogenia , Polimorfismo Genético/genética , Proteínas/genética , Sintenia , Fatores de Tempo , Vertebrados/classificação , Vertebrados/genética
16.
Nucleic Acids Res ; 40(Web Server issue): W604-8, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22700702

RESUMO

Web services application programming interface (API) was developed to provide a programmatic access to the regulatory interactions accumulated in the RegPrecise database (http://regprecise.lbl.gov), a core resource on transcriptional regulation for the microbial domain of the Department of Energy (DOE) Systems Biology Knowledgebase. RegPrecise captures and visualize regulogs, sets of genes controlled by orthologous regulators in several closely related bacterial genomes, that were reconstructed by comparative genomics. The current release of RegPrecise 2.0 includes >1400 regulogs controlled either by protein transcription factors or by conserved ribonucleic acid regulatory motifs in >250 genomes from 24 taxonomic groups of bacteria. The reference regulons accumulated in RegPrecise can serve as a basis for automatic annotation of regulatory interactions in newly sequenced genomes. The developed API provides an efficient access to the RegPrecise data by a comprehensive set of 14 web service resources. The RegPrecise web services API is freely accessible at http://regprecise.lbl.gov/RegPrecise/services.jsp with no login requirements.


Assuntos
Regulação Bacteriana da Expressão Gênica , Regulon , Software , Transcrição Gênica , Redes Reguladoras de Genes , Genoma Bacteriano , Genômica/métodos , Internet , Motivos de Nucleotídeos , Sequências Reguladoras de Ácido Ribonucleico , Fatores de Transcrição/metabolismo , Interface Usuário-Computador
17.
Nucleic Acids Res ; 40(Database issue): D26-32, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22110030

RESUMO

The Department of Energy (DOE) Joint Genome Institute (JGI) is a national user facility with massive-scale DNA sequencing and analysis capabilities dedicated to advancing genomics for bioenergy and environmental applications. Beyond generating tens of trillions of DNA bases annually, the Institute develops and maintains data management systems and specialized analytical capabilities to manage and interpret complex genomic data sets, and to enable an expanding community of users around the world to analyze these data in different contexts over the web. The JGI Genome Portal (http://genome.jgi.doe.gov) provides a unified access point to all JGI genomic databases and analytical tools. A user can find all DOE JGI sequencing projects and their status, search for and download assemblies and annotations of sequenced genomes, and interactively explore those genomes and compare them with other sequenced microbes, fungi, plants or metagenomes using specialized systems tailored to each particular class of organisms. We describe here the general organization of the Genome Portal and the most recent addition, MycoCosm (http://jgi.doe.gov/fungi), a new integrated fungal genomics resource.


Assuntos
Bases de Dados Genéticas , Genômica , Análise de Sequência de DNA , Análise por Conglomerados , Genoma Fúngico , Internet , Anotação de Sequência Molecular , Software , Integração de Sistemas
18.
Adv Exp Med Biol ; 799: 39-67, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24292961

RESUMO

Recent technological advances in genomics now allow producing biological data at unprecedented tera- and petabyte scales. Yet, the extraction of useful knowledge from this voluminous data presents a significant challenge to a scientific community. Efficient mining of vast and complex data sets for the needs of biomedical research critically depends on seamless integration of clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships accumulated in a plethora of publicly available databases. Furthermore, such experimental data should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining. Translational projects require sophisticated approaches that coordinate and perform various analytical steps involved in the extraction of useful knowledge from accumulated clinical and experimental data in an orderly semiautomated manner. It presents a number of challenges such as (1) high-throughput data management involving data transfer, data storage, and access control; (2) scalable computational infrastructure; and (3) analysis of large-scale multidimensional data for the extraction of actionable knowledge.We present a scalable computational platform based on crosscutting requirements from multiple scientific groups for data integration, management, and analysis. The goal of this integrated platform is to address the challenges and to support the end-to-end analytical needs of various translational projects.


Assuntos
Pesquisa Translacional Biomédica/métodos , Pesquisa Translacional Biomédica/tendências , Mineração de Dados/métodos , Mineração de Dados/tendências , Bases de Dados Genéticas/tendências , Genômica/métodos , Genômica/tendências , Humanos
19.
J Bacteriol ; 195(1): 29-38, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23086211

RESUMO

Accurate detection of transcriptional regulatory elements is essential for high-quality genome annotation, metabolic reconstruction, and modeling of regulatory networks. We developed a computational approach for reconstruction of regulons operated by transcription factors (TFs) from large protein families and applied this novel approach to three TF families in 10 Desulfovibrionales genomes. Phylogenetic analyses of 125 regulators from the ArsR, Crp/Fnr, and GntR families revealed that 65% of these regulators (termed reference TFs) are well conserved in Desulfovibrionales, while the remaining 35% of regulators (termed singleton TFs) are species specific and show a mosaic distribution. For regulon reconstruction in the group of singleton TFs, the standard orthology-based approach was inefficient, and thus, we developed a novel approach based on the simultaneous study of all homologous TFs from the same family in a group of genomes. As a result, we identified binding for 21 singleton TFs and for all reference TFs in all three analyzed families. Within each TF family we observed structural similarities between DNA-binding motifs of different reference and singleton TFs. The collection of reconstructed regulons is available at the RegPrecise database (http://regprecise.lbl.gov/RegPrecise/Desulfovibrionales.jsp).


Assuntos
Proteínas de Bactérias/metabolismo , Regulação Bacteriana da Expressão Gênica/fisiologia , Genoma Bacteriano , Regulon/fisiologia , Bactérias Redutoras de Enxofre/metabolismo , Fatores de Transcrição/metabolismo , Motivos de Aminoácidos , Proteínas de Bactérias/genética , Sequência de Bases , Sequência Conservada , Família Multigênica , Filogenia , Ligação Proteica , Bactérias Redutoras de Enxofre/genética
20.
J Bacteriol ; 195(19): 4466-75, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23913324

RESUMO

The trace elements molybdenum and tungsten are essential components of cofactors of many metalloenzymes. However, in sulfate-reducing bacteria, high concentrations of molybdate and tungstate oxyanions inhibit growth, thus requiring the tight regulation of their homeostasis. By a combination of bioinformatic and experimental techniques, we identified a novel regulator family, tungstate-responsive regulator (TunR), controlling the homeostasis of tungstate and molybdate in sulfate-reducing deltaproteobacteria. The effector-sensing domains of these regulators are similar to those of the known molybdate-responsive regulator ModE, while their DNA-binding domains are homologous to XerC/XerD site-specific recombinases. Using a comparative genomics approach, we identified DNA motifs and reconstructed regulons for 40 TunR family members. Positional analysis of TunR sites and putative promoters allowed us to classify most TunR proteins into two groups: (i) activators of modABC genes encoding a high-affinity molybdenum and tungsten transporting system and (ii) repressors of genes for toluene sulfonate uptake (TSUP) family transporters. The activation of modA and modBC genes by TunR in Desulfovibrio vulgaris Hildenborough was confirmed in vivo, and we discovered that the activation was diminished in the presence of tungstate. A predicted 30-bp TunR-binding motif was confirmed by in vitro binding assays. A novel TunR family of bacterial transcriptional factors controls tungstate and molybdate homeostasis in sulfate-reducing deltaproteobacteria. We proposed that TunR proteins participate in protection of the cells from the inhibition by these oxyanions. To our knowledge, this is a unique case of a family of bacterial transcriptional factors evolved from site-specific recombinases.


Assuntos
Proteínas de Bactérias/metabolismo , Desulfovibrio/metabolismo , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Fatores de Transcrição/metabolismo , Compostos de Tungstênio/farmacologia , Proteínas de Bactérias/genética , Evolução Biológica , Clonagem Molecular , Desulfovibrio/genética , Molibdênio , Filogenia , Regiões Promotoras Genéticas , Transporte Proteico , Fatores de Transcrição/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA