Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 81
Filtrar
1.
Microbiol Resour Announc ; 8(42)2019 Oct 17.
Artigo em Inglês | MEDLINE | ID: mdl-31624164

RESUMO

Here, we report the complete genome sequence of an African swine fever (ASF) virus (ASFV/Kyiv/2016/131) isolated from the spleen of a domestic pig in Ukraine with a lethal case of African swine fever. Using only long-read Nanopore sequences, we assembled a full-length genome of 191,911 base pairs in a single contig.

3.
Nature ; 557(7703): 43-49, 2018 05.
Artigo em Inglês | MEDLINE | ID: mdl-29695866

RESUMO

Here we analyse genetic variation, population structure and diversity among 3,010 diverse Asian cultivated rice (Oryza sativa L.) genomes from the 3,000 Rice Genomes Project. Our results are consistent with the five major groups previously recognized, but also suggest several unreported subpopulations that correlate with geographic location. We identified 29 million single nucleotide polymorphisms, 2.4 million small indels and over 90,000 structural variations that contribute to within- and between-population variation. Using pan-genome analyses, we identified more than 10,000 novel full-length protein-coding genes and a high number of presence-absence variations. The complex patterns of introgression observed in domestication genes are consistent with multiple independent rice domestication events. The public availability of data from the 3,000 Rice Genomes Project provides a resource for rice genomics research and breeding.


Assuntos
Produtos Agrícolas/classificação , Produtos Agrícolas/genética , Variação Genética , Genoma de Planta/genética , Oryza/classificação , Oryza/genética , Ásia , Evolução Molecular , Genes de Plantas/genética , Genética Populacional , Genômica , Haplótipos , Mutação INDEL/genética , Filogenia , Melhoramento Vegetal , Polimorfismo de Nucleotídeo Único/genética
4.
Mol Phylogenet Evol ; 117: 10-29, 2017 12.
Artigo em Inglês | MEDLINE | ID: mdl-28860010

RESUMO

Synteny can be maintained for certain genomic regions across broad phylogenetic groups. In these homologous genomic regions, sites that are under relaxed purifying selection, such as intergenic regions, could be used broadly as markers for population genetic and phylogenetic studies on species complexes. To explore the potential of this approach, we found 125 Collinear Orthologous Regions (COR) ranging from 1 to >10kb across nine genomes representing the Lecanoromycetes and Eurotiomycetes (Pezizomycotina, Ascomycota). Twenty-six of these COR were found in all 24 eurotiomycete genomes surveyed for this study. Given the high abundance and availability of fungal genomes we believe this approach could be adopted for other large groups of fungi outside the Pezizomycotina. Asa proof of concept, we selected three Collinear Orthologous Regions (COR1b, COR3, and COR16), based on synteny analyses of several genomes representing three classes of Ascomycota: Eurotiomycetes, Lecanoromycetes, and Lichinomycetes. COR16, for example, was found across these three classes of fungi. Here we compare the resolving power of these three new markers with five loci commonly used in phylogenetic studies of fungi, using section Polydactylon of the cyanolichen-forming genus Peltigera (Lecanoromycetes) - a clade with several challenging species complexes. Sequence data were subjected to three species discovery and two validating methods. COR markers substantially increased phylogenetic resolution and confidence, and highly contributed to species delimitation. The level of phylogenetic signal provided by each of the COR markers was higher than the commonly used fungal barcode ITS. High cryptic diversity was revealed by all methods. As redefined here, most species represent lineages that have relatively narrower, and more homogeneous biogeographical ranges than previously understood. The scabrosoid clade consists of ten species, seven of which are new. For the dolichorhizoid clade, twenty-two new species were discovered for a total of twenty-nine species in this clade.


Assuntos
Ascomicetos/classificação , Ascomicetos/genética , Marcadores Genéticos/genética , Genoma Fúngico/genética , Genômica , Líquens/classificação , Líquens/genética , Filogenia , DNA Intergênico , Reprodutibilidade dos Testes , Especificidade da Espécie , Sintenia
5.
Methods Mol Biol ; 1613: 85-99, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28849559

RESUMO

Recent technological advances in genomics allow the production of biological data at unprecedented tera- and petabyte scales. Efficient mining of these vast and complex datasets for the needs of biomedical research critically depends on a seamless integration of the clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships. Such experimental data accumulated in publicly available databases should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining.We present an integrated computational platform Lynx (Sulakhe et al., Nucleic Acids Res 44:D882-D887, 2016) ( http://lynx.cri.uchicago.edu ), a web-based database and knowledge extraction engine. It provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization. It gives public access to the Lynx integrated knowledge base (LynxKB) and its analytical tools via user-friendly web services and interfaces. The Lynx service-oriented architecture supports annotation and analysis of high-throughput experimental data. Lynx tools assist the user in extracting meaningful knowledge from LynxKB and experimental data, and in the generation of weighted hypotheses regarding the genes and molecular mechanisms contributing to human phenotypes or conditions of interest. The goal of this integrated platform is to support the end-to-end analytical needs of various translational projects.


Assuntos
Biologia Computacional/métodos , Redes Reguladoras de Genes , Algoritmos , Mineração de Dados , Humanos , Bases de Conhecimento , Interface Usuário-Computador , Navegador
6.
Nucleic Acids Res ; 45(D1): D1075-D1081, 2017 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-27899667

RESUMO

We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genoma de Planta , Mutação INDEL , Oryza/genética , Polimorfismo de Nucleotídeo Único , Ferramenta de Busca , Software , Alelos , Biologia Computacional/métodos , Frequência do Gene , Loci Gênicos , Genômica/métodos , Genótipo , Interface Usuário-Computador , Navegador
7.
BMC Genomics ; 16: 882, 2015 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-26519295

RESUMO

BACKGROUND: To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. Our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species. RESULTS: We searched for sequences that were conserved within groups of closely related species but not between groups of more distant species, and were associated with an epigenetic mark of enhancer activity. To facilitate inferring orthology between non-conserved sequences, we limited our search to introns whose orthology could be unambiguously established by mapping the bracketing exons. We show that a subset of these non-conserved but syntenic sequences from the mouse and zebrafish genomes have homologous functions in a zebrafish transgenic enhancer assay. The conserved expression patterns driven by these enhancers are probably associated with short transcription factor-binding motifs present in the divergent sequences. CONCLUSIONS: We have identified numerous potential enhancers with divergent sequences but a conserved function. These results indicate that selection on function, rather than sequence, may be a common mode of enhancer evolution; evidence for selection at the sequence level is not a necessary criterion to define a gene regulatory element.


Assuntos
Sequência Conservada , Elementos Facilitadores Genéticos , Variação Genética , Vertebrados/genética , Animais , Animais Geneticamente Modificados , Sítios de Ligação , Biologia Computacional/métodos , Evolução Molecular , Perfilação da Expressão Gênica , Estudo de Associação Genômica Ampla , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismo
8.
BMC Genomics ; 16: 919, 2015 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-26555820

RESUMO

BACKGROUND: The σ(54) subunit controls a unique class of promoters in bacteria. Such promoters, without exception, require enhancer binding proteins (EBPs) for transcription initiation. Desulfovibrio vulgaris Hildenborough, a model bacterium for sulfate reduction studies, has a high number of EBPs, more than most sequenced bacteria. The cellular processes regulated by many of these EBPs remain unknown. RESULTS: To characterize the σ(54)-dependent regulome of D. vulgaris Hildenborough, we identified EBP binding motifs and regulated genes by a combination of computational and experimental techniques. These predictions were supported by our reconstruction of σ(54)-dependent promoters by comparative genomics. We reassessed and refined the results of earlier studies on regulation in D. vulgaris Hildenborough and consolidated them with our new findings. It allowed us to reconstruct the σ(54) regulome in D. vulgaris Hildenborough. This regulome includes 36 regulons that consist of 201 coding genes and 4 non-coding RNAs, and is involved in nitrogen, carbon and energy metabolism, regulation, transmembrane transport and various extracellular functions. To the best of our knowledge, this is the first report of direct regulation of alanine dehydrogenase, pyruvate metabolism genes and type III secretion system by σ(54)-dependent regulators. CONCLUSIONS: The σ(54)-dependent regulome is an important component of transcriptional regulatory network in D. vulgaris Hildenborough and related free-living Deltaproteobacteria. Our study provides a representative collection of σ(54)-dependent regulons that can be used for regulation prediction in Deltaproteobacteria and other taxa.


Assuntos
Desulfovibrio vulgaris/genética , Desulfovibrio vulgaris/metabolismo , Regulação Bacteriana da Expressão Gênica , Fator sigma/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sítios de Ligação , Análise por Conglomerados , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Elementos Facilitadores Genéticos , Motivos de Nucleotídeos , Filogenia , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , Ligação Proteica , Fator sigma/genética , Fatores de Transcrição/metabolismo , Sistemas de Secreção Tipo III/genética
9.
BMC Bioinformatics ; 16: 130, 2015 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-25928663

RESUMO

BACKGROUND: Metagenomics, the sequencing of DNA collected from an entire microbial community, enables the study of natural microbial consortia in their native habitats. Metagenomics studies produce huge volumes of data, including both the sequences themselves and metadata describing their abundance, assembly, predicted functional characteristics and environmental parameters. The ability to explore these data visually is critically important to meaningful biological interpretation. Current genomics applications cannot effectively integrate sequence data, assembly metadata, and annotation to support both genome and community-level inquiry. RESULTS: Elviz (Environmental Laboratory Visualization) is an interactive web-based tool for the visual exploration of assembled metagenomes and their complex metadata. Elviz allows scientists to navigate metagenome assemblies across multiple dimensions and scales, plotting parameters such as GC content, relative abundance, phylogenetic affiliation and assembled contig length. Furthermore Elviz enables interactive exploration using real-time plot navigation, search, filters, axis selection, and the ability to drill from a whole-community profile down to individual gene annotations. Thus scientists engage in a rapid feedback loop of visual pattern identification, hypothesis generation, and hypothesis testing. CONCLUSIONS: Compared to the current alternative of generating a succession of static figures, Elviz can greatly accelerate the speed of metagenome analysis. Elviz can be used to explore both user-submitted datasets and numerous metagenome studies publicly available at the Joint Genome Institute (JGI). Elviz is freely available at http://genome.jgi.doe.gov/viz and runs on most current web-browsers.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , Bases de Dados Genéticas , Genoma Bacteriano , Metagenoma , Metagenômica/métodos , Software , Anotação de Sequência Molecular , Petróleo/microbiologia , Filogenia
10.
PLoS One ; 9(12): e114903, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25506935

RESUMO

An essential step in the discovery of molecular mechanisms contributing to disease phenotypes and efficient experimental planning is the development of weighted hypotheses that estimate the functional effects of sequence variants discovered by high-throughput genomics. With the increasing specialization of the bioinformatics resources, creating analytical workflows that seamlessly integrate data and bioinformatics tools developed by multiple groups becomes inevitable. Here we present a case study of a use of the distributed analytical environment integrating four complementary specialized resources, namely the Lynx platform, VISTA RViewer, the Developmental Brain Disorders Database (DBDB), and the RaptorX server, for the identification of high-confidence candidate genes contributing to pathogenesis of spina bifida. The analysis resulted in prediction and validation of deleterious mutations in the SLC19A placental transporter in mothers of the affected children that causes narrowing of the outlet channel and therefore leads to the reduced folate permeation rate. The described approach also enabled correct identification of several genes, previously shown to contribute to pathogenesis of spina bifida, and suggestion of additional genes for experimental validations. The study demonstrates that the seamless integration of bioinformatics resources enables fast and efficient prioritization and characterization of genomic factors and molecular networks contributing to the phenotypes of interest.


Assuntos
Mutação , Proteína Carregadora de Folato Reduzido/genética , Disrafismo Espinal/genética , Criança , Feminino , Ácido Fólico/metabolismo , Genômica/métodos , Humanos , Modelos Moleculares , Gravidez , Conformação Proteica , Proteína Carregadora de Folato Reduzido/química , Proteína Carregadora de Folato Reduzido/metabolismo , Software , Disrafismo Espinal/metabolismo
11.
Genome Res ; 24(12): 2077-89, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25273068

RESUMO

Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.


Assuntos
Genoma , Genômica/métodos , Alinhamento de Sequência/métodos , Software , Animais , Biologia Computacional/métodos , Simulação por Computador , Conjuntos de Dados como Assunto , Estudo de Associação Genômica Ampla , Humanos , Mamíferos/genética , Filogenia , Reprodutibilidade dos Testes
12.
Nature ; 510(7505): 356-62, 2014 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-24919147

RESUMO

Eucalypts are the world's most widely planted hardwood trees. Their outstanding diversity, adaptability and growth have made them a global renewable resource of fibre and energy. We sequenced and assembled >94% of the 640-megabase genome of Eucalyptus grandis. Of 36,376 predicted protein-coding genes, 34% occur in tandem duplications, the largest proportion thus far in plant genomes. Eucalyptus also shows the highest diversity of genes for specialized metabolites such as terpenes that act as chemical defence and provide unique pharmaceutical oils. Genome sequencing of the E. grandis sister species E. globulus and a set of inbred E. grandis tree genomes reveals dynamic genome evolution and hotspots of inbreeding depression. The E. grandis genome is the first reference for the eudicot order Myrtales and is placed here sister to the eurosids. This resource expands our understanding of the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.


Assuntos
Eucalyptus/genética , Genoma de Planta , Eucalyptus/classificação , Evolução Molecular , Variação Genética , Endogamia , Filogenia
13.
Bioinformatics ; 30(18): 2654-5, 2014 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-24860159

RESUMO

UNLABELLED: With the ubiquitous generation of complete genome assemblies for a variety of species, efficient tools for whole-genome alignment along with user-friendly visualization are critically important. Our VISTA family of tools for comparative genomics, based on algorithms for pairwise and multiple alignments of genomic sequences and whole-genome assemblies, has become one of the standard techniques for comparative analysis. Most of the VISTA programs have been implemented as Web-accessible servers and are extensively used by the biomedical community. In this manuscript, we introduce GenomeVISTA: a novel implementation that incorporates most features of the VISTA family--fast and accurate alignment, visualization capabilities, GUI and analytical tools within a stand-alone software package. GenomeVISTA thus provides flexibility and security for users who need to conduct whole-genome comparisons on their own computers. AVAILABILITY AND IMPLEMENTATION: Implemented in Perl, C/C++ and Java, the source code is freely available for download at the VISTA Web site: http://genome.lbl.gov/vista/.


Assuntos
Genômica/métodos , Alinhamento de Sequência/métodos , Software , Algoritmos , Gráficos por Computador
14.
Adv Exp Med Biol ; 799: 39-67, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24292961

RESUMO

Recent technological advances in genomics now allow producing biological data at unprecedented tera- and petabyte scales. Yet, the extraction of useful knowledge from this voluminous data presents a significant challenge to a scientific community. Efficient mining of vast and complex data sets for the needs of biomedical research critically depends on seamless integration of clinical, genomic, and experimental information with prior knowledge about genotype-phenotype relationships accumulated in a plethora of publicly available databases. Furthermore, such experimental data should be accessible to a variety of algorithms and analytical pipelines that drive computational analysis and data mining. Translational projects require sophisticated approaches that coordinate and perform various analytical steps involved in the extraction of useful knowledge from accumulated clinical and experimental data in an orderly semiautomated manner. It presents a number of challenges such as (1) high-throughput data management involving data transfer, data storage, and access control; (2) scalable computational infrastructure; and (3) analysis of large-scale multidimensional data for the extraction of actionable knowledge.We present a scalable computational platform based on crosscutting requirements from multiple scientific groups for data integration, management, and analysis. The goal of this integrated platform is to address the challenges and to support the end-to-end analytical needs of various translational projects.


Assuntos
Pesquisa Translacional Biomédica/métodos , Pesquisa Translacional Biomédica/tendências , Mineração de Dados/métodos , Mineração de Dados/tendências , Bases de Dados Genéticas/tendências , Genômica/métodos , Genômica/tendências , Humanos
15.
Nucleic Acids Res ; 42(Database issue): D699-704, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24297253

RESUMO

MycoCosm is a fungal genomics portal (http://jgi.doe.gov/fungi), developed by the US Department of Energy Joint Genome Institute to support integration, analysis and dissemination of fungal genome sequences and other 'omics' data by providing interactive web-based tools. MycoCosm also promotes and facilitates user community participation through the nomination of new species of fungi for sequencing, and the annotation and analysis of resulting data. By efficiently filling gaps in the Fungal Tree of Life, MycoCosm will help address important problems associated with energy and the environment, taking advantage of growing fungal genomics resources.


Assuntos
Bases de Dados Genéticas , Genoma Fúngico , Fungos/classificação , Fungos/genética , Genômica , Internet , Anotação de Sequência Molecular
16.
Nucleic Acids Res ; 42(Database issue): D26-31, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24225321

RESUMO

The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. Here we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.


Assuntos
Bases de Dados de Ácidos Nucleicos , Genômica , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Análise de Sequência de DNA , Integração de Sistemas
17.
BMC Genomics ; 14: 745, 2013 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-24175918

RESUMO

BACKGROUND: Genome-scale prediction of gene regulation and reconstruction of transcriptional regulatory networks in prokaryotes is one of the critical tasks of modern genomics. Bacteria from different taxonomic groups, whose lifestyles and natural environments are substantially different, possess highly diverged transcriptional regulatory networks. The comparative genomics approaches are useful for in silico reconstruction of bacterial regulons and networks operated by both transcription factors (TFs) and RNA regulatory elements (riboswitches). DESCRIPTION: RegPrecise (http://regprecise.lbl.gov) is a web resource for collection, visualization and analysis of transcriptional regulons reconstructed by comparative genomics. We significantly expanded a reference collection of manually curated regulons we introduced earlier. RegPrecise 3.0 provides access to inferred regulatory interactions organized by phylogenetic, structural and functional properties. Taxonomy-specific collections include 781 TF regulogs inferred in more than 160 genomes representing 14 taxonomic groups of Bacteria. TF-specific collections include regulogs for a selected subset of 40 TFs reconstructed across more than 30 taxonomic lineages. Novel collections of regulons operated by RNA regulatory elements (riboswitches) include near 400 regulogs inferred in 24 bacterial lineages. RegPrecise 3.0 provides four classifications of the reference regulons implemented as controlled vocabularies: 55 TF protein families; 43 RNA motif families; ~150 biological processes or metabolic pathways; and ~200 effectors or environmental signals. Genome-wide visualization of regulatory networks and metabolic pathways covered by the reference regulons are available for all studied genomes. A separate section of RegPrecise 3.0 contains draft regulatory networks in 640 genomes obtained by an conservative propagation of the reference regulons to closely related genomes. CONCLUSIONS: RegPrecise 3.0 gives access to the transcriptional regulons reconstructed in bacterial genomes. Analytical capabilities include exploration of: regulon content, structure and function; TF binding site motifs; conservation and variations in genome-wide regulatory networks across all taxonomic groups of Bacteria. RegPrecise 3.0 was selected as a core resource on transcriptional regulation of the Department of Energy Systems Biology Knowledgebase, an emerging software and data environment designed to enable researchers to collaboratively generate, test and share new hypotheses about gene and protein functions, perform large-scale analyses, and model interactions in microbes, plants, and their communities.


Assuntos
Bactérias/genética , Bases de Dados Genéticas , Genoma Bacteriano , Bactérias/classificação , Redes Reguladoras de Genes/genética , Internet , Redes e Vias Metabólicas/genética , Fatores de Transcrição/genética , Interface Usuário-Computador
18.
J Bacteriol ; 195(19): 4466-75, 2013 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-23913324

RESUMO

The trace elements molybdenum and tungsten are essential components of cofactors of many metalloenzymes. However, in sulfate-reducing bacteria, high concentrations of molybdate and tungstate oxyanions inhibit growth, thus requiring the tight regulation of their homeostasis. By a combination of bioinformatic and experimental techniques, we identified a novel regulator family, tungstate-responsive regulator (TunR), controlling the homeostasis of tungstate and molybdate in sulfate-reducing deltaproteobacteria. The effector-sensing domains of these regulators are similar to those of the known molybdate-responsive regulator ModE, while their DNA-binding domains are homologous to XerC/XerD site-specific recombinases. Using a comparative genomics approach, we identified DNA motifs and reconstructed regulons for 40 TunR family members. Positional analysis of TunR sites and putative promoters allowed us to classify most TunR proteins into two groups: (i) activators of modABC genes encoding a high-affinity molybdenum and tungsten transporting system and (ii) repressors of genes for toluene sulfonate uptake (TSUP) family transporters. The activation of modA and modBC genes by TunR in Desulfovibrio vulgaris Hildenborough was confirmed in vivo, and we discovered that the activation was diminished in the presence of tungstate. A predicted 30-bp TunR-binding motif was confirmed by in vitro binding assays. A novel TunR family of bacterial transcriptional factors controls tungstate and molybdate homeostasis in sulfate-reducing deltaproteobacteria. We proposed that TunR proteins participate in protection of the cells from the inhibition by these oxyanions. To our knowledge, this is a unique case of a family of bacterial transcriptional factors evolved from site-specific recombinases.


Assuntos
Proteínas de Bactérias/metabolismo , Desulfovibrio/metabolismo , Regulação Bacteriana da Expressão Gênica/efeitos dos fármacos , Fatores de Transcrição/metabolismo , Compostos de Tungstênio/farmacologia , Proteínas de Bactérias/genética , Evolução Biológica , Clonagem Molecular , Desulfovibrio/genética , Molibdênio , Filogenia , Regiões Promotoras Genéticas , Transporte Proteico , Fatores de Transcrição/genética
19.
Bioinformatics ; 29(16): 2059-61, 2013 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-23736530

RESUMO

SUMMARY: We have developed a web-based query tool, Whole-Genome rVISTA (WGRV), that determines enrichment of transcription factors (TFs) and associated target genes in sets of co-regulated genes. WGRV enables users to query databases containing pre-computed genome coordinates of evolutionarily conserved transcription factor binding sites in the proximal promoters (from 100 bp to 5 kb upstream) of human, mouse and Drosophila genomes. TF binding sites are based on position-weight matrices from the TRANSFAC Professional database. For a given set of co-regulated genes, WGRV returns statistically enriched and evolutionarily conserved binding sites, mapped by the regulatory VISTA (rVISTA) algorithm. Users can then retrieve a list of genes from the query set containing the enriched TF binding sites and their location in the query set promoters. Results are exported in a BED format for rapid visualization in the UCSC genome browser. Flat files of mapped conserved sites and their genomic coordinates are also available for analysis with stand-alone software. AVAILABILITY: http://genome.lbl.gov/cgi-bin/WGRVistaInputCommon.pl.


Assuntos
Perfilação da Expressão Gênica , Regiões Promotoras Genéticas , Software , Fatores de Transcrição/metabolismo , Algoritmos , Animais , Genômica , Humanos , Internet , Camundongos
20.
BMC Genomics ; 14: 213, 2013 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-23547897

RESUMO

BACKGROUND: Due to the constantly growing number of sequenced microbial genomes, comparative genomics has been playing a major role in the investigation of regulatory interactions in bacteria. Regulon inference mostly remains a field of semi-manual examination since absence of a knowledgebase and informatics platform for automated and systematic investigation restricts opportunities for computational prediction. Additionally, confirming computationally inferred regulons by experimental data is critically important. DESCRIPTION: RegTransBase is an open-access platform with a user-friendly web interface publicly available at http://regtransbase.lbl.gov. It consists of two databases - a manually collected hierarchical regulatory interactions database based on more than 7000 scientific papers which can serve as a knowledgebase for verification of predictions, and a large set of curated by experts transcription factor binding sites used in regulon inference by a variety of tools. RegTransBase captures the knowledge from published scientific literature using controlled vocabularies and contains various types of experimental data, such as: the activation or repression of transcription by an identified direct regulator; determination of the transcriptional regulatory function of a protein (or RNA) directly binding to DNA or RNA; mapping of binding sites for a regulatory protein; characterization of regulatory mutations. Analysis of the data collected from literature resulted in the creation of Putative Regulons from Experimental Data that are also available in RegTransBase. CONCLUSIONS: RegTransBase is a powerful user-friendly platform for the investigation of regulation in prokaryotes. It uses a collection of validated regulatory sequences that can be easily extracted and used to infer regulatory interactions by comparative genomics techniques thus assisting researchers in the interpretation of transcriptional regulation data.


Assuntos
Bases de Dados de Ácidos Nucleicos , Regulação da Expressão Gênica , Células Procarióticas/metabolismo , Elementos Reguladores de Transcrição/genética , Fatores de Transcrição/metabolismo , Genoma Bacteriano , Internet , Regulon/fisiologia , Interface Usuário-Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA