Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Nucleic Acids Res ; 52(D1): D255-D264, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-37971353

RESUMO

RegulonDB is a database that contains the most comprehensive corpus of knowledge of the regulation of transcription initiation of Escherichia coli K-12, including data from both classical molecular biology and high-throughput methodologies. Here, we describe biological advances since our last NAR paper of 2019. We explain the changes to satisfy FAIR requirements. We also present a full reconstruction of the RegulonDB computational infrastructure, which has significantly improved data storage, retrieval and accessibility and thus supports a more intuitive and user-friendly experience. The integration of graphical tools provides clear visual representations of genetic regulation data, facilitating data interpretation and knowledge integration. RegulonDB version 12.0 can be accessed at https://regulondb.ccg.unam.mx.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12 , Regulação Bacteriana da Expressão Gênica , Biologia Computacional/métodos , Escherichia coli K12/genética , Internet , Transcrição Gênica
2.
Microb Genom ; 8(5)2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35584008

RESUMO

Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.


Assuntos
Escherichia coli K12 , Escherichia coli , Escherichia coli/genética , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Regulação Bacteriana da Expressão Gênica , Óperon/genética , Reprodutibilidade dos Testes
3.
Proc Natl Acad Sci U S A ; 118(14)2021 04 06.
Artigo em Inglês | MEDLINE | ID: mdl-33737447

RESUMO

When addressing a genomic question, having a reliable and adequate reference genome is of utmost importance. This drives the necessity to refine and customize reference genomes (RGs). Our laboratory has recently developed a strategy, the Perfect Match Genomic Landscape (PMGL), to detect variation between genomes [K. Palacios-Flores et al.Genetics 208, 1631-1641 (2018)]. The PMGL is precise and sensitive and, in contrast to most currently used algorithms, is nonstatistical in nature. Here we demonstrate the power of PMGL to refine and customize RGs. As a proof-of-concept, we refined different versions of the Saccharomyces cerevisiae RG. We applied the automatic PMGL pipeline to refine the genomes of microorganisms belonging to the three domains of life: the archaea Methanococcus maripaludis and Pyrococcus furiosus; the bacteria Escherichia coli, Staphylococcus aureus, and Bacillus subtilis; and the eukarya Schizosaccharomyces pombe, Aspergillus oryzae, and several strains of Saccharomyces paradoxus. We analyzed the reference genome of the virus SARS-CoV-2 and previously published viral genomes from patients' samples with COVID-19. We performed a mutation-accumulation experiment in E. coli and show that the PMGL strategy can detect specific mutations generated at any desired step of the whole procedure. We propose that PMGL can be used as a final step for the refinement and customization of any haploid genome, independently of the strategies and algorithms used in its assembly.


Assuntos
Variação Genética , Genoma Microbiano , Genômica/métodos , SARS-CoV-2/genética , Algoritmos , Acúmulo de Mutações , Estudo de Prova de Conceito , Saccharomyces cerevisiae/genética
4.
Sci Rep ; 10(1): 514, 2020 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-31949184

RESUMO

Chronic Obstructive Pulmonary Disease (COPD) and Idiopathic Pulmonary Fibrosis (IPF) have contrasting clinical and pathological characteristics and interesting whole-genome transcriptomic profiles. However, data from public repositories are difficult to reprocess and reanalyze. Here, we present PulmonDB, a web-based database (http://pulmondb.liigh.unam.mx/) and R library that facilitates exploration of gene expression profiles for these diseases by integrating transcriptomic data and curated annotation from different sources. We demonstrated the value of this resource by presenting the expression of already well-known genes of COPD and IPF across multiple experiments and the results of two differential expression analyses in which we successfully identified differences and similarities. With this first version of PulmonDB, we create a new hypothesis and compare the two diseases from a transcriptomics perspective.


Assuntos
Bases de Dados Genéticas , Redes Reguladoras de Genes , Fibrose Pulmonar Idiopática/genética , Doença Pulmonar Obstrutiva Crônica/genética , Curadoria de Dados , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Internet , Sequenciamento do Exoma
5.
Bioinformatics ; 35(22): 4803-4805, 2019 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-31161195

RESUMO

MOTIVATION: Identifying disease-causing variants from exome sequencing projects remains a challenging task that often requires bioinformatics expertise. Here we describe a user-friendly graphical application that allows medical professionals and bench biologists to prioritize and visualize genetic variants from human exome sequencing data. RESULTS: We have implemented VCF/Plotein, a graphical, fully interactive web application able to display exome sequencing data in VCF format. Gene and variant information is extracted from Ensembl. Cross-referencing with external databases and application-based gene and variant filtering have also been implemented. All data processing is done locally by the user's CPU to ensure the security of patient data. AVAILABILITY AND IMPLEMENTATION: Freely available on the web at https://vcfplotein.liigh.unam.mx. Website implemented in JavaScript using the Vue.js framework, with all major browsers supported. Source code freely available for download at https://github.com/raulossio/VCF-plotein. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Exoma , Genômica , Humanos , Software
6.
Proc Natl Acad Sci U S A ; 116(17): 8445-8450, 2019 04 23.
Artigo em Inglês | MEDLINE | ID: mdl-30962378

RESUMO

Genomes are dynamic structures. Different mechanisms participate in the generation of genomic rearrangements. One of them is nonallelic homologous recombination (NAHR). This rearrangement is generated by recombination between pairs of repeated sequences with high identity. We analyzed rearrangements mediated by repeated sequences located in different chromosomes. Such rearrangements generate chimeric chromosomes. Potential rearrangements were predicted by localizing interchromosomal identical repeated sequences along the nuclear genome of the Saccharomyces cerevisiae S288C strain. Rearrangements were identified by a PCR-based experimental strategy. PCR primers are located in the unique regions bordering each repeated region of interest. When the PCR is performed using forward primers from one chromosome and reverse primers from another chromosome, the break point of the chimeric chromosome structure is revealed. In all cases analyzed, the corresponding chimeric structures were found. Furthermore, the nucleotide sequence of chimeric structures was obtained, and the origin of the unique regions bordering the repeated sequence was located in the expected chromosomes, using the perfect-match genomic landscape strategy (PMGL). Several chimeric structures were searched in colonies derived from single cells. All of the structures were found in DNA isolated from each of the colonies. Our findings indicate that interchromosomal rearrangements that generate chimeric chromosomes are recurrent and occur, at a relatively high frequency, in cell populations of S. cerevisiae.


Assuntos
Cromossomos Fúngicos/genética , Rearranjo Gênico/genética , Genoma Fúngico/genética , Saccharomyces cerevisiae/genética , Genômica , Modelos Genéticos , Reação em Cadeia da Polimerase , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA
7.
Nucleic Acids Res ; 47(D1): D212-D220, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30395280

RESUMO

RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.


Assuntos
Biologia Computacional/métodos , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Genômica , Ontologia Genética , Redes Reguladoras de Genes , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala
8.
Curr Protoc Bioinformatics ; 61(1): 1.32.1-1.32.30, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-30040192

RESUMO

In RegulonDB, for over 25 years, we have been gathering knowledge by manual curation from original scientific literature on the regulation of transcription initiation and genome organization in transcription units of the Escherichia coli K-12 genome. This unit describes six basic protocols that can serve as a guiding introduction to the main content of the current version (v9.4) of this electronic resource. These protocols include general navigation as well as searching for specific objects such as genes, gene products, transcription units, promoters, transcription factors, coexpression, and genetic sensory response units or GENSOR Units. In these protocols, the user will find an initial introduction to the concepts pertinent to the protocol, the content obtained when performing the given navigation, and the necessary resources for carrying out the protocol. This easy-to-follow presentation should help anyone interested in quickly seeing all that is currently offered in RegulonDB, including position weight matrices of transcription factors, coexpression values based on published microarrays, and the GENSOR Units unique to RegulonDB that offer regulatory mechanisms in the context of their signals and metabolic consequences. © 2018 by John Wiley & Sons, Inc.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Redes Reguladoras de Genes , Regulon/genética , Transcrição Gênica , Regulação Bacteriana da Expressão Gênica , Internet , Óperon/genética , Regiões Promotoras Genéticas , Fatores de Transcrição/metabolismo
9.
Genetics ; 208(4): 1631-1641, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29367403

RESUMO

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository.


Assuntos
Variação Genética , Genoma , Genômica , Haploidia , Cromossomos , Biologia Computacional , Simulação por Computador , Testes Genéticos , Genoma Fúngico , Genoma Humano , Estudo de Associação Genômica Ampla , Genômica/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Leveduras/genética
10.
Nucleic Acids Res ; 44(D1): D133-43, 2016 Jan 04.
Artigo em Inglês | MEDLINE | ID: mdl-26527724

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx) is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for 'neighborhood' genes to known operons and regulons, and computational developments.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Regulon , Análise por Conglomerados , Escherichia coli K12/metabolismo , Redes Reguladoras de Genes , Óperon , Matrizes de Pontuação de Posição Específica , Pequeno RNA não Traduzido/metabolismo , Fatores de Transcrição/classificação
11.
Nucleic Acids Res ; 41(Database issue): D203-13, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23203884

RESUMO

This article summarizes our progress with RegulonDB (http://regulondb.ccg.unam.mx/) during the past 2 years. We have kept up-to-date the knowledge from the published literature regarding transcriptional regulation in Escherichia coli K-12. We have maintained and expanded our curation efforts to improve the breadth and quality of the encoded experimental knowledge, and we have implemented criteria for the quality of our computational predictions. Regulatory phrases now provide high-level descriptions of regulatory regions. We expanded the assignment of quality to various sources of evidence, particularly for knowledge generated through high-throughput (HT) technology. Based on our analysis of most relevant methods, we defined rules for determining the quality of evidence when multiple independent sources support an entry. With this latest release of RegulonDB, we present a new highly reliable larger collection of transcription start sites, a result of our experimental HT genome-wide efforts. These improvements, together with several novel enhancements (the tracks display, uploading format and curational guidelines), address the challenges of incorporating HT-generated knowledge into RegulonDB. Information on the evolutionary conservation of regulatory elements is also available now. Altogether, RegulonDB version 8.0 is a much better home for integrating knowledge on gene regulation from the sources of information currently available.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Elementos Reguladores de Transcrição , Transcrição Gênica , Proteínas de Bactérias/metabolismo , Bases de Dados Genéticas/normas , Evolução Molecular , Genômica , Internet , Regiões Promotoras Genéticas , Regulon , Proteínas Repressoras/metabolismo , Análise de Sequência de RNA , Fatores de Transcrição/metabolismo , Sítio de Iniciação de Transcrição
12.
Methods Mol Biol ; 804: 179-95, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22144154

RESUMO

RegulonDB contains the largest and currently best-known data set on transcriptional regulation in a single free-living organism, that of Escherichia coli K-12 (Gama-Castro et al. Nucleic Acids Res 36:D120-D124, 2008). This organized knowledge has been the gold standard for the implementation of bioinformatic predictive methods on gene regulation in bacteria (Collado-Vides et al. J Bacteriol 191:23-31, 2009). Given the complexity of different types of interactions, the difficulty of visualizing in a single figure of the whole network, and the different uses of this knowledge, we are making available different views of the genetic network. This chapter describes case studies about how to access these views, via precomputed files, web services and SQL, including sigma-gene relationships corresponding to transcription of alternative RNA polymerase holoenzyme promoters; as well as, transcription factor (TF)-genes, TF-operons, TF-TF, and TF-regulon interactions. 17.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Bases de Dados Genéticas , Escherichia coli K12/genética , Redes Reguladoras de Genes/genética , Regulon/genética , Internet , Óperon/genética , Fatores de Transcrição/genética
13.
Nucleic Acids Res ; 39(Database issue): D98-105, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21051347

RESUMO

RegulonDB (http://regulondb.ccg.unam.mx/) is the primary reference database of the best-known regulatory network of any free-living organism, that of Escherichia coli K-12. The major conceptual change since 3 years ago is an expanded biological context so that transcriptional regulation is now part of a unit that initiates with the signal and continues with the signal transduction to the core of regulation, modifying expression of the affected target genes responsible for the response. We call these genetic sensory response units, or Gensor Units. We have initiated their high-level curation, with graphic maps and superreactions with links to other databases. Additional connectivity uses expandable submaps. RegulonDB has summaries for every transcription factor (TF) and TF-binding sites with internal symmetry. Several DNA-binding motifs and their sizes have been redefined and relocated. In addition to data from the literature, we have incorporated our own information on transcription start sites (TSSs) and transcriptional units (TUs), obtained by using high-throughput whole-genome sequencing technologies. A new portable drawing tool for genomic features is also now available, as well as new ways to download the data, including web services, files for several relational database manager systems and text files including BioPAX format.


Assuntos
Bases de Dados Genéticas , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Fatores de Transcrição/metabolismo , Sítios de Ligação , Escherichia coli K12/metabolismo , Transdução de Sinais , Integração de Sistemas , Sítio de Iniciação de Transcrição , Transcrição Gênica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA