Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 114
Filtrar
1.
Cell Syst ; 15(3): 227-245.e7, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38417437

RESUMO

Many bacteria use operons to coregulate genes, but it remains unclear how operons benefit bacteria. We integrated E. coli's 788 polycistronic operons and 1,231 transcription units into an existing whole-cell model and found inconsistencies between the proposed operon structures and the RNA-seq read counts that the model was parameterized from. We resolved these inconsistencies through iterative, model-guided corrections to both datasets, including the correction of RNA-seq counts of short genes that were misreported as zero by existing alignment algorithms. The resulting model suggested two main modes by which operons benefit bacteria. For 86% of low-expression operons, adding operons increased the co-expression probabilities of their constituent proteins, whereas for 92% of high-expression operons, adding operons resulted in more stable expression ratios between the proteins. These simulations underscored the need for further experimental work on how operons reduce noise and synchronize both the expression timing and the quantity of constituent genes. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Escherichia coli , Óperon , Escherichia coli/genética , Óperon/genética , Bactérias/genética
2.
Front Microbiol ; 15: 1340413, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38357349

RESUMO

CyanoCyc is a web portal that integrates an exceptionally rich database collection of information about cyanobacterial genomes with an extensive suite of bioinformatics tools. It was developed to address the needs of the cyanobacterial research and biotechnology communities. The 277 annotated cyanobacterial genomes currently in CyanoCyc are supplemented with computational inferences including predicted metabolic pathways, operons, protein complexes, and orthologs; and with data imported from external databases, such as protein features and Gene Ontology (GO) terms imported from UniProt. Five of the genome databases have undergone manual curation with input from more than a dozen cyanobacteria experts to correct errors and integrate information from more than 1,765 published articles. CyanoCyc has bioinformatics tools that encompass genome, metabolic pathway and regulatory informatics; omics data analysis; and comparative analyses, including visualizations of multiple genomes aligned at orthologous genes, and comparisons of metabolic networks for multiple organisms. CyanoCyc is a high-quality, reliable knowledgebase that accelerates scientists' work by enabling users to quickly find accurate information using its powerful set of search tools, to understand gene function through expert mini-reviews with citations, to acquire information quickly using its interactive visualization tools, and to inform better decision-making for fundamental and applied research.

3.
Metabolites ; 14(1)2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38276300

RESUMO

The Omics Dashboard is a software tool for interactive exploration and analysis of metabolomics, transcriptomics, proteomics, and multi-omics datasets. Organized as a hierarchy of cellular systems, the Dashboard at its highest level contains graphical panels for the full range of cellular systems, including biosynthesis, energy metabolism, and response to stimulus. Thus, the Dashboard top level surveys the state of the cell across a broad range of key systems in a single screen. Each Dashboard panel contains a series of X-Y plots depicting the aggregated omics data values relevant to different subsystems of that panel, e.g., subsystems within the biosynthesis panel include amino acid biosynthesis, carbohydrate biosynthesis and cofactor biosynthesis. Users can interactively drill down to focus in on successively lower-level subsystems of interest. In this article, we present for the first time the metabolomics analysis capabilities of the Omics Dashboard, along with significant new extensions to better accommodate metabolomics datasets, enable analysis and visualization of multi-omics datasets, and provide new data-filtering options.

4.
EcoSal Plus ; 11(1): eesp00022023, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-37220074

RESUMO

EcoCyc is a bioinformatics database available online at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on the regulation of gene expression, E. coli gene essentiality, and nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for the analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed online. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. Data generated from a whole-cell model that is parameterized from the latest data on EcoCyc are also available. This review outlines the data content of EcoCyc and of the procedures by which this content is generated.


Assuntos
Escherichia coli K12 , Proteínas de Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Escherichia coli K12/genética , Bases de Dados Genéticas , Software , Biologia Computacional , Proteínas de Escherichia coli/metabolismo
6.
Database (Oxford) ; 20222022 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-36520791

RESUMO

This article offers thoughts on reviewing grant proposals for biological knowledgebases and databases (KDs) in the hope of aiding grant reviewers and applicants in addressing the issue of innovation. Assessing such grant proposals involves a number of subtleties that are worthy of discussion, particularly for new reviewers and applicants. In part, this article is motivated by the release of two funding opportunity announcements by the US National Institutes of Health concerning KDs. We find that the amount of innovation required for different KD projects can vary significantly, particularly depending on where in its life cycle a given project is. Strong innovation is not necessarily required to have an impactful KD project. For example, PubMed has low innovation but high impact. The importance of innovation should be weighted differently for different KD projects depending on the challenges they face and their maturity. The score for the overall impact of a grant proposal might have little dependence on the innovation score, such as for a mature project that is already delivering strong impact.


Assuntos
Disciplinas das Ciências Biológicas , Organização do Financiamento , Estados Unidos , National Institutes of Health (U.S.) , Bases de Conhecimento
7.
Front Bioinform ; 2: 869150, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36304298

RESUMO

The Pathway Tools (PTools) software provides a suite of capabilities for storing and analyzing integrated collections of genomic and metabolic information in the form of organism-specific Pathway/Genome Databases (PGDBs). A microbial community is represented in PTools by generating a PGDB from each metagenome-assembled genome (MAG). PTools computes a metabolic reconstruction for each organism, and predicts its operons. The properties of individual MAGs can be investigated using the many search and visualization operations within PTools. PTools also enables the user to investigate the properties of the microbial community by issuing searches across the full community, and by performing comparative operations across genome and pathway information. The software can generate a metabolic network diagram for the community, and it can overlay community omics datasets on that network diagram. PTools also provides a tool for searching for metabolic transformation routes across an organism community.

8.
mSystems ; 7(5): e0029322, 2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-35968975

RESUMO

Animals colonized with a defined microbiota represent useful experimental systems to investigate microbiome function. The altered Schaedler flora (ASF) represents a consortium of eight murine bacterial species that have been used for more than 4 decades where the study of mice with a reduced microbiota is desired. In contrast to germ-free mice, or mice colonized with only one or two species, ASF mice show the normal gut structure and immune system development. To further expand the utility of the ASF, we have developed technical and bioinformatic resources to enable a systems-based analysis of microbiome function using this model. Here, we highlighted four distinct applications of these resources that enable and improve (i) measurements of the abundance of each ASF member by quantitative PCR; (ii) exploration and comparative analysis of ASF genomes and the metabolic pathways they encode that comprise the entire gut microbiome; (iii) global transcriptional profiling to identify genes whose expression responds to environmental changes within the gut; and (iv) discovery of genetic changes resulting from the evolutionary adaptation of the microbiota. These resources were designed to be accessible to a broad community of researchers that, in combination with conventionally-reared mice (i.e., with complex microbiome), should contribute to our understanding of microbiome structure and function. IMPORTANCE Improved experimental systems are needed to advance our understanding of how the gut microbiome influences processes of the mammalian host as well as microbial community structure and function. An approach that is receiving considerable attention is the use of animal models that harbor a stable microbiota of known composition, i.e., defined microbiota, which enables control over an otherwise highly complex and variable feature of mammalian biology. The altered Schaedler flora (ASF) consortium is a well-established defined microbiota model, where mice are stably colonized with 8 distinct murine bacterial species. To take better advantage of the ASF, we established new experimental and bioinformatics resources for researchers to make better use of this model as an experimental system to study microbiome function.


Assuntos
Microbioma Gastrointestinal , Microbiota , Animais , Camundongos , Microbiota/genética , Modelos Animais de Doenças , Microbioma Gastrointestinal/genética , Bactérias/genética , Reação em Cadeia da Polimerase , Mamíferos/genética
9.
Database (Oxford) ; 20222022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35961013

RESUMO

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.


Assuntos
Genômica , Proteínas , Sequência de Bases , Biologia Computacional , Genoma , Anotação de Sequência Molecular
10.
Methods Mol Biol ; 2349: 259-289, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34718999

RESUMO

The MetaFlux software supports creating, executing, and solving quantitative metabolic flux models using flux balance analysis (FBA). MetaFlux offers four modes of operation: (1) solving mode executes an FBA model for an individual organism or for an organism community, (2) gene knockout mode executes an FBA model with one or many gene knockouts, (3) development mode assists the user in creating and improving FBA models, and (4) flux variability analysis mode generates a report of the robustness of an FBA model. MetaFlux also solves dynamic FBA (dFBA) for both individual organisms and communities of organisms. MetaFlux can be used in two different environments: on your local computer, which requires the installation of the Pathway Tools software, or through the web, which does not require installation of Pathway Tools. On your local computer, MetaFlux offers all four modes of operation, whereas the web environment provides only the solving mode.Several visualization tools are available to analyze model solutions. The Cellular Overview tool graphically shows the reaction fluxes on an organism's metabolic map once a model is solved. The Omics Dashboard provides a hierarchical approach to visualizing reaction fluxes, organized by metabolic subsystems. For a community of organisms, plotting of accumulated biomasses and metabolites can be performed using the Gnuplot tool.In this chapter, we present eight methods using MetaFlux. Five solving mode methods illustrate execution of models for individual organisms and for organism communities. One method illustrates the gene knockout mode. Two methods for the development mode illustrate steps for developing new metabolic models.


Assuntos
Redes e Vias Metabólicas , Modelos Biológicos , Software , Algoritmos , Biomassa , Técnicas de Inativação de Genes , Análise do Fluxo Metabólico
11.
Front Microbiol ; 12: 711077, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34394059

RESUMO

The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.

12.
J Integr Plant Biol ; 63(11): 1888-1905, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34403192

RESUMO

To understand and engineer plant metabolism, we need a comprehensive and accurate annotation of all metabolic information across plant species. As a step towards this goal, we generated genome-scale metabolic pathway databases of 126 algal and plant genomes, ranging from model organisms to crops to medicinal plants (https://plantcyc.org). Of these, 104 have not been reported before. We systematically evaluated the quality of the databases, which revealed that our semi-automated validation pipeline dramatically improves the quality. We then compared the metabolic content across the 126 organisms using multiple correspondence analysis and found that Brassicaceae, Poaceae, and Chlorophyta appeared as metabolically distinct groups. To demonstrate the utility of this resource, we used recently published sorghum transcriptomics data to discover previously unreported trends of metabolism underlying drought tolerance. We also used single-cell transcriptomics data from the Arabidopsis root to infer cell type-specific metabolic pathways. This work shows the quality and quantity of our resource and demonstrates its wide-ranging utility in integrating metabolism with other areas of plant biology.


Assuntos
Bases de Dados Factuais , Redes e Vias Metabólicas , Plantas/metabolismo , Viridiplantae/metabolismo , Genoma de Planta , Plantas/genética
13.
BMC Bioinformatics ; 22(1): 208, 2021 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-33882841

RESUMO

BACKGROUND: The Metabolic Network Explorer is a new addition to the BioCyc.org website and the Pathway Tools software suite that supports the interactive exploration of metabolic networks. Any metabolic network visualization tool must by necessity show only a subset of all possible metabolite connections, or the results will be visually overwhelming. Existing tools, even those that purport to show an organism's full metabolic network, limit the set of displayed connections based on predefined pathways or other preselected criteria. We sought instead to provide a tool that would give the user dynamic control over which connections to follow. RESULTS: The Metabolic Network Explorer is an easy-to-use, web-based software tool that allows the user to specify a starting metabolite of interest and interactively explore its immediate metabolic neighborhood in either or both directions to any desired depth, letting the user select from the full set of connected reactions. Although, as for other tools, only a small portion of the metabolic network is visible at a time, that portion is selected by the user, based on the full reaction complement, and it is easy to switch among alternate paths of interest. The display is intuitive, customizable, and provides copious links to more detailed information pages. CONCLUSIONS: The Metabolic Network Explorer fills a gap in the set of metabolic network visualization tools and complements other modes of exploration. Its primary strengths are its ease of use, diagrams that are intuitive to biologists, and its integration with the broader corpus of data provided by a BioCyc Pathway/Genome Database.


Assuntos
Redes e Vias Metabólicas , Software , Internet
14.
Front Microbiol ; 12: 614355, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33763039

RESUMO

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K-12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.

15.
BMC Genomics ; 22(1): 191, 2021 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-33726670

RESUMO

BACKGROUND: Enrichment or over-representation analysis is a common method used in bioinformatics studies of transcriptomics, metabolomics, and microbiome datasets. The key idea behind enrichment analysis is: given a set of significantly expressed genes (or metabolites), use that set to infer a smaller set of perturbed biological pathways or processes, in which those genes (or metabolites) play a role. Enrichment computations rely on collections of defined biological pathways and/or processes, which are usually drawn from pathway databases. Although practitioners of enrichment analysis take great care to employ statistical corrections (e.g., for multiple testing), they appear unaware that enrichment results are quite sensitive to the pathway definitions that the calculation uses. RESULTS: We show that alternative pathway definitions can alter enrichment p-values by up to nine orders of magnitude, whereas statistical corrections typically alter enrichment p-values by only two orders of magnitude. We present multiple examples where the smaller pathway definitions used in the EcoCyc database produces stronger enrichment p-values than the much larger pathway definitions used in the KEGG database; we demonstrate that to attain a given enrichment p-value, KEGG-based enrichment analyses require 1.3-2.0 times as many significantly expressed genes as does EcoCyc-based enrichment analyses. The large pathways in KEGG are problematic for another reason: they blur together multiple (as many as 21) biological processes. When such a KEGG pathway receives a high enrichment p-value, which of its component processes is perturbed is unclear, and thus the biological conclusions drawn from enrichment of large pathways are also in question. CONCLUSIONS: The choice of pathway database used in enrichment analyses can have a much stronger effect on the enrichment results than the statistical corrections used in these analyses.


Assuntos
Biologia Computacional , Metabolômica , Bases de Dados Factuais
16.
Metabolites ; 11(2)2021 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-33499002

RESUMO

Metabolomics, synthetic biology, and microbiome research demand information about organism-scale metabolic networks. The convergence of genome sequencing and computational inference of metabolic networks has enabled great progress toward satisfying that demand by generating metabolic reconstructions from the genomes of thousands of sequenced organisms. Visualization of whole metabolic networks is critical for aiding researchers in understanding, analyzing, and exploiting those reconstructions. We have developed bioinformatics software tools that automatically generate a full metabolic-network diagram for an organism, and that enable searching and analyses of the network. The software generates metabolic-network diagrams for unicellular organisms, for multi-cellular organisms, and for pan-genomes and organism communities. Search tools enable users to find genes, metabolites, enzymes, reactions, and pathways within a diagram. The diagrams are zoomable to enable researchers to study local neighborhoods in detail and to see the big picture. The diagrams also serve as tools for comparison of metabolic networks and for interpreting high-throughput datasets, including transcriptomics, metabolomics, and reaction fluxes computed by metabolic models. These data can be overlaid on the metabolic charts to produce animated zoomable displays of metabolic flux and metabolite abundance. The BioCyc.org website contains whole-network diagrams for more than 18,000 sequenced organisms. The ready availability of organism-specific metabolic network diagrams and associated tools for almost any sequenced organism are useful for researchers working to better understand the metabolism of their organism and to interpret high-throughput datasets in a metabolic context.

17.
Brief Bioinform ; 22(1): 109-126, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31813964

RESUMO

MOTIVATION: Biological systems function through dynamic interactions among genes and their products, regulatory circuits and metabolic networks. Our development of the Pathway Tools software was motivated by the need to construct biological knowledge resources that combine these many types of data, and that enable users to find and comprehend data of interest as quickly as possible through query and visualization tools. Further, we sought to support the development of metabolic flux models from pathway databases, and to use pathway information to leverage the interpretation of high-throughput data sets. RESULTS: In the past 4 years we have enhanced the already extensive Pathway Tools software in several respects. It can now support metabolic-model execution through the Web, it provides a more accurate gap filler for metabolic models; it supports development of models for organism communities distributed across a spatial grid; and model results may be visualized graphically. Pathway Tools supports several new omics-data analysis tools including the Omics Dashboard, multi-pathway diagrams called pathway collages, a pathway-covering algorithm for metabolomics data analysis and an algorithm for generating mechanistic explanations of multi-omics data. We have also improved the core pathway/genome databases management capabilities of the software, providing new multi-organism search tools for organism communities, improved graphics rendering, faster performance and re-designed gene and metabolite pages. AVAILABILITY: The software is free for academic use; a fee is required for commercial use. See http://pathwaytools.com. CONTACT: pkarp@ai.sri.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
Genômica/métodos , Metabolômica/métodos , Software/normas , Biologia de Sistemas/métodos , Animais , Humanos
19.
mBio ; 11(5)2020 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-32994326

RESUMO

Central metabolism is a topic that has been studied for decades, and yet, this process is still not fully understood in Escherichia coli, perhaps the most amenable and well-studied model organism in biology. To further our understanding, we used a high-throughput method to measure the growth kinetics of each of 3,796 E. coli single-gene deletion mutants in 30 different carbon sources. In total, there were 342 genes (9.01%) encompassing a breadth of biological functions that showed a growth phenotype on at least 1 carbon source, demonstrating that carbon metabolism is closely linked to a large number of processes in the cell. We identified 74 genes that showed low growth in 90% of conditions, defining a set of genes which are essential in nutrient-limited media, regardless of the carbon source. The data are compiled into a Web application, Carbon Phenotype Explorer (CarPE), to facilitate easy visualization of growth curves for each mutant strain in each carbon source. Our experimental data matched closely with the predictions from the EcoCyc metabolic model which uses flux balance analysis to predict growth phenotypes. From our comparisons to the model, we found that, unexpectedly, phosphoenolpyruvate carboxylase (ppc) was required for robust growth in most carbon sources other than most trichloroacetic acid (TCA) cycle intermediates. We also identified 51 poorly annotated genes that showed a low growth phenotype in at least 1 carbon source, which allowed us to form hypotheses about the functions of these genes. From this list, we further characterized the ydhC gene and demonstrated its role in adenosine efflux.IMPORTANCE While there has been much study of bacterial gene dispensability, there is a lack of comprehensive genome-scale examinations of the impact of gene deletion on growth in different carbon sources. In this context, a lot can be learned from such experiments in the model microbe Escherichia coli where much is already understood and there are existing tools for the investigation of carbon metabolism and physiology (1). Gene deletion studies have practical potential in the field of antibiotic drug discovery where there is emerging interest in bacterial central metabolism as a target for new antibiotics (2). Furthermore, some carbon utilization pathways have been shown to be critical for initiating and maintaining infection for certain pathogens and sites of infection (3-5). Here, with the use of high-throughput solid medium phenotyping methods, we have generated kinetic growth measurements for 3,796 genes under 30 different carbon source conditions. This data set provides a foundation for research that will improve our understanding of genes with unknown function, aid in predicting potential antibiotic targets, validate and advance metabolic models, and help to develop our understanding of E. coli metabolism.


Assuntos
Carbono/metabolismo , Meios de Cultura/química , Proteínas de Escherichia coli/genética , Escherichia coli/crescimento & desenvolvimento , Escherichia coli/genética , Deleção de Genes , Regulação Bacteriana da Expressão Gênica , Cinética , Mutação , Fenótipo
20.
Science ; 369(6502)2020 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-32703847

RESUMO

The extensive heterogeneity of biological data poses challenges to analysis and interpretation. Construction of a large-scale mechanistic model of Escherichia coli enabled us to integrate and cross-evaluate a massive, heterogeneous dataset based on measurements reported by various groups over decades. We identified inconsistencies with functional consequences across the data, including that the total output of the ribosomes and RNA polymerases described by data are not sufficient for a cell to reproduce measured doubling times, that measured metabolic parameters are neither fully compatible with each other nor with overall growth, and that essential proteins are absent during the cell cycle-and the cell is robust to this absence. Finally, considering these data as a whole leads to successful predictions of new experimental outcomes, in this case protein half-lives.


Assuntos
Análise de Dados , Conjuntos de Dados como Assunto , Proteínas de Escherichia coli , Escherichia coli , Simulação por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...