Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 116
Filtrar
1.
mSystems ; : e0026724, 2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38958457

RESUMO

Are two adjacent genes in the same operon? What are the order and spacing between several transcription factor binding sites? Genome browsers are software data visualization and exploration tools that enable biologists to answer questions such as these. In this paper, we report on a major update to our browser, Genome Explorer, that provides nearly instantaneous scaling and traversing of a genome, enabling users to quickly and easily zoom into an area of interest. The user can rapidly move between scales that depict the entire genome, individual genes, and the sequence; Genome Explorer presents the most relevant detail and context for each scale. By downloading the data for the entire genome to the user's web browser and dynamically generating visualizations locally, we enable fine control of zoom and pan functions and real-time redrawing of the visualization, resulting in smoother and more intuitive exploration of a genome than is possible with other browsers. Further, genome features are presented together, in-line, using familiar graphical depictions. In contrast, many other browsers depict genome features using data tracks, which have low information density and can visually obscure the relative positions of features. Genome Explorer diagrams have a high information density that provides larger amounts of genome context and sequence information to be presented in a given-sized monitor than for tracks-based browsers. Genome Explorer provides optional data tracks for the analysis of large-scale data sets and a unique comparative mode that aligns genomes at orthologous genes with synchronized zooming. IMPORTANCE: Genome browsers provide graphical depictions of genome information to speed the uptake of complex genome data by scientists. They provide search operations to help scientists find information and zoom operations to enable scientists to view genome features at different resolutions. We introduce the Genome Explorer browser, which provides extremely fast zooming and panning of genome visualizations and displays with high information density.

2.
bioRxiv ; 2024 Jun 12.
Artigo em Inglês | MEDLINE | ID: mdl-38915637

RESUMO

The Comparative Genome Dashboard is a web-based software tool for interactive exploration of the similarities and differences in gene functions between organisms. It provides a high-level graphical survey of cellular functions, and enables the user to drill down to examine subsystems of interest in greater detail. At its highest level the Comparative Dashboard contains panels for cellular systems such as biosynthesis, energy metabolism, transport, and response to stimulus. Each panel contains a set of bar graphs that plot the numbers of compounds or gene products for each organism across a set of subsystems of that panel. Users can interactively drill down to focus on subsystems of interest and see grids of compounds produced or consumed by each organism, specific GO term assignments, pathway diagrams, and links to more detailed comparison pages. For example, the dashboard enables users to compare the cofactors that a set of organisms can synthesize, the metal ions that they are able to transport, their DNA damage repair capabilities, their biofilm-formation genes, and their viral response proteins. The dashboard enables users to quickly perform comprehensive comparisons at varying levels of detail.

3.
Front Microbiol ; 15: 1340413, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38357349

RESUMO

CyanoCyc is a web portal that integrates an exceptionally rich database collection of information about cyanobacterial genomes with an extensive suite of bioinformatics tools. It was developed to address the needs of the cyanobacterial research and biotechnology communities. The 277 annotated cyanobacterial genomes currently in CyanoCyc are supplemented with computational inferences including predicted metabolic pathways, operons, protein complexes, and orthologs; and with data imported from external databases, such as protein features and Gene Ontology (GO) terms imported from UniProt. Five of the genome databases have undergone manual curation with input from more than a dozen cyanobacteria experts to correct errors and integrate information from more than 1,765 published articles. CyanoCyc has bioinformatics tools that encompass genome, metabolic pathway and regulatory informatics; omics data analysis; and comparative analyses, including visualizations of multiple genomes aligned at orthologous genes, and comparisons of metabolic networks for multiple organisms. CyanoCyc is a high-quality, reliable knowledgebase that accelerates scientists' work by enabling users to quickly find accurate information using its powerful set of search tools, to understand gene function through expert mini-reviews with citations, to acquire information quickly using its interactive visualization tools, and to inform better decision-making for fundamental and applied research.

4.
Cell Syst ; 15(3): 227-245.e7, 2024 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-38417437

RESUMO

Many bacteria use operons to coregulate genes, but it remains unclear how operons benefit bacteria. We integrated E. coli's 788 polycistronic operons and 1,231 transcription units into an existing whole-cell model and found inconsistencies between the proposed operon structures and the RNA-seq read counts that the model was parameterized from. We resolved these inconsistencies through iterative, model-guided corrections to both datasets, including the correction of RNA-seq counts of short genes that were misreported as zero by existing alignment algorithms. The resulting model suggested two main modes by which operons benefit bacteria. For 86% of low-expression operons, adding operons increased the co-expression probabilities of their constituent proteins, whereas for 92% of high-expression operons, adding operons resulted in more stable expression ratios between the proteins. These simulations underscored the need for further experimental work on how operons reduce noise and synchronize both the expression timing and the quantity of constituent genes. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Escherichia coli , Óperon , Escherichia coli/genética , Óperon/genética , Bactérias/genética
5.
Metabolites ; 14(1)2024 Jan 19.
Artigo em Inglês | MEDLINE | ID: mdl-38276300

RESUMO

The Omics Dashboard is a software tool for interactive exploration and analysis of metabolomics, transcriptomics, proteomics, and multi-omics datasets. Organized as a hierarchy of cellular systems, the Dashboard at its highest level contains graphical panels for the full range of cellular systems, including biosynthesis, energy metabolism, and response to stimulus. Thus, the Dashboard top level surveys the state of the cell across a broad range of key systems in a single screen. Each Dashboard panel contains a series of X-Y plots depicting the aggregated omics data values relevant to different subsystems of that panel, e.g., subsystems within the biosynthesis panel include amino acid biosynthesis, carbohydrate biosynthesis and cofactor biosynthesis. Users can interactively drill down to focus in on successively lower-level subsystems of interest. In this article, we present for the first time the metabolomics analysis capabilities of the Omics Dashboard, along with significant new extensions to better accommodate metabolomics datasets, enable analysis and visualization of multi-omics datasets, and provide new data-filtering options.

6.
EcoSal Plus ; 11(1): eesp00022023, 2023 Dec 12.
Artigo em Inglês | MEDLINE | ID: mdl-37220074

RESUMO

EcoCyc is a bioinformatics database available online at EcoCyc.org that describes the genome and the biochemical machinery of Escherichia coli K-12 MG1655. The long-term goal of the project is to describe the complete molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists and for biologists who work with related microorganisms. The database includes information pages on each E. coli gene product, metabolite, reaction, operon, and metabolic pathway. The database also includes information on the regulation of gene expression, E. coli gene essentiality, and nutrient conditions that do or do not support the growth of E. coli. The website and downloadable software contain tools for the analysis of high-throughput data sets. In addition, a steady-state metabolic flux model is generated from each new version of EcoCyc and can be executed online. The model can predict metabolic flux rates, nutrient uptake rates, and growth rates for different gene knockouts and nutrient conditions. Data generated from a whole-cell model that is parameterized from the latest data on EcoCyc are also available. This review outlines the data content of EcoCyc and of the procedures by which this content is generated.


Assuntos
Escherichia coli K12 , Proteínas de Escherichia coli , Escherichia coli/genética , Escherichia coli/metabolismo , Escherichia coli K12/genética , Bases de Dados Genéticas , Software , Biologia Computacional , Proteínas de Escherichia coli/metabolismo
8.
Database (Oxford) ; 20222022 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-36520791

RESUMO

This article offers thoughts on reviewing grant proposals for biological knowledgebases and databases (KDs) in the hope of aiding grant reviewers and applicants in addressing the issue of innovation. Assessing such grant proposals involves a number of subtleties that are worthy of discussion, particularly for new reviewers and applicants. In part, this article is motivated by the release of two funding opportunity announcements by the US National Institutes of Health concerning KDs. We find that the amount of innovation required for different KD projects can vary significantly, particularly depending on where in its life cycle a given project is. Strong innovation is not necessarily required to have an impactful KD project. For example, PubMed has low innovation but high impact. The importance of innovation should be weighted differently for different KD projects depending on the challenges they face and their maturity. The score for the overall impact of a grant proposal might have little dependence on the innovation score, such as for a mature project that is already delivering strong impact.


Assuntos
Disciplinas das Ciências Biológicas , Organização do Financiamento , Estados Unidos , National Institutes of Health (U.S.) , Bases de Conhecimento
9.
Front Bioinform ; 2: 869150, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36304298

RESUMO

The Pathway Tools (PTools) software provides a suite of capabilities for storing and analyzing integrated collections of genomic and metabolic information in the form of organism-specific Pathway/Genome Databases (PGDBs). A microbial community is represented in PTools by generating a PGDB from each metagenome-assembled genome (MAG). PTools computes a metabolic reconstruction for each organism, and predicts its operons. The properties of individual MAGs can be investigated using the many search and visualization operations within PTools. PTools also enables the user to investigate the properties of the microbial community by issuing searches across the full community, and by performing comparative operations across genome and pathway information. The software can generate a metabolic network diagram for the community, and it can overlay community omics datasets on that network diagram. PTools also provides a tool for searching for metabolic transformation routes across an organism community.

10.
Database (Oxford) ; 20222022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35961013

RESUMO

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.


Assuntos
Genômica , Proteínas , Sequência de Bases , Biologia Computacional , Genoma , Anotação de Sequência Molecular
11.
mSystems ; 7(5): e0029322, 2022 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-35968975

RESUMO

Animals colonized with a defined microbiota represent useful experimental systems to investigate microbiome function. The altered Schaedler flora (ASF) represents a consortium of eight murine bacterial species that have been used for more than 4 decades where the study of mice with a reduced microbiota is desired. In contrast to germ-free mice, or mice colonized with only one or two species, ASF mice show the normal gut structure and immune system development. To further expand the utility of the ASF, we have developed technical and bioinformatic resources to enable a systems-based analysis of microbiome function using this model. Here, we highlighted four distinct applications of these resources that enable and improve (i) measurements of the abundance of each ASF member by quantitative PCR; (ii) exploration and comparative analysis of ASF genomes and the metabolic pathways they encode that comprise the entire gut microbiome; (iii) global transcriptional profiling to identify genes whose expression responds to environmental changes within the gut; and (iv) discovery of genetic changes resulting from the evolutionary adaptation of the microbiota. These resources were designed to be accessible to a broad community of researchers that, in combination with conventionally-reared mice (i.e., with complex microbiome), should contribute to our understanding of microbiome structure and function. IMPORTANCE Improved experimental systems are needed to advance our understanding of how the gut microbiome influences processes of the mammalian host as well as microbial community structure and function. An approach that is receiving considerable attention is the use of animal models that harbor a stable microbiota of known composition, i.e., defined microbiota, which enables control over an otherwise highly complex and variable feature of mammalian biology. The altered Schaedler flora (ASF) consortium is a well-established defined microbiota model, where mice are stably colonized with 8 distinct murine bacterial species. To take better advantage of the ASF, we established new experimental and bioinformatics resources for researchers to make better use of this model as an experimental system to study microbiome function.


Assuntos
Microbioma Gastrointestinal , Microbiota , Animais , Camundongos , Microbiota/genética , Modelos Animais de Doenças , Microbioma Gastrointestinal/genética , Bactérias/genética , Reação em Cadeia da Polimerase , Mamíferos/genética
12.
Methods Mol Biol ; 2349: 259-289, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34718999

RESUMO

The MetaFlux software supports creating, executing, and solving quantitative metabolic flux models using flux balance analysis (FBA). MetaFlux offers four modes of operation: (1) solving mode executes an FBA model for an individual organism or for an organism community, (2) gene knockout mode executes an FBA model with one or many gene knockouts, (3) development mode assists the user in creating and improving FBA models, and (4) flux variability analysis mode generates a report of the robustness of an FBA model. MetaFlux also solves dynamic FBA (dFBA) for both individual organisms and communities of organisms. MetaFlux can be used in two different environments: on your local computer, which requires the installation of the Pathway Tools software, or through the web, which does not require installation of Pathway Tools. On your local computer, MetaFlux offers all four modes of operation, whereas the web environment provides only the solving mode.Several visualization tools are available to analyze model solutions. The Cellular Overview tool graphically shows the reaction fluxes on an organism's metabolic map once a model is solved. The Omics Dashboard provides a hierarchical approach to visualizing reaction fluxes, organized by metabolic subsystems. For a community of organisms, plotting of accumulated biomasses and metabolites can be performed using the Gnuplot tool.In this chapter, we present eight methods using MetaFlux. Five solving mode methods illustrate execution of models for individual organisms and for organism communities. One method illustrates the gene knockout mode. Two methods for the development mode illustrate steps for developing new metabolic models.


Assuntos
Redes e Vias Metabólicas , Modelos Biológicos , Software , Algoritmos , Biomassa , Técnicas de Inativação de Genes , Análise do Fluxo Metabólico
13.
Front Microbiol ; 12: 711077, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34394059

RESUMO

The EcoCyc model-organism database collects and summarizes experimental data for Escherichia coli K-12. EcoCyc is regularly updated by the manual curation of individual database entries, such as genes, proteins, and metabolic pathways, and by the programmatic addition of results from select high-throughput analyses. Updates to the Pathway Tools software that supports EcoCyc and to the web interface that enables user access have continuously improved its usability and expanded its functionality. This article highlights recent improvements to the curated data in the areas of metabolism, transport, DNA repair, and regulation of gene expression. New and revised data analysis and visualization tools include an interactive metabolic network explorer, a circular genome viewer, and various improvements to the speed and usability of existing tools.

14.
J Integr Plant Biol ; 63(11): 1888-1905, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34403192

RESUMO

To understand and engineer plant metabolism, we need a comprehensive and accurate annotation of all metabolic information across plant species. As a step towards this goal, we generated genome-scale metabolic pathway databases of 126 algal and plant genomes, ranging from model organisms to crops to medicinal plants (https://plantcyc.org). Of these, 104 have not been reported before. We systematically evaluated the quality of the databases, which revealed that our semi-automated validation pipeline dramatically improves the quality. We then compared the metabolic content across the 126 organisms using multiple correspondence analysis and found that Brassicaceae, Poaceae, and Chlorophyta appeared as metabolically distinct groups. To demonstrate the utility of this resource, we used recently published sorghum transcriptomics data to discover previously unreported trends of metabolism underlying drought tolerance. We also used single-cell transcriptomics data from the Arabidopsis root to infer cell type-specific metabolic pathways. This work shows the quality and quantity of our resource and demonstrates its wide-ranging utility in integrating metabolism with other areas of plant biology.


Assuntos
Bases de Dados Factuais , Redes e Vias Metabólicas , Plantas/metabolismo , Viridiplantae/metabolismo , Genoma de Planta , Plantas/genética
15.
BMC Bioinformatics ; 22(1): 208, 2021 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-33882841

RESUMO

BACKGROUND: The Metabolic Network Explorer is a new addition to the BioCyc.org website and the Pathway Tools software suite that supports the interactive exploration of metabolic networks. Any metabolic network visualization tool must by necessity show only a subset of all possible metabolite connections, or the results will be visually overwhelming. Existing tools, even those that purport to show an organism's full metabolic network, limit the set of displayed connections based on predefined pathways or other preselected criteria. We sought instead to provide a tool that would give the user dynamic control over which connections to follow. RESULTS: The Metabolic Network Explorer is an easy-to-use, web-based software tool that allows the user to specify a starting metabolite of interest and interactively explore its immediate metabolic neighborhood in either or both directions to any desired depth, letting the user select from the full set of connected reactions. Although, as for other tools, only a small portion of the metabolic network is visible at a time, that portion is selected by the user, based on the full reaction complement, and it is easy to switch among alternate paths of interest. The display is intuitive, customizable, and provides copious links to more detailed information pages. CONCLUSIONS: The Metabolic Network Explorer fills a gap in the set of metabolic network visualization tools and complements other modes of exploration. Its primary strengths are its ease of use, diagrams that are intuitive to biologists, and its integration with the broader corpus of data provided by a BioCyc Pathway/Genome Database.


Assuntos
Redes e Vias Metabólicas , Software , Internet
16.
BMC Genomics ; 22(1): 191, 2021 Mar 16.
Artigo em Inglês | MEDLINE | ID: mdl-33726670

RESUMO

BACKGROUND: Enrichment or over-representation analysis is a common method used in bioinformatics studies of transcriptomics, metabolomics, and microbiome datasets. The key idea behind enrichment analysis is: given a set of significantly expressed genes (or metabolites), use that set to infer a smaller set of perturbed biological pathways or processes, in which those genes (or metabolites) play a role. Enrichment computations rely on collections of defined biological pathways and/or processes, which are usually drawn from pathway databases. Although practitioners of enrichment analysis take great care to employ statistical corrections (e.g., for multiple testing), they appear unaware that enrichment results are quite sensitive to the pathway definitions that the calculation uses. RESULTS: We show that alternative pathway definitions can alter enrichment p-values by up to nine orders of magnitude, whereas statistical corrections typically alter enrichment p-values by only two orders of magnitude. We present multiple examples where the smaller pathway definitions used in the EcoCyc database produces stronger enrichment p-values than the much larger pathway definitions used in the KEGG database; we demonstrate that to attain a given enrichment p-value, KEGG-based enrichment analyses require 1.3-2.0 times as many significantly expressed genes as does EcoCyc-based enrichment analyses. The large pathways in KEGG are problematic for another reason: they blur together multiple (as many as 21) biological processes. When such a KEGG pathway receives a high enrichment p-value, which of its component processes is perturbed is unclear, and thus the biological conclusions drawn from enrichment of large pathways are also in question. CONCLUSIONS: The choice of pathway database used in enrichment analyses can have a much stronger effect on the enrichment results than the statistical corrections used in these analyses.


Assuntos
Biologia Computacional , Metabolômica , Bases de Dados Factuais
17.
Front Microbiol ; 12: 614355, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33763039

RESUMO

Updating genome databases to reflect newly published molecular findings for an organism was hard enough when only a single strain of a given organism had been sequenced. With multiple sequenced strains now available for many organisms, the challenge has grown significantly because of the still-limited resources available for the manual curation that corrects errors and captures new knowledge. We have developed a method to automatically propagate multiple types of curated knowledge from genes and proteins in one genome database to their orthologs in uncurated databases for related strains, imposing several quality-control filters to reduce the chances of introducing errors. We have applied this method to propagate information from the highly curated EcoCyc database for Escherichia coli K-12 to databases for 480 other Escherichia coli strains in the BioCyc database collection. The increase in value and utility of the target databases after propagation is considerable. Target databases received updates for an average of 2,535 proteins each. In addition to widespread addition and regularization of gene and protein names, 97% of the target databases were improved by the addition of at least 200 new protein complexes, at least 800 new or updated reaction assignments, and at least 2,400 sets of GO annotations.

18.
Metabolites ; 11(2)2021 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-33499002

RESUMO

Metabolomics, synthetic biology, and microbiome research demand information about organism-scale metabolic networks. The convergence of genome sequencing and computational inference of metabolic networks has enabled great progress toward satisfying that demand by generating metabolic reconstructions from the genomes of thousands of sequenced organisms. Visualization of whole metabolic networks is critical for aiding researchers in understanding, analyzing, and exploiting those reconstructions. We have developed bioinformatics software tools that automatically generate a full metabolic-network diagram for an organism, and that enable searching and analyses of the network. The software generates metabolic-network diagrams for unicellular organisms, for multi-cellular organisms, and for pan-genomes and organism communities. Search tools enable users to find genes, metabolites, enzymes, reactions, and pathways within a diagram. The diagrams are zoomable to enable researchers to study local neighborhoods in detail and to see the big picture. The diagrams also serve as tools for comparison of metabolic networks and for interpreting high-throughput datasets, including transcriptomics, metabolomics, and reaction fluxes computed by metabolic models. These data can be overlaid on the metabolic charts to produce animated zoomable displays of metabolic flux and metabolite abundance. The BioCyc.org website contains whole-network diagrams for more than 18,000 sequenced organisms. The ready availability of organism-specific metabolic network diagrams and associated tools for almost any sequenced organism are useful for researchers working to better understand the metabolism of their organism and to interpret high-throughput datasets in a metabolic context.

19.
Brief Bioinform ; 22(1): 109-126, 2021 01 18.
Artigo em Inglês | MEDLINE | ID: mdl-31813964

RESUMO

MOTIVATION: Biological systems function through dynamic interactions among genes and their products, regulatory circuits and metabolic networks. Our development of the Pathway Tools software was motivated by the need to construct biological knowledge resources that combine these many types of data, and that enable users to find and comprehend data of interest as quickly as possible through query and visualization tools. Further, we sought to support the development of metabolic flux models from pathway databases, and to use pathway information to leverage the interpretation of high-throughput data sets. RESULTS: In the past 4 years we have enhanced the already extensive Pathway Tools software in several respects. It can now support metabolic-model execution through the Web, it provides a more accurate gap filler for metabolic models; it supports development of models for organism communities distributed across a spatial grid; and model results may be visualized graphically. Pathway Tools supports several new omics-data analysis tools including the Omics Dashboard, multi-pathway diagrams called pathway collages, a pathway-covering algorithm for metabolomics data analysis and an algorithm for generating mechanistic explanations of multi-omics data. We have also improved the core pathway/genome databases management capabilities of the software, providing new multi-organism search tools for organism communities, improved graphics rendering, faster performance and re-designed gene and metabolite pages. AVAILABILITY: The software is free for academic use; a fee is required for commercial use. See http://pathwaytools.com. CONTACT: pkarp@ai.sri.com. SUPPLEMENTARY INFORMATION: Supplementary data are available at Briefings in Bioinformatics online.


Assuntos
Genômica/métodos , Metabolômica/métodos , Software/normas , Biologia de Sistemas/métodos , Animais , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA