Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Apr 28.
Artigo em Inglês | MEDLINE | ID: mdl-38712217

RESUMO

A critical body of knowledge has developed through advances in protein microscopy, protein-fold modeling, structural biology software, availability of sequenced bacterial genomes, large-scale mutation databases, and genome-scale models. Based on these recent advances, we develop a computational framework that; i) identifies the oligomeric structural proteome encoded by an organism's genome from available structural resources; ii) maps multi-strain alleleomic variation, resulting in the structural proteome for a species; and iii) calculates the 3D orientation of proteins across subcellular compartments with residue-level precision. Using the platform, we; iv) compute the quaternary E. coli K-12 MG1655 structural proteome; v) use a dataset of 12,000 mutations to build Random Forest classifiers that can predict the severity of mutations; and, in combination with a genome-scale model that computes proteome allocation, vi) obtain the spatial allocation of the E. coli proteome. Thus, in conjunction with relevant datasets and increasingly accurate computational models, we can now annotate quaternary structural proteomes, at genome-scale, to obtain a molecular-level understanding of whole-cell functions. Significance: Advancements in experimental and computational methods have revealed the shapes of multi-subunit proteins. The absence of a unified platform that maps actionable datatypes onto these increasingly accurate structures creates a barrier to structural analyses, especially at the genome-scale. Here, we describe QSPACE, a computational annotation platform that evaluates existing resources to identify the best-available structure for each protein in a user's query, maps the 3D location of actionable datatypes ( e.g. , active sites, published mutations) onto the selected structures, and uses third-party APIs to determine the subcellular compartment of all amino acids of a protein. As proof-of-concept, we deployed QSPACE to generate the quaternary structural proteome of E. coli MG1655 and demonstrate two use-cases involving large-scale mutant analysis and genome-scale modelling.

2.
Res Sq ; 2023 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-37292890

RESUMO

A critical body of knowledge has developed through advances in protein microscopy, protein-fold modeling, structural biology software, availability of sequenced bacterial genomes, large-scale mutation databases, and genome-scale models. Based on these recent advances, we develop a computational platform that; i) computes the oligomeric structural proteome encoded by an organism's genome; ii) maps multi-strain alleleomic variation, resulting in the structural proteome for a species; and iii) calculates the 3D orientation of proteins across subcellular compartments with angstrom-level precision. Using the platform, we; iv) compute the full quaternary E. coli K-12 MG1655 structural proteome; v) deploy structure-guided analyses to identify consequential mutations; and, in combination with a genome-scale model that computes proteome allocation, vi) obtain a draft 3D visualization of the proteome in a functioning cell. Thus, in conjunction with relevant datasets and computational models, we can now resolve genome-scale structural proteomes to obtain an angstrom-level understanding of whole-cell functions.

3.
Proc Natl Acad Sci U S A ; 120(15): e2218835120, 2023 04 11.
Artigo em Inglês | MEDLINE | ID: mdl-37011218

RESUMO

The genomic diversity across strains of a species forms the genetic basis for differences in their behavior. A large-scale assessment of sequence variation has been made possible by the growing availability of strain-specific whole-genome sequences (WGS) and with the advent of large-scale databases of laboratory-acquired mutations. We define the Escherichia coli "alleleome" through a genome-scale assessment of amino acid (AA) sequence diversity in open reading frames across 2,661 WGS from wild-type strains. We observe a highly conserved alleleome enriched in mutations unlikely to affect protein function. In contrast, 33,000 mutations acquired in laboratory evolution experiments result in more severe AA substitutions that are rarely achieved by natural selection. Large-scale assessment of the alleleome establishes a method for the quantification of bacterial allelic diversity, reveals opportunities for synthetic biology to explore novel sequence space, and offers insights into the constraints governing evolution.


Assuntos
Escherichia coli , Variação Genética , Mutação , Escherichia coli/genética , Genoma Bacteriano/genética , Sequência de Aminoácidos
4.
BMC Bioinformatics ; 21(1): 162, 2020 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-32349661

RESUMO

BACKGROUND: The reconstruction of metabolic networks and the three-dimensional coverage of protein structures have reached the genome-scale in the widely studied Escherichia coli K-12 MG1655 strain. The combination of the two leads to the formation of a structural systems biology framework, which we have used to analyze differences between the reactive oxygen species (ROS) sensitivity of the proteomes of sequenced strains of E. coli. As proteins are one of the main targets of oxidative damage, understanding how the genetic changes of different strains of a species relates to its oxidative environment can reveal hypotheses as to why these variations arise and suggest directions of future experimental work. RESULTS: Creating a reference structural proteome for E. coli allows us to comprehensively map genetic changes in 1764 different strains to their locations on 4118 3D protein structures. We use metabolic modeling to predict basal ROS production levels (ROStype) for 695 of these strains, finding that strains with both higher and lower basal levels tend to enrich their proteomes with antioxidative properties, and speculate as to why that is. We computationally assess a strain's sensitivity to an oxidative environment, based on known chemical mechanisms of oxidative damage to protein groups, defined by their localization and functionality. Two general groups - metalloproteins and periplasmic proteins - show enrichment of their antioxidative properties between the 695 strains with a predicted ROStype as well as 116 strains with an assigned pathotype. Specifically, proteins that a) utilize a molybdenum ion as a cofactor and b) are involved in the biogenesis of fimbriae show intriguing protective properties to resist oxidative damage. Overall, these findings indicate that a strain's sensitivity to oxidative damage can be elucidated from the structural proteome, though future experimental work is needed to validate our model assumptions and findings. CONCLUSION: We thus demonstrate that structural systems biology enables a proteome-wide, computational assessment of changes to atomic-level physicochemical properties and of oxidative damage mechanisms for multiple strains in a species. This integrative approach opens new avenues to study adaptation to a particular environment based on physiological properties predicted from sequence alone.


Assuntos
Adaptação Fisiológica , Escherichia coli K12/fisiologia , Estresse Oxidativo , Proteoma/metabolismo , Antioxidantes/metabolismo , Proteínas de Escherichia coli/metabolismo , Fímbrias Bacterianas/metabolismo , Modelos Biológicos , Molibdênio/metabolismo , Óperon/genética , Oxirredução , Periplasma/metabolismo , Fenótipo , Espécies Reativas de Oxigênio/metabolismo
5.
Mol Biol Evol ; 37(3): 660-667, 2020 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-31651953

RESUMO

Oxidative stress is concomitant with aerobic metabolism. Thus, bacterial genomes encode elaborate mechanisms to achieve redox homeostasis. Here we report that the peroxide-sensing transcription factor, oxyR, is a common mutational target using bacterial species belonging to two genera, Escherichia coli and Vibrio natriegens, in separate growth conditions implemented during laboratory evolution. The mutations clustered in the redox active site, dimer interface, and flexible redox loop of the protein. These mutations favor the oxidized conformation of OxyR that results in constitutive expression of the genes it regulates. Independent component analysis of the transcriptome revealed that the constitutive activity of OxyR reduces DNA damage from reactive oxygen species, as inferred from the activity of the SOS response regulator LexA. This adaptation to peroxide stress came at a cost of lower growth, as revealed by calculations of proteome allocation using genome-scale models of metabolism and macromolecular expression. Further, identification of similar sequence changes in natural isolates of E. coli indicates that adaptation to oxidative stress through genetic changes in oxyR can be a common occurrence.


Assuntos
Proteínas de Escherichia coli/genética , Escherichia coli/crescimento & desenvolvimento , Proteínas Repressoras/genética , Fatores de Transcrição/genética , Vibrio/crescimento & desenvolvimento , Adaptação Fisiológica , Proteínas de Bactérias/genética , Domínio Catalítico , Evolução Molecular Direcionada , Escherichia coli/genética , Proteínas de Escherichia coli/química , Regulação Bacteriana da Expressão Gênica , Modelos Moleculares , Mutação , Estresse Oxidativo , Conformação Proteica , Espécies Reativas de Oxigênio/metabolismo , Proteínas Repressoras/química , Fatores de Transcrição/química , Vibrio/genética
6.
Nat Microbiol ; 4(3): 386-389, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30692668

RESUMO

Pseudogenes represent open reading frames that have been damaged by mutations, rendering the gene product non-functional. Pseudogenes are found in many genomes and are not always eliminated, even if they are potentially 'wasteful'. This raises a fundamental question about their prevalence. Here we report pseudogene efeU repair that restores the iron uptake system of Escherichia coli under a designed selection pressure during adaptive laboratory evolution.


Assuntos
Reparo de Erro de Pareamento de DNA , Evolução Molecular Direcionada , Pseudogenes , Seleção Genética , Proteínas de Transporte de Cátions/genética , Escherichia coli/genética , Proteínas de Escherichia coli/genética , Evolução Molecular , Ferro/metabolismo , Fases de Leitura Aberta , Filogenia
7.
BMC Syst Biol ; 12(1): 143, 2018 12 17.
Artigo em Inglês | MEDLINE | ID: mdl-30558585

RESUMO

BACKGROUND: Essentiality assays are important tools commonly utilized for the discovery of gene functions. Growth/no growth screens of single gene knockout strain collections are also often utilized to test the predictive power of genome-scale models. False positive predictions occur when computational analysis predicts a gene to be non-essential, however experimental screens deem the gene to be essential. One explanation for this inconsistency is that the model contains the wrong information, possibly an incorrectly annotated alternative pathway or isozyme reaction. Inconsistencies could also be attributed to experimental limitations, such as growth tests with arbitrary time cut-offs. The focus of this study was to resolve such inconsistencies to better understand isozyme activities and gene essentiality. RESULTS: In this study, we explored the definition of conditional essentiality from a phenotypic and genomic perspective. Gene-deletion strains associated with false positive predictions of gene essentiality on defined minimal medium for Escherichia coli were targeted for extended growth tests followed by population sequencing and transcriptome analysis. Of the twenty false positive strains available and confirmed from the Keio single gene knock-out collection, 11 strains were shown to grow with longer incubation periods making these actual true positives. These strains grew reproducibly with a diverse range of growth phenotypes. The lag phase observed for these strains ranged from less than one day to more than 7 days. It was found that 9 out of 11 of the false positive strains that grew acquired mutations in at least one replicate experiment and the types of mutations ranged from SNPs and small indels associated with regulatory or metabolic elements to large regions of genome duplication. Comparison of the detected adaptive mutations, modeling predictions of alternate pathways and isozymes, and transcriptome analysis of KO strains suggested agreement for the observed growth phenotype for 6 out of the 9 cases where mutations were observed. CONCLUSIONS: Longer-term growth experiments followed by whole genome sequencing and transcriptome analysis can provide a better understanding of conditional gene essentiality and mechanisms of adaptation to such perturbations. Compensatory mutations are largely reproducible mechanisms and are in agreement with genome-scale modeling predictions to loss of function gene deletion events.


Assuntos
Adaptação Fisiológica/genética , Escherichia coli/genética , Escherichia coli/fisiologia , Genes Essenciais/genética , Escherichia coli/enzimologia , Evolução Molecular , Perfilação da Expressão Gênica , Genômica , Isoenzimas/metabolismo , Mutação , Sequenciamento Completo do Genoma
8.
Nat Commun ; 9(1): 4306, 2018 10 17.
Artigo em Inglês | MEDLINE | ID: mdl-30333483

RESUMO

Mycobacterium tuberculosis is a serious human pathogen threat exhibiting complex evolution of antimicrobial resistance (AMR). Accordingly, the many publicly available datasets describing its AMR characteristics demand disparate data-type analyses. Here, we develop a reference strain-agnostic computational platform that uses machine learning approaches, complemented by both genetic interaction analysis and 3D structural mutation-mapping, to identify signatures of AMR evolution to 13 antibiotics. This platform is applied to 1595 sequenced strains to yield four key results. First, a pan-genome analysis shows that M. tuberculosis is highly conserved with sequenced variation concentrated in PE/PPE/PGRS genes. Second, the platform corroborates 33 genes known to confer resistance and identifies 24 new genetic signatures of AMR. Third, 97 epistatic interactions across 10 resistance classes are revealed. Fourth, detailed structural analysis of these genes yields mechanistic bases for their selection. The platform can be used to study other human pathogens.


Assuntos
Farmacorresistência Bacteriana/genética , Genoma Bacteriano , Aprendizado de Máquina , Mycobacterium tuberculosis/genética , Frequência do Gene , Seleção Genética
9.
Nat Commun ; 9(1): 3771, 2018 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-30218022

RESUMO

Salmonella strains are traditionally classified into serovars based on their surface antigens. While increasing availability of whole-genome sequences has allowed for more detailed subtyping of strains, links between genotype, serovar, and host remain elusive. Here we reconstruct genome-scale metabolic models for 410 Salmonella strains spanning 64 serovars. Model-predicted growth capabilities in over 530 different environments demonstrate that: (1) the Salmonella accessory metabolic network includes alternative carbon metabolism, and cell wall biosynthesis; (2) metabolic capabilities correspond to each strain's serovar and isolation host; (3) growth predictions agree with 83.1% of experimental outcomes for 12 strains (690 out of 858); (4) 27 strains are auxotrophic for at least one compound, including L-tryptophan, niacin, L-histidine, L-cysteine, and p-aminobenzoate; and (5) the catabolic pathways that are important for fitness in the gastrointestinal environment are lost amongst extraintestinal serovars. Our results reveal growth differences that may reflect adaptation to particular colonization sites.


Assuntos
Genoma Bacteriano/genética , Redes e Vias Metabólicas/genética , Salmonella/genética , Sorogrupo , Parede Celular/metabolismo , Genótipo , Fenótipo , Salmonella/metabolismo
10.
PLoS Comput Biol ; 14(7): e1006302, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29975681

RESUMO

Genome-scale models of metabolism and macromolecular expression (ME-models) explicitly compute the optimal proteome composition of a growing cell. ME-models expand upon the well-established genome-scale models of metabolism (M-models), and they enable a new fundamental understanding of cellular growth. ME-models have increased predictive capabilities and accuracy due to their inclusion of the biosynthetic costs for the machinery of life, but they come with a significant increase in model size and complexity. This challenge results in models which are both difficult to compute and challenging to understand conceptually. As a result, ME-models exist for only two organisms (Escherichia coli and Thermotoga maritima) and are still used by relatively few researchers. To address these challenges, we have developed a new software framework called COBRAme for building and simulating ME-models. It is coded in Python and built on COBRApy, a popular platform for using M-models. COBRAme streamlines computation and analysis of ME-models. It provides tools to simplify constructing and editing ME-models to enable ME-model reconstructions for new organisms. We used COBRAme to reconstruct a condensed E. coli ME-model called iJL1678b-ME. This reformulated model gives functionally identical solutions to previous E. coli ME-models while using 1/6 the number of free variables and solving in less than 10 minutes, a marked improvement over the 6 hour solve time of previous ME-model formulations. Errors in previous ME-models were also corrected leading to 52 additional genes that must be expressed in iJL1678b-ME to grow aerobically in glucose minimal in silico media. This manuscript outlines the architecture of COBRAme and demonstrates how ME-models can be created, modified, and shared most efficiently using the new software framework.


Assuntos
Simulação por Computador , Expressão Gênica , Metabolismo/genética , Modelos Genéticos , Design de Software , Algoritmos , Genoma
11.
Bioinformatics ; 34(12): 2155-2157, 2018 06 15.
Artigo em Inglês | MEDLINE | ID: mdl-29444205

RESUMO

Summary: Working with protein structures at the genome-scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome-scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genome-scale models with protein structures (GEM-PROs), wrappers to popular third-party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows. Availability and implementation: ssbio is implemented in Python and available to download under the MIT license at http://github.com/SBRG/ssbio. Documentation and Jupyter notebook tutorials are available at http://ssbio.readthedocs.io/en/latest/. Interactive notebooks can be launched using Binder at https://mybinder.org/v2/gh/SBRG/ssbio/master?filepath=Binder.ipynb. Supplementary information: Supplementary data are available at Bioinformatics online.


Assuntos
Biologia Computacional/métodos , Modelos Biológicos , Conformação Proteica , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA