RESUMO
Proteoforms, the different forms of a protein with sequence variations including post-translational modifications (PTMs), execute vital functions in biological systems, such as cell signaling and epigenetic regulation. Advances in top-down mass spectrometry (MS) technology have permitted the direct characterization of intact proteoforms and their exact number of modification sites, allowing for the relative quantification of positional isomers (PI). Protein positional isomers refer to a set of proteoforms with identical total mass and set of modifications, but varying PTM site combinations. The relative abundance of PI can be estimated by matching proteoform-specific fragment ions to top-down tandem MS (MS2) data to localize and quantify modifications. However, the current approaches heavily rely on manual annotation. Here, we present IsoForma, an open-source R package for the relative quantification of PI within a single tool. Benchmarking IsoForma's performance against two existing workflows produced comparable results and improvements in speed. Overall, IsoForma provides a streamlined process for quantifying PI, reduces the analysis time, and offers an essential framework for developing customized proteoform analysis workflows. The software is open source and available at https://github.com/EMSL-Computing/isoforma-lib.
Assuntos
Espectrometria de Massa com Cromatografia Líquida , Isoformas de Proteínas , Processamento de Proteína Pós-Traducional , Software , Espectrometria de Massas em Tandem , Humanos , Isomerismo , Espectrometria de Massa com Cromatografia Líquida/métodos , Isoformas de Proteínas/análise , Proteômica/métodos , Espectrometria de Massas em Tandem/métodosRESUMO
PMart is a web-based tool for reproducible quality control, exploratory data analysis, statistical analysis, and interactive visualization of 'omics data, based on the functionality of the pmartR R package. The newly improved user interface supports more 'omics data types, additional statistical capabilities, and enhanced options for creating downloadable graphics. PMart supports the analysis of label-free and isobaric-labeled (e.g., TMT, iTRAQ) proteomics, nuclear magnetic resonance (NMR) and mass-spectrometry (MS)-based metabolomics, MS-based lipidomics, and ribonucleic acid sequencing (RNA-seq) transcriptomics data. At the end of a PMart session, a report is available that summarizes the processing steps performed and includes the pmartR R package functions used to execute the data processing. In addition, built-in safeguards in the backend code prevent users from utilizing methods that are inappropriate based on omics data type. PMart is a user-friendly interface for conducting exploratory data analysis and statistical comparisons of omics data without programming.
RESUMO
Visual examination of mass spectrometry data is necessary to assess data quality and to facilitate data exploration. Graphics provide the means to evaluate spectral properties, test alternative peptide/protein sequence matches, prepare annotated spectra for publication, and fine-tune parameters during wet lab procedures. Visual inspection of LC-MS data is constrained by proteomics visualization software designed for particular workflows or vendor-specific tools without open-source code. We built PSpecteR, an open-source and interactive R Shiny web application for visualization of LC-MS data, with support for several steps of proteomics data processing, including reading various mass spectrometry files, running open-source database search engines, labeling spectra with fragmentation patterns, testing post-translational modifications, plotting where identified fragments map to reference sequences, and visualizing algorithmic output and metadata. All figures, tables, and spectra are exportable within one easy-to-use graphical user interface. Our current software provides a flexible and modern R framework to support fast implementation of additional features. The open-source code is readily available (https://github.com/EMSL-Computing/PSpecteR), and a PSpecteR Docker container (https://hub.docker.com/r/emslcomputing/pspecter) is available for easy local installation.
Assuntos
Proteômica , Espectrometria de Massas em Tandem , Cromatografia Líquida , Proteínas , SoftwareRESUMO
The high-resolution and mass accuracy of Fourier transform mass spectrometry (FT-MS) has made it an increasingly popular technique for discerning the composition of soil, plant and aquatic samples containing complex mixtures of proteins, carbohydrates, lipids, lignins, hydrocarbons, phytochemicals and other compounds. Thus, there is a growing demand for informatics tools to analyze FT-MS data that will aid investigators seeking to understand the availability of carbon compounds to biotic and abiotic oxidation and to compare fundamental chemical properties of complex samples across groups. We present ftmsRanalysis, an R package which provides an extensive collection of data formatting and processing, filtering, visualization, and sample and group comparison functionalities. The package provides a suite of plotting methods and enables expedient, flexible and interactive visualization of complex datasets through functions which link to a powerful and interactive visualization user interface, Trelliscope. Example analysis using FT-MS data from a soil microbiology study demonstrates the core functionality of the package and highlights the capabilities for producing interactive visualizations.
Assuntos
Biologia Computacional/métodos , Análise de Fourier , Espectrometria de Massas , Software , Bases de Dados Factuais , Microbiologia do SoloRESUMO
RATIONALE: Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) is a preferred technique for analyzing complex organic mixtures. Currently, there is no consensus normalization approach, nor an objective method for selecting one, for quantitative analyses of FT-ICR-MS data. We investigate a method to evaluate and score the amount of bias various normalization approaches introduce into the data. METHODS: We evaluate the ability of the Statistical Procedure for the Analysis of Normalization Strategies (SPANS) to guide the selection of appropriate normalization approaches for two different FT-ICR-MS data sets. Furthermore, we test the robustness of SPANS results to changes in SPANS parameter values and assess the impact of using various normalization approaches on downstream statistical analyses. RESULTS: The normalization approach identified by SPANS differed for the two data sets. Normalization methods impacted the statistical significance of peaks differently, underscoring the importance of carefully evaluating potential methods. More consistent SPANS scores resulted when at least 120 significant peaks are used, where larger sets of peaks were obtained by increasing the p-value threshold. Interestingly, we show that total sum scaling and highest peak normalization, used in previous studies, underperformed relative to SPANS-recommended normalization approaches. CONCLUSIONS: Although there is no single, best normalization method for all data sets, SPANS provides a mechanism to identify an appropriate normalization method for analyzing FT-ICR-MS data quantitatively. The number of peaks used in the background distributions of SPANS contributes more significantly to the reproducibility of results than the p-value thresholds used to obtain those peaks.
RESUMO
BACKGROUND: Metagenomics studies provide valuable insight into the composition and function of microbial populations from diverse environments; however, the data processing pipelines that rely on mapping reads to gene catalogs or genome databases for cultured strains yield results that underrepresent the genes and functional potential of uncultured microbes. Recent improvements in sequence assembly methods have eased the reliance on genome databases, thereby allowing the recovery of genomes from uncultured microbes. However, configuring these tools, linking them with advanced binning and annotation tools, and maintaining provenance of the processing continues to be challenging for researchers. RESULTS: Here we present ATLAS, a software package for customizable data processing from raw sequence reads to functional and taxonomic annotations using state-of-the-art tools to assemble, annotate, quantify, and bin metagenome data. Abundance estimates at genome resolution are provided for each sample in a dataset. ATLAS is written in Python and the workflow implemented in Snakemake; it operates in a Linux environment, and is compatible with Python 3.5+ and Anaconda 3+ versions. The source code for ATLAS is freely available, distributed under a BSD-3 license. CONCLUSIONS: ATLAS provides a user-friendly, modular and customizable Snakemake workflow for metagenome data processing; it is easily installable with conda and maintained as open-source on GitHub at https://github.com/metagenome-atlas/atlas.
Assuntos
Metagenômica/métodos , Software , Metagenoma , Anotação de Sequência Molecular , Fluxo de TrabalhoRESUMO
Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography-MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
Assuntos
Cromatografia Líquida/estatística & dados numéricos , Espectrometria de Massas/estatística & dados numéricos , Proteínas/isolamento & purificação , Proteômica/estatística & dados numéricos , Animais , Cromatografia Líquida/métodos , Interpretação Estatística de Dados , Espectrometria de Massas/métodos , Camundongos , Proteínas/química , Proteômica/métodos , Controle de QualidadeRESUMO
BACKGROUND: The dominant fungi in arid grasslands and shrublands are members of the Ascomycota phylum. Ascomycota fungi are important drivers in carbon and nitrogen cycling in arid ecosystems. These fungi play roles in soil stability, plant biomass decomposition, and endophytic interactions with plants. They may also form symbiotic associations with biocrust components or be latent saprotrophs or pathogens that live on plant tissues. However, their functional potential in arid soils, where organic matter, nutrients and water are very low or only periodically available, is poorly characterized. RESULTS: Five Ascomycota fungi were isolated from different soil crust microhabitats and rhizosphere soils around the native bunchgrass Pleuraphis jamesii in an arid grassland near Moab, UT, USA. Putative genera were Coniochaeta, isolated from lichen biocrust, Embellisia from cyanobacteria biocrust, Chaetomium from below lichen biocrust, Phoma from a moss microhabitat, and Aspergillus from the soil. The fungi were grown in replicate cultures on different carbon sources (chitin, native bunchgrass or pine wood) relevant to plant biomass and soil carbon sources. Secretomes produced by the fungi on each substrate were characterized. Results demonstrate that these fungi likely interact with primary producers (biocrust or plants) by secreting a wide range of proteins that facilitate symbiotic associations. Each of the fungal isolates secreted enzymes that degrade plant biomass, small secreted effector proteins, and proteins involved in either beneficial plant interactions or virulence. Aspergillus and Phoma expressed more plant biomass degrading enzymes when grown in grass- and pine-containing cultures than in chitin. Coniochaeta and Embellisia expressed similar numbers of these enzymes under all conditions, while Chaetomium secreted more of these enzymes in grass-containing cultures. CONCLUSIONS: This study of Ascomycota genomes and secretomes provides important insights about the lifestyles and the roles that Ascomycota fungi likely play in arid grassland, ecosystems. However, the exact nature of those interactions, whether any or all of the isolates are true endophytes, latent saprotrophs or opportunistic phytopathogens, will be the topic of future studies.
Assuntos
Ascomicetos/classificação , Proteínas Fúngicas/metabolismo , Fenômenos Fisiológicos Vegetais , Plantas/microbiologia , Ascomicetos/genética , Ascomicetos/isolamento & purificação , Ascomicetos/fisiologia , Biomassa , Endófitos , Proteínas Fúngicas/genética , Genoma Fúngico , Filogenia , Proteômica , Microbiologia do Solo , Sequenciamento Completo do GenomaRESUMO
Summary: FQC is software that facilitates quality control of FASTQ files by carrying out a QC protocol using FastQC, parsing results, and aggregating quality metrics into an interactive dashboard designed to richly summarize individual sequencing runs. The dashboard groups samples in dropdowns for navigation among the data sets, utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data. Availability and Implementation: FQC is implemented in Python 3 and Javascript, and is maintained under an MIT license. Documentation and source code is available at: https://github.com/pnnl/fqc . Contact: joseph.brown@pnnl.gov.
RESUMO
Cyanobacterial regulation of gene expression must contend with a genome organization that lacks apparent functional context, as the majority of cellular processes and metabolic pathways are encoded by genes found at disparate locations across the genome and relatively few transcription factors exist. In this study, global transcript abundance data from the model cyanobacterium Synechococcus sp. PCC 7002 grown under 42 different conditions was analyzed using Context-Likelihood of Relatedness (CLR). The resulting network, organized into 11 modules, provided insight into transcriptional network topology as well as grouping genes by function and linking their response to specific environmental variables. When used in conjunction with genome sequences, the network allowed identification and expansion of novel potential targets of both DNA binding proteins and sRNA regulators. These results offer a new perspective into the multi-level regulation that governs cellular adaptations of the fast-growing physiologically robust cyanobacterium Synechococcus sp. PCC 7002 to changing environmental variables. It also provides a methodological high-throughput approach to studying multi-scale regulatory mechanisms that operate in cyanobacteria. Finally, it provides valuable context for integrating systems-level data to enhance gene grouping based on annotated function, especially in organisms where traditional context analyses cannot be implemented due to lack of operon-based functional organization.
Assuntos
Regulação Bacteriana da Expressão Gênica , Redes Reguladoras de Genes , Synechococcus/genética , Transcriptoma , Sítios de Ligação , Análise por Conglomerados , Perfilação da Expressão Gênica , Genoma Bacteriano , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Ligação Proteica , RNA não Traduzido , Synechococcus/metabolismo , Fatores de Transcrição/metabolismoRESUMO
Members of the cyanobacterial genus Cyanothece exhibit considerable variation in physiological and biochemical characteristics. The comparative assessment of the genomes and the proteomes has the potential to provide insights on differences among Cyanothece strains. By applying Sequedex, an annotation-independent method for ascribing gene functions, we confirmed significant species-specific differences of functional genes in different Cyanothece strains, particularly in Cyanothece PCC7425. Using a shotgun proteomics approach based on prefractionation and tandem mass spectrometry, we detected â¼28-48% of the theoretical Cyanothece proteome, depending on the strain. The expression of a total of 642 orthologous proteins was observed in all five Cyanothece strains. These shared orthologous proteins showed considerable correlations in their abundances across different Cyanothece strains. Functional classification indicated that the majority of proteins involved in central metabolic functions such as amino acid, carbohydrate, protein, and RNA metabolism, photosynthesis, respiration, and stress responses were observed to a greater extent in the core proteome, whereas proteins involved in membrane transport, iron acquisition, regulatory functions, flagellar motility, and chemotaxis were observed to a greater extent in the unique proteome. Considerable differences were evident across different Cyanothece strains. Notably, the analysis of Cyanothece PCC7425, which showed the highest number of unique proteins (682), provided direct evidence of evolutionary differences in this strain. We conclude that Cyanothece PCC7425 diverged significantly from the other Cyanothece strains or evolved from a different lineage.
Assuntos
Proteínas de Bactérias/metabolismo , Cyanothece/metabolismo , Proteoma/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/isolamento & purificação , Cromatografia por Troca Iônica , Cyanothece/genética , Expressão Gênica , Fixação de Nitrogênio , Fotossíntese , Filogenia , Proteoma/genética , Proteoma/isolamento & purificação , Espectrometria de Massas em TandemRESUMO
SUMMARY: At the rate that prokaryotic genomes can now be generated, comparative genomics studies require a flexible method for quickly and accurately predicting orthologs among the rapidly changing set of genomes available. SPOCS implements a graph-based ortholog prediction method to generate a simple tab-delimited table of orthologs and in addition, html files that provide a visualization of the predicted ortholog/paralog relationships to which gene/protein expression metadata may be overlaid. AVAILABILITY AND IMPLEMENTATION: A SPOCS web application is freely available at http://cbb.pnnl.gov/portal/tools/spocs.html. Source code for Linux systems is also freely available under an open source license at http://cbb.pnnl.gov/portal/software/spocs.html; the Boost C++ libraries and BLAST are required.
Assuntos
Genômica/métodos , Genoma , Internet , Linguagens de Programação , Proteínas/genética , Design de SoftwareRESUMO
Comparative analysis of (meta)genomes necessitates aggregation, integration, and synthesis of well-annotated data using standards. The Genomic Standards Consortium (GSC) collaborates with the research community to develop and maintain the Minimum Information about any (x) Sequence (MIxS) reporting standard for genomic data. To facilitate the use of the GSC's MIxS reporting standard, we provide a description of the structure and terminology, how to navigate ontologies for required terms in MIxS, and demonstrate practical usage through a soil metagenome example.
Assuntos
Genômica , Metagenoma , Metagenômica , Metagenômica/métodos , Metagenômica/normas , Genômica/métodos , Genômica/normas , Metagenoma/genética , Bases de Dados Genéticas , Microbiologia do SoloRESUMO
Due to its speed, accuracy, and adaptability to various sample types, matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS) has become a popular method to identify molecular isotope profiles from biological samples. Often MALDI-MS data do not include tandem MS fragmentation data, and thus the identification of compounds in samples requires external databases so that the accurate mass of detected signals can be matched to known molecular compounds. Most relevant MALDI-MS software tools developed to confirm compound identifications are focused on small molecules (e.g., metabolites, lipids) and cannot be easily adapted to protein data due to their more complex isotopic distributions. Here, we present an R package called IsoMatchMS for the automated annotation of MALDI-MS data for multiple datatypes: intact proteins, peptides, and glycans. This tool accepts already derived molecular formulas or, for proteomics applications, can derive molecular formulas from a list of input peptides or proteins including proteins with post-translational modifications. Visualization of all matched isotopic profiles is provided in a highly accessible HTML format called a trelliscope display, which allows users to filter and sort by several parameters such as match scores and the number of peaks matched. IsoMatchMS simplifies the annotation and visualization of MALDI-MS data for downstream analyses.
Assuntos
Proteínas , Software , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodos , Proteínas/química , Peptídeos , Proteômica/métodosRESUMO
BACKGROUND: The procedural aspects of genome sequencing and assembly have become relatively inexpensive, yet the full, accurate structural annotation of these genomes remains a challenge. Next-generation sequencing transcriptomics (RNA-Seq), global microarrays, and tandem mass spectrometry (MS/MS)-based proteomics have demonstrated immense value to genome curators as individual sources of information, however, integrating these data types to validate and improve structural annotation remains a major challenge. Current visual and statistical analytic tools are focused on a single data type, or existing software tools are retrofitted to analyze new data forms. We present Visual Exploration and Statistics to Promote Annotation (VESPA) is a new interactive visual analysis software tool focused on assisting scientists with the annotation of prokaryotic genomes though the integration of proteomics and transcriptomics data with current genome location coordinates. RESULTS: VESPA is a desktop Java™ application that integrates high-throughput proteomics data (peptide-centric) and transcriptomics (probe or RNA-Seq) data into a genomic context, all of which can be visualized at three levels of genomic resolution. Data is interrogated via searches linked to the genome visualizations to find regions with high likelihood of mis-annotation. Search results are linked to exports for further validation outside of VESPA or potential coding-regions can be analyzed concurrently with the software through interaction with BLAST. VESPA is demonstrated on two use cases (Yersinia pestis Pestoides F and Synechococcus sp. PCC 7002) to demonstrate the rapid manner in which mis-annotations can be found and explored in VESPA using either proteomics data alone, or in combination with transcriptomic data. CONCLUSIONS: VESPA is an interactive visual analytics tool that integrates high-throughput data into a genomic context to facilitate the discovery of structural mis-annotations in prokaryotic genomes. Data is evaluated via visual analysis across multiple levels of genomic resolution, linked searches and interaction with existing bioinformatics tools. We highlight the novel functionality of VESPA and core programming requirements for visualization of these large heterogeneous datasets for a client-side application. The software is freely available at https://www.biopilot.org/docs/Software/Vespa.php.
Assuntos
Bactérias/genética , Perfilação da Expressão Gênica/métodos , Anotação de Sequência Molecular/métodos , Proteômica/métodos , Software , Gráficos por Computador , Mineração de Dados , Internet , Synechococcus/genética , Yersinia pestis/genéticaRESUMO
We have developed a new kinetic model to study how microbial dynamics are affected by the heterogeneity in the physical structure of the environment and by different strategies for hydrolysis of polymeric carbon. The hybrid model represented the dynamics of substrates and enzymes using a continuum representation and the dynamics of the cells were modeled individually. Individual-based biological model allowed us to explicitly simulate microbial diversity, and to model cell physiology as regulated via optimal allocation of cellular resources to enzyme synthesis, control of growth rate by protein synthesis capacity, and shifts to dormancy. This model was developed to study how microbial community functioning is influenced by local environmental conditions in heterogeneous media such as soil and by the functional attributes of individual microbes. Microbial community dynamics were simulated at two spatial scales: micro-pores that resemble 6-20-µm size portions of the soil physical structure and in 111-µm size soil aggregates with a random pore structure. Different strategies for acquisition of carbon from polymeric cellulose were investigated. Bacteria that express membrane-associated hydrolase had different growth and survival dynamics in soil pores than bacteria that release extracellular hydrolases. The kinetic differences suggested different functional niches for these two microbe types in cellulose utilization. Our model predicted an emergent behavior in which co-existence of membrane-associated hydrolase and extracellular hydrolases releasing organisms led to higher cellulose utilization efficiency and reduced stochasticity. Our analysis indicated that their co-existence mutually benefits these organisms, where basal cellulose degradation activity by membrane-associated hydrolase-expressing cells shortened the soluble hydrolase buildup time and, when enzyme buildup allowed for cellulose degradation to be fast enough to sustain exponential growth, all the organisms in the community shared the soluble carbon product and grew together. Although pore geometry affected the kinetics of cellulose degradation, the patterns observed for the bacterial community dynamics in the 6-20 µm-sized micro-pores were relevant to the dynamics in the more complex 111-µm-sized porous soil aggregates, implying that micro-scale studies can be useful approximations to aggregate scale studies when local effects on microbial dynamics are studied. As shown with examples in this study, various functional niches of the bacterial communities can be investigated using complex predictive mathematical models where the role of key environmental aspects such as the heterogeneous three-dimensional structure, functional niches of the community members, and environmental biochemical processes are directly connected to microbial metabolism and maintenance in an integrated model.
Assuntos
Bactérias/crescimento & desenvolvimento , Bactérias/metabolismo , Carbono/metabolismo , Modelos Biológicos , Microbiologia do Solo , Solo/química , Carbono/química , Celulose/metabolismo , Hidrolases/metabolismo , Hidrólise , Cinética , Especificidade por SubstratoRESUMO
To what extent genotypic differences translate to phenotypic variation remains a poorly understood issue of paramount importance for several cornerstone concepts of microbiology including the species definition. Here, we take advantage of the completed genomic sequences, expressed proteomic profiles, and physiological studies of 10 closely related Shewanella strains and species to provide quantitative insights into this issue. Our analyses revealed that, despite extensive horizontal gene transfer within these genomes, the genotypic and phenotypic similarities among the organisms were generally predictable from their evolutionary relatedness. The power of the predictions depended on the degree of ecological specialization of the organisms evaluated. Using the gradient of evolutionary relatedness formed by these genomes, we were able to partly isolate the effect of ecology from that of evolutionary divergence and to rank the different cellular functions in terms of their rates of evolution. Our ranking also revealed that whole-cell protein expression differences among these organisms, when the organisms were grown under identical conditions, were relatively larger than differences at the genome level, suggesting that similarity in gene regulation and expression should constitute another important parameter for (new) species description. Collectively, our results provide important new information toward beginning a systems-level understanding of bacterial species and genera.
Assuntos
Evolução Biológica , Shewanella/classificação , Shewanella/genética , Sequência Conservada , Ecossistema , Evolução Molecular , Expressão Gênica , Transferência Genética Horizontal , Genoma Bacteriano , Fenótipo , Filogenia , Análise Serial de Proteínas , Proteoma , RNA Bacteriano/genética , RNA Ribossômico 16S/genética , Shewanella/fisiologia , Biologia de Sistemas , Fatores de TempoRESUMO
We present observations from a laboratory-controlled study on the impacts of extreme wetting and drying on a wetland soil microbiome. Our approach was to experimentally challenge the soil microbiome to understand impacts on anaerobic carbon cycling processes as the system transitions from dryness to saturation and vice-versa. Specifically, we tested for impacts on stress responses related to shifts from wet to drought conditions. We used a combination of high-resolution data for small organic chemical compounds (metabolites) and biological (community structure based on 16S rRNA gene sequencing) features. Using a robust correlation-independent data approach, we further tested the predictive power of soil metabolites for the presence or absence of taxa. Here, we demonstrate that taking an untargeted, multidimensional data approach to the interpretation of metabolomics has the potential to indicate the causative pathways selecting for the observed bacterial community structure in soils.
RESUMO
BACKGROUND: EtrA in Shewanella oneidensis MR-1, a model organism for study of adaptation to varied redox niches, shares 73.6% and 50.8% amino acid sequence identity with the oxygen-sensing regulators Fnr in E. coli and Anr in Pseudomonas aeruginosa, respectively; however, its regulatory role of anaerobic metabolism in Shewanella spp. is complex and not well understood. RESULTS: The expression of the nap genes, nrfA, cymA and hcp was significantly reduced in etrA deletion mutant EtrA7-1; however, limited anaerobic growth and nitrate reduction occurred, suggesting that multiple regulators control nitrate reduction in this strain. Dimethyl sulfoxide (DMSO) and fumarate reductase gene expression was down-regulated at least 2-fold in the mutant, which, showed lower or no reduction of these electron acceptors when compared to the wild type, suggesting both respiratory pathways are under EtrA control. Transcript analysis further suggested a role of EtrA in prophage activation and down-regulation of genes implicated in aerobic metabolism. CONCLUSION: In contrast to previous studies that attributed a minor regulatory role to EtrA in Shewanella spp., this study demonstrates that EtrA acts as a global transcriptional regulator and, in conjunction with other regulators, fine-tunes the expression of genes involved in anaerobic metabolism in S. oneidensis strain MR-1. Transcriptomic and sequence analyses of the genes differentially expressed showed that those mostly affected by the mutation belonged to the "Energy metabolism" category, while stress-related genes were indirectly regulated in the mutant possibly as a result of a secondary perturbation (e.g. oxidative stress, starvation). We also conclude based on sequence, physiological and expression analyses that this regulator is more appropriately termed Fnr and recommend this descriptor be used in future publications.
Assuntos
Proteínas de Bactérias/metabolismo , Metabolismo Energético , Regulação Bacteriana da Expressão Gênica , Redes e Vias Metabólicas/genética , Shewanella/fisiologia , Fatores de Transcrição/metabolismo , Anaerobiose , Proteínas de Bactérias/genética , Deleção de Genes , Perfilação da Expressão Gênica , Nitratos/metabolismo , Análise de Sequência de DNA , Shewanella/genética , Shewanella/crescimento & desenvolvimento , Shewanella/metabolismo , Fatores de Transcrição/genéticaRESUMO
Thermodynamics plays a crucial role in regulating the metabolic processes in all living organisms. Accurate determination of biochemical and biophysical properties is important to understand, analyze, and synthetically design such metabolic processes for engineered systems. In this work, we extensively performed first-principles quantum mechanical calculations to assess its accuracy in estimating free energy of biochemical reactions and developed automated quantum-chemistry (QC) pipeline (https://appdev.kbase.us/narrative/45710) for the prediction of thermodynamics parameters of biochemical reactions. We benchmark the QC methods based on density functional theory (DFT) against different basis sets, solvation models, pH, and exchange-correlation functionals using the known thermodynamic properties from the NIST database. Our results show that QC calculations when combined with simple calibration yield a mean absolute error in the range of 1.60-2.27 kcal/mol for different exchange-correlation functionals, which is comparable to the error in the experimental measurements. This accuracy over a diverse set of metabolic reactions is unprecedented and near the benchmark chemical accuracy of 1 kcal/mol that is usually desired from DFT calculations.