RESUMEN
The cyanobacterial phylum encompasses oxygenic photosynthetic prokaryotes of a great breadth of morphologies and ecologies; they play key roles in global carbon and nitrogen cycles. The chloroplasts of all photosynthetic eukaryotes can trace their ancestry to cyanobacteria. Cyanobacteria also attract considerable interest as platforms for "green" biotechnology and biofuels. To explore the molecular basis of their different phenotypes and biochemical capabilities, we sequenced the genomes of 54 phylogenetically and phenotypically diverse cyanobacterial strains. Comparison of cyanobacterial genomes reveals the molecular basis for many aspects of cyanobacterial ecophysiological diversity, as well as the convergence of complex morphologies without the acquisition of novel proteins. This phylum-wide study highlights the benefits of diversity-driven genome sequencing, identifying more than 21,000 cyanobacterial proteins with no detectable similarity to known proteins, and foregrounds the diversity of light-harvesting proteins and gene clusters for secondary metabolite biosynthesis. Additionally, our results provide insight into the distribution of genes of cyanobacterial origin in eukaryotic nuclear genomes. Moreover, this study doubles both the amount and the phylogenetic diversity of cyanobacterial genome sequence data. Given the exponentially growing number of sequenced genomes, this diversity-driven study demonstrates the perspective gained by comparing disparate yet related genomes in a phylum-wide context and the insights that are gained from it.
Asunto(s)
Cianobacterias/clasificación , Cianobacterias/genética , Genoma Bacteriano , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Proteínas de Unión a Clorofila/química , Proteínas de Unión a Clorofila/genética , Proteínas de Unión a Clorofila/metabolismo , Cianobacterias/metabolismo , Evolución Molecular , Variación Genética , Complejos de Proteína Captadores de Luz/química , Complejos de Proteína Captadores de Luz/genética , Complejos de Proteína Captadores de Luz/metabolismo , Modelos Moleculares , Datos de Secuencia Molecular , Familia de Multigenes , Complejo de Proteína del Fotosistema I/química , Complejo de Proteína del Fotosistema I/genética , Complejo de Proteína del Fotosistema I/metabolismo , Filogenia , Plastidios/genética , Homología de Secuencia de AminoácidoRESUMEN
Bacterial microcompartments (BMCs) are proteinaceous organelles involved in both autotrophic and heterotrophic metabolism. All BMCs share homologous shell proteins but differ in their complement of enzymes; these are typically encoded adjacent to shell protein genes in genetic loci, or operons. To enable the identification and prediction of functional (sub)types of BMCs, we developed LoClass, an algorithm that finds putative BMC loci and inventories, weights, and compares their constituent pfam domains to construct a locus similarity network and predict locus (sub)types. In addition to using LoClass to analyze sequences in the Non-redundant Protein Database, we compared predicted BMC loci found in seven candidate bacterial phyla (six from single-cell genomic studies) to the LoClass taxonomy. Together, these analyses resulted in the identification of 23 different types of BMCs encoded in 30 distinct locus (sub)types found in 23 bacterial phyla. These include the two carboxysome types and a divergent set of metabolosomes, BMCs that share a common catalytic core and process distinct substrates via specific signature enzymes. Furthermore, many Candidate BMCs were found that lack one or more core metabolosome components, including one that is predicted to represent an entirely new paradigm for BMC-associated metabolism, joining the carboxysome and metabolosome. By placing these results in a phylogenetic context, we provide a framework for understanding the horizontal transfer of these loci, a starting point for studies aimed at understanding the evolution of BMCs. This comprehensive taxonomy of BMC loci, based on their constituent protein domains, foregrounds the functional diversity of BMCs and provides a reference for interpreting the role of BMC gene clusters encoded in isolate, single cell, and metagenomic data. Many loci encode ancillary functions such as transporters or genes for cofactor assembly; this expanded vocabulary of BMC-related functions should be useful for design of genetic modules for introducing BMCs in bioengineering applications.
Asunto(s)
Bacterias , Proteínas Bacterianas , Biología Computacional/métodos , Redes Reguladoras de Genes , Aldehído Deshidrogenasa , Algoritmos , Bacterias/genética , Bacterias/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Bioingeniería , Redes Reguladoras de Genes/genética , Redes Reguladoras de Genes/fisiología , Orgánulos , FilogeniaRESUMEN
Plants rely on the Calvin-Benson (CB) cycle for CO(2) fixation. The key carboxylase of the CB cycle is ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO). Efforts to enhance carbon fixation in plants have traditionally focused on RubisCO or on approaches that can help to remedy RubisCO's undesirable traits: its low catalytic efficiency and photorespiration. Towards reaching the goal of improving plant photosynthesis, cyanobacteria may be instrumental. Because of their evolutionary relationship to chloroplasts, they represent ideal model organisms for photosynthesis research. Furthermore, the molecular understanding of cyanobacterial carbon fixation provides a rich source of strategies that can be exploited for the bioengineering of chloroplasts. These strategies include the cyanobacterial carbon concentrating mechanism (CCM), which consists of active and passive transporter systems for inorganic carbon and a specialized organelle, the carboxysome. The carboxysome encapsulates RubisCO together with carbonic anhydrase in a protein shell, resulting in an elevated CO(2) concentration around RubisCO. Moreover, cyanobacteria differ from plants in the isoenzymes involved in the CB cycle and the photorespiratory pathways as well as in mechanisms that can affect the activity of RubisCO. In addition, newly available cyanobacterial genome sequence data from the CyanoGEBA project, which has more than doubled the amount of genomic information available for cyanobacteria, increases our knowledge on the CCM and the occurrence and distribution of genes of interest.
Asunto(s)
Cianobacterias/metabolismo , Fotosíntesis , Plantas/metabolismo , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Dióxido de Carbono/metabolismo , Cianobacterias/genética , Ingeniería Metabólica , Plantas/microbiologíaRESUMEN
Members of the phylum Cyanobacteria inhabit ecologically diverse environments. However, the CRISPR-Cas (clustered regularly interspaced short palindromic repeats, CRISPR associated genes), an extremely adaptable defense system, has not been surveyed in this phylum. We analyzed 126 cyanobacterial genomes and, surprisingly, found CRISPR-Cas in the majority except the marine subclade (Synechococcus and Prochlorococcus), in which cyanophages are a known force shaping their evolution. Multiple observations of CRISPR loci in the absence of cas1/cas2 genes may represent an early stage of losing a CRISPR-Cas locus. Our findings reveal the widespread distribution of their role in the phylum Cyanobacteria and provide a first step to systematically understanding CRISPR-Cas systems in cyanobacteria.
Asunto(s)
Sistemas CRISPR-Cas/genética , Cianobacterias/genética , Cianobacterias/inmunología , Evolución Molecular , Genes Arqueales , Genes Bacterianos , FilogeniaRESUMEN
Word frequency is a strong predictor in most lexical processing tasks. Thus, any model of word recognition needs to account for how word frequency effects arise. The Discriminative Lexicon Model (DLM) models lexical processing with mappings between words' forms and their meanings. Comprehension and production are modeled via linear mappings between the two domains. So far, the mappings within the model can either be obtained incrementally via error-driven learning, a computationally expensive process able to capture frequency effects, or in an efficient, but frequency-agnostic solution modeling the theoretical endstate of learning (EL) where all words are learned optimally. In the present study we show how an efficient, yet frequency-informed mapping between form and meaning can be obtained (Frequency-informed learning; FIL). We find that FIL well approximates an incremental solution while being computationally much cheaper. FIL shows a relatively low type- and high token-accuracy, demonstrating that the model is able to process most word tokens encountered by speakers in daily life correctly. We use FIL to model reaction times in the Dutch Lexicon Project by means of a Gaussian Location Scale Model and find that FIL predicts well the S-shaped relationship between frequency and the mean of reaction times but underestimates the variance of reaction times for low frequency words. FIL is also better able to account for priming effects in an auditory lexical decision task in Mandarin Chinese, compared to EL. Finally, we used ordered data from CHILDES to compare mappings obtained with FIL and incremental learning. We show that the mappings are highly correlated, but that with FIL some nuances based on word ordering effects are lost. Our results show how frequency effects in a learning model can be simulated efficiently, and raise questions about how to best account for low-frequency words in cognitive models.
RESUMEN
Carboxysomes are metabolic modules for CO(2) fixation that are found in all cyanobacteria and some chemoautotrophic bacteria. They comprise a semi-permeable proteinaceous shell that encapsulates ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) and carbonic anhydrase. Structural studies are revealing the integral role of the shell protein paralogs to carboxysome form and function. The shell proteins are composed of two domain classes: those with the bacterial microcompartment (BMC; Pfam00936) domain, which oligomerize to form (pseudo)hexamers, and those with the CcmL/EutN (Pfam03319) domain which form pentamers in carboxysomes. These two shell protein types are proposed to be the basis for the carboxysome's icosahedral geometry. The shell proteins are also thought to allow the flux of metabolites across the shell through the presence of the small pore formed by their hexameric/pentameric symmetry axes. In this review, we describe bioinformatic and structural analyses that highlight the important primary, tertiary, and quaternary structural features of these conserved shell subunits. In the future, further understanding of these molecular building blocks may provide the basis for enhancing CO(2) fixation in other organisms or creating novel biological nanostructures.
Asunto(s)
Proteínas Bacterianas/metabolismo , Halothiobacillus/enzimología , Orgánulos/enzimología , Prochlorococcus/enzimología , Synechocystis/enzimología , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Bicarbonatos/metabolismo , Ciclo del Carbono , Dióxido de Carbono/metabolismo , Anhidrasas Carbónicas/genética , Anhidrasas Carbónicas/metabolismo , Halothiobacillus/genética , Halothiobacillus/ultraestructura , Modelos Moleculares , Familia de Multigenes , Orgánulos/genética , Orgánulos/ultraestructura , Fotosíntesis , Prochlorococcus/genética , Prochlorococcus/ultraestructura , Conformación Proteica , Ribulosa-Bifosfato Carboxilasa/genética , Ribulosa-Bifosfato Carboxilasa/metabolismo , Ribulosafosfatos/metabolismo , Synechocystis/genética , Synechocystis/ultraestructuraRESUMEN
Statistical and machine learning approaches predict drug-to-target relationships from 2D small-molecule topology patterns. One might expect 3D information to improve these calculations. Here we apply the logic of the extended connectivity fingerprint (ECFP) to develop a rapid, alignment-invariant 3D representation of molecular conformers, the extended three-dimensional fingerprint (E3FP). By integrating E3FP with the similarity ensemble approach (SEA), we achieve higher precision-recall performance relative to SEA with ECFP on ChEMBL20 and equivalent receiver operating characteristic performance. We identify classes of molecules for which E3FP is a better predictor of similarity in bioactivity than is ECFP. Finally, we report novel drug-to-target binding predictions inaccessible by 2D fingerprints and confirm three of them experimentally with ligand efficiencies from 0.442-0.637 kcal/mol/heavy atom.
Asunto(s)
Conformación Molecular , Bibliotecas de Moléculas Pequeñas/química , Diseño Asistido por Computadora , Diseño de Fármacos , Ligandos , Modelos Moleculares , Bibliotecas de Moléculas Pequeñas/farmacologíaRESUMEN
Ubiquitin is essential for eukaryotic life and varies in only 3 amino acid positions between yeast and humans. However, recent deep sequencing studies indicate that ubiquitin is highly tolerant to single mutations. We hypothesized that this tolerance would be reduced by chemically induced physiologic perturbations. To test this hypothesis, a class of first year UCSF graduate students employed deep mutational scanning to determine the fitness landscape of all possible single residue mutations in the presence of five different small molecule perturbations. These perturbations uncover 'shared sensitized positions' localized to areas around the hydrophobic patch and the C-terminus. In addition, we identified perturbation specific effects such as a sensitization of His68 in HU and a tolerance to mutation at Lys63 in DTT. Our data show how chemical stresses can reduce buffering effects in the ubiquitin proteasome system. Finally, this study demonstrates the potential of lab-based interdisciplinary graduate curriculum.
Asunto(s)
Análisis Mutacional de ADN , Proteínas Mutantes/genética , Proteínas Mutantes/metabolismo , Saccharomyces cerevisiae/enzimología , Estrés Fisiológico , Ubiquitina/genética , Ubiquitina/metabolismo , Biología/educación , Humanos , Complejo de la Endopetidasa Proteasomal/genética , Complejo de la Endopetidasa Proteasomal/metabolismo , Saccharomyces cerevisiae/fisiología , Estudiantes , UniversidadesRESUMEN
Bacterial microcompartments (BMCs) sequester enzymes from the cytoplasmic environment by encapsulation inside a selectively permeable protein shell. Bioinformatic analyses indicate that many bacteria encode BMC clusters of unknown function and with diverse combinations of shell proteins. The genome of the halophilic myxobacterium Haliangium ochraceum encodes one of the most atypical sets of shell proteins in terms of composition and primary structure. We found that microcompartment shells could be purified in high yield when all seven H. ochraceum BMC shell genes were expressed from a synthetic operon in Escherichia coli. These shells differ substantially from previously isolated shell systems in that they are considerably smaller and more homogeneous, with measured diameters of 39±2nm. The size and nearly uniform geometry allowed the development of a structural model for the shells composed of 260 hexagonal units and 13 hexagons per icosahedral face. We found that new proteins could be recruited to the shells by fusion to a predicted targeting peptide sequence, setting the stage for the use of these remarkably homogeneous shells for applications such as three-dimensional scaffolding and the construction of synthetic BMCs. Our results demonstrate the value of selecting from the diversity of BMC shell building blocks found in genomic sequence data for the construction of novel compartments.
Asunto(s)
Proteínas Bacterianas/química , Compartimento Celular , Myxococcales/química , Myxococcales/fisiología , Myxococcales/ultraestructura , Proteínas Bacterianas/genética , Proteínas Bacterianas/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Modelos Moleculares , Operón , Orgánulos , Multimerización de Proteína , Estructura Cuaternaria de ProteínaRESUMEN
Biological soil crusts (BSCs) cover extensive portions of the earth's deserts. In order to survive desiccation cycles and utilize short periods of activity during infrequent precipitation, crust microorganisms must rely on the unique capabilities of vegetative cells to enter a dormant state and be poised for rapid resuscitation upon wetting. To elucidate the key events involved in the exit from dormancy, we performed a wetting experiment of a BSC and followed the response of the dominant cyanobacterium, Microcoleus vaginatus, in situ using a whole-genome transcriptional time course that included two diel cycles. Immediate, but transient, induction of DNA repair and regulatory genes signaled the hydration event. Recovery of photosynthesis occurred within 1 h, accompanied by upregulation of anabolic pathways. Onset of desiccation was characterized by the induction of genes for oxidative and photo-oxidative stress responses, osmotic stress response and the synthesis of C and N storage polymers. Early expression of genes for the production of exopolysaccharides, additional storage molecules and genes for membrane unsaturation occurred before drying and hints at preparedness for desiccation. We also observed signatures of preparation for future precipitation, notably the expression of genes for anaplerotic reactions in drying crusts, and the stable maintenance of mRNA through dormancy. These data shed light on possible synchronization between this cyanobacterium and its environment, and provides key mechanistic insights into its metabolism in situ that may be used to predict its response to climate, and or, land-use driven perturbations.