RESUMO
Microorganisms that degrade biomass produce diverse assortments of carbohydrate-active enzymes and binding modules. Despite tremendous advances in the genomic sequencing of these organisms, many genes do not have an ascribed function owing to low sequence identity to genes that have been annotated. Consequently, biochemical and structural characterization of genes with unknown function is required to complement the rapidly growing pool of genomic sequencing data. A protein with previously unknown function (Cthe_2159) was recently isolated in a genome-wide screen using phage display to identify cellulose-binding protein domains from the biomass-degrading bacterium Clostridium thermocellum. Here, the crystal structure of Cthe_2159 is presented and it is shown that it is a unique right-handed parallel ß-helix protein. Despite very low sequence identity to known ß-helix or carbohydrate-active proteins, Cthe_2159 displays structural features that are very similar to those of polysaccharide lyase (PL) families 1, 3, 6 and 9. Cthe_2159 is conserved across bacteria and some archaea and is a member of the domain of unknown function family DUF4353. This suggests that Cthe_2159 is the first representative of a previously unknown family of cellulose and/or acid-sugar binding ß-helix proteins that share structural similarities with PLs. Importantly, these results demonstrate how functional annotation by biochemical and structural analysis remains a critical tool in the characterization of new gene products.
Assuntos
Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Cálcio/química , Cálcio/metabolismo , Celulose/metabolismo , Clonagem Molecular , Clostridium thermocellum/química , Clostridium thermocellum/metabolismo , Cristalografia por Raios X , Gadolínio/química , Modelos Moleculares , Polissacarídeo-Liases/química , Conformação Proteica , Estrutura Terciária de Proteína , Homologia Estrutural de ProteínaRESUMO
Lignocellulosic biomass is an important feedstock for the pulp and paper industry as well as emerging biofuel and biomaterial industries. However, the recalcitrance of the secondary cell wall to chemical or enzymatic degradation remains a major hurdle for efficient extraction of economically important biopolymers such as cellulose. It has been estimated that approximately 10-15% of about 27,000 protein-coding genes in the Arabidopsis genome are dedicated to cell wall development; however, only about 130 Arabidopsis genes thus far have experimental evidence validating cell wall function. While many genes have been implicated through co-expression analysis with known genes, a large number are broadly classified as proteins of unknown function (PUFs). Recently the functionality of some of these unknown proteins in cell wall development has been revealed using reverse genetic approaches. Given the large number of cell wall-related PUFs, how do we approach and subsequently prioritize the investigation of such unknown genes that may be essential to or influence plant cell wall development and structure? Here, we address the aforementioned question in two parts; we first identify the different kinds of PUFs based on known and predicted features such as protein domains. Knowledge of inherent features of PUFs may allow for functional inference and a concomitant link to biological context. Secondly, we discuss omics-based technologies and approaches that are helping identify and prioritize cell wall-related PUFs by functional association. In this way, hypothesis-driven experiments can be designed for functional elucidation of many proteins that remain missing links in our understanding of plant cell wall biosynthesis.