RESUMO
We take advantage of a historic collection of 133 Staphylococcus aureus strains accessioned between 1924 and 2016, whose genomes have been long-read sequenced as part of a major National Collection of Type Cultures (NCTC) initiative, to conduct a gene family-wide computational analysis of enterotoxin genes. We identify two novel staphylococcal enterotoxin (pseudo)genes (sel29p and sel30), the former of which has not been observed in any contemporary strain to date. We provide further information on five additional enterotoxin genes or gene variants that either have recently entered the literature or for which the nomenclature or description is currently unclear (selz, sel26, sel27, sel28, and ses-2p). An examination of over 11,000 RefSeq genomes in search of wider support for these seven (pseudo)genes led to the identification of an additional three novel enterotoxin gene family members (sel31, sel32, and sel33) plus two new variants (seh-2p and ses-3p). We cast light on the genomic distribution of the enterotoxin genes, further defining their arrangement in gene clusters. Finally, we show that cooccurrence of enterotoxin genes is prevalent, with individual NCTC strains possessing as many as 18 enterotoxin genes and pseudogenes, and that clonal complex membership rather than time of isolation is the key factor in determining enterotoxin load.IMPORTANCEStaphylococcus aureus strains pose a significant health risk to both human and animal populations. Key among this species' virulence factors is the staphylococcal enterotoxin gene family. Certain enterotoxin forms can induce a potentially life-threatening immune response, while others are implicated in less fatal though often severe conditions such as food poisoning. Genetic characterization of staphylococcal enterotoxin gene family members has steadily accumulated over recent decades, with over 20 genes now established in the literature. Despite the current wealth of knowledge on this important gene family, questions remain about the presence of additional enterotoxin genes and the genomic composition of family members. This study further expands knowledge of the staphylococcal enterotoxins while shedding light on their evolution over the last century.
Assuntos
Enterotoxinas/genética , Evolução Molecular , Genoma Bacteriano , Staphylococcus aureus/genética , Fatores de Virulência/genética , Animais , Bases de Dados Genéticas , Genes Bacterianos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Família Multigênica , Filogenia , Plasmídeos , Pseudogenes , Infecções Estafilocócicas/microbiologia , Staphylococcus aureus/classificação , Staphylococcus aureus/patogenicidade , Virulência/genética , Sequenciamento Completo do GenomaRESUMO
The wealth of phylogenetic information accumulated over many decades of biological research, coupled with recent technological advances in molecular sequence generation, presents significant opportunities for researchers to investigate relationships across and within the kingdoms of life. However, to make best use of this data wealth, several problems must first be overcome. One key problem is finding effective strategies to deal with missing data. Here, we introduce Lasso, a novel heuristic approach for reconstructing rooted phylogenetic trees from distance matrices with missing values, for data sets where a molecular clock may be assumed. Contrary to other phylogenetic methods on partial data sets, Lasso possesses desirable properties such as its reconstructed trees being both unique and edge-weighted. These properties are achieved by Lasso restricting its leaf set to a large subset of all possible taxa, which in many practical situations is the entire taxa set. Furthermore, the Lasso approach is distance-based, rendering it very fast to run and suitable for data sets of all sizes, including large data sets such as those generated by modern Next Generation Sequencing technologies. To better understand the performance of Lasso, we assessed it by means of artificial and real biological data sets, showing its effectiveness in the presence of missing data. Furthermore, by formulating the supermatrix problem as a particular case of the missing data problem, we assessed Lasso's ability to reconstruct supertrees. We demonstrate that, although not specifically designed for such a purpose, Lasso performs better than or comparably with five leading supertree algorithms on a challenging biological data set. Finally, we make freely available a software implementation of Lasso so that researchers may, for the first time, perform both rooted tree and supertree reconstruction with branch lengths on their own partial data sets.
Assuntos
Bases de Dados Genéticas , Modelos Genéticos , Filogenia , Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/genética , Software , Triticum/classificação , Triticum/genéticaRESUMO
Five British ale yeast strains were subjected to flavour profiling under brewery fermentation conditions in which all other brewing parameters were kept constant. Significant variation was observed in the timing and quantity of flavour-related chemicals produced. Genetic tests showed no evidence of hybrid origins in any of the strains, including one strain previously reported as a possible hybrid of Saccharomyces cerevisiae and S. bayanus. Variation maintained in historical S. cerevisiae ale yeast collections is highlighted as a potential source of novelty in innovative strain improvement for bioflavour production.
Assuntos
Cerveja/análise , Cerveja/microbiologia , Aromatizantes/metabolismo , Saccharomyces/metabolismo , Fermentação , Aromatizantes/análise , Saccharomyces/genética , Saccharomyces/isolamento & purificaçãoRESUMO
The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast].
Assuntos
DNA Ribossômico/genética , Heterogeneidade Genética , Genoma Fúngico/genética , Filogenia , Saccharomyces/classificação , Saccharomyces/genética , DNA Fúngico/genética , Variação Genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
KEY MESSAGE: A high level of genetic diversity was found in the A. E. Watkins bread wheat landrace collection. Genotypic information was used to determine the population structure and to develop germplasm resources. In the 1930s A. E. Watkins acquired landrace cultivars of bread wheat (Triticum aestivum L.) from official channels of the board of Trade in London, many of which originated from local markets in 32 countries. The geographic distribution of the 826 landrace cultivars of the current collection, here called the Watkins collection, covers many Asian and European countries and some from Africa. The cultivars were genotyped with 41 microsatellite markers in order to investigate the genetic diversity and population structure of the collection. A high level of genetic diversity was found, higher than in a collection of modern European winter bread wheat varieties from 1945 to 2000. Furthermore, although weak, the population structure of the Watkins collection reveals nine ancestral geographical groupings. An exchange of genetic material between ancestral groups before commercial wheat-breeding started would be a possible explanation for this. The increased knowledge regarding the diversity of the Watkins collection was used to develop resources for wheat research and breeding, one of them a core set, which captures the majority of the genetic diversity detected. The understanding of genetic diversity and population structure together with the availability of breeding resources should help to accelerate the detection of new alleles in the Watkins collection.
Assuntos
Pão , Ecótipo , Genes de Plantas , Estudos de Associação Genética , Triticum/genética , Variação Genética , Técnicas de Genotipagem , Geografia , Repetições de Microssatélites , Fenótipo , Dinâmica PopulacionalRESUMO
Here, we report on the one hundred and twenty-five bacterial strains made available by the National Collection of Type Cultures in 2022 alongside a commentary on the strains, their provenance and significance.
RESUMO
The National Collection of Type Cultures (NCTC) was founded on 1 January 1920 in order to fulfil a recognized need for a centralized repository for bacterial and fungal strains within the UK. It is among the longest-established collections of its kind anywhere in the world and today holds approximately 6000 type and reference bacterial strains - many of medical, scientific and veterinary importance - available to academic, health, food and veterinary institutions worldwide. Recently, a collaboration between NCTC, Pacific Biosciences and the Wellcome Sanger Institute established the NCTC3000 project to long-read sequence and assemble the genomes of up to 3000 NCTC strains. Here, at the beginning of the collection's second century, we introduce the resulting NCTC3000 sequence read datasets, genome assemblies and annotations as a unique, historically and scientifically relevant resource for the benefit of the international bacterial research community.
Assuntos
Genoma Bacteriano , Genômica , Análise de Sequência de DNA/métodos , Genoma Bacteriano/genética , Bactérias/genéticaRESUMO
Triterpenes are one of the largest classes of plant metabolites and have important functions. A diverse array of triterpenoid skeletons are synthesized via the isoprenoid pathway by enzymatic cyclization of 2,3-oxidosqualene. The genomes of the lower plants Chlamydomonas reinhardtii and moss (Physcomitrella patens) contain just one oxidosqualene cyclase (OSC) gene (for sterol biosynthesis), whereas the genomes of higher plants contain nine to 16 OSC genes. Here we carry out functional analysis of rice OSCs and rigorous phylogenetic analysis of 96 OSCs from higher plants, including Arabidopsis thaliana, Oryza sativa, Sorghum bicolor and Brachypodium distachyon. The functional analysis identified an amino acid sequence for isoarborinol synthase (OsIAS) (encoded by Os11g35710/OsOSC11) in rice. Our phylogenetic analysis suggests that expansion of OSC members in higher plants has occurred mainly through tandem duplication followed by positive selection and diversifying evolution, and consolidated the previous suggestion that dicot triterpene synthases have been derived from an ancestral lanosterol synthase instead of directly from their cycloartenol synthases. The phylogenetic trees are consistent with the reaction mechanisms of the protosteryl and dammarenyl cations which parent a wide variety of triterpene skeletal types, allowing us to predict the functions of the uncharacterized OSCs.
Assuntos
Transferases Intramoleculares/genética , Transferases Intramoleculares/metabolismo , Arabidopsis/enzimologia , Brachypodium/enzimologia , Ciclização , Evolução Molecular , Duplicação Gênica , Regulação Enzimológica da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Família Multigênica , Oryza/enzimologia , Oryza/genética , Filogenia , Sorghum/enzimologia , Esqualeno/análogos & derivados , Esqualeno/metabolismo , Triterpenos/química , Triterpenos/metabolismoRESUMO
The estimation of genetic linkage maps is a key component in plant and animal research, providing both an indication of the genetic structure of an organism and a mechanism for identifying candidate genes associated with traits of interest. Because of this importance, several computational solutions to genetic map estimation exist, mostly implemented as stand-alone software packages. However, the estimation process is often largely hidden from the user. Consequently, problems such as a program crashing may occur that leave a user baffled. THREaD Mapper Studio (http://cbr.jic.ac.uk/threadmapper) is a new web site that implements a novel, visual and interactive method for the estimation of genetic linkage maps from DNA markers. The rationale behind the web site is to make the estimation process as transparent and robust as possible, while also allowing users to use their expert knowledge during analysis. Indeed, the 3D visual nature of the tool allows users to spot features in a data set, such as outlying markers and potential structural rearrangements that could cause problems with the estimation procedure and to account for them in their analysis. Furthermore, THREaD Mapper Studio facilitates the visual comparison of genetic map solutions from third party software, aiding users in developing robust solutions for their data sets.
Assuntos
Ligação Genética , Software , Mapeamento Cromossômico , Biologia Computacional , Gráficos por Computador , InternetRESUMO
The type VII secretion system (T7SS) is found in many Gram-positive firmicutes and secretes protein toxins that mediate bacterial antagonism. Two T7SS toxins have been identified in Staphylococcus aureus, EsaD a nuclease toxin that is counteracted by the EsaG immunity protein, and TspA, which has membrane depolarising activity and is neutralised by TsaI. Both toxins are polymorphic, and strings of non-identical esaG and tsaI immunity genes are encoded in all S. aureus strains. To investigate the evolution of esaG repertoires, we analysed the sequences of the tandem esaG genes and their encoded proteins. We identified three blocks of high sequence similarity shared by all esaG genes and identified evidence of extensive recombination events between esaG paralogues facilitated through these conserved sequence blocks. Recombination between these blocks accounts for loss and expansion of esaG genes in S. aureus genomes and we identified evidence of such events during evolution of strains in clonal complex 8. TipC, an immunity protein for the TelC lipid II phosphatase toxin secreted by the streptococcal T7SS, is also encoded by multiple gene paralogues. Two blocks of high sequence similarity locate to the 5' and 3' end of tipC genes, and we found strong evidence for recombination between tipC paralogues encoded by Streptococcus mitis BCC08. By contrast, we found only a single homology block across tsaI genes, and little evidence for intergenic recombination within this gene family. We conclude that homologous recombination is one of the drivers for the evolution of T7SS immunity gene clusters.
Assuntos
Infecções Estafilocócicas , Sistemas de Secreção Tipo VII , Bactérias/metabolismo , Recombinação Homóloga , Humanos , Staphylococcus aureus/genética , Staphylococcus aureus/metabolismo , Sistemas de Secreção Tipo VII/genética , Sistemas de Secreção Tipo VII/metabolismoRESUMO
Here, we report on the 47 bacterial strains made available by the National Collection of Type Cultures in 2021, alongside a commentary on these strains and their significance.
RESUMO
Genetic maps are an important component within the plant biologist's toolkit, underpinning crop plant improvement programs. The estimation of plant genetic maps is a conceptually simple yet computationally complex problem, growing ever more so with the development of inexpensive, high-throughput DNA markers. The challenge for bioinformaticians is to develop analytical methods and accompanying software tools that can cope with datasets of differing sizes, from tens to thousands of markers, that can incorporate the expert knowledge that plant biologists typically use when developing their maps, and that facilitate user-friendly approaches to achieving these goals. Here, we aim to give a flavour of computational approaches for genetic map estimation, discussing briefly many of the key concepts involved, and describing a selection of software tools that employ them. This review is intended both for plant geneticists as an introduction to software tools with which to estimate genetic maps, and for bioinformaticians as an introduction to the underlying computational approaches.
Assuntos
Algoritmos , DNA de Plantas/genética , Ligação Genética/genética , Genoma de Planta/genética , Plantas/genética , Software , Mapeamento Cromossômico/métodosRESUMO
UNLABELLED: TURNIP comprises a suite of Perl scripts and modules that facilitates the resolution of microheterogeneity within hard-to-assemble repetitive DNA sequences. TURNIP was originally developed for the Saccharomyces Genome Resequencing Project (SGRP) within which the ribosomal DNA (rDNA) of 36 strains of S.cerevisiae were analysed to investigate the occurrence of potential polymorphisms. Here, 'partially resolved SNPs', or pSNPs, as well as indels, were found to be far more prevalent than previously suspected. More generally, the TURNIP software ascertains degrees of variation between large tandem repeats within a single locus, offering insights into mechanisms of genome stability and gene conversion in any organism for which genome sequence data are available. AVAILABILITY: The TURNIP source code, results files and online help are available at http://www.ncyc.co.uk/software/turnip.html.
Assuntos
DNA/química , Genômica/métodos , Polimorfismo Genético , Sequências Repetitivas de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Genoma , Instabilidade Genômica , Dados de Sequência Molecular , SoftwareRESUMO
The first committed step in sterol biosynthesis in plants involves the cyclization of 2,3-oxidosqualene by the oxidosqualene cyclase (OSC) enzyme cycloartenol synthase. 2,3-Oxidosqualene is also a precursor for triterpene synthesis. Antimicrobial triterpenes are common in dicots, but seldom found in monocots, with the notable exception of oat. Here, through genome mining and metabolic engineering, we investigate the potential for triterpene synthesis in rice. The first two steps in the oat triterpene pathway are catalysed by a divergent OSC (AsbAS1) and a cytochrome P450 (CYP51). The genes for these enzymes form part of a metabolic gene cluster. To investigate the origins of triterpene synthesis in monocots, we analysed systematically the OSC and CYP51 gene families in rice. We also engineered rice for elevated triterpene content. We discovered a total of 12 OSC and 12 CYP51 genes in rice and uncovered key events in the evolution of triterpene synthesis. We further showed that the expression of AsbAS1 in rice leads to the accumulation of the simple triterpene, ß-amyrin. These findings provide new insights into the evolution of triterpene synthesis in monocots and open up opportunities for metabolic engineering for disease resistance in rice and other cereals.
Assuntos
Ácido Oleanólico/análogos & derivados , Oryza/metabolismo , Proteínas de Plantas/metabolismo , Esqualeno/análogos & derivados , Triterpenos/metabolismo , Sequência de Aminoácidos , Evolução Biológica , Genoma de Planta/genética , Transferases Intramoleculares/genética , Transferases Intramoleculares/metabolismo , Anotação de Sequência Molecular , Família Multigênica , Ácido Oleanólico/metabolismo , Oryza/genética , Filogenia , Proteínas de Plantas/genética , Plantas Geneticamente Modificadas/genética , Plantas Geneticamente Modificadas/metabolismo , Poaceae/genética , Alinhamento de Sequência , Esqualeno/metabolismo , Esterol 14-Desmetilase/genética , Esterol 14-Desmetilase/metabolismoRESUMO
The grass species Brachypodium distachyon (hereafter, Brachypodium) has been adopted as a model system for grasses. Here, we describe the development of a genetic linkage map of Brachypodium. The genetic linkage map was developed with an F2 population from a cross between the diploid Brachypodium lines Bd3-1 and Bd21. The map was populated with polymorphic simple sequence repeat (SSR) markers from Brachypodium expressed sequence tag (EST) and bacterial artificial chromosome (BAC) end sequences and conserved orthologous sequence (COS) markers from other grass species. The map is 1386 cM in length and consists of 139 marker loci distributed across 20 linkage groups. Five of the linkage groups exceed 100 cM in length, with the largest being 231 cM long. Assessment of colinearity between the Brachypodium linkage map and the rice genome sequence revealed significant regions of macrosynteny between the two genomes, as well as rearrangements similar to those reported in other grass comparative structural genomics studies. The Brachypodium genetic linkage map described here will serve as a new tool to pursue a range of molecular genetic analyses and other applications in this new model plant system.
Assuntos
Mapeamento Cromossômico/métodos , Repetições de Microssatélites/genética , Modelos Teóricos , Poaceae/genética , Sequência de Bases , Cromossomos de Plantas , Análise por Conglomerados , Genes de Plantas , Modelos Biológicos , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
The human pathogen Candida albicans is considered an obligate commensal of animals, yet it is occasionally isolated from trees, shrubs, and grass. We generated genome sequence data for three strains of C. albicans that we isolated from oak trees in an ancient wood pasture, and compared these to the genomes of over 200 clinical strains. C. albicans strains from oak are similar to clinical C. albicans in that they are predominantly diploid and can become homozygous at the mating locus through whole-chromosome loss of heterozygosity. Oak strains differed from clinical strains in showing slightly higher levels of heterozygosity genome-wide. Using phylogenomic analyses and in silico chromosome painting, we show that each oak strain is more closely related to strains from humans and other animals than to strains from other oaks. The high genetic diversity of C. albicans from old oaks shows that they can live in this environment for extended periods of time.
Assuntos
Candida albicans/genética , Genoma Fúngico , Filogenia , Candida albicans/classificação , Candida albicans/patogenicidade , Diploide , Evolução Molecular , Genes Fúngicos Tipo Acasalamento , Quercus/microbiologiaRESUMO
UNLABELLED: MPP is a Java application, encompassing both new and established algorithms, for the analysis of gene and marker content datasets arising from high-throughput microarray techniques. MPP analyses flat file output from microarray experiments to determine the probability of the presence or absence of genes or markers within a genome. MPP can construct gene or marker content datasets for a number of genomes and can use the data to estimate an evolutionary tree or network. Results from gene content analyses may be validated by comparing them to known gene contents. MPP was initially developed to analyse data derived from comparative genome hybridization (CGH) microarray experiments in fungi and bacteria. It has recently been adapted to analyse retrotransposon-based insertion polymorphism (RBIP) marker scores derived from tagged microarray marker (TAM) experiments in pea. New analytical procedures may be added easily to MPP as plugins in order to increase the scope of the software. AVAILABILITY: MPP source code, executables and online help are available at http://cbr.jic.ac.uk/dicks/software/
Assuntos
Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Dosagem de Genes/genética , Marcadores Genéticos/genética , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Análise de Sequência de DNA/métodos , Software , Algoritmos , Armazenamento e Recuperação da Informação/métodos , Filogenia , Linguagens de Programação , Alinhamento de Sequência/métodosRESUMO
A web-based tool, the Interspecies Transcription Factor Function Finder (IT3F), has been developed to display both evolutionary gene relationships and expression data for plant transcription factors, focussing primarily on the R2R3MYB gene subfamily for proof of concept. The graphical display of information allows users to make direct comparisons between structurally related genes and to identify those genes that are potentially orthologous, thereby assisting with their understanding of gene function. A key feature of the website is the provision of an interrogative phylogenetic tree that allows submission of new sequences corresponding to a transcription factor family or subfamily and maps their relative positions to the products of other genes on an 'existing' tree containing proteins encoded by Arabidopsis and rice genes, along with key proteins encoded by genes from other species that have been characterised functionally. In addition, a feature to select clusters of related sequences has been developed so that more detailed phylogenetic analysis can be performed to highlight potential orthologous and paralogous genes within related clusters. Arabidopsis genes that reside on duplicated regions of the genome are indicated on the tree, providing further information for interpreting gene function. An additional feature of the website allows a selected number of key Arabidopsis and rice microarray experiments to be visualised alongside the tree as a tabulated heat map of expression intensity values. Through this display, it is possible to observe relative expression levels across a whole gene family and the extent to which the expression of closely related genes within subgroups has altered since their ancestral divergence. The website is available at http://jicbio.nbi.ac.uk/IT3F/.
Assuntos
Internet , Proteínas de Plantas/genética , Software , Fatores de Transcrição/genética , Algoritmos , Sequência de Aminoácidos , Proteínas de Arabidopsis/classificação , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/fisiologia , Biologia Computacional/métodos , Dados de Sequência Molecular , Filogenia , Proteínas de Plantas/fisiologia , Homologia de Sequência de Aminoácidos , Fatores de Transcrição/classificação , Fatores de Transcrição/fisiologiaRESUMO
BACKGROUND: Rice husk and rice straw represent promising sources of biomass for production of renewable fuels and chemicals. For efficient utilisation, lignocellulosic components must first be pretreated to enable efficient enzymatic saccharification and subsequent fermentation. Existing pretreatments create breakdown products such as sugar-derived furans, and lignin-derived phenolics that inhibit enzymes and fermenting organisms. Alkali pretreatments have also been shown to release significant levels of simple, free phenolics such as ferulic acid that are normally esterified to cell wall polysaccharides in the intact plant. These phenolics have recently been found to have considerable inhibitory properties. The aim of this research has been to establish the extent to which such free phenolic acids are also released during hydrothermal pretreatment of rice straw (RS) and rice husk (RH). RESULTS: RS and RH were subjected to hydrothermal pretreatments over a wide range of severities (1.57-5.45). FTIR analysis showed that the pretreatments hydrolysed and solubilised hemicellulosic moieties, leading to an enrichment of lignin and crystalline cellulose in the insoluble residue. The residues also lost the capacity for UV autofluorescence at pH 7 or pH 10, indicating the breakdown or release of cell wall phenolics. Saponification of raw RS and RH enabled identification and quantification of substantial levels of simple phenolics including ferulic acid (tFA), coumaric acid (pCA) and several diferulic acids (DiFAs) including 8-O-4'-DiFA, 8,5'-DiFA and 5,5'-DiFA. RH had higher levels of pCA and lower levels of tFA and DiFAs compared with RS. Assessment of the pretreatment liquors revealed that pretreatment-liberated phenolics present were not free but remained as phenolic esters (at mM concentrations) that could be readily freed by saponification. Many were lost, presumably through degradation, at the higher severities. CONCLUSION: Differences in lignin, tFA, DiFAs and pCA between RS and RH reflect differences in cell wall physiology, and probably contribute to the higher recalcitrance of RH compared with RS. Hydrothermal pretreatments, unlike alkali pretreatments, release cinnamic acid components as esters. The potential for pretreatment-liberated phenolic esters to be inhibitory to fermenting microorganisms is not known. However, the present study shows that they are found at concentrations that could be significantly inhibitory if released as free forms by enzyme activity.
RESUMO
BACKGROUND: Rice straw and husk are globally significant sources of cellulose-rich biomass and there is great interest in converting them to bioethanol. However, rice husk is reportedly much more recalcitrant than rice straw and produces larger quantities of fermentation inhibitors. The aim of this study was to explore the underlying differences between rice straw and rice husk with reference to the composition of the pre-treatment liquors and their impacts on saccharification and fermentation. This has been carried out by developing quantitative NMR screening methods. RESULTS: Air-dried rice husk and rice straw from the same cultivar were used as substrates. Carbohydrate compositions were similar, whereas lignin contents differed significantly (husk: 35.3% w/w of raw material; straw 22.1% w/w of raw material). Substrates were hydrothermally pre-treated with high-pressure microwave processing across a wide range of severities. 25 compounds were identified from the liquors of both pre-treated rice husk and rice straw. However, the quantities of compounds differed between the two substrates. Fermentation inhibitors such as 5-HMF and 2-FA were highest in husk liquors, and formic acid was higher in straw liquors. At a pre-treatment severity of 3.65, twice as much ethanol was produced from rice straw (14.22% dry weight of substrate) compared with the yield from rice husk (7.55% dry weight of substrate). Above severities of 5, fermentation was inhibited in both straw and husk. In addition to inhibitors, high levels of cellulase-inhibiting xylo-oligomers and xylose were found and at much higher concentrations in rice husk liquor. At low severities, organic acids and related intracellular metabolites were released into the liquor. CONCLUSIONS: Rice husk recalcitrance to saccharification is probably due to the much higher levels of lignin and, from other studies, likely high levels of silica. Therefore, if highly polluting chemical pre-treatments and multi-step biorefining processes are to be avoided, rice husk may need to be improved through selective breeding strategies, although more careful control of pre-treatment may be sufficient to reduce the levels of fermentation inhibitors, e.g. through steam explosion-induced volatilisation. For rice straw, pre-treating at severities of between 3.65 and 4.25 would give a glucose yield of between 37.5 and 40% (w/DW, dry weight of the substrate) close to the theoretical yield of 44.1% w/DW, and an insignificant yield of total inhibitors.