RESUMO
BACKGROUND: Rhodosporidium toruloides has emerged as a promising host for the production of bioproducts from lignocellulose, in part due to its ability to grow on lignocellulosic feedstocks, tolerate growth inhibitors, and co-utilize sugars and lignin-derived monomers. Ent-kaurene derivatives have a diverse range of potential applications from therapeutics to novel resin-based materials. RESULTS: The Design, Build, Test, and Learn (DBTL) approach was employed to engineer production of the non-native diterpene ent-kaurene in R. toruloides. Following expression of kaurene synthase (KS) in R. toruloides in the first DBTL cycle, a key limitation appeared to be the availability of the diterpene precursor, geranylgeranyl diphosphate (GGPP). Further DBTL cycles were carried out to select an optimal GGPP synthase and to balance its expression with KS, requiring two of the strongest promoters in R. toruloides, ANT (adenine nucleotide translocase) and TEF1 (translational elongation factor 1) to drive expression of the KS from Gibberella fujikuroi and a mutant version of an FPP synthase from Gallus gallus that produces GGPP. Scale-up of cultivation in a 2 L bioreactor using a corn stover hydrolysate resulted in an ent-kaurene titer of 1.4 g/L. CONCLUSION: This study builds upon previous work demonstrating the potential of R. toruloides as a robust and versatile host for the production of both mono- and sesquiterpenes, and is the first demonstration of the production of a non-native diterpene in this organism.
Assuntos
Diterpenos do Tipo Caurano/metabolismo , Lignina/metabolismo , Engenharia Metabólica , Ustilaginales/metabolismo , Animais , Proteínas de Plantas/metabolismoRESUMO
BACKGROUND: First generation bioethanol production utilizes the starch fraction of maize, which accounts for approximately 60% of the ash-free dry weight of the grain. Scale-up of this technology for fuels applications has resulted in a massive supply of distillers' grains with solubles (DGS) coproduct, which is rich in cellulosic polysaccharides and protein. It was surmised that DGS would be rapidly adopted for animal feed applications, however, this has not been observed based on inconsistency of the product stream and other logistics-related risks, especially toxigenic contaminants. Therefore, efficient valorization of DGS for production of petroleum displacing products will significantly improve the techno-economic feasibility and net energy return of the established starch bioethanol process. In this study, we demonstrate 'one-pot' bioconversion of the protein and carbohydrate fractions of a DGS hydrolysate into C4 and C5 fusel alcohols through development of a microbial consortium incorporating two engineered Escherichia coli biocatalyst strains. RESULTS: The carbohydrate conversion strain E. coli BLF2 was constructed from the wild type E. coli strain B and showed improved capability to produce fusel alcohols from hexose and pentose sugars. Up to 12 g/L fusel alcohols was produced from glucose or xylose synthetic medium by E. coli BLF2. The second strain, E. coli AY3, was dedicated for utilization of proteins in the hydrolysates to produce mixed C4 and C5 alcohols. To maximize conversion yield by the co-culture, the inoculation ratio between the two strains was optimized. The co-culture with an inoculation ratio of 1:1.5 of E. coli BLF2 and AY3 achieved the highest total fusel alcohol titer of up to 10.3 g/L from DGS hydrolysates. The engineered E. coli co-culture system was shown to be similarly applicable for biofuel production from other biomass sources, including algae hydrolysates. Furthermore, the co-culture population dynamics revealed by quantitative PCR analysis indicated that despite the growth rate difference between the two strains, co-culturing didn't compromise the growth of each strain. The q-PCR analysis also demonstrated that fermentation with an appropriate initial inoculation ratio of the two strains was important to achieve a balanced co-culture population which resulted in higher total fuel titer. CONCLUSIONS: The efficient conversion of DGS hydrolysates into fusel alcohols will significantly improve the feasibility of the first generation bioethanol process. The integrated carbohydrate and protein conversion platform developed here is applicable for the bioconversion of a variety of biomass feedstocks rich in sugars and proteins.
Assuntos
Biocombustíveis , Escherichia coli/metabolismo , Proteínas de Plantas/metabolismo , Zea mays/metabolismo , Biocatálise , Metabolismo dos Carboidratos , Técnicas de Cocultura , Grão Comestível , Etanol/metabolismo , Fermentação , Consórcios Microbianos , Amido/metabolismo , Xilose/metabolismoRESUMO
Recently, several endophytic fungi have been demonstrated to produce volatile organic compounds (VOCs) with properties similar to fossil fuels, called "mycodiesel," while growing on lignocellulosic plant and agricultural residues. The fact that endophytes are plant symbionts suggests that some may be able to produce lignocellulolytic enzymes, making them capable of both deconstructing lignocellulose and converting it into mycodiesel, two properties that indicate that these strains may be useful consolidated bioprocessing (CBP) hosts for the biofuel production. In this study, four endophytes Hypoxylon sp. CI4A, Hypoxylon sp. EC38, Hypoxylon sp. CO27, and Daldinia eschscholzii EC12 were selected and evaluated for their CBP potential. Analysis of their genomes indicates that these endophytes have a rich reservoir of biomass-deconstructing carbohydrate-active enzymes (CAZys), which includes enzymes active on both polysaccharides and lignin, as well as terpene synthases (TPSs), enzymes that may produce fuel-like molecules, suggesting that they do indeed have CBP potential. GC-MS analyses of their VOCs when grown on four representative lignocellulosic feedstocks revealed that these endophytes produce a wide spectrum of hydrocarbons, the majority of which are monoterpenes and sesquiterpenes, including some known biofuel candidates. Analysis of their cellulase activity when grown under the same conditions revealed that these endophytes actively produce endoglucanases, exoglucanases, and ß-glucosidases. The richness of CAZymes as well as terpene synthases identified in these four endophytic fungi suggests that they are great candidates to pursue for development into platform CBP organisms.
Assuntos
Endófitos/enzimologia , Proteínas Fúngicas/metabolismo , Genoma Fúngico , Lignina/metabolismo , Xylariales/enzimologia , Alquil e Aril Transferases/genética , Alquil e Aril Transferases/metabolismo , Biocombustíveis , Celulase/genética , Celulase/metabolismo , Celulases/genética , Celulases/metabolismo , Endófitos/classificação , Endófitos/genética , Proteínas Fúngicas/genética , Expressão Gênica , Glicosídeo Hidrolases/genética , Glicosídeo Hidrolases/metabolismo , Monoterpenos/metabolismo , Filogenia , Polissacarídeos/metabolismo , Sesquiterpenos/metabolismo , Compostos Orgânicos Voláteis/metabolismo , Xylariales/classificação , Xylariales/genéticaRESUMO
Large-scale open microalgae cultivation has tremendous potential to make a significant contribution to replacing petroleum-based fuels with biofuels. Open algal cultures are unavoidably inhabited with a diversity of microbes that live on, influence, and shape the fate of these ecosystems. However, there is little understanding of the resilience and stability of the microbial communities in engineered semicontinuous algal systems. To evaluate the dynamics and resilience of the microbial communities in microalgae biofuel cultures, we conducted a longitudinal study on open systems to compare the temporal profiles of the microbiota from two multigenerational algal cohorts, which include one seeded with the microbiota from an in-house culture and the other exogenously seeded with a natural-occurring consortia of bacterial species harvested from the Pacific Ocean. From these month-long, semicontinuous open microalga Nannochloropsis salina cultures, we sequenced a time-series of 46 samples, yielding 8804 operational taxonomic units derived from 9,160,076 high-quality partial 16S rRNA sequences. We provide quantitative evidence that clearly illustrates the development of microbial community is associated with microbiota ancestry. In addition, N. salina growth phases were linked with distinct changes in microbial phylotypes. Alteromonadeles dominated the community in the N. salina exponential phase whereas Alphaproteobacteria and Flavobacteriia were more prevalent in the stationary phase. We also demonstrate that the N. salina-associated microbial community in open cultures is diverse, resilient, and dynamic in response to environmental perturbations. This knowledge has general implications for developing and testing design principles of cultivated algal systems.
Assuntos
Bactérias/classificação , Microalgas/microbiologia , Microbiota , Bactérias/genética , Bactérias/isolamento & purificação , Biocombustíveis , Biomassa , DNA Bacteriano/genética , Biblioteca Gênica , Oceano Pacífico , RNA Ribossômico 16S/genética , Análise de Sequência de DNA , Estramenópilas/microbiologia , Microbiologia da ÁguaRESUMO
To fully understand the interactions of a pathogen with its host, it is necessary to analyze the RNA transcripts of both the host and pathogen throughout the course of an infection. Although this can be accomplished relatively easily on the host side, the analysis of pathogen transcripts is complicated by the overwhelming amount of host RNA isolated from an infected sample. Even with the read depth provided by second-generation sequencing, it is extremely difficult to get enough pathogen reads for an effective gene-level analysis. In this study, we describe a novel capture-based technique and device that considerably enriches for pathogen transcripts from infected samples. This versatile method can, in principle, enrich for any pathogen in any infected sample. To test the technique's efficacy, we performed time course tissue culture infections using Rift Valley fever virus and Francisella tularensis. At each time point, RNA sequencing (RNA-Seq) was performed and the results of the treated samples were compared with untreated controls. The capture of pathogen transcripts, in all cases, led to more than an order of magnitude enrichment of pathogen reads, greatly increasing the number of genes hit, the coverage of those genes, and the depth at which each transcript was sequenced.
Assuntos
Francisella tularensis/genética , Francisella tularensis/fisiologia , Interações Hospedeiro-Patógeno , Vírus da Febre do Vale do Rift/genética , Vírus da Febre do Vale do Rift/fisiologia , Análise de Sequência de RNA/métodos , Linhagem Celular , Perfilação da Expressão Gênica , Humanos , Macrófagos/microbiologia , Macrófagos/virologia , Hibridização de Ácido Nucleico , RNA Bacteriano/genética , RNA Mensageiro/genética , RNA Viral/genéticaRESUMO
Phlebia radiata is a widespread white-rot basidiomycete fungus with significance in diverse biotechnological applications due to its ability to degrade aromatic compounds, xenobiotics, and lignin using an assortment of oxidative enzymes including laccase. In this work, a chemical screen with 480 conditions was conducted to identify chemical inducers of laccase expression in P. radiata. Among the chemicals tested, phenothiazines were observed to induce laccase activity in P. radiata, with promethazine being the strongest laccase inducer of the phenothiazine-derived compounds examined. Secretomes produced by promethazine-treated P. radiata exhibited increased laccase protein abundance, increased enzymatic activity, and an enhanced ability to degrade phenolic model lignin compounds. Transcriptomics analyses revealed that promethazine rapidly induced the expression of genes encoding lignin-degrading enzymes, including laccase and various oxidoreductases, showing that the increased laccase activity was due to increased laccase gene expression. Finally, the generality of promethazine as an inducer of laccases in fungi was demonstrated by showing that promethazine treatment also increased laccase activity in other relevant fungal species with known lignin conversion capabilities including Trametes versicolor and Pleurotus ostreatus.
RESUMO
Outdoor cultivation of microalgae has promising potential for renewable bioenergy, but there is a knowledge gap on the structure and function of the algal microbiome that coinhabits these ecosystems. Here, we describe the assembly mechanisms, taxonomic structure, and metabolic potential of bacteria associated with Microchloropsis salina cultivated outdoors. Open mesocosms were inoculated with algal cultures that were either free of bacteria or coincubated with one of two different strains of alga-associated bacteria and were sampled across five time points taken over multiple harvesting rounds of a 40-day experiment. Using quantitative analyses of metagenome-assembled genomes (MAGs), we tracked bacterial community compositional abundance and taxon-specific functional capacity involved in algal-bacterial interactions. One of the inoculated bacteria (Alteromonas sp.) persisted and dispersed across mesocosms, whereas the other inoculated strain (Phaeobacter gallaeciensis) disappeared by day 17 while a taxonomically similar but functionally distinct Phaeobacter strain became established. The inoculated strains were less abundant than 6 numerically dominant newly recruited taxa with functional capacities for mutualistic or saprophytic lifestyles, suggesting a generalist approach to persistence. This includes a highly abundant unclassified Rhodobacteraceae species that fluctuated between 25% and 77% of the total community. Overall, we did not find evidence for priority effects exerted by the distinct inoculum conditions; all mesocosms converged with similar microbial community compositions by the end of the experiment. Instead, we infer that the 15 total populations were retained due to host selection, as they showed high metabolic potential for algal-bacterial interactions such as recycling alga-produced carbon and nitrogen and production of vitamins and secondary metabolites associated with algal growth and senescence, including B vitamins, tropodithietic acid, and roseobacticides. IMPORTANCE Bacteria proliferate in nutrient-rich aquatic environments, including engineered algal biofuel systems, where they remineralize photosynthates, exchange secondary metabolites with algae, and can influence system output of biomass or oil. Despite this, knowledge on the microbial ecology of algal cultivation systems is lacking, and the subject is worthy of investigation. Here, we used metagenomics to characterize the metabolic capacities of the predominant bacteria associated with the biofuel-relevant microalga Microchloropsis salina and to predict testable metabolic interactions between algae and manipulated communities of bacteria. We identified a previously undescribed and uncultivated organism that dominated the community. Collectively, the microbial community may interact with the alga in cultivation via exchange of secondary metabolites which could affect algal success, which we demonstrate as a possible outcome from controlled experiments with metabolically analogous isolates. These findings address the scalability of lab-based algal-bacterial interactions through to cultivation systems and more broadly provide a framework for empirical testing of genome-based metabolic predictions.
Assuntos
Biocombustíveis , Microbiota , Biomassa , Metagenoma , SimbioseRESUMO
BACKGROUND: Mitigation of climate change requires that new routes for the production of fuels and chemicals be as oil-independent as possible. The microbial conversion of lignocellulosic feedstocks into terpene-based biofuels and bioproducts represents one such route. This work builds upon previous demonstrations that the single-celled carotenogenic basidiomycete, Rhodosporidium toruloides, is a promising host for the production of terpenes from lignocellulosic hydrolysates. RESULTS: This study focuses on the optimization of production of the monoterpene 1,8-cineole and the sesquiterpene α-bisabolene in R. toruloides. The α-bisabolene titer attained in R. toruloides was found to be proportional to the copy number of the bisabolene synthase (BIS) expression cassette, which in turn influenced the expression level of several native mevalonate pathway genes. The addition of more copies of BIS under a stronger promoter resulted in production of α-bisabolene at 2.2 g/L from lignocellulosic hydrolysate in a 2-L fermenter. Production of 1,8-cineole was found to be limited by availability of the precursor geranylgeranyl pyrophosphate (GPP) and expression of an appropriate GPP synthase increased the monoterpene titer fourfold to 143 mg/L at bench scale. Targeted mevalonate pathway metabolite analysis suggested that 3-hydroxy-3-methyl-glutaryl-coenzyme A reductase (HMGR), mevalonate kinase (MK) and phosphomevalonate kinase (PMK) may be pathway bottlenecks are were therefore selected as targets for overexpression. Expression of HMGR, MK, and PMK orthologs and growth in an optimized lignocellulosic hydrolysate medium increased the 1,8-cineole titer an additional tenfold to 1.4 g/L. Expression of the same mevalonate pathway genes did not have as large an impact on α-bisabolene production, although the final titer was higher at 2.6 g/L. Furthermore, mevalonate pathway intermediates accumulated in the mevalonate-engineered strains, suggesting room for further improvement. CONCLUSIONS: This work brings R. toruloides closer to being able to make industrially relevant quantities of terpene from lignocellulosic biomass.
RESUMO
Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding conservation with non-mammalian vertebrates, suggesting that they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-coding genes including the protocadherin and interleukin gene families. We also completely sequenced versions of the large chromosome-5-specific internal duplications. These duplications are very recent evolutionary events and probably have a mechanistic role in human physiological variation, as deletions in these regions are the cause of debilitating disorders including spinal muscular atrophy.
Assuntos
Cromossomos Humanos Par 5/genética , Análise de Sequência de DNA , Animais , Composição de Bases , Caderinas/genética , Sequência Conservada/genética , Duplicação Gênica , Genes/genética , Doenças Genéticas Inatas/genética , Genômica , Humanos , Interleucinas/genética , Dados de Sequência Molecular , Atrofia Muscular Espinal/genética , Pan troglodytes/genética , Mapeamento Físico do Cromossomo , Pseudogenes/genética , Sintenia/genética , Vertebrados/genéticaRESUMO
Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.
Assuntos
Cromossomos Humanos Par 19/genética , Genes/genética , Mapeamento Físico do Cromossomo , Processamento Alternativo/genética , Animais , Composição de Bases , Sequência Conservada/genética , Ilhas de CpG/genética , Evolução Molecular , Duplicação Gênica , Genética Médica , Humanos , Camundongos , Dados de Sequência Molecular , Família Multigênica/genética , Pseudogenes/genética , Análise de Sequência de DNARESUMO
Improving the economic feasibility is necessary for algae-based processes to achieve commercial scales for biofuels and bioproducts production. A closed-loop system for fusel alcohol production from microalgae biomass with integrated nutrient recycling was developed, which enables the reuse of nitrogen and phosphorus for downstream application and thus reduces the operational requirement for external major nutrients. Mixed fusel alcohols, primarily isobutanol and isopentanol were produced from Microchloropsis salina hydrolysates by an engineered E. coli co-culture. During the process, cellular nitrogen from microalgae biomass was converted into ammonium, whereas cellular phosphorus was liberated by an osmotic shock treatment. The formation of struvite from the liberated ammonium and phosphate, and the subsequent utilization of struvite to support M. salina cultivation was demonstrated. The closed loop system established here should help overcome one of the identified economic barriers to scale-up of microalgae production, and enhance the sustainability of microalgae-based chemical commodities production.
Assuntos
Álcoois/metabolismo , Biomassa , Microalgas/metabolismo , Nutrientes/metabolismo , Estramenópilas/metabolismo , Escherichia coli/metabolismo , Microalgas/crescimento & desenvolvimento , Nitrogênio/metabolismo , Fósforo/metabolismo , Reciclagem , Estramenópilas/crescimento & desenvolvimento , Estruvita/metabolismoRESUMO
Open microalgae cultures host a myriad of bacteria, creating a complex system of interacting species that influence algal growth and health. Many algal microbiota studies have been conducted to determine the relative importance of bacterial taxa to algal culture health and physiological states, but these studies have not characterized the interspecies relationships in the microbial communities. We subjected Nanochroloropsis salina cultures to multiple chemical treatments (antibiotics and quorum sensing compounds) and obtained dense time-series data on changes to the microbial community using 16S gene amplicon metagenomic sequencing (21,029,577 reads for 23 samples) to measure microbial taxa-taxa abundance correlations. Short-term treatment with antibiotics resulted in substantially larger shifts in the microbiota structure compared to changes observed following treatment with signaling compounds and glucose. We also calculated operational taxonomic unit (OTU) associations and generated OTU correlation networks to provide an overview of possible bacterial OTU interactions. This analysis identified five major cohesive modules of microbiota with similar co-abundance profiles across different chemical treatments. The Eigengenes of OTU modules were examined for correlation with different external treatment factors. This correlation-based analysis revealed that culture age (time) and treatment types have primary effects on forming network modules and shaping the community structure. Additional network analysis detected Alteromonadeles and Alphaproteobacteria as having the highest centrality, suggesting these species are "keystone" OTUs in the microbial community. Furthermore, we illustrated that the chemical tropodithietic acid, which is secreted by several species in the Alphaproteobacteria taxon, is able to drastically change the structure of the microbiota within 3 h. Taken together, these results provide valuable insights into the structure of the microbiota associated with N. salina cultures and how these structures change in response to chemical perturbations.
RESUMO
Harnessing the biotechnological potential of the large number of proteins available in sequence databases requires scalable methods for functional characterization. Here we propose a workflow to address this challenge by combining phylogenomic guided DNA synthesis with high-throughput mass spectrometry and apply it to the systematic characterization of GH1 ß-glucosidases, a family of enzymes necessary for biomass hydrolysis, an important step in the conversion of lignocellulosic feedstocks to fuels and chemicals. We synthesized and expressed 175 GH1s, selected from over 2000 candidate sequences to cover maximum sequence diversity. These enzymes were functionally characterized over a range of temperatures and pHs using nanostructure-initiator mass spectrometry (NIMS), generating over 10,000 data points. When combined with HPLC-based sugar profiling, we observed GH1 enzymes active over a broad temperature range and toward many different ß-linked disaccharides. For some GH1s we also observed activity toward laminarin, a more complex oligosaccharide present as a major component of macroalgae. An area of particular interest was the identification of GH1 enzymes compatible with the ionic liquid 1-ethyl-3-methylimidazolium acetate ([C2mim][OAc]), a next-generation biomass pretreatment technology. We thus searched for GH1 enzymes active at 70 °C and 20% (v/v) [C2mim][OAc] over the course of a 24-h saccharification reaction. Using our unbiased approach, we identified multiple enzymes of different phylogentic origin with such activities. Our approach of characterizing sequence diversity through targeted gene synthesis coupled to high-throughput screening technologies is a broadly applicable paradigm for a wide range of biological problems.
Assuntos
Biotecnologia/métodos , Celulases/análise , Celulases/genética , Celulases/metabolismo , DNA/biossíntese , Espectrometria de Massas/métodos , Filogenia , Biomassa , Cromatografia Líquida de Alta Pressão/métodos , Glucanos/metabolismo , Ensaios de Triagem em Larga Escala/métodos , Concentração de Íons de Hidrogênio , Hidrólise , Imidazóis/química , Líquidos Iônicos/química , Nanoestruturas , Especificidade por Substrato , Temperatura , Fluxo de TrabalhoRESUMO
Francisella tularensis is a zoonotic intracellular pathogen that is capable of causing potentially fatal human infections. Like all successful bacterial pathogens, F. tularensis rapidly responds to changes in its environment during infection of host cells, and upon encountering different microenvironments within those cells. This ability to appropriately respond to the challenges of infection requires rapid and global shifts in gene expression patterns. In this study, we use a novel pathogen transcript enrichment strategy and whole transcriptome sequencing (RNA-Seq) to perform a detailed characterization of the rapid and global shifts in F. tularensis LVS gene expression during infection of murine macrophages. We performed differential gene expression analysis on all bacterial genes at two key stages of infection: phagosomal escape, and cytosolic replication. By comparing the F. tularensis transcriptome at these two stages of infection to that of the bacteria grown in culture, we were able to identify sets of genes that are differentially expressed over the course of infection. This analysis revealed the temporally dynamic expression of a number of known and putative transcriptional regulators and virulence factors, providing insight into their role during infection. In addition, we identified several F. tularensis genes that are significantly up-regulated during infection but had not been previously identified as virulence factors. These unknown genes may make attractive therapeutic or vaccine targets.
Assuntos
Francisella tularensis/genética , Francisella tularensis/fisiologia , Macrófagos/microbiologia , Análise de Sequência de RNA/métodos , Transcriptoma/genética , Tularemia/genética , Tularemia/microbiologia , Animais , Regulação para Baixo/genética , Francisella tularensis/patogenicidade , Perfilação da Expressão Gênica , Regulação Bacteriana da Expressão Gênica , Genes Bacterianos/genética , Ilhas Genômicas/genética , Humanos , Macrófagos/patologia , Camundongos , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Transcrição Gênica , Regulação para Cima/genética , Virulência/genética , Fatores de Virulência/genética , Fatores de Virulência/metabolismoRESUMO
The chicken genome draft sequence has provided a valuable resource for studies of an important agricultural and experimental model species and an important data set for comparative analysis. However, some of the most gene-rich segments are missing from chicken genome draft assemblies, limiting the analysis of a substantial number of genes and preventing a closer look at regions that are especially prone to syntenic rearrangements. To facilitate the functional and evolutionary analysis of one especially gene-rich, rearrangement-prone genomic region, we analyzed sequence from BAC clones spanning chicken microchromosome GGA28; as a complement we also analyzed a gene-sparse, stable region from GGA11. In these two regions we documented the conservation and lineage-specific gain and loss of protein-coding genes and precisely mapped the locations of 31 major human-chicken syntenic breakpoints. Altogether, we identified 72 lineage-specific genes, many of which are found at or near syntenic breaks, implicating evolutionary breakpoint regions as major sites of genetic innovation and change. Twenty-two of the 31 breakpoint regions have been reused repeatedly as rearrangement breakpoints in vertebrate evolution. Compared with stable GC-matched regions, GGA28 is highly enriched in CpG islands, as are break-prone intervals identified elsewhere in the chicken genome; evolutionary breakpoints are further enriched in GC content and CpG islands, highlighting a potential role for these features in genome instability. These data support the hypothesis that chromosome rearrangements have not occurred randomly over the course of vertebrate evolution but are focused preferentially within "fragile" regions with unusual DNA sequence characteristics.
Assuntos
Galinhas/genética , Cromossomos , Evolução Molecular , Animais , Quebra Cromossômica , Mapeamento Cromossômico , Duplicação Gênica , Rearranjo Gênico , Genoma Humano , Humanos , Camundongos , Modelos Genéticos , Especificidade da Espécie , SinteniaRESUMO
Most genes are conserved in mammals, but certain gene families have acquired large numbers of lineage-specific loci through repeated rounds of gene duplication, divergence, and loss that have continued in each mammalian group. One such family encodes KRAB-zinc finger (KRAB-ZNF) proteins, which function as transcriptional repressors. One particular subfamily of KRAB-ZNF genes, including ZNF91, has expanded specifically in primates to comprise more than 110 loci in the human genome. Genes of the ZNF91 subfamily reside in large gene clusters near centromeric regions of human chromosomes 19 and 7 with smaller clusters or isolated copies in other locations. Phylogenetic analysis indicates that many of these genes arose before the split between the New and Old World monkeys, but the ZNF91 subfamily has continued to expand and diversify throughout the evolution of apes and humans. Paralogous loci are distinguished by divergence within their zinc finger arrays, indicating selection for proteins with different regulatory targets. In addition, many loci produce multiple alternatively spliced transcripts encoding proteins that may serve separate and perhaps even opposing regulatory roles because of the modular motif structure of KRAB-ZNF genes. The tissue-specific expression patterns and rapid structural divergence of ZNF91 subfamily genes suggest a role in determining gene expression differences between species and the evolution of novel primate traits.
Assuntos
Evolução Molecular , Primatas/genética , Dedos de Zinco/genética , Animais , Cromossomos Humanos Par 19 , Cromossomos Humanos Par 7 , DNA Intergênico , Bases de Dados Factuais , Dosagem de Genes , Duplicação Gênica , Genoma Humano , Humanos , Fatores de Transcrição Kruppel-Like/genética , Família Multigênica , Filogenia , Mapeamento Físico do Cromossomo , Proteínas Repressoras/genética , Análise de Sequência de DNARESUMO
Krüppel-type zinc finger (ZNF) motifs are prevalent components of transcription factor proteins in all eukaryotes. KRAB-ZNF proteins, in which a potent repressor domain is attached to a tandem array of DNA-binding zinc-finger motifs, are specific to tetrapod vertebrates and represent the largest class of ZNF proteins in mammals. To define the full repertoire of human KRAB-ZNF proteins, we searched the genome sequence for key motifs and then constructed and manually curated gene models incorporating those sequences. The resulting gene catalog contains 423 KRAB-ZNF protein-coding loci, yielding alternative transcripts that altogether predict at least 742 structurally distinct proteins. Active rounds of segmental duplication, involving single genes or larger regions and including both tandem and distributed duplication events, have driven the expansion of this mammalian gene family. Comparisons between the human genes and ZNF loci mined from the draft mouse, dog, and chimpanzee genomes not only identified 103 KRAB-ZNF genes that are conserved in mammals but also highlighted a substantial level of lineage-specific change; at least 136 KRAB-ZNF coding genes are primate specific, including many recent duplicates. KRAB-ZNF genes are widely expressed and clustered genes are typically not coregulated, indicating that paralogs have evolved to fill roles in many different biological processes. To facilitate further study, we have developed a Web-based public resource with access to gene models, sequences, and other data, including visualization tools to provide genomic context and interaction with other public data sets.