Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 96
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nat Protoc ; 18(1): 208-238, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36376589

RESUMEN

Uncultivated Bacteria and Archaea account for the vast majority of species on Earth, but obtaining their genomes directly from the environment, using shotgun sequencing, has only become possible recently. To realize the hope of capturing Earth's microbial genetic complement and to facilitate the investigation of the functional roles of specific lineages in a given ecosystem, technologies that accelerate the recovery of high-quality genomes are necessary. We present a series of analysis steps and data products for the extraction of high-quality metagenome-assembled genomes (MAGs) from microbiomes using the U.S. Department of Energy Systems Biology Knowledgebase (KBase) platform ( http://www.kbase.us/ ). Overall, these steps take about a day to obtain extracted genomes when starting from smaller environmental shotgun read libraries, or up to about a week from larger libraries. In KBase, the process is end-to-end, allowing a user to go from the initial sequencing reads all the way through to MAGs, which can then be analyzed with other KBase capabilities such as phylogenetic placement, functional assignment, metabolic modeling, pangenome functional profiling, RNA-Seq and others. While portions of such capabilities are available individually from other resources, the combination of the intuitive usability, data interoperability and integration of tools in a freely available computational resource makes KBase a powerful platform for obtaining MAGs from microbiomes. While this workflow offers tools for each of the key steps in the genome extraction process, it also provides a scaffold that can be easily extended with additional MAG recovery and analysis tools, via the KBase software development kit (SDK).


Asunto(s)
Metagenoma , Microbiota , Filogenia , Genoma Bacteriano , Microbiota/genética , Bacterias/genética , Metagenómica
3.
ACS Chem Biol ; 14(12): 2867-2875, 2019 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-31693336

RESUMEN

Elucidating the interaction networks associated with secondary metabolite production in microorganisms is an ongoing challenge made all the more daunting by the rate at which DNA sequencing technology reveals new genes and potential pathways. Developing the culturing methods, expression conditions, and genetic systems needed for validating pathways in newly discovered microorganisms is often not possible. Therefore, new tools and techniques are needed for defining complex metabolic pathways. Here, we describe an in vitro computationally assisted pathway description approach that employs bioinformatic searches of genome databases, protein structural modeling, and protein-ligand-docking simulations to predict the gene products most likely to be involved in a particular secondary metabolite production pathway. This information is then used to direct in vitro reconstructions of the pathway and subsequent confirmation of pathway activity using crude enzyme preparations. As a test system, we elucidated the pathway for biosynthesis of indole-3-acetic acid (IAA) in the plant-associated microbe Pantoea sp. YR343. This organism is capable of metabolizing tryptophan into the plant phytohormone IAA. BLAST analyses identified a likely three-step pathway involving an amino transferase, an indole pyruvate decarboxylase, and a dehydrogenase. However, multiple candidate enzymes were identified at each step, resulting in a large number of potential pathway reconstructions (32 different enzyme combinations). Our approach shows the effectiveness of crude extracts to rapidly elucidate enzymes leading to functional pathways. Results are compared to affinity purified enzymes for select combinations and found to yield similar relative activities. Further, in vitro testing of the pathway reconstructions revealed the "underground" nature of IAA metabolism in Pantoea sp. YR343 and the various mechanisms used to produce IAA. Importantly, our experiments illustrate the scalable integration of computational tools and cell-free enzymatic reactions to identify and validate metabolic pathways in a broadly applicable manner.


Asunto(s)
Biología Computacional , Ácidos Indolacéticos/metabolismo , Reguladores del Crecimiento de las Plantas/metabolismo , Vías Biosintéticas , Ligandos , Simulación del Acoplamiento Molecular , Pantoea/metabolismo , Reproducibilidad de los Resultados
4.
mSystems ; 3(5)2018.
Artículo en Inglés | MEDLINE | ID: mdl-30320216

RESUMEN

Natural products (NPs) isolated from bacteria have dramatically advanced human society, especially in medicine and agriculture. The rapidity and ease of genome sequencing have enabled bioinformatics-guided NP discovery and characterization. As a result, NP potential and diversity within a complex community, such as the microbiome of a plant, are rapidly expanding areas of scientific exploration. Here, we assess biosynthetic diversity in the Populus microbiome by analyzing both bacterial isolate genomes and metagenome samples. We utilize the fully sequenced genomes of isolates from the Populus root microbiome to characterize a subset of organisms for NP potential. The more than 3,400 individual gene clusters identified in 339 bacterial isolates, including 173 newly sequenced organisms, were diverse across NP types and distinct from known NP clusters. The ribosomally synthesized and posttranslationally modified peptides were both widespread and divergent from previously characterized molecules. Lactones and siderophores were prevalent in the genomes, suggesting a high level of communication and pressure to compete for resources. We then consider the overall bacterial diversity and NP variety of metagenome samples compared to the sequenced isolate collection and other plant microbiomes. The sequenced collection, curated to reflect the phylogenetic diversity of the Populus microbiome, also reflects the overall NP diversity trends seen in the metagenomic samples. In our study, only about 1% of all clusters from sequenced isolates were positively matched to a previously characterized gene cluster, suggesting a great opportunity for the discovery of novel NPs involved in communication and control in the Populus root microbiome. IMPORTANCE The plant root microbiome is one of the most diverse and abundant biological communities known. Plant-associated bacteria can have a profound effect on plant growth and development, and especially on protection from disease and environmental stress. These organisms are also known to be a rich source of antibiotic and antifungal drugs. In order to better understand the ways bacterial communities influence plant health, we evaluated the diversity and uniqueness of the natural product gene clusters in bacteria isolated from poplar trees. The complex molecule clusters are abundant, and the majority are unique, suggesting a great potential to discover new molecules that could not only affect plant health but also could have applications as antibiotic agents.

5.
Curr Microbiol ; 75(1): 57-70, 2018 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-28865010

RESUMEN

The selected robust fungus, Aspergillus oryzae strain BCC7051 is of interest for biotechnological production of lipid-derived products due to its capability to accumulate high amount of intracellular lipids using various sugars and agro-industrial substrates. Here, we report the genome sequence of the oleaginous A. oryzae BCC7051. The obtained reads were de novo assembled into 25 scaffolds spanning of 38,550,958 bps with predicted 11,456 protein-coding genes. By synteny mapping, a large rearrangement was found in two scaffolds of A. oryzae BCC7051 as compared to the reference RIB40 strain. The genetic relationship between BCC7051 and other strains of A. oryzae in terms of aflatoxin production was investigated, indicating that the A. oryzae BCC7051 was categorized into group 2 nonaflatoxin-producing strain. Moreover, a comparative analysis of the structural genes focusing on the involvement in lipid metabolism among oleaginous yeast and fungi revealed the presence of multiple isoforms of metabolic enzymes responsible for fatty acid synthesis in BCC7051. The alternative routes of acetyl-CoA generation as oleaginous features and malate/citrate/pyruvate shuttle were also identified in this A. oryzae strain. The genome sequence generated in this work is a dedicated resource for expanding genome-wide study of microbial lipids at systems level, and developing the fungal-based platform for production of diversified lipids with commercial relevance.


Asunto(s)
Aspergillus oryzae/genética , Aspergillus oryzae/metabolismo , Genoma Fúngico , Lípidos/biosíntesis , Proteínas Fúngicas/genética , Proteínas Fúngicas/metabolismo , Malatos/metabolismo , Sintenía
6.
Genome Announc ; 4(5)2016 Sep 29.
Artículo en Inglés | MEDLINE | ID: mdl-27688341

RESUMEN

We and others have shown the utility of long sequence reads to improve genome assembly quality. In this study, we generated PacBio DNA sequence data to improve the assemblies of draft genomes for Clostridium thermocellum AD2, Clostridium thermocellum LQRI, and Pelosinus fermentans R7.

7.
Appl Environ Microbiol ; 82(18): 5698-708, 2016 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-27422831

RESUMEN

UNLABELLED: Bacterial endophytes that colonize Populus trees contribute to nutrient acquisition, prime immunity responses, and directly or indirectly increase both above- and below-ground biomasses. Endophytes are embedded within plant material, so physical separation and isolation are difficult tasks. Application of culture-independent methods, such as metagenome or bacterial transcriptome sequencing, has been limited due to the predominance of DNA from the plant biomass. Here, we describe a modified differential and density gradient centrifugation-based protocol for the separation of endophytic bacteria from Populus roots. This protocol achieved substantial reduction in contaminating plant DNA, allowed enrichment of endophytic bacteria away from the plant material, and enabled single-cell genomics analysis. Four single-cell genomes were selected for whole-genome amplification based on their rarity in the microbiome (potentially uncultured taxa) as well as their inferred abilities to form associations with plants. Bioinformatics analyses, including assembly, contamination removal, and completeness estimation, were performed to obtain single-amplified genomes (SAGs) of organisms from the phyla Armatimonadetes, Verrucomicrobia, and Planctomycetes, which were unrepresented in our previous cultivation efforts. Comparative genomic analysis revealed unique characteristics of each SAG that could facilitate future cultivation efforts for these bacteria. IMPORTANCE: Plant roots harbor a diverse collection of microbes that live within host tissues. To gain a comprehensive understanding of microbial adaptations to this endophytic lifestyle from strains that cannot be cultivated, it is necessary to separate bacterial cells from the predominance of plant tissue. This study provides a valuable approach for the separation and isolation of endophytic bacteria from plant root tissue. Isolated live bacteria provide material for microbiome sequencing, single-cell genomics, and analyses of genomes of uncultured bacteria to provide genomics information that will facilitate future cultivation attempts.


Asunto(s)
Bacterias/clasificación , Bacterias/aislamiento & purificación , Endófitos/clasificación , Endófitos/aislamiento & purificación , Raíces de Plantas/microbiología , Populus/microbiología , Bacterias/genética , Centrifugación por Gradiente de Densidad/métodos , Biología Computacional , Endófitos/genética , Metagenómica , Análisis de Secuencia de ADN , Análisis de la Célula Individual/métodos
8.
Stand Genomic Sci ; 11: 33, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27123157

RESUMEN

Geobacillus sp. WCH70 was one of several thermophilic organisms isolated from hot composts in the Middleton, WI area. Comparison of 16 S rRNA sequences showed the strain may be a new species, and is most closely related to G. galactosidasius and G. toebii. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2009 (CP001638). The genome of Geobacillus species WCH70 consists of one circular chromosome of 3,893,306 bp with an average G + C content of 43 %, and two circular plasmids of 33,899 and 10,287 bp with an average G + C content of 40 %. Among sequenced organisms, Geobacillus sp. WCH70 shares highest Average Nucleotide Identity (86 %) with G. thermoglucosidasius strains, as well as similar genome organization. Geobacillus sp. WCH70 appears to be a highly adaptable organism, with an exceptionally high 125 annotated transposons in the genome. The organism also possesses four predicted restriction-modification systems not found in other Geobacillus species.

10.
Stand Genomic Sci ; 10: 81, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26500717

RESUMEN

Geobacillus sp. Y412MC52 was isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. The genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid of 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Transport and utilization clusters are also present for other carbohydrates including starch, cellobiose, and α- and ß-galactooligosaccharides.

11.
Stand Genomic Sci ; 10: 73, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26442136

RESUMEN

Geobacillus thermoglucosidasius C56-YS93 was one of several thermophilic organisms isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. Comparison of 16 S rRNA sequences confirmed the classification of the strain as a G. thermoglucosidasius species. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). The genome of G. thermoglucosidasius C56-YS93 consists of one circular chromosome of 3,893,306 bp and two circular plasmids of 80,849 and 19,638 bp and an average G + C content of 43.93 %. G. thermoglucosidasius C56-YS93 possesses a xylan degradation cluster not found in the other G. thermoglucosidasius sequenced strains. This cluster appears to be related to the xylan degradation cluster found in G. stearothermophilus. G. thermoglucosidasius C56-YS93 possesses two plasmids not found in the other two strains. One plasmid contains a novel gene cluster coding for proteins involved in proline degradation and metabolism, the other contains a collection of mostly hypothetical proteins.

12.
BMC Syst Biol ; 9: 30, 2015 Jun 26.
Artículo en Inglés | MEDLINE | ID: mdl-26111937

RESUMEN

BACKGROUND: Thermoanaerobacterium saccharolyticum is a hemicellulose-degrading thermophilic anaerobe that was previously engineered to produce ethanol at high yield. A major project was undertaken to develop this organism into an industrial biocatalyst, but the lack of genome information and resources were recognized early on as a key limitation. RESULTS: Here we present a set of genome-scale resources to enable the systems level investigation and development of this potentially important industrial organism. Resources include a complete genome sequence for strain JW/SL-YS485, a genome-scale reconstruction of metabolism, tiled microarray data showing transcription units, mRNA expression data from 71 different growth conditions or timepoints and GC/MS-based metabolite analysis data from 42 different conditions or timepoints. Growth conditions include hemicellulose hydrolysate, the inhibitors HMF, furfural, diamide, and ethanol, as well as high levels of cellulose, xylose, cellobiose or maltodextrin. The genome consists of a 2.7 Mbp chromosome and a 110 Kbp megaplasmid. An active prophage was also detected, and the expression levels of CRISPR genes were observed to increase in association with those of the phage. Hemicellulose hydrolysate elicited a response of carbohydrate transport and catabolism genes, as well as poorly characterized genes suggesting a redox challenge. In some conditions, a time series of combined transcription and metabolite measurements were made to allow careful study of microbial physiology under process conditions. As a demonstration of the potential utility of the metabolic reconstruction, the OptKnock algorithm was used to predict a set of gene knockouts that maximize growth-coupled ethanol production. The predictions validated intuitive strain designs and matched previous experimental results. CONCLUSION: These data will be a useful asset for efforts to develop T. saccharolyticum for efficient industrial production of biofuels. The resources presented herein may also be useful on a comparative basis for development of other lignocellulose degrading microbes, such as Clostridium thermocellum.


Asunto(s)
Genoma Bacteriano/genética , Genómica/métodos , Thermoanaerobacterium/genética , Secuencia de Bases , Biocombustibles/microbiología , Furaldehído/análogos & derivados , Furaldehído/farmacología , Industrias , Modelos Biológicos , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Polisacáridos/farmacología , Thermoanaerobacterium/efectos de los fármacos , Thermoanaerobacterium/crecimiento & desarrollo , Thermoanaerobacterium/metabolismo
13.
Proc Natl Acad Sci U S A ; 112(14): 4251-6, 2015 Apr 07.
Artículo en Inglés | MEDLINE | ID: mdl-25831533

RESUMEN

Understanding the evolution of the free-living, cyanobacterial, diazotroph Trichodesmium is of great importance because of its critical role in oceanic biogeochemistry and primary production. Unlike the other >150 available genomes of free-living cyanobacteria, only 63.8% of the Trichodesmium erythraeum (strain IMS101) genome is predicted to encode protein, which is 20-25% less than the average for other cyanobacteria and nonpathogenic, free-living bacteria. We use distinctive isolates and metagenomic data to show that low coding density observed in IMS101 is a common feature of the Trichodesmium genus, both in culture and in situ. Transcriptome analysis indicates that 86% of the noncoding space is expressed, although the function of these transcripts is unclear. The density of noncoding, possible regulatory elements predicted in Trichodesmium, when normalized per intergenic kilobase, was comparable and twofold higher than that found in the gene-dense genomes of the sympatric cyanobacterial genera Synechococcus and Prochlorococcus, respectively. Conserved Trichodesmium noncoding RNA secondary structures were predicted between most culture and metagenomic sequences, lending support to the structural conservation. Conservation of these intergenic regions in spatiotemporally separated Trichodesmium populations suggests possible genus-wide selection for their maintenance. These large intergenic spacers may have developed during intervals of strong genetic drift caused by periodic blooms of a subset of genotypes, which may have reduced effective population size. Our data suggest that transposition of selfish DNA, low effective population size, and high-fidelity replication allowed the unusual "inflation" of noncoding sequence observed in Trichodesmium despite its oligotrophic lifestyle.


Asunto(s)
Cianobacterias/genética , Cianobacterias/fisiología , ADN Bacteriano/química , Proteínas Bacterianas/química , Carbono/química , Biología Computacional , ADN Bacteriano/genética , ADN Intergénico/genética , Ecosistema , Regulación Bacteriana de la Expresión Génica , Genes Bacterianos , Genoma , Genómica , Datos de Secuencia Molecular , Nitrógeno/química , Fijación del Nitrógeno/genética , Conformación de Ácido Nucleico , Océanos y Mares , Prochlorococcus/genética , ARN/química , ARN/genética , Transducción de Señal , Synechococcus/genética , Transposasas/metabolismo
14.
Genome Announc ; 3(2)2015 Mar 05.
Artículo en Inglés | MEDLINE | ID: mdl-25744998

RESUMEN

The Opitutaceae bacterium strain TAV5, a member of the phylum Verrucomicrobia, was isolated from the wood-feeding termite hindgut. We report here its complete genome sequence, which contains a chromosome and a plasmid of 7,317,842 bp and 99,831 bp, respectively. The genomic analysis reveals genes for methylotrophy, lignocellulose degradation, and ammonia and sulfate assimilation.

15.
Genome Announc ; 3(2)2015 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-25767232

RESUMEN

Desulfovibrio carbinoliphilus subsp. oakridgensis FW-101-2B is an anaerobic, organic acid/alcohol-oxidizing, sulfate-reducing δ-proteobacterium. FW-101-2B was isolated from contaminated groundwater at The Field Research Center at Oak Ridge National Lab after in situ stimulation for heavy metal-reducing conditions. The genome will help elucidate the metabolic potential of sulfate-reducing bacteria during uranium reduction.

16.
Stand Genomic Sci ; 9(3): 562-73, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25197444

RESUMEN

Anabaena variabilis ATCC 29413 is a filamentous, heterocyst-forming cyanobacterium that has served as a model organism, with an extensive literature extending over 40 years. The strain has three distinct nitrogenases that function under different environmental conditions and is capable of photoautotrophic growth in the light and true heterotrophic growth in the dark using fructose as both carbon and energy source. While this strain was first isolated in 1964 in Mississippi and named Anabaena flos-aquae MSU A-37, it clusters phylogenetically with cyanobacteria of the genus Nostoc. The strain is a moderate thermophile, growing well at approximately 40(°) C. Here we provide some additional characteristics of the strain, and an analysis of the complete genome sequence.

17.
Bioinformatics ; 30(19): 2709-16, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24930142

RESUMEN

MOTIVATION: To assess the potential of different types of sequence data combined with de novo and hybrid assembly approaches to improve existing draft genome sequences. RESULTS: Illumina, 454 and PacBio sequencing technologies were used to generate de novo and hybrid genome assemblies for four different bacteria, which were assessed for quality using summary statistics (e.g. number of contigs, N50) and in silico evaluation tools. Differences in predictions of multiple copies of rDNA operons for each respective bacterium were evaluated by PCR and Sanger sequencing, and then the validated results were applied as an additional criterion to rank assemblies. In general, assemblies using longer PacBio reads were better able to resolve repetitive regions. In this study, the combination of Illumina and PacBio sequence data assembled through the ALLPATHS-LG algorithm gave the best summary statistics and most accurate rDNA operon number predictions. This study will aid others looking to improve existing draft genome assemblies. AVAILABILITY AND IMPLEMENTATION: All assembly tools except CLC Genomics Workbench are freely available under GNU General Public License. CONTACT: brownsd@ornl.gov SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Algoritmos , Secuencia de Bases , Mapeo Contig , ADN Bacteriano/análisis , ADN Ribosómico/química , Reproducibilidad de los Resultados
18.
Biotechnol Biofuels ; 7: 40, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24655715

RESUMEN

BACKGROUND: Clostridium autoethanogenum strain JA1-1 (DSM 10061) is an acetogen capable of fermenting CO, CO2 and H2 (e.g. from syngas or waste gases) into biofuel ethanol and commodity chemicals such as 2,3-butanediol. A draft genome sequence consisting of 100 contigs has been published. RESULTS: A closed, high-quality genome sequence for C. autoethanogenum DSM10061 was generated using only the latest single-molecule DNA sequencing technology and without the need for manual finishing. It is assigned to the most complex genome classification based upon genome features such as repeats, prophage, nine copies of the rRNA gene operons. It has a low G + C content of 31.1%. Illumina, 454, Illumina/454 hybrid assemblies were generated and then compared to the draft and PacBio assemblies using summary statistics, CGAL, QUAST and REAPR bioinformatics tools and comparative genomic approaches. Assemblies based upon shorter read DNA technologies were confounded by the large number repeats and their size, which in the case of the rRNA gene operons were ~5 kb. CRISPR (Clustered Regularly Interspaced Short Paloindromic Repeats) systems among biotechnologically relevant Clostridia were classified and related to plasmid content and prophages. Potential associations between plasmid content and CRISPR systems may have implications for historical industrial scale Acetone-Butanol-Ethanol (ABE) fermentation failures and future large scale bacterial fermentations. While C. autoethanogenum contains an active CRISPR system, no such system is present in the closely related Clostridium ljungdahlii DSM 13528. A common prophage inserted into the Arg-tRNA shared between the strains suggests a common ancestor. However, C. ljungdahlii contains several additional putative prophages and it has more than double the amount of prophage DNA compared to C. autoethanogenum. Other differences include important metabolic genes for central metabolism (as an additional hydrogenase and the absence of a phophoenolpyruvate synthase) and substrate utilization pathway (mannose and aromatics utilization) that might explain phenotypic differences between C. autoethanogenum and C. ljungdahlii. CONCLUSIONS: Single molecule sequencing will be increasingly used to produce finished microbial genomes. The complete genome will facilitate comparative genomics and functional genomics and support future comparisons between Clostridia and studies that examine the evolution of plasmids, bacteriophage and CRISPR systems.

19.
Stand Genomic Sci ; 9: 20, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25780509

RESUMEN

BACKGROUND: More than 80% of the microbial genomes in GenBank are of 'draft' quality (12,553 draft vs. 2,679 finished, as of October, 2013). We have examined all the microbial DNA sequences available for complete, draft, and Sequence Read Archive genomes in GenBank as well as three other major public databases, and assigned quality scores for more than 30,000 prokaryotic genome sequences. RESULTS: Scores were assigned using four categories: the completeness of the assembly, the presence of full-length rRNA genes, tRNA composition and the presence of a set of 102 conserved genes in prokaryotes. Most (~88%) of the genomes had quality scores of 0.8 or better and can be safely used for standard comparative genomics analysis. We compared genomes across factors that may influence the score. We found that although sequencing depth coverage of over 100x did not ensure a better score, sequencing read length was a better indicator of sequencing quality. With few exceptions, most of the 30,000 genomes have nearly all the 102 essential genes. CONCLUSIONS: The score can be used to set thresholds for screening data when analyzing "all published genomes" and reference data is either not available or not applicable. The scores highlighted organisms for which commonly used tools do not perform well. This information can be used to improve tools and to serve a broad group of users as more diverse organisms are sequenced. Unexpectedly, the comparison of predicted tRNAs across 15,000 high quality genomes showed that anticodons beginning with an 'A' (codons ending with a 'U') are almost non-existent, with the exception of one arginine codon (CGU); this has been noted previously in the literature for a few genomes, but not with the depth found here.

20.
Biotechnol Biofuels ; 6(1): 179, 2013 Dec 02.
Artículo en Inglés | MEDLINE | ID: mdl-24295562

RESUMEN

BACKGROUND: The thermophilic anaerobe Clostridium thermocellum is a candidate consolidated bioprocessing (CBP) biocatalyst for cellulosic ethanol production. The aim of this study was to investigate C. thermocellum genes required to ferment biomass substrates and to conduct a robust comparison of DNA microarray and RNA sequencing (RNA-seq) analytical platforms. RESULTS: C. thermocellum ATCC 27405 fermentations were conducted with a 5 g/L solid substrate loading of either pretreated switchgrass or Populus. Quantitative saccharification and inductively coupled plasma emission spectroscopy (ICP-ES) for elemental analysis revealed composition differences between biomass substrates, which may have influenced growth and transcriptomic profiles. High quality RNA was prepared for C. thermocellum grown on solid substrates and transcriptome profiles were obtained for two time points during active growth (12 hours and 37 hours postinoculation). A comparison of two transcriptomic analytical techniques, microarray and RNA-seq, was performed and the data analyzed for statistical significance. Large expression differences for cellulosomal genes were not observed. We updated gene predictions for the strain and a small novel gene, Cthe_3383, with a putative AgrD peptide quorum sensing function was among the most highly expressed genes. RNA-seq data also supported different small regulatory RNA predictions over others. The DNA microarray gave a greater number (2,351) of significant genes relative to RNA-seq (280 genes when normalized by the kernel density mean of M component (KDMM) method) in an analysis of variance (ANOVA) testing method with a 5% false discovery rate (FDR). When a 2-fold difference in expression threshold was applied, 73 genes were significantly differentially expressed in common between the two techniques. Sulfate and phosphate uptake/utilization genes, along with genes for a putative efflux pump system were some of the most differentially regulated transcripts when profiles for C. thermocellum grown on either pretreated switchgrass or Populus were compared. CONCLUSIONS: Our results suggest that a high degree of agreement in differential gene expression measurements between transcriptomic platforms is possible, but choosing an appropriate normalization regime is essential.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...