Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 233
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Front Bioinform ; 2: 871256, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36304316

RESUMEN

We present a novel approach for rapidly identifying sequences that leverages the representational power of Deep Learning techniques and is applied to the analysis of microbiome data. The method involves the creation of a latent sequence space, training a convolutional neural network to rapidly identify sequences by mapping them into that space, and we leverage the novel encoded latent space for denoising to correct sequencing errors. Using mock bacterial communities of known composition, we show that this approach achieves single nucleotide resolution, generating results for sequence identification and abundance estimation that match the best available microbiome algorithms in terms of accuracy while vastly increasing the speed of accurate processing. We further show the ability of this approach to support phenotypic prediction at the sample level on an experimental data set for which the ground truth for sequence identities and abundances is unknown, but the expected phenotypes of the samples are definitive. Moreover, this approach offers a potential solution for the analysis of data from other types of experiments that currently rely on computationally intensive sequence identification.

2.
BMC Bioinformatics ; 18(1): 345, 2017 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-28724412

RESUMEN

BACKGROUND: Functional annotation of bacterial genomes is an obligatory and crucially important step of information processing from the genome sequences into cellular mechanisms. However, there is a lack of computational methods to evaluate the quality of functional assignments. RESULTS: We developed a genome-scale model that assigns Bayesian probability to each gene utilizing a known property of functional similarity between neighboring genes in bacteria. CONCLUSIONS: Our model clearly distinguished true annotation from random annotation with Bayesian annotation probability >0.95. Our model will provide a useful guide to quantitatively evaluate functional annotation methods and to detect gene sets with reliable annotations.


Asunto(s)
Genómica/métodos , Algoritmos , Teorema de Bayes , Clostridium thermocellum/genética , Bases de Datos Genéticas , Escherichia coli/genética , Genoma Bacteriano
3.
Stand Genomic Sci ; 11(1): 70, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27617060

RESUMEN

Halorubrum lacusprofundi is an extreme halophile within the archaeal phylum Euryarchaeota. The type strain ACAM 34 was isolated from Deep Lake, Antarctica. H. lacusprofundi is of phylogenetic interest because it is distantly related to the haloarchaea that have previously been sequenced. It is also of interest because of its psychrotolerance. We report here the complete genome sequence of H. lacusprofundi type strain ACAM 34 and its annotation. This genome is part of a 2006 Joint Genome Institute Community Sequencing Program project to sequence genomes of diverse Archaea.

4.
Front Microbiol ; 7: 794, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27303383

RESUMEN

RNA-seq is being used increasingly for gene expression studies and it is revolutionizing the fields of genomics and transcriptomics. However, the field of RNA-seq analysis is still evolving. Therefore, we specifically designed this study to contain large numbers of reads and four biological replicates per condition so we could alter these parameters and assess their impact on differential expression results. Bacillus thuringiensis strains ATCC10792 and CT43 were grown in two Luria broth medium lots on four dates and transcriptomics data were generated using one lane of sequence output from an Illumina HiSeq2000 instrument for each of the 32 samples, which were then analyzed using DESeq2. Genome coverages across samples ranged from 87 to 465X with medium lots and culture dates identified as major variation sources. Significantly differentially expressed genes (5% FDR, two-fold change) were detected for cultures grown using different medium lots and between different dates. The highly differentially expressed iron acquisition and metabolism genes, were a likely consequence of differing amounts of iron in the two media lots. Indeed, in this study RNA-seq was a tool for predictive biology since we hypothesized and confirmed the two LB medium lots had different iron contents (~two-fold difference). This study shows that the noise in data can be controlled and minimized with appropriate experimental design and by having the appropriate number of replicates and reads for the system being studied. We outline parameters for an efficient and cost effective microbial transcriptomics study.

5.
Stand Genomic Sci ; 11: 38, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27274784

RESUMEN

Thioalkalimicrobium cyclicum Sorokin et al. 2002 is a member of the family Piscirickettsiaceae in the order Thiotrichales. The γ-proteobacterium belongs to the colourless sulfur-oxidizing bacteria isolated from saline soda lakes with stable alkaline pH, such as Lake Mono (California) and Soap Lake (Washington State). Strain ALM 1(T) is characterized by its adaptation to life in the oxic/anoxic interface towards the less saline aerobic waters (mixolimnion) of the stable stratified alkaline salt lakes. Strain ALM 1(T) is the first representative of the genus Thioalkalimicrobium whose genome sequence has been deciphered and the fourth genome sequence of a type strain of the Piscirickettsiaceae to be published. The 1,932,455 bp long chromosome with its 1,684 protein-coding and 50 RNA genes was sequenced as part of the DOE Joint Genome Institute Community Sequencing Program (CSP) 2008.

7.
Appl Environ Microbiol ; 82(1): 375-83, 2016 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-26519390

RESUMEN

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.


Asunto(s)
Variación Genética , Genoma Bacteriano , Populus/microbiología , Pseudomonas/clasificación , Pseudomonas/genética , Hibridación Genómica Comparativa , Filogenia , Raíces de Plantas/microbiología , Pseudomonas/aislamiento & purificación , Pseudomonas aeruginosa/genética , Pseudomonas aeruginosa/aislamiento & purificación , Pseudomonas fluorescens/clasificación , Pseudomonas fluorescens/genética , Pseudomonas fluorescens/aislamiento & purificación , Pseudomonas putida/genética , Pseudomonas putida/aislamiento & purificación , Rizosfera , Análisis de Secuencia de ADN
8.
Stand Genomic Sci ; 10: 81, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26500717

RESUMEN

Geobacillus sp. Y412MC52 was isolated from Obsidian Hot Spring, Yellowstone National Park, Montana, USA under permit from the National Park Service. The genome was sequenced, assembled, and annotated by the DOE Joint Genome Institute and deposited at the NCBI in December 2011 (CP002835). Based on 16S rRNA genes and average nucleotide identity, Geobacillus sp. Y412MC52 and the related Geobacillus sp. Y412MC61 appear to be members of a new species of Geobacillus. The genome of Geobacillus sp. Y412MC52 consists of one circular chromosome of 3,628,883 bp, an average G + C content of 52 % and one circular plasmid of 45,057 bp and an average G + C content of 45 %. Y412MC52 possesses arabinan, arabinoglucuronoxylan, and aromatic acid degradation clusters for degradation of hemicellulose from biomass. Transport and utilization clusters are also present for other carbohydrates including starch, cellobiose, and α- and ß-galactooligosaccharides.

9.
Stand Genomic Sci ; 10: 55, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26380642

RESUMEN

Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs in two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3' end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.

10.
BMC Res Notes ; 8: 479, 2015 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-26409790

RESUMEN

BACKGROUND: For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost-effective way to screen samples of interest for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. RESULTS: We present a computational method called pangenome-based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU's taxonomic lineage. From this lineage, we derive an OTU functional profile by weighting a pangenome's functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8-0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed-reference OTU picking strategies against specific reference sequence databases. CONCLUSIONS: We developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub ( https://github.com/srjun/PanFP ).


Asunto(s)
Algoritmos , Bacterias/genética , Metagenoma/genética , Metagenómica/métodos , Análisis de Secuencia de ADN , Estadísticas no Paramétricas
11.
PLoS One ; 10(6): e0118285, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26035711

RESUMEN

Clostridium phytofermentans was isolated from forest soil and is distinguished by its capacity to directly ferment plant cell wall polysaccharides into ethanol as the primary product, suggesting that it possesses unusual catabolic pathways. The objective of the present study was to understand the molecular mechanisms of biomass conversion to ethanol in a single organism, Clostridium phytofermentans, by analyzing its complete genome and transcriptome during growth on plant carbohydrates. The saccharolytic versatility of C. phytofermentans is reflected in a diversity of genes encoding ATP-binding cassette sugar transporters and glycoside hydrolases, many of which may have been acquired through horizontal gene transfer. These genes are frequently organized as operons that may be controlled individually by the many transcriptional regulators identified in the genome. Preferential ethanol production may be due to high levels of expression of multiple ethanol dehydrogenases and additional pathways maximizing ethanol yield. The genome also encodes three different proteinaceous bacterial microcompartments with the capacity to compartmentalize pathways that divert fermentation intermediates to various products. These characteristics make C. phytofermentans an attractive resource for improving the efficiency and speed of biomass conversion to biofuels.


Asunto(s)
Metabolismo de los Hidratos de Carbono/genética , Clostridium/genética , Clostridium/metabolismo , Enzimas/metabolismo , Genoma Bacteriano , Plantas/metabolismo , Alcohol Deshidrogenasa/genética , Alcohol Deshidrogenasa/metabolismo , Biocombustibles , Transporte Biológico , Enzimas/genética , Etanol/metabolismo , Fermentación , Regulación Bacteriana de la Expresión Génica , Filogenia , ARN Ribosómico 16S , Transcriptoma
12.
Genome Announc ; 3(2)2015 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-25767232

RESUMEN

Desulfovibrio carbinoliphilus subsp. oakridgensis FW-101-2B is an anaerobic, organic acid/alcohol-oxidizing, sulfate-reducing δ-proteobacterium. FW-101-2B was isolated from contaminated groundwater at The Field Research Center at Oak Ridge National Lab after in situ stimulation for heavy metal-reducing conditions. The genome will help elucidate the metabolic potential of sulfate-reducing bacteria during uranium reduction.

13.
Funct Integr Genomics ; 15(2): 141-61, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25722247

RESUMEN

Since the first two complete bacterial genome sequences were published in 1995, the science of bacteria has dramatically changed. Using third-generation DNA sequencing, it is possible to completely sequence a bacterial genome in a few hours and identify some types of methylation sites along the genome as well. Sequencing of bacterial genome sequences is now a standard procedure, and the information from tens of thousands of bacterial genomes has had a major impact on our views of the bacterial world. In this review, we explore a series of questions to highlight some insights that comparative genomics has produced. To date, there are genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. However, the distribution is quite skewed towards a few phyla that contain model organisms. But the breadth is continuing to improve, with projects dedicated to filling in less characterized taxonomic groups. The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system provides bacteria with immunity against viruses, which outnumber bacteria by tenfold. How fast can we go? Second-generation sequencing has produced a large number of draft genomes (close to 90 % of bacterial genomes in GenBank are currently not complete); third-generation sequencing can potentially produce a finished genome in a few hours, and at the same time provide methlylation sites along the entire chromosome. The diversity of bacterial communities is extensive as is evident from the genome sequences available from 50 different bacterial phyla and 11 different archaeal phyla. Genome sequencing can help in classifying an organism, and in the case where multiple genomes of the same species are available, it is possible to calculate the pan- and core genomes; comparison of more than 2000 Escherichia coli genomes finds an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. Why do we care about bacterial genome sequencing? There are many practical applications, such as genome-scale metabolic modeling, biosurveillance, bioforensics, and infectious disease epidemiology. In the near future, high-throughput sequencing of patient metagenomic samples could revolutionize medicine in terms of speed and accuracy of finding pathogens and knowing how to treat them.


Asunto(s)
Genoma Bacteriano , Bacterias/clasificación , Proteínas Bacterianas/genética , Codón , Variación Genética , Tamaño del Genoma , Genómica , Metagenómica , Anotación de Secuencia Molecular , Filogenia , Análisis de Secuencia de ADN
14.
Metab Eng Commun ; 2: 23-29, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-34150505

RESUMEN

A key tool for metabolic engineering is the ability to express heterologous genes. One obstacle to gene expression in non-model organisms, and especially in relatively uncharacterized bacteria, is the lack of well-characterized promoters. Here we test 17 promoter regions for their ability to drive expression of the reporter genes ß-galactosidase (lacZ) and NADPH-alcohol dehydrogenase (adhB) in Clostridium thermocellum, an important bacterium for the production of cellulosic biofuels. Only three promoters have been commonly used for gene expression in C. thermocellum, gapDH, cbp and eno. Of the new promoters tested, 2638, 2926, 966 and 815 showed reliable expression. The 2638 promoter showed relatively higher activity when driving adhB (compared to lacZ), and the 815 promoter showed relatively higher activity when driving lacZ (compared to adhB).

15.
Stand Genomic Sci ; 9(3): 449-61, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25197431

RESUMEN

Granulicella tundricola strain MP5ACTX9(T) is a novel species of the genus Granulicella in subdivision 1 Acidobacteria. G. tundricola is a predominant member of soil bacterial communities, active at low temperatures and nutrient limiting conditions in Arctic alpine tundra. The organism is a cold-adapted acidophile and a versatile heterotroph that hydrolyzes a suite of sugars and complex polysaccharides. Genome analysis revealed metabolic versatility with genes involved in metabolism and transport of carbohydrates, including gene modules encoding for the carbohydrate-active enzyme (CAZy) families for the breakdown, utilization and biosynthesis of diverse structural and storage polysaccharides such as plant based carbon polymers. The genome of G. tundricola strain MP5ACTX9(T) consists of 4,309,151 bp of a circular chromosome and five mega plasmids with a total genome content of 5,503,984 bp. The genome comprises 4,705 protein-coding genes and 52 RNA genes.

16.
Stand Genomic Sci ; 9(3): 763-74, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25197461

RESUMEN

Burkholderia phymatum is a soil bacterium able to develop a nitrogen-fixing symbiosis with species of the legume genus Mimosa, and is frequently found associated specifically with Mimosa pudica. The type strain of the species, STM 815(T), was isolated from a root nodule in French Guiana in 2000. The strain is an aerobic, motile, non-spore forming, Gram-negative rod, and is a highly competitive strain for nodulation compared to other Mimosa symbionts, as it also nodulates a broad range of other legume genera and species. The 8,676,562 bp genome is composed of two chromosomes (3,479,187 and 2,697,374 bp), a megaplasmid (1,904,893 bp) and a plasmid hosting the symbiotic functions (595,108 bp).

17.
Stand Genomic Sci ; 9(3): 1105-17, 2014 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-25197486

RESUMEN

Thermotoga thermarum Windberger et al. 1989 is a member to the genomically well characterized genus Thermotoga in the phylum 'Thermotogae'. T. thermarum is of interest for its origin from a continental solfataric spring vs. predominantly marine oil reservoirs of other members of the genus. The genome of strain LA3T also provides fresh data for the phylogenomic positioning of the (hyper-)thermophilic bacteria. T. thermarum strain LA3(T) is the fourth sequenced genome of a type strain from the genus Thermotoga, and the sixth in the family Thermotogaceae to be formally described in a publication. Phylogenetic analyses do not reveal significant discrepancies between the current classification of the group, 16S rRNA gene data and whole-genome sequences. Nevertheless, T. thermarum significantly differs from other Thermotoga species regarding its iron-sulfur cluster synthesis, as it contains only a minimal set of the necessary proteins. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,039,943 bp long chromosome with its 2,015 protein-coding and 51 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

18.
Genome Announc ; 2(4)2014 Aug 14.
Artículo en Inglés | MEDLINE | ID: mdl-25125642

RESUMEN

The benefits of using transgenic switchgrass with decreased levels of caffeic acid 3-O-methyltransferase (COMT) as biomass feedstock have been clearly demonstrated. However, its effect on the soil microbial community has not been assessed. Here we report metagenomic and metatranscriptomic analyses of root-associated soil from COMT switchgrass compared with nontransgenic counterparts.

19.
Genome Announc ; 2(2)2014 Apr 03.
Artículo en Inglés | MEDLINE | ID: mdl-24699952

RESUMEN

Bacteria belonging to the phylum Gemmatimonadetes are found in a wide variety of environments and are particularly abundant in soils. Here, we present the complete genome sequence and methylation pattern of the newly described Gemmatirosa kalamazoonensis type strain.

20.
Stand Genomic Sci ; 9: 10, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25780503

RESUMEN

Planctomyces brasiliensis Schlesner 1990 belongs to the order Planctomycetales, which differs from other bacterial taxa by several distinctive features such as internal cell compartmentalization, multiplication by forming buds directly from the spherical, ovoid or pear-shaped mother cell and a cell wall consisting of a proteinaceous layer rather than a peptidoglycan layer. The first strains of P. brasiliensis, including the type strain IFAM 1448(T), were isolated from a water sample of Lagoa Vermelha, a salt pit near Rio de Janeiro, Brasil. This is the second completed genome sequence of a type strain of the genus Planctomyces to be published and the sixth type strain genome sequence from the family Planctomycetaceae. The 6,006,602 bp long genome with its 4,811 protein-coding and 54 RNA genes is a part of the G enomic E ncyclopedia of Bacteria and Archaea project. Phylogenomic analyses indicate that the classification within the Planctomycetaceae is partially in conflict with its evolutionary history, as the positioning of Schlesneria renders the genus Planctomyces paraphyletic. A re-analysis of published fatty-acid measurements also does not support the current arrangement of the two genera. A quantitative comparison of phylogenetic and phenotypic aspects indicates that the three Planctomyces species with type strains available in public culture collections should be placed in separate genera. Thus the genera Gimesia, Planctopirus and Rubinisphaera are proposed to accommodate P. maris, P. limnophilus and P. brasiliensis, respectively. Pronounced differences between the reported G + C content of Gemmata obscuriglobus, Singulisphaera acidiphila and Zavarzinella formosa and G + C content calculated from their genome sequences call for emendation of their species descriptions. In addition to other features, the range of G + C values reported for the genera within the Planctomycetaceae indicates that the descriptions of the family and the order should be emended.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...