Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Genet ; 19(3): e1010683, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36972309

RESUMO

Prokaryotic evolution is influenced by the exchange of genetic information between species through a process referred to as recombination. The rate of recombination is a useful measure for the adaptive capacity of a prokaryotic population. We introduce Rhometa (https://github.com/sid-krish/Rhometa), a new software package to determine recombination rates from shotgun sequencing reads of metagenomes. It extends the composite likelihood approach for population recombination rate estimation and enables the analysis of modern short-read datasets. We evaluated Rhometa over a broad range of sequencing depths and complexities, using simulated and real experimental short-read data aligned to external reference genomes. Rhometa offers a comprehensive solution for determining population recombination rates from contemporary metagenomic read datasets. Rhometa extends the capabilities of conventional sequence-based composite likelihood population recombination rate estimators to include modern aligned metagenomic read datasets with diverse sequencing depths, thereby enabling the effective application of these techniques and their high accuracy rates to the field of metagenomics. Using simulated datasets, we show that our method performs well, with its accuracy improving with increasing numbers of genomes. Rhometa was validated on a real S. pneumoniae transformation experiment, where we show that it obtains plausible estimates of the rate of recombination. Finally, the program was also run on ocean surface water metagenomic datasets, through which we demonstrate that the program works on uncultured metagenomic datasets.


Assuntos
Metagenoma , Metagenômica , Metagenômica/métodos , Metagenoma/genética , Análise de Sequência de DNA/métodos , Funções Verossimilhança , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Recombinação Genética/genética , Algoritmos
2.
PLoS Comput Biol ; 17(10): e1008839, 2021 10.
Artigo em Inglês | MEDLINE | ID: mdl-34634030

RESUMO

Hi-C is a sample preparation method that enables high-throughput sequencing to capture genome-wide spatial interactions between DNA molecules. The technique has been successfully applied to solve challenging problems such as 3D structural analysis of chromatin, scaffolding of large genome assemblies and more recently the accurate resolution of metagenome-assembled genomes (MAGs). Despite continued refinements, however, preparing a Hi-C library remains a complex laboratory protocol. To avoid costly failures and maximise the odds of successful outcomes, diligent quality management is recommended. Current wet-lab methods provide only a crude assay of Hi-C library quality, while key post-sequencing quality indicators used have-thus far-relied upon reference-based read-mapping. When a reference is accessible, this reliance introduces a concern for quality, where an incomplete or inexact reference skews the resulting quality indicators. We propose a new, reference-free approach that infers the total fraction of read-pairs that are a product of proximity ligation. This quantification of Hi-C library quality requires only a modest amount of sequencing data and is independent of other application-specific criteria. The algorithm builds upon the observation that proximity ligation events are likely to create k-mers that would not naturally occur in the sample. Our software tool (qc3C) is to our knowledge the first to implement a reference-free Hi-C QC tool, and also provides reference-based QC, enabling Hi-C to be more easily applied to non-model organisms and environmental samples. We characterise the accuracy of the new algorithm on simulated and real datasets and compare it to reference-based methods.


Assuntos
Mapeamento Cromossômico , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Controle de Qualidade , Software , Algoritmos , Animais , Mapeamento Cromossômico/métodos , Mapeamento Cromossômico/normas , DNA/química , DNA/genética , Biblioteca Gênica , Genômica/métodos , Genômica/normas , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Humanos , Tartarugas
3.
Nat Methods ; 14(11): 1063-1071, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28967888

RESUMO

Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.


Assuntos
Metagenômica , Software , Algoritmos , Benchmarking , Análise de Sequência de DNA
4.
Plasmid ; 102: 56-61, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30885788

RESUMO

IncHI2-ST1 plasmids play an important role in co-mobilizing genes conferring resistance to critically important antibiotics and heavy metals. Here we present the identification and analysis of IncHI2-ST1 plasmid pSPRC-Echo1, isolated from an Enterobacter hormaechei strain from a Sydney hospital, which predates other multi-drug resistant IncHI2-ST1 plasmids reported from Australia. Our time-resolved phylogeny analysis indicates pSPRC-Echo1 represents a new lineage of IncHI2-ST1 plasmids and show how their diversification relates to the era of antibiotics.


Assuntos
Filogenia , Plasmídeos/genética , Mapeamento Cromossômico , Elementos de DNA Transponíveis/genética , Fatores de Tempo
5.
Proc Natl Acad Sci U S A ; 110(42): 16939-44, 2013 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-24082106

RESUMO

Deep Lake in Antarctica is a globally isolated, hypersaline system that remains liquid at temperatures down to -20 °C. By analyzing metagenome data and genomes of four isolates we assessed genome variation and patterns of gene exchange to learn how the lake community evolved. The lake is completely and uniformly dominated by haloarchaea, comprising a hierarchically structured, low-complexity community that differs greatly to temperate and tropical hypersaline environments. The four Deep Lake isolates represent distinct genera (∼85% 16S rRNA gene similarity and ∼73% genome average nucleotide identity) with genomic characteristics indicative of niche adaptation, and collectively account for ∼72% of the cellular community. Network analysis revealed a remarkable level of intergenera gene exchange, including the sharing of long contiguous regions (up to 35 kb) of high identity (∼100%). Although the genomes of closely related Halobacterium, Haloquadratum, and Haloarcula (>90% average nucleotide identity) shared regions of high identity between species or strains, the four Deep Lake isolates were the only distantly related haloarchaea to share long high-identity regions. Moreover, the Deep Lake high-identity regions did not match to any other hypersaline environment metagenome data. The most abundant species, tADL, appears to play a central role in the exchange of insertion sequences, but not the exchange of high-identity regions. The genomic characteristics of the four haloarchaea are consistent with a lake ecosystem that sustains a high level of intergenera gene exchange while selecting for ecotypes that maintain sympatric speciation. The peculiarities of this polar system restrict which species can grow and provide a tempo and mode for accentuating gene exchange.


Assuntos
Evolução Molecular , Transferência Genética Horizontal , Genoma Arqueal/fisiologia , Halobacteriaceae/genética , Lagos/microbiologia , Microbiologia da Água , Regiões Antárticas , Metagenoma , RNA Arqueal/genética , RNA Ribossômico 16S/genética
6.
Proc Natl Acad Sci U S A ; 108(15): 6163-8, 2011 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-21444812

RESUMO

Viruses are abundant ubiquitous members of microbial communities and in the marine environment affect population structure and nutrient cycling by infecting and lysing primary producers. Antarctic lakes are microbially dominated ecosystems supporting truncated food webs in which viruses exert a major influence on the microbial loop. Here we report the discovery of a virophage (relative of the recently described Sputnik virophage) that preys on phycodnaviruses that infect prasinophytes (phototrophic algae). By performing metaproteogenomic analysis on samples from Organic Lake, a hypersaline meromictic lake in Antarctica, complete virophage and near-complete phycodnavirus genomes were obtained. By introducing the virophage as an additional predator of a predator-prey dynamic model we determined that the virophage stimulates secondary production through the microbial loop by reducing overall mortality of the host and increasing the frequency of blooms during polar summer light periods. Virophages remained abundant in the lake 2 y later and were represented by populations with a high level of major capsid protein sequence variation (25-100% identity). Virophage signatures were also found in neighboring Ace Lake (in abundance) and in two tropical lakes (hypersaline and fresh), an estuary, and an ocean upwelling site. These findings indicate that virophages regulate host-virus interactions, influence overall carbon flux in Organic Lake, and play previously unrecognized roles in diverse aquatic ecosystems.


Assuntos
Água Doce/virologia , Genoma Viral/genética , Metagenoma/genética , Phycodnaviridae/genética , Phycodnaviridae/fisiologia , Regiões Antárticas , Sequência de Bases , Variação Genética , Dados de Sequência Molecular , Phycodnaviridae/classificação , Filogenia , Estramenópilas
7.
Environ Microbiol ; 15(5): 1318-33, 2013 May.
Artigo em Inglês | MEDLINE | ID: mdl-23199136

RESUMO

We performed a metagenomic survey (6.6 Gbp of 454 sequence data) of Southern Ocean (SO) microorganisms during the austral summer of 2007-2008, examining the genomic signatures of communities across a latitudinal transect from Hobart (44°S) to the Mertz Glacier, Antarctica (67°S). Operational taxonomic units (OTUs) of the SAR11 and SAR116 clades and the cyanobacterial genera Prochlorococcus and Synechococcus were strongly overrepresented north of the Polar Front (PF). Conversely, OTUs of the Gammaproteobacterial Sulfur Oxidizer-EOSA-1 (GSO-EOSA-1) complex, the phyla Bacteroidetes and Verrucomicrobia and order Rhodobacterales were characteristic of waters south of the PF. Functions enriched south of the PF included a range of transporters, sulfur reduction and histidine degradation to glutamate, while branched-chain amino acid transport, nucleic acid biosynthesis and methionine salvage were overrepresented north of the PF. The taxonomic and functional characteristics suggested a shift of primary production from cyanobacteria in the north to eukaryotic phytoplankton in the south, and reflected the different trophic statuses of the two regions. The study provides a new level of understanding about SO microbial communities, describing the contrasting taxonomic and functional characteristics of microbial assemblages either side of the PF.


Assuntos
Bactérias/classificação , Bactérias/genética , Biodiversidade , Metagenômica , Água do Mar/microbiologia , Microbiologia da Água , Aminoácidos de Cadeia Ramificada/genética , Bactérias/metabolismo , Cianobactérias/classificação , Cianobactérias/genética , Eucariotos/genética , Eucariotos/metabolismo , Eucariotos/fisiologia , Oceanos e Mares , Filogenia , RNA Ribossômico 16S/genética , Água do Mar/química
8.
Mol Syst Biol ; 8: 595, 2012 Jul 17.
Artigo em Inglês | MEDLINE | ID: mdl-22806143

RESUMO

The ubiquitous SAR11 bacterial clade is the most abundant type of organism in the world's oceans, but the reasons for its success are not fully elucidated. We analysed 128 surface marine metagenomes, including 37 new Antarctic metagenomes. The large size of the data set enabled internal transcribed spacer (ITS) regions to be obtained from the Southern polar region, enabling the first global characterization of the distribution of SAR11, from waters spanning temperatures -2 to 30°C. Our data show a stable co-occurrence of phylotypes within both 'tropical' (>20°C) and 'polar' (<10°C) biomes, highlighting ecological niche differentiation between major SAR11 subgroups. All phylotypes display transitions in abundance that are strongly correlated with temperature and latitude. By assembling SAR11 genomes from Antarctic metagenome data, we identified specific genes, biases in gene functions and signatures of positive selection in the genomes of the polar SAR11-genomic signatures of adaptive radiation. Our data demonstrate the importance of adaptive radiation in the organism's ability to proliferate throughout the world's oceans, and describe genomic traits characteristic of different phylotypes in specific marine biomes.


Assuntos
Alphaproteobacteria/genética , Genoma Bacteriano/efeitos da radiação , Metagenoma/efeitos da radiação , Modelos Biológicos , Água do Mar/microbiologia , Regiões Antárticas , Clima , Genoma Bacteriano/genética , Biologia Marinha , Metagenoma/genética , Filogenia , Filogeografia , Alinhamento de Sequência , Temperatura
9.
Bioinformatics ; 27(17): 2431-2, 2011 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-21775307

RESUMO

SUMMARY: SHAP (simple high-throughput annotation pipeline) is a lightweight and scalable sequence annotation pipeline capable of supporting research efforts that generate or utilize large volumes of DNA sequence data. The software provides Grid capable analysis, relational storage and Web-based full-text searching of annotation results. Implemented in Java, SHAP recognizes the limited resources of many smaller research groups. AVAILABILITY: Source code is freely available under GPLv3 at https://sourceforge.net/projects/shap. CONTACT: matt.demaere@unsw.edu.au; r.cavicchioli@unsw.edu.au.


Assuntos
Anotação de Sequência Molecular/métodos , Análise de Sequência de DNA , Software , Humanos , Armazenamento e Recuperação da Informação , Internet
10.
Proc Natl Acad Sci U S A ; 106(37): 15527-33, 2009 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-19805210

RESUMO

Many marine bacteria have evolved to grow optimally at either high (copiotrophic) or low (oligotrophic) nutrient concentrations, enabling different species to colonize distinct trophic habitats in the oceans. Here, we compare the genome sequences of two bacteria, Photobacterium angustum S14 and Sphingopyxis alaskensis RB2256, that serve as useful model organisms for copiotrophic and oligotrophic modes of life and specifically relate the genomic features to trophic strategy for these organisms and define their molecular mechanisms of adaptation. We developed a model for predicting trophic lifestyle from genome sequence data and tested >400,000 proteins representing >500 million nucleotides of sequence data from 126 genome sequences with metagenome data of whole environmental samples. When applied to available oceanic metagenome data (e.g., the Global Ocean Survey data) the model demonstrated that oligotrophs, and not the more readily isolatable copiotrophs, dominate the ocean's free-living microbial populations. Using our model, it is now possible to define the types of bacteria that specific ocean niches are capable of sustaining.


Assuntos
Bactérias/crescimento & desenvolvimento , Bactérias/genética , Genoma Bacteriano , Ecossistema , Biologia Marinha , Modelos Biológicos , Dados de Sequência Molecular , Photobacterium/genética , Photobacterium/crescimento & desenvolvimento , Sphingomonadaceae/genética , Sphingomonadaceae/crescimento & desenvolvimento
11.
PLoS One ; 17(6): e0270372, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35749534

RESUMO

Intensive farming practices can increase exposure of animals to infectious agents against which antibiotics are used. Orally administered antibiotics are well known to cause dysbiosis. To counteract dysbiotic effects, numerous studies in the past two decades sought to understand whether probiotics are a valid tool to help re-establish a healthy gut microbial community after antibiotic treatment. Although dysbiotic effects of antibiotics are well investigated, little is known about the effects of intramuscular antibiotic treatment on the gut microbiome and a few studies attempted to study treatment effects using phylogenetic diversity analysis techniques. In this study we sought to determine the effects of two probiotic- and one intramuscularly administered antibiotic treatment on the developing gut microbiome of post-weaning piglets between their 3rd and 9th week of life. Shotgun metagenomic sequences from over 800 faecal time-series samples derived from 126 post-weaning piglets and 42 sows were analysed in a phylogenetic framework. Differences between individual hosts such as breed, litter, and age, were found to be important contributors to variation in the community composition. Host age was the dominant factor in shaping the gut microbiota of piglets after weaning. The post-weaning pig gut microbiome appeared to follow a highly structured developmental program with characteristic post-weaning changes that can distinguish hosts that were born as little as two days apart in the second month of life. Treatment effects of the antibiotic and probiotic treatments were found but were subtle and included a higher representation of Mollicutes associated with intramuscular antibiotic treatment, and an increase of Lactobacillus associated with probiotic treatment. The discovery of correlations between experimental factors and microbial community composition is more commonly addressed with OTU-based methods and rarely analysed via phylogenetic diversity measures. The latter method, although less intuitive than the former, suffers less from library size normalization biases, and it proved to be instrumental in this study for the discovery of correlations between microbiome composition and host-, and treatment factors.


Assuntos
Microbioma Gastrointestinal , Microbiota , Probióticos , Animais , Antibacterianos/farmacologia , Disbiose , Feminino , Microbioma Gastrointestinal/genética , Filogenia , Suínos , Desmame
12.
Microb Genom ; 7(8)2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34370660

RESUMO

Using a previously described metagenomics dataset of 27 billion reads, we reconstructed over 50 000 metagenome-assembled genomes (MAGs) of organisms resident in the porcine gut, 46.5 % of which were classified as >70 % complete with a <10 % contamination rate, and 24.4 % were nearly complete genomes. Here, we describe the generation and analysis of those MAGs using time-series samples. The gut microbial communities of piglets appear to follow a highly structured developmental programme in the weeks following weaning, and this development is robust to treatments including an intramuscular antibiotic treatment and two probiotic treatments. The high resolution we obtained allowed us to identify specific taxonomic 'signatures' that characterize the gut microbial development immediately after weaning. Additionally, we characterized the carbohydrate repertoire of the organisms resident in the porcine gut. We tracked the abundance shifts of 294 carbohydrate active enzymes, and identified the species and higher-level taxonomic groups carrying each of these enzymes in their MAGs. This knowledge can contribute to the design of probiotics and prebiotic interventions as a means to modify the piglet gut microbiome.


Assuntos
Microbioma Gastrointestinal/genética , Metagenoma , Metagenômica , Animais , Microbioma Gastrointestinal/fisiologia , Genoma Bacteriano , Filogenia , Proteoma , Suínos , Desmame
13.
Gigascience ; 10(6)2021 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-34080630

RESUMO

BACKGROUND: Early weaning and intensive farming practices predispose piglets to the development of infectious and often lethal diseases, against which antibiotics are used. Besides contributing to the build-up of antimicrobial resistance, antibiotics are known to modulate the gut microbial composition. As an alternative to antibiotic treatment, studies have previously investigated the potential of probiotics for the prevention of postweaning diarrhea. In order to describe the post-weaning gut microbiota, and to study the effects of two probiotics formulations and of intramuscular antibiotic treatment on the gut microbiota, we sampled and processed over 800 faecal time-series samples from 126 piglets and 42 sows. RESULTS: Here we report on the largest shotgun metagenomic dataset of the pig gut lumen microbiome to date, consisting of >8 Tbp of shotgun metagenomic sequencing data. The animal trial, the workflow from sample collection to sample processing, and the preparation of libraries for sequencing, are described in detail. We provide a preliminary analysis of the dataset, centered on a taxonomic profiling of the samples, and a 16S-based beta diversity analysis of the mothers and the piglets in the first 5 weeks after weaning. CONCLUSIONS: This study was conducted to generate a publicly available databank of the faecal metagenome of weaner piglets aged between 3 and 9 weeks old, treated with different probiotic formulations and intramuscular antibiotic treatment. Besides investigating the effects of the probiotic and intramuscular antibiotic treatment, the dataset can be explored to assess a wide range of ecological questions with regards to antimicrobial resistance, host-associated microbial and phage communities, and their dynamics during the aging of the host.


Assuntos
Microbioma Gastrointestinal , Probióticos , Animais , Feminino , Metagenoma , Metagenômica , Suínos , Desmame
14.
Microbiol Resour Announc ; 9(6)2020 Feb 06.
Artigo em Inglês | MEDLINE | ID: mdl-32029559

RESUMO

We report the availability of a high-quality metagenomic Hi-C data set generated from a fecal sample taken from a healthy fecal microbiome transplant donor subject. We report on basic features of the data to evaluate their quality.

15.
Genome Biol ; 20(1): 46, 2019 02 26.
Artigo em Inglês | MEDLINE | ID: mdl-30808380

RESUMO

Most microbes cannot be easily cultured, and metagenomics provides a means to study them. Current techniques aim to resolve individual genomes from metagenomes, so-called metagenome-assembled genomes (MAGs). Leading approaches depend upon time series or transect studies, the efficacy of which is a function of community complexity, target abundance, and sequencing depth. We describe an unsupervised method that exploits the hierarchical nature of Hi-C interaction rates to resolve MAGs using a single time point. We validate the method and directly compare against a recently announced proprietary service, ProxiMeta. bin3C is an open-source pipeline and makes use of the Infomap clustering algorithm ( https://github.com/cerebis/bin3C ).


Assuntos
Genoma Bacteriano , Metagenômica/métodos , Microbiota/genética , Software , Simulação por Computador , Fezes/microbiologia , Humanos , Análise de Sequência de DNA
16.
Microb Genom ; 5(3)2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30303480

RESUMO

We recently identified clonal complex 10 (CC10) Escherichia coli as the predominant clonal group in two populations of healthy Australian food-production pigs. CC10 are highly successful, colonizing humans, food-production animals, fresh produce and environmental niches. Furthermore, E. coli within CC10 are frequently drug resistant and increasingly reported as human and animal extra-intestinal pathogens. In order to develop a high-resolution global phylogeny and determine the repertoire of antimicrobial-resistance genes, virulence-associated genes and plasmid types within this clonal group, we downloaded 228 publicly available CC10 short-read genome sequences for comparison with 20 porcine CC10 we have previously described. Core genome single nucleotide polymorphism phylogeny revealed a highly diverse global phylogeny consisting of multiple lineages that did not cluster by geography or source of the isolates. Australian porcine strains belonged to several of these divergent lineages, indicative that CC10 is present in these animals due to multiple colonization events. Differences in resistance gene and plasmid carriage between porcine strains and the global collection highlighted the role of lateral gene transfer in the evolution of CC10 strains. Virulence profiles typical of extra-intestinal pathogenic E. coli were present in both Australian porcine strains and the broader collection. As both the core phylogeny and accessory gene characteristics appeared unrelated to the geography or source of the isolates, it is likely that the global expansion of CC10 is not a recent event and may be associated with faecal carriage in humans.


Assuntos
Escherichia coli/classificação , Escherichia coli/genética , Filogenia , Suínos/microbiologia , Animais , Austrália , Técnicas de Tipagem Bacteriana , DNA Bacteriano/genética , Farmacorresistência Bacteriana/genética , Escherichia coli/isolamento & purificação , Infecções por Escherichia coli/microbiologia , Infecções por Escherichia coli/veterinária , Proteínas de Escherichia coli/genética , Fezes , Microbiologia de Alimentos , Transferência Genética Horizontal , Genoma Bacteriano , Humanos , Epidemiologia Molecular , Plasmídeos , Doenças dos Suínos/microbiologia , Virulência/genética , Sequenciamento Completo do Genoma
17.
Gut Pathog ; 11: 3, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30805030

RESUMO

BACKGROUND: Enterobacter hormaechei is an important emerging pathogen and a key member of the highly diverse Enterobacter cloacae complex. E. hormaechei strains can persist and spread in nosocomial environments, and often exhibit resistance to multiple clinically important antibiotics. However, the genomic regions that harbour resistance determinants are typically highly repetitive and impossible to resolve with standard short-read sequencing technologies. RESULTS: Here we used both short- and long-read methods to sequence the genome of a multidrug-resistant hospital isolate (C15117), which we identified as E. hormaechei. Hybrid assembly generated a complete circular chromosome of 4,739,272 bp and a fully resolved plasmid of 339,920 bp containing several antibiotic resistance genes. The strain also harboured a 34,857 bp repeat encoding copper resistance, which was present in both the chromosome and plasmid. Long reads that unambiguously spanned this repeat were required to resolve the chromosome and plasmid into separate replicons. CONCLUSION: This study provides important insights into the evolution and potential spread of antimicrobial resistance in a nosocomial E. hormaechei strain. More broadly, it further exemplifies the power of long-read sequencing technologies, particularly the Oxford Nanopore platform, for the characterisation of bacteria with complex resistance loci and large repeat elements.

18.
Microbiome ; 7(1): 17, 2019 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-30736849

RESUMO

BACKGROUND: Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. RESULTS: We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM. CONCLUSIONS: CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.


Assuntos
Simulação por Computador , Microbioma Gastrointestinal/genética , Metagenoma/genética , Metagenômica/métodos , Algoritmos , Animais , Humanos , Camundongos , Modelos Biológicos , Análise de Sequência de DNA/métodos , Software
19.
Gigascience ; 7(2)2018 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-29149264

RESUMO

Background: Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. Findings: We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. Conclusions: We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing.


Assuntos
Bactérias/genética , Fungos/genética , Genoma , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Modelos Estatísticos , Algoritmos , Mapeamento Cromossômico , Simulação por Computador
20.
Microbiome ; 6(1): 113, 2018 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-29925429

RESUMO

BACKGROUND: The genomes of halophilic archaea (haloarchaea) often comprise multiple replicons. Genomic variation in haloarchaea has been linked to viral infection pressure and, in the case of Antarctic communities, can be caused by intergenera gene exchange. To expand understanding of genome variation and biogeography of Antarctic haloarchaea, here we assessed genomic variation between two strains of Halorubrum lacusprofundi that were isolated from Antarctic hypersaline lakes from different regions (Vestfold Hills and Rauer Islands). To assess variation in haloarchaeal populations, including the presence of genomic islands, metagenomes from six hypersaline Antarctic lakes were characterised. RESULTS: The sequence of the largest replicon of each Hrr. lacusprofundi strain (primary replicon) was highly conserved, while each of the strains' two smaller replicons (secondary replicons) were highly variable. Intergenera gene exchange was identified, including the sharing of a type I-B CRISPR system. Evaluation of infectivity of an Antarctic halovirus provided experimental evidence for the differential susceptibility of the strains, bolstering inferences that strain variation is important for modulating interactions with viruses. A relationship was found between genomic structuring and the location of variation within replicons and genomic islands, demonstrating that the way in which haloarchaea accommodate genomic variability relates to replicon structuring. Metagenome read and contig mapping and clustering and scaling analyses demonstrated biogeographical patterning of variation consistent with environment and distance effects. The metagenome data also demonstrated that specific haloarchaeal species dominated the hypersaline systems indicating they are endemic to Antarctica. CONCLUSION: The study describes how genomic variation manifests in Antarctic-lake haloarchaeal communities and provides the basis for future assessments of Antarctic regional and global biogeography of haloarchaea.


Assuntos
Vírus de Archaea/genética , Genoma Arqueal/genética , Halorubrum/genética , Microbiota/genética , Regiões Antárticas , Vírus de Archaea/isolamento & purificação , Sequência de Bases , Variação Genética/genética , Ilhas Genômicas/genética , Geografia , Halorubrum/classificação , Halorubrum/isolamento & purificação , Lagos/microbiologia , Metagenoma/genética , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA