Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36575570

RESUMO

High-throughput profiling of microbial functional traits involved in various biogeochemical cycling pathways using shotgun metagenomic sequencing has been routinely applied in microbial ecology and environmental science. Multiple bioinformatics data processing approaches are available, including assembly-based (single-sample assembly and multi-sample assembly) and read-based (merged reads and raw data). However, it remains not clear how these different approaches may differ in data analyses and affect result interpretation. In this study, using two typical shotgun metagenome datasets recovered from geographically distant coastal sediments, the performance of different data processing approaches was comparatively investigated from both technical and biological/ecological perspectives. Microbially mediated biogeochemical cycling pathways, including nitrogen cycling, sulfur cycling and B12 biosynthesis, were analyzed. As a result, multi-sample assembly provided the most amount of usable information for targeted functional traits, at a high cost of computational resources and running time. Single-sample assembly and read-based analysis were comparable in obtaining usable information, but the former was much more time- and resource-consuming. Critically, different approaches introduced much stronger variations in microbial profiles than biological differences. However, community-level differences between the two sampling sites could be consistently observed despite the approaches being used. In choosing an appropriate approach, researchers shall balance the trade-offs between multiple factors, including the scientific question, the amount of usable information, computational resources and time cost. This study is expected to provide valuable technical insights and guidelines for the various approaches used for metagenomic data analysis.


Assuntos
Metagenoma , Metagenômica , Sequenciamento de Nucleotídeos em Larga Escala
2.
BMC Genomics ; 24(1): 440, 2023 Aug 05.
Artigo em Inglês | MEDLINE | ID: mdl-37543591

RESUMO

BACKGROUND: Biocontrol is a key technology for the control of pest species. Microctonus parasitoid wasps (Hymenoptera: Braconidae) have been released in Aotearoa New Zealand as biocontrol agents, targeting three different pest weevil species. Despite their value as biocontrol agents, no genome assemblies are currently available for these Microctonus wasps, limiting investigations into key biological differences between the different species and strains. METHODS AND FINDINGS: Here we present high-quality genomes for Microctonus hyperodae and Microctonus aethiopoides, assembled with short read sequencing and Hi-C scaffolding. These assemblies have total lengths of 106.7 Mb for M. hyperodae and 129.2 Mb for M. aethiopoides, with scaffold N50 values of 9 Mb and 23 Mb respectively. With these assemblies we investigated differences in reproductive mechanisms, and association with viruses between Microctonus wasps. Meiosis-specific genes are conserved in asexual Microctonus, with in-situ hybridisation validating expression of one of these genes in the ovaries of asexual Microctonus aethiopoides. This implies asexual reproduction in these Microctonus wasps involves meiosis, with the potential for sexual reproduction maintained. Investigation of viral gene content revealed candidate genes that may be involved in virus-like particle production in M. aethiopoides, as well as a novel virus infecting M. hyperodae, for which a complete genome was assembled. CONCLUSION AND SIGNIFICANCE: These are the first published genomes for Microctonus wasps which have been deployed as biocontrol agents, in Aotearoa New Zealand. These assemblies will be valuable resources for continued investigation and monitoring of these biocontrol systems. Understanding the biology underpinning Microctonus biocontrol is crucial if we are to maintain its efficacy, or in the case of M. hyperodae to understand what may have influenced the significant decline of biocontrol efficacy. The potential for sexual reproduction in asexual Microctonus is significant given that empirical modelling suggests this asexual reproduction is likely to have contributed to biocontrol decline. Furthermore the identification of a novel virus in M. hyperodae highlights a previously unknown aspect of this biocontrol system, which may contribute to premature mortality of the host pest. These findings have potential to be exploited in future in attempt to increase the effectiveness of M. hyperodae biocontrol.


Assuntos
Vespas , Gorgulhos , Animais , Vespas/genética , Gorgulhos/genética , Reprodução , Partenogênese , Cromossomos
3.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33758906

RESUMO

Recent advances in high-throughput sequencing technologies and computational methods have added a new dimension to metagenomic data analysis i.e. genome-resolved metagenomics. In general terms, it refers to the recovery of draft or high-quality microbial genomes and their taxonomic classification and functional annotation. In recent years, several studies have utilized the genome-resolved metagenome analysis approach and identified previously unknown microbial species from human and environmental metagenomes. In this review, we describe genome-resolved metagenome analysis as a series of four necessary steps: (i) preprocessing of the sequencing reads, (ii) de novo metagenome assembly, (iii) genome binning and (iv) taxonomic and functional analysis of the recovered genomes. For each of these four steps, we discuss the most commonly used tools and the currently available pipelines to guide the scientific community in the recovery and subsequent analyses of genomes from any metagenome sample. Furthermore, we also discuss the tools required for validation of assembly quality as well as for improving quality of the recovered genomes. We also highlight the currently available pipelines that can be used to automate the whole analysis without having advanced bioinformatics knowledge. Finally, we will highlight the most widely adapted and actively maintained tools and pipelines that can be helpful to the scientific community in decision making before they commence the analysis.


Assuntos
Código de Barras de DNA Taxonômico/métodos , Genoma Microbiano , Metagenoma , Metagenômica/métodos , Microbiota/genética , Fezes/microbiologia , Genitália/microbiologia , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Boca/microbiologia , Análise de Sequência de DNA , Pele/microbiologia , Microbiologia do Solo , Microbiologia da Água
4.
BMC Bioinformatics ; 23(1): 513, 2022 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-36451083

RESUMO

BACKGROUND: The assembly of metagenomes decomposes members of complex microbe communities and allows the characterization of these genomes without laborious cultivation or single-cell metagenomics. Metagenome assembly is a process that is memory intensive and time consuming. Multi-terabyte sequences can become too large to be assembled on a single computer node, and there is no reliable method to predict the memory requirement due to data-specific memory consumption pattern. Currently, out-of-memory (OOM) is one of the most prevalent factors that causes metagenome assembly failures. RESULTS: In this study, we explored the possibility of using Persistent Memory (PMem) as a less expensive substitute for dynamic random access memory (DRAM) to reduce OOM and increase the scalability of metagenome assemblers. We evaluated the execution time and memory usage of three popular metagenome assemblers (MetaSPAdes, MEGAHIT, and MetaHipMer2) in datasets up to one terabase. We found that PMem can enable metagenome assemblers on terabyte-sized datasets by partially or fully substituting DRAM. Depending on the configured DRAM/PMEM ratio, running metagenome assemblies with PMem can achieve a similar speed as DRAM, while in the worst case it showed a roughly two-fold slowdown. In addition, different assemblers displayed distinct memory/speed trade-offs in the same hardware/software environment. CONCLUSIONS: We demonstrated that PMem is capable of expanding the capacity of DRAM to allow larger metagenome assembly with a potential tradeoff in speed. Because PMem can be used directly without any application-specific code modification, these findings are likely to be generalized to other memory-intensive bioinformatics applications.


Assuntos
Metagenoma , Microbiota , Metagenômica , Software , Biologia Computacional
5.
J Proteome Res ; 19(4): 1351-1360, 2020 04 03.
Artigo em Inglês | MEDLINE | ID: mdl-32200634

RESUMO

As the infection of 2019-nCoV coronavirus is quickly developing into a global pneumonia epidemic, the careful analysis of its transmission and cellular mechanisms is sorely needed. In this Communication, we first analyzed two recent studies that concluded that snakes are the intermediate hosts of 2019-nCoV and that the 2019-nCoV spike protein insertions share a unique similarity to HIV-1. However, the reimplementation of the analyses, built on larger scale data sets using state-of-the-art bioinformatics methods and databases, presents clear evidence that rebuts these conclusions. Next, using metagenomic samples from Manis javanica, we assembled a draft genome of the 2019-nCoV-like coronavirus, which shows 73% coverage and 91% sequence identity to the 2019-nCoV genome. In particular, the alignments of the spike surface glycoprotein receptor binding domain revealed four times more variations in the bat coronavirus RaTG13 than in the Manis coronavirus compared with 2019-nCoV, suggesting the pangolin as a missing link in the transmission of 2019-nCoV from bats to human.


Assuntos
Betacoronavirus/genética , Infecções por Coronavirus/virologia , Genoma Viral/genética , Interações Hospedeiro-Patógeno , Modelos Moleculares , Pneumonia Viral/virologia , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/genética , Sequência de Aminoácidos , Animais , Betacoronavirus/classificação , COVID-19 , Eutérios/virologia , HIV-1/genética , Humanos , Metagenoma , Pandemias , Estrutura Terciária de Proteína , SARS-CoV-2 , Alinhamento de Sequência , Análise de Sequência de Proteína , Serpentes/virologia
6.
Genomics ; 111(6): 1824-1830, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-30552976

RESUMO

Metagenome from refinery wastewater treatment plant running under nitrogen stress was analyzed for mining of novel aromatic hydrocarbon-degrading bacteria. The sequence data were assembled using metaspade followed by binning using the Metabat tool to assemble genome; where coverage and depth were calculated using bowtie and samtools. The analysis picked a novel genome belonging to family Bradyrhizobiaceae, identified based on 16S rDNA gene which was supported by CheckM and Kraken analysis. Using RAST, the assembled genome showed the capabilities for nitrogen fixation with the utilization of multiple hydrocarbon substrates with 14 different types of oxygenases as mapped by Minpath. An additional genetic feature like genes for stress and resistance towards heavy metals and antibiotic suggested that the genome has gone through the rigorous process of adaptation. If such bacteria could be cultivated then it will open the broad window of bioremediation strategies under nitrogen stress environment.


Assuntos
Genoma Bacteriano , Hidrocarbonetos Aromáticos/metabolismo , Fixação de Nitrogênio/genética , Bactérias Fixadoras de Nitrogênio/genética , Biodegradação Ambiental , Bactérias Fixadoras de Nitrogênio/metabolismo
7.
Yeast ; 35(1): 71-84, 2018 01.
Artigo em Inglês | MEDLINE | ID: mdl-28892574

RESUMO

Interspecific hybridization is a common mechanism enabling genetic diversification and adaptation; however, the detection of hybrid species has been quite difficult. The identification of microbial hybrids is made even more complicated, as most environmental microbes are resistant to culturing and must be studied in their native mixed communities. We have previously adapted the chromosome conformation capture method Hi-C to the assembly of genomes from mixed populations. Here, we show the method's application in assembling genomes directly from an uncultured, mixed population from a spontaneously inoculated beer sample. Our assembly method has enabled us to de-convolute four bacterial and four yeast genomes from this sample, including a putative yeast hybrid. Downstream isolation and analysis of this hybrid confirmed its genome to consist of Pichia membranifaciens and that of another related, but undescribed, yeast. Our work shows that Hi-C-based metagenomic methods can overcome the limitation of traditional sequencing methods in studying complex mixtures of genomes. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Cerveja/microbiologia , Hibridização Genética , Metagenômica/métodos , Leveduras/genética , Genoma Fúngico , Filogenia
8.
BMC Genomics ; 18(1): 521, 2017 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-28693474

RESUMO

BACKGROUND: Metagenomics allows unprecedented access to uncultured environmental microorganisms. The analysis of metagenomic sequences facilitates gene prediction and annotation, and enables the assembly of draft genomes, including uncultured members of a community. However, while several platforms have been developed for this critical step, there is currently no clear framework for the assembly of metagenomic sequence data. RESULTS: To assist with selection of an appropriate metagenome assembler we evaluated the capabilities of nine prominent assembly tools on nine publicly-available environmental metagenomes, as well as three simulated datasets. Overall, we found that SPAdes provided the largest contigs and highest N50 values across 6 of the 9 environmental datasets, followed by MEGAHIT and metaSPAdes. MEGAHIT emerged as a computationally inexpensive alternative to SPAdes, assembling the most complex dataset using less than 500 GB of RAM and within 10 hours. CONCLUSIONS: We found that assembler choice ultimately depends on the scientific question, the available resources and the bioinformatic competence of the researcher. We provide a concise workflow for the selection of the best assembly tool.


Assuntos
Metagenômica/métodos , Benchmarking , Bases de Dados Genéticas , Meio Ambiente
9.
Methods ; 102: 3-11, 2016 06 01.
Artigo em Inglês | MEDLINE | ID: mdl-27012178

RESUMO

The study of metagenomics has been much benefited from low-cost and high-throughput sequencing technologies, yet the tremendous amount of data generated make analysis like de novo assembly to consume too much computational resources. In late 2014 we released MEGAHIT v0.1 (together with a brief note of Li et al. (2015) [1]), which is the first NGS metagenome assembler that can assemble genome sequences from metagenomic datasets of hundreds of Giga base-pairs (bp) in a time- and memory-efficient manner on a single server. The core of MEGAHIT is an efficient parallel algorithm for constructing succinct de Bruijn Graphs (SdBG), implemented on a graphical processing unit (GPU). The software has been well received by the assembly community, and there is interest in how to adapt the algorithms to integrate popular assembly practices so as to improve the assembly quality, as well as how to speed up the software using better CPU-based algorithms (instead of GPU). In this paper we first describe the details of the core algorithms in MEGAHIT v0.1, and then we show the new modules to upgrade MEGAHIT to version v1.0, which gives better assembly quality, runs faster and uses less memory. For the Iowa Prairie Soil dataset (252Gbp after quality trimming), the assembly quality of MEGAHIT v1.0, when compared with v0.1, has a significant improvement, namely, 36% increase in assembly size and 23% in N50. More interestingly, MEGAHIT v1.0 is no slower than before (even running with the extra modules). This is primarily due to a new CPU-based algorithm for SdBG construction that is faster and requires less memory. Using CPU only, MEGAHIT v1.0 can assemble the Iowa Prairie Soil sample in about 43h, reducing the running time of v0.1 by at least 25% and memory usage by up to 50%. MEGAHIT v1.0, exhibiting a smaller memory footprint, can process even larger datasets. The Kansas Prairie Soil sample (484Gbp), the largest publicly available dataset, can now be assembled using no more than 500GB of memory in 7.5days. The assemblies of these datasets (and other large metgenomic datasets), as well as the software, are available at the website https://hku-bal.github.io/megabox.


Assuntos
Metagenoma , Análise de Sequência/métodos , Software , Algoritmos , Conjuntos de Dados como Assunto , Metagenômica/métodos , Solo
10.
ArXiv ; 2024 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-38903742

RESUMO

Metagenomic studies have primarily relied on de novo assembly for reconstructing genes and genomes from microbial mixtures. While reference-guided approaches have been employed in the assembly of single organisms, they have not been used in a metagenomic context. Here we describe the first effective approach for reference-guided metagenomic assembly that can complement and improve upon de novo metagenomic assembly methods for certain organisms. Such approaches will be increasingly useful as more genomes are sequenced and made publicly available.

11.
Microbiol Spectr ; 12(6): e0011724, 2024 Jun 04.
Artigo em Inglês | MEDLINE | ID: mdl-38687063

RESUMO

Oxford Nanopore sequencing is one of the high-throughput sequencing technologies that facilitates the reconstruction of metagenome-assembled genomes (MAGs). This study aimed to assess the potential of long-read assembly algorithms in Oxford Nanopore sequencing to enhance the MAG-based identification of bacterial pathogens using both simulated and mock communities. Simulated communities were generated to mimic those on fresh spinach and in surface water. Long reads were produced using R9.4.1+SQK-LSK109 and R10.4 + SQK-LSK112, with 0.5, 1, and 2 million reads. The simulated bacterial communities included multidrug-resistant Salmonella enterica serotypes Heidelberg, Montevideo, and Typhimurium in the fresh spinach community individually or in combination, as well as multidrug-resistant Pseudomonas aeruginosa in the surface water community. Real data sets of the ZymoBIOMICS HMW DNA Standard were also studied. A bioinformatic pipeline (MAGenie, freely available at https://github.com/jackchen129/MAGenie) that combines metagenome assembly, taxonomic classification, and sequence extraction was developed to reconstruct draft MAGs from metagenome assemblies. Five assemblers were evaluated based on a series of genomic analyses. Overall, Flye outperformed the other assemblers, followed by Shasta, Raven, and Unicycler, while Canu performed least effectively. In some instances, the extracted sequences resulted in draft MAGs and provided the locations and structures of antimicrobial resistance genes and mobile genetic elements. Our study showcases the viability of utilizing the extracted sequences for precise phylogenetic inference, as demonstrated by the consistent alignment of phylogenetic topology between the reference genome and the extracted sequences. R9.4.1+SQK-LSK109 was more effective in most cases than R10.4+SQK-LSK112, and greater sequencing depths generally led to more accurate results.IMPORTANCEBy examining diverse bacterial communities, particularly those housing multiple Salmonella enterica serotypes, this study holds significance in uncovering the potential of long-read assembly algorithms to improve metagenome-assembled genome (MAG)-based pathogen identification through Oxford Nanopore sequencing. Our research demonstrates that long-read assembly stands out as a promising avenue for boosting precision in MAG-based pathogen identification, thus advancing the development of more robust surveillance measures. The findings also support ongoing endeavors to fine-tune a bioinformatic pipeline for accurate pathogen identification within complex metagenomic samples.


Assuntos
Algoritmos , Genoma Bacteriano , Sequenciamento de Nucleotídeos em Larga Escala , Metagenoma , Sequenciamento por Nanoporos , Sequenciamento por Nanoporos/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Bactérias/genética , Bactérias/classificação , Bactérias/isolamento & purificação , Biologia Computacional/métodos , Salmonella enterica/genética , Salmonella enterica/classificação , Salmonella enterica/isolamento & purificação , Metagenômica/métodos , Pseudomonas aeruginosa/genética , Pseudomonas aeruginosa/isolamento & purificação , Pseudomonas aeruginosa/classificação
12.
Methods Mol Biol ; 2649: 235-259, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37258866

RESUMO

The development of long-read nucleic acid sequencing is beginning to make very substantive impact on the conduct of metagenome analysis, particularly in relation to the problem of recovering the genomes of member species of complex microbial communities. Here we outline bioinformatics workflows for the recovery and characterization of complete genomes from long-read metagenome data and some complementary procedures for comparison of cognate draft genomes and gene quality obtained from short-read sequencing and long-read sequencing.


Assuntos
Metagenoma , Microbiota , Metagenômica/métodos , Microbiota/genética , Análise de Sequência de DNA/métodos , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala/métodos
13.
mSystems ; 8(1): e0096522, 2023 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-36533929

RESUMO

The gut microbiome provides vital functions for mammalian hosts, yet research on its variability and function across adult life spans and multiple generations is limited in large mammalian carnivores. Here, we used 16S rRNA gene and metagenomic high-throughput sequencing to profile the bacterial taxonomic composition, genomic diversity, and metabolic function of fecal samples collected from 12 wild spotted hyenas (Crocuta crocuta) residing in the Masai Mara National Reserve, Kenya, over a 23-year period spanning three generations. The metagenomic data came from four of these hyenas and spanned two 2-year periods. With these data, we determined the extent to which host factors predicted variation in the gut microbiome and identified the core microbes present in the guts of hyenas. We also investigated novel genomic diversity in the mammalian gut by reporting the first metagenome-assembled genomes (MAGs) for hyenas. We found that gut microbiome taxonomic composition varied temporally, but despite this, a core set of 14 bacterial genera were identified. The strongest predictors of the microbiome were host identity and age, suggesting that hyenas possess individualized microbiomes and that these may change with age during adulthood. The gut microbiome functional profiles of the four adult hyenas were also individual specific and were associated with prey abundance, indicating that the functions of the gut microbiome vary with host diet. We recovered 149 high-quality MAGs from the hyenas' guts; some MAGs were classified as taxa previously reported for other carnivores, but many were novel and lacked species-level matches to genomes in existing reference databases. IMPORTANCE There is a gap in knowledge regarding the genomic diversity and variation of the gut microbiome across a host's life span and across multiple generations of hosts in wild mammals. Using two types of sequencing approaches, we found that although gut microbiomes were individualized and temporally variable among hyenas, they correlated similarly to large-scale changes in the ecological conditions experienced by their hosts. We also recovered 149 high-quality MAGs from the hyena gut, greatly expanding the microbial genome repertoire known for hyenas, carnivores, and wild mammals in general. Some MAGs came from genera abundant in the gastrointestinal tracts of canid species and other carnivores, but over 80% of MAGs were novel and from species not previously represented in genome databases. Collectively, our novel body of work illustrates the importance of surveying the gut microbiome of nonmodel wild hosts, using multiple sequencing methods and computational approaches and at distinct scales of analysis.


Assuntos
Carnívoros , Microbioma Gastrointestinal , Hyaenidae , Animais , Microbioma Gastrointestinal/genética , Hyaenidae/genética , RNA Ribossômico 16S/genética , Carnívoros/genética , Metagenômica
14.
FEMS Microbiol Lett ; 369(1)2022 07 28.
Artigo em Inglês | MEDLINE | ID: mdl-35687414

RESUMO

Biogenic coalbed methane is produced by biological processes mediated by synergistic interactions of microbial complexes in coal seams. However, the ecological role of functional bacteria in biogenic coalbed methane remains poorly understood. Here, we studied the metagenome assembled genomes (MAGs) of Bacillales and Clostridiales from coal seams, revealing further expansion of hydrogen and acetogen producers involved in organic matter decomposition. In this study, Bacillales and Clostridiales were dominant orders (91.85 ± 0.94%) in cultured coal seams, and a total of 16 MAGs from six families, including Bacillus, Paenibacillus, Staphylococcus, Anaerosalibacter, Hungatella and Paeniclostridium, were reconstructed. These microbial groups possessed multiple metabolic pathways (glycolysis/gluconeogenesis, pentose phosphate, ß-oxidation, TCA cycle, assimilatory sulfate reduction, nitrogen metabolism and encoding hydrogenase) that provided metabolic substrates (acetate and/or H2) for the methanogenic processes. Therein, the hydrogenase-encoding gene and hydrogenase maturation factors were merely found in all the Clostridiales MAGs. ß-oxidation was the main metabolic pathway involved in short-chain fatty acid degradation and acetate production, and most of these pathways were detected and exhibited different operon structures in Bacillales MAGs. In addition, assimilatory sulfate reduction and nitrogen metabolism processes were also detected in some MAGs, and these processes were also closely related to acetate production and/or organic matter degradation according to their operon structures and metabolic pathways. In summary, this study enabled a better understanding of the ecological roles of Bacillales and Clostridiales in biogenic methane in coal seams based on a combination of bioinformatic techniques.


Assuntos
Bacillales , Hidrogenase , Acetatos , Bacillales/metabolismo , Clostridiales/metabolismo , Carvão Mineral/microbiologia , Humanos , Metano/metabolismo , Nitrogênio , Sulfatos
15.
Imeta ; 1(4): e46, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38867906

RESUMO

Metagenomic evidence of great genetic diversity within the nonconserved regions of the human gut microbial genomes appeals for new methods to elucidate the species-level variability at high resolution. However, current approaches cannot satisfy this methodologically challenge. In this study, we proposed an efficient binning-first-and-assembly-later strategy, named MetaTrass, to recover high-quality species-resolved genomes based on public reference genomes and the single-tube long fragment read (stLFR) technology, which enables cobarcoding. MetaTrass can generate genomes with longer contiguity, higher completeness, and lower contamination than those produced by conventional assembly-first-and-binning-later strategies. From a simulation study on a mock microbial community, MetaTrass showed the potential to improve the contiguity of assembly from kb to Mb without accuracy loss, as compared to other methods based on the next-generation sequencing technology. From four human fecal samples, MetaTrass successfully retrieved 178 high-quality genomes, whereas only 58 ones were provided by the optimal performance of other conventional strategies. Most importantly, these high-quality genomes confirmed the high level of genetic diversity among different samples and unveiled much more. MetaTrass was designed to work with metagenomic reads sequenced by stLFR technology, but is also applicable to other types of cobarcoding libraries. With the high capability of assembling high-quality genomes of metagenomic data sets, MetaTrass seeks to facilitate the study of spatial characters and dynamics of complex microbial communities at enhanced resolution. The open-source code of MetaTrass is available at https://github.com/BGI-Qingdao/MetaTrass.

16.
Genomics Proteomics Bioinformatics ; 20(2): 246-259, 2022 04.
Artigo em Inglês | MEDLINE | ID: mdl-34492339

RESUMO

The oral cavity of each person is home to hundreds of bacterial species. While taxa for oral diseases have been studied using culture-based characterization as well as amplicon sequencing, metagenomic and genomic information remains scarce compared to the fecal microbiome. Here, using metagenomic shotgun data for 3346 oral metagenomic samples together with 808 published samples, we obtain 56,213 metagenome-assembled genomes (MAGs), and more than 64% of the 3589 species-level genome bins (SGBs) contain no publicly available genomes. The resulting genome collection is representative of samples around the world and contains many genomes from candidate phyla radiation (CPR) that lack monoculture. Also, it enables the discovery of new taxa such as a genus Candidatus Bgiplasma within the family Acholeplasmataceae. Large-scale metagenomic data from massive samples also allow the assembly of strains from important oral taxa such as Porphyromonas and Neisseria. The oral microbes encode genes that could potentially metabolize drugs. Apart from these findings, a strongly male-enriched Campylobacter species was identified. Oral samples would be more user-friendly collected than fecal samples and have the potential for disease diagnosis. Thus, these data lay down a genomic framework for future inquiries of the human oral microbiome.


Assuntos
Metagenoma , Microbiota , Humanos , Masculino , Microbiota/genética , Metagenômica/métodos , Bactérias/genética , Fezes
17.
Microorganisms ; 10(12)2022 Dec 06.
Artigo em Inglês | MEDLINE | ID: mdl-36557669

RESUMO

Metagenomics offers the highest level of strain discrimination of bacterial pathogens from complex food and water microbiota. With the rapid evolvement of assembly algorithms, defining an optimal assembler based on the performance in the metagenomic identification of foodborne and waterborne pathogens is warranted. We aimed to benchmark short-read assemblers for the metagenomic identification of foodborne and waterborne pathogens using simulated bacterial communities. Bacterial communities on fresh spinach and in surface water were simulated by generating paired-end short reads of Illumina HiSeq, MiSeq, and NovaSeq at different sequencing depths. Multidrug-resistant Salmonella Indiana SI43 and Pseudomonas aeruginosa PAO1 were included in the simulated communities on fresh spinach and in surface water, respectively. ABySS, IDBA-UD, MaSuRCA, MEGAHIT, metaSPAdes, and Ray Meta were benchmarked in terms of assembly quality, identifications of plasmids, virulence genes, Salmonella pathogenicity island, antimicrobial resistance genes, chromosomal point mutations, serotyping, multilocus sequence typing, and whole-genome phylogeny. Overall, MEGHIT, metaSPAdes, and Ray Meta were more effective for metagenomic identification. We did not obtain an optimal assembler when using the extracted reads classified as Salmonella or P. aeruginosa for downstream genomic analyses, but the extracted reads showed consistent phylogenetic topology with the reference genome when they were aligned with Salmonella or P. aeruginosa strains. In most cases, HiSeq, MiSeq, and NovaSeq were comparable at the same sequencing depth, while higher sequencing depths generally led to more accurate results. As assembly algorithms advance and mature, the evaluation of assemblers should be a continuous process.

18.
Front Microbiol ; 13: 869135, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35756038

RESUMO

The analysis of metagenome data based on the recovery of draft genomes (so called metagenome-assembled genomes, or MAG) has assumed an increasingly central role in microbiome research in recent years. Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high ecological complexity, and remain important, albeit understudied, microbial communities that play ssa key role in mediating interactions between human and natural ecosystems. Here we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multi-sample co-assembly and a recently published MAG recovery workflow using deep learning. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, showing that co-assembly may offer advantages over single-sample assembly in the case of metagenome data obtained from closely sampled longitudinal study designs. Around 1,000 MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genomes. These findings carry broader broader implications for performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.

19.
Front Microbiol ; 12: 638561, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33717033

RESUMO

High-throughput sequencing has revolutionized the field of microbiology, however, reconstructing complete genomes of organisms from whole metagenomic shotgun sequencing data remains a challenge. Recovered genomes are often highly fragmented, due to uneven abundances of organisms, repeats within and across genomes, sequencing errors, and strain-level variation. To address the fragmented nature of metagenomic assemblies, scientists rely on a process called binning, which clusters together contigs inferred to originate from the same organism. Existing binning algorithms use oligonucleotide frequencies and contig abundance (coverage) within and across samples to group together contigs from the same organism. However, these algorithms often miss short contigs and contigs from regions with unusual coverage or DNA composition characteristics, such as mobile elements. Here, we propose that information from assembly graphs can assist current strategies for metagenomic binning. We use MetaCarvel, a metagenomic scaffolding tool, to construct assembly graphs where contigs are nodes and edges are inferred based on paired-end reads. We developed a tool, Binnacle, that extracts information from the assembly graphs and clusters scaffolds into comprehensive bins. Binnacle also provides wrapper scripts to integrate with existing binning methods. The Binnacle pipeline can be found on GitHub (https://github.com/marbl/binnacle). We show that binning graph-based scaffolds, rather than contigs, improves the contiguity and quality of the resulting bins, and captures a broader set of the genes of the organisms being reconstructed.

20.
mSystems ; 5(6)2020 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-33144315

RESUMO

Large-scale metagenome assemblies of human microbiomes have produced a vast catalogue of previously unseen microbial genomes; however, comparatively few microbial genomes derive from other vertebrates. Here, we generated 5,596 metagenome-assembled genomes (MAGs) from the gut metagenomes of 180 predominantly wild animal species representing 5 classes, in addition to 14 existing animal gut metagenome data sets. The MAGs comprised 1,522 species-level genome bins (SGBs), most of which were novel at the species, genus, or family level, and the majority were enriched in host versus environment metagenomes. Many traits distinguished SGBs enriched in host or environmental biomes, including the number of antimicrobial resistance genes. We identified 1,986 diverse biosynthetic gene clusters; only 23 clustered with any MIBiG database references. Gene-based assembly revealed tremendous gene diversity, much of it host or environment specific. Our MAG and gene data sets greatly expand the microbial genome repertoire and provide a broad view of microbial adaptations to the vertebrate gut.IMPORTANCE Microbiome studies on a select few mammalian species (e.g., humans, mice, and cattle) have revealed a great deal of novel genomic diversity in the gut microbiome. However, little is known of the microbial diversity in the gut of other vertebrates. We studied the gut microbiomes of a large set of mostly wild animal species consisting of mammals, birds, reptiles, amphibians, and fish. Unfortunately, we found that existing reference databases commonly used for metagenomic analyses failed to capture the microbiome diversity among vertebrates. To increase database representation, we applied advanced metagenome assembly methods to our animal gut data and to many public gut metagenome data sets that had not been used to obtain microbial genomes. Our resulting genome and gene cluster collections comprised a great deal of novel taxonomic and genomic diversity, which we extensively characterized. Our findings substantially expand what is known of microbial genomic diversity in the vertebrate gut.

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa