Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters








Publication year range
1.
Environ Microbiome ; 19(1): 19, 2024 Mar 28.
Article in English | MEDLINE | ID: mdl-38549112

ABSTRACT

BACKGROUND: Recent endeavours in metagenomics, exemplified by projects such as the human microbiome project and TARA Oceans, have illuminated the complexities of microbial biomes. A robust bioinformatic pipeline and meticulous evaluation of their methodology have contributed to the success of these projects. The soil environment, however, with its unique challenges, requires a specialized methodological exploration to maximize microbial insights. A notable limitation in soil microbiome studies is the dearth of soil-specific reference databases available to classifiers that emulate the complexity of soil communities. There is also a lack of in-vitro mock communities derived from soil strains that can be assessed for taxonomic classification accuracy. RESULTS: In this study, we generated a custom in-silico mock community containing microbial genomes commonly observed in the soil microbiome. Using this mock community, we simulated shotgun sequencing data to evaluate the performance of three leading metagenomic classifiers: Kraken2 (supplemented with Bracken, using a custom database derived from GTDB-TK genomes along with its own default database), Kaiju, and MetaPhlAn, utilizing their respective default databases for a robust analysis. Our results highlight the importance of optimizing taxonomic classification parameters, database selection, as well as analysing trimmed reads and contigs. Our study showed that classifiers tailored to the specific taxa present in our samples led to fewer errors compared to broader databases including microbial eukaryotes, protozoa, or human genomes, highlighting the effectiveness of targeted taxonomic classification. Notably, an optimal classifier performance was achieved when applying a relative abundance threshold of 0.001% or 0.005%. The Kraken2 supplemented with bracken, with a custom database demonstrated superior precision, sensitivity, F1 score, and overall sequence classification. Using a custom database, this classifier classified 99% of in-silico reads and 58% of real-world soil shotgun reads, with the latter identifying previously overlooked phyla using a custom database. CONCLUSION: This study underscores the potential advantages of in-silico methodological optimization in metagenomic analyses, especially when deciphering the complexities of soil microbiomes. We demonstrate that the choice of classifier and database significantly impacts microbial taxonomic profiling. Our findings suggest that employing Kraken2 with Bracken, coupled with a custom database of GTDB-TK genomes and fungal genomes at a relative abundance threshold of 0.001% provides optimal accuracy in soil shotgun metagenome analysis.

2.
Genes (Basel) ; 13(12)2022 12 03.
Article in English | MEDLINE | ID: mdl-36553546

ABSTRACT

The study of microorganisms is a field of great interest due to their environmental (e.g., soil contamination) and biomedical (e.g., parasitic diseases, autism) importance. The advent of revolutionary next-generation sequencing techniques, and their application to the hypervariable regions of the 16S, 18S or 23S ribosomal subunits, have allowed the research of a large variety of organisms more in-depth, including bacteria, archaea, eukaryotes and fungi. Additionally, together with the development of analysis software, the creation of specific databases (e.g., SILVA or RDP) has boosted the enormous growth of these studies. As the cost of sequencing per sample has continuously decreased, new protocols have also emerged, such as shotgun sequencing, which allows the profiling of all taxonomic domains in a sample. The sequencing of hypervariable regions and shotgun sequencing are technologies that enable the taxonomic classification of microorganisms from the DNA present in microbial communities. However, they are not capable of measuring what is actively expressed. Conversely, we advocate that metatranscriptomics is a "new" technology that makes the identification of the mRNAs of a microbial community possible, quantifying gene expression levels and active biological pathways. Furthermore, it can be also used to characterise symbiotic interactions between the host and its microbiome. In this manuscript, we examine the three technologies above, and discuss the implementation of different software and databases, which greatly impact the obtaining of reliable results. Finally, we have developed two easy-to-use pipelines leveraging Nextflow technology. These aim to provide everything required for an average user to perform a metagenomic analysis of marker genes with QIMME2 and a metatranscriptomic study using Kraken2/Bracken.


Subject(s)
Bacteria , Microbiota , Bacteria/genetics , Archaea/genetics , Software , Microbiota/genetics , Metagenome/genetics
3.
Med Princ Pract ; 31(5): 493-496, 2022.
Article in English | MEDLINE | ID: mdl-35944494

ABSTRACT

OBJECTIVE: A multiplex gyrB PCR assay has been used to diagnose Acinetobacter baumannii. However, this assay has not been validated against the gold standard DNA-DNA hybridization assay, which is a laborious method. DNA-DNA hybridization assay is now replaced by whole genome sequence (WGS)-based methods. Two such methods are a k-mer-based search of sequence reads using the Kraken 2 program and average nucleotide identity (ANI). The objective was to validate the gyrB PCR assay with WGS-based methods. SUBJECTS AND METHODS: We cultured 270 sequential A. baumannii isolates from the rectal swabs of 32 adult patients. The identity of the isolates was determined by gyrB PCR. The sequences of 269 isolates were determined by Illumina sequencing and the taxonomy was inferred by the Kraken 2 program and ANI. RESULTS: All the 269 isolates were confirmed as A. baumannii by Kraken 2 and ANI. CONCLUSION: The gyrB PCR assay is now validated for easy identification of A. baumannii in comparison with gold standard WGS-based assays.


Subject(s)
Acinetobacter Infections , Acinetobacter baumannii , Adult , Humans , Acinetobacter baumannii/genetics , Acinetobacter Infections/diagnosis , Multiplex Polymerase Chain Reaction/methods , DNA , Anti-Bacterial Agents
4.
Microorganisms ; 10(4)2022 Mar 25.
Article in English | MEDLINE | ID: mdl-35456762

ABSTRACT

Metagenomics analysis is now routinely used for clinical diagnosis in several diseases, and we need confidence in interpreting metagenomics analysis of microbiota. Particularly from the side of clinical microbiology, we consider that it would be a major milestone to further advance microbiota studies with an innovative and significant approach consisting of processing steps and quality assessment for interpreting metagenomics data used for diagnosis. Here, we propose a methodology for taxon identification and abundance assessment of shotgun sequencing data of microbes that are well fitted for clinical setup. Processing steps of quality controls have been developed in order (i) to avoid low-quality reads and sequences, (ii) to optimize abundance thresholds and profiles, (iii) to combine classifiers and reference databases for best classification of species and abundance profiles for both prokaryotic and eukaryotic sequences, and (iv) to introduce external positive control. We find that the best strategy is to use a pipeline composed of a combination of different but complementary classifiers such as Kraken2/Bracken and Kaiju. Such improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.

5.
Chin Med ; 17(1): 38, 2022 Mar 22.
Article in English | MEDLINE | ID: mdl-35317843

ABSTRACT

Molecular herbal authentication has gained worldwide popularity in the past decade. DNA-based methods, including DNA barcoding and species-specific amplification, have been adopted for herbal identification by various pharmacopoeias. Development of next-generating sequencing (NGS) drastically increased the throughput of sequencing process and has sped up sequence collection and assembly of organelle genomes, making more and more reference sequences/genomes available. NGS allows simultaneous sequencing of multiple reads, opening up the opportunity of identifying multiple species from one sample in one go. Two major experimental approaches have been applied in recent publications of identification of herbal products by NGS, the PCR-dependent DNA metabarcoding and PCR-free genome skimming/shotgun metagenomics. This review provides a brief introduction of the use of DNA metabarcoding and genome skimming/shotgun metagenomics in authentication of herbal products and discusses some important considerations in experimental design for botanical identification by NGS, with a specific focus on quality control, reference sequence database and different taxon assignment programs. The potential of quantification or abundance estimation by NGS is discussed and new scientific findings that could potentially interfere with accurate taxon assignment and/or quantification is presented.

6.
China CDC Wkly ; 4(49): 1110-1116, 2022 Dec 09.
Article in English | MEDLINE | ID: mdl-36751662

ABSTRACT

Introduction: Salmonella is a key intestinal pathogen of foodborne disease, and the plasmids in Salmonella are related to many biological characteristics, including virulence and drug resistance. A large number of plasmid contigs have been sequenced in bacterial draft genomes, however, these are often difficult to distinguish from chromosomal contigs. Methods: In this study, three different customized Kraken databases were used to build three different Kraken classifiers. Complete genome benchmark datasets and simulated draft genome benchmark datasets were constructed. Five-fold cross-validation was used to evaluate the performance of the three different Kraken classifiers by two benchmark datasets. Results: The predictive performance of the classifier based on all National Center for Biotechnology Information plasmids and Salmonella complete genomes was optimal. This optimal Kraken classifier was performed with Salmonella isolated in China. The plasmid carrying rate of Salmonella in China is 91.01%, and it was found that the Kraken classifier could find more plasmid contigs and antibiotic resistance genes (ARGs) than results derived from a plasmid replicon-based method (PlasmidFinder). Moreover, it was found that in the strains carrying ARGs, plasmids carried more ARGs [three, 95% confidence interval (CI): 1-14] than chromosomes (one, 95% CI: 1-7). Discussion: We found building a high-quality customized database as a Kraken classifier to be ideal for the prediction of Salmonella plasmid sequences from bacterial draft genomes. In the future, the Kraken classifier established in this study will play a significant role in ARG monitoring.

7.
Int J Mol Sci ; 22(16)2021 Aug 22.
Article in English | MEDLINE | ID: mdl-34445764

ABSTRACT

Recent research studies are showing breast tissues as a place where various species of microorganisms can thrive and cannot be considered sterile, as previously thought. We analysed the microbial composition of primary tumour tissue and normal breast tissue and found differences between them and between multiple breast cancer phenotypes. We sequenced the transcriptome of breast tumours and normal tissues (from cancer-free women) of 23 individuals from Slovakia and used bioinformatics tools to uncover differences in the microbial composition of tissues. To analyse our RNA-seq data (rRNA depleted), we used and tested Kraken2 and Metaphlan3 tools. Kraken2 has shown higher reliability for our data. Additionally, we analysed 91 samples obtained from SRA database, originated in China and submitted by Sichuan University. In breast tissue, the most enriched group were Proteobacteria, then Firmicutes and Actinobacteria for both datasets, in Slovak samples also Bacteroides, while in Chinese samples Cyanobacteria were more frequent. We have observed changes in the microbiome between cancerous and healthy tissues and also different phenotypes of diseases, based on the presence of circulating tumour cells and few other markers.


Subject(s)
Breast Neoplasms/microbiology , Breast/microbiology , Microbiota , Case-Control Studies , Female , Humans , Neoplastic Cells, Circulating , Transcriptome
8.
mSystems ; 6(4): e0075021, 2021 Aug 31.
Article in English | MEDLINE | ID: mdl-34427527

ABSTRACT

The advent of high-throughput sequencing techniques has recently provided an astonishing insight into the composition and function of the human microbiome. Next-generation sequencing (NGS) has become the gold standard for advanced microbiome analysis; however, 3rd generation real-time sequencing, such as Oxford Nanopore Technologies (ONT), enables rapid sequencing from several kilobases to >2 Mb with high resolution. Despite the wide availability and the enormous potential for clinical and translational applications, ONT is poorly standardized in terms of sampling and storage conditions, DNA extraction, library creation, and bioinformatic classification. Here, we present a comprehensive analysis pipeline with sampling, storage, DNA extraction, library preparation, and bioinformatic evaluation for complex microbiomes sequenced with ONT. Our findings from buccal and rectal swabs and DNA extraction experiments indicate that methods that were approved for NGS microbiome analysis cannot be simply adapted to ONT. We recommend using swabs and DNA extractions protocols with extended washing steps. Both 16S rRNA and metagenomic sequencing achieved reliable and reproducible results. Our benchmarking experiments reveal thresholds for analysis parameters that achieved excellent precision, recall, and area under the precision recall values and is superior to existing classifiers (Kraken2, Kaiju, and MetaMaps). Hence, our workflow provides an experimental and bioinformatic pipeline to perform a highly accurate analysis of complex microbial structures from buccal and rectal swabs. IMPORTANCE Advanced microbiome analysis relies on sequencing of short DNA fragments from microorganisms like bacteria, fungi, and viruses. More recently, long fragment DNA sequencing of 3rd generation sequencing has gained increasing importance and can be rapidly conducted within a few hours due to its potential real-time sequencing. However, the analysis and correct identification of the microbiome relies on a multitude of factors, such as the method of sampling, DNA extraction, sequencing, and bioinformatic analysis. Scientists have used different protocols in the past that do not allow us to compare results across different studies and research fields. Here, we provide a comprehensive workflow from DNA extraction, sequencing, and bioinformatic workflow that allows rapid and accurate analysis of human buccal and rectal swabs with reproducible protocols. This workflow can be readily applied by many scientists from various research fields that aim to use long-fragment microbiome sequencing.

9.
Article in English | MEDLINE | ID: mdl-32083020

ABSTRACT

Differentiation between mitis group streptococci (MGS) bacteria in routine laboratory tests has become important for obtaining accurate epidemiological information on the characteristics of MGS and understanding their clinical significance. The most reliable method of MGS species identification is multilocus sequence analysis (MLSA) with seven house-keeping genes; however, because this method is time-consuming, it is deemed unsuitable for use in most clinical laboratories. In this study, we established a scheme for identifying 12 species of MGS (S. pneumoniae, S. pseudopneumoniae, S. mitis, S. oralis, S. peroris, S. infantis, S. australis, S. parasanguinis, S. sinensis, S. sanguinis, S. gordonii, and S. cristatus) using the MinION nanopore sequencer (Oxford Nanopore Technologies, Oxford, UK) with the taxonomic aligner "What's in My Pot?" (WIMP; Oxford Nanopore's cloud-based analysis platform) and Kraken2 pipeline with the custom database adjusted for MGS species identification. The identities of the species in reference genomes (n = 514), clinical isolates (n = 31), and reference strains (n = 4) were confirmed via MLSA. The nanopore simulation reads were generated from reference genomes, and the optimal cut-off values for MGS species identification were determined. For 31 clinical isolates (S. pneumoniae = 8, S. mitis = 17 and S. oralis = 6) and 4 reference strains (S. pneumoniae = 1, S. mitis = 1, S. oralis = 1, and S. pseudopneumoniae = 1), a sequence library was constructed via a Rapid Barcoding Sequencing Kit for multiplex and real-time MinION sequencing. The optimal cut-off values for the identification of MGS species for analysis by WIMP and Kraken2 pipeline were determined. The workflow using Kraken2 pipeline with a custom database identified all 12 species of MGS, and WIMP identified 8 MGS bacteria except S. infantis, S. australis, S. peroris, and S. sinensis. The results obtained by MinION with WIMP and Kraken2 pipeline were consistent with the MGS species identified by MLSA analysis. The practical advantage of whole genome analysis using the MinION nanopore sequencer is that it can aid in MGS surveillance. We concluded that MinION sequencing with the taxonomic aligner enables accurate MGS species identification and could contribute to further epidemiological surveys.


Subject(s)
Bacterial Typing Techniques , Nanopore Sequencing , Sequence Analysis, DNA , Streptococcus/classification , Genes, Bacterial , Genome, Bacterial , Humans , Mouth Mucosa/microbiology , Multilocus Sequence Typing , Phylogeny , Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization , Streptococcal Infections/microbiology , Streptococcus/genetics , Streptococcus/isolation & purification , Streptococcus mitis/classification , Streptococcus mitis/genetics , Streptococcus mitis/isolation & purification , Streptococcus oralis/classification , Streptococcus oralis/genetics , Streptococcus oralis/isolation & purification , Streptococcus pneumoniae/classification , Streptococcus pneumoniae/genetics , Streptococcus pneumoniae/isolation & purification , Streptococcus sanguis/classification , Streptococcus sanguis/genetics , Streptococcus sanguis/isolation & purification , Whole Genome Sequencing
10.
Mol Ecol Resour ; 20(3)2020 May.
Article in English | MEDLINE | ID: mdl-31943790

ABSTRACT

The ability to detect the identity of a sample obtained from its environment is a cornerstone of molecular ecological research. Thanks to the falling price of shotgun sequencing, genome skimming, the acquisition of short reads spread across the genome at low coverage, is emerging as an alternative to traditional barcoding. By obtaining far more data across the whole genome, skimming has the promise to increase the precision of sample identification beyond traditional barcoding while keeping the costs manageable. While methods for assembly-free sample identification based on genome skims are now available, little is known about how these methods react to the presence of DNA from organisms other than the target species. In this paper, we show that the accuracy of distances computed between a pair of genome skims based on k-mer similarity can degrade dramatically if the skims include contaminant reads; i.e., any reads originating from other organisms. We establish a theoretical model of the impact of contamination. We then suggest and evaluate a solution to the contamination problem: Query reads in a genome skim against an extensive database of possible contaminants (e.g., all microbial organisms) and filter out any read that matches. We evaluate the effectiveness of this strategy when implemented using Kraken-II, in detailed analyses. Our results show substantial improvements in accuracy as a result of filtering but also point to limitations, including a need for relatively close matches in the contaminant database.


Subject(s)
Genome/genetics , DNA/genetics , Databases, Genetic , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Analysis, DNA/methods
11.
Genes (Basel) ; 9(8)2018 Aug 20.
Article in English | MEDLINE | ID: mdl-30127280

ABSTRACT

Accurate species identification from ancient DNA samples is a difficult task that would shed light on the evolutionary history of pathogenic microorganisms. The field of palaeomicrobiology has undoubtedly benefited from the advent of untargeted metagenomic approaches that use next-generation sequencing methodologies. Nevertheless, assigning ancient DNA at the species level is a challenging process. Recently, the gut microbiome analysis of three pre-Columbian Andean mummies (Santiago-Rodriguez et al., 2016) has called into question the identification of Leishmania in South America. The accurate assignment would be important because it will provide some key elements that are linked to the evolutionary scenario for visceral leishmaniasis agents in South America. Here, we recovered the metagenomic data filed in the metagenomics RAST server (MG-RAST) to identify the different members of the Trypanosomatidae family that have infected these ancient remains. For this purpose, we used the ultrafast metagenomic sequence classifier, based on an exact alignment of k-mers (Kraken) and Bowtie2, an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. The analyses, which have been conducted on the most exhaustive genomic database possible on Trypanosomatidae, show that species assignments could be biased by a lack of some genomic sequences of Trypanosomatidae species (strains). Nevertheless, our work raises the issue of possible co-infections by multiple members of the Trypanosomatidae family in these three pre-Columbian mummies. In the three mummies, we show the presence of DNA that is reminiscent of a probable co-infection with Leptomonas seymouri, a parasite of insect's gut, and Lotmaria.

12.
Front Microbiol ; 8: 2445, 2017.
Article in English | MEDLINE | ID: mdl-29270165

ABSTRACT

The advent of next generation sequencing and bioinformatics tools have greatly advanced our knowledge about the phylogenetic diversity and ecological role of microbes inhabiting the mammalian gut. However, there is a lack of information on the evaluation of these computational tools in the context of the rumen microbiome as these programs have mostly been benchmarked on real or simulated datasets generated from human studies. In this study, we compared the outcomes of two methods, Kraken (mRNA based) and a pipeline developed in-house based on Mothur (16S rRNA based), to assess the taxonomic profiles (bacteria and archaea) of rumen microbial communities using total RNA sequencing of rumen fluid collected from 12 cattle with differing feed conversion ratios (FCR). Both approaches revealed a similar phyla distribution of the most abundant taxa, with Bacteroidetes, Firmicutes, and Proteobacteria accounting for approximately 80% of total bacterial abundance. For bacterial taxa, although 69 genera were commonly detected by both methods, an additional 159 genera were exclusively identified by Kraken. Kraken detected 423 species, while Mothur was not able to assign bacterial sequences to the species level. For archaea, both methods generated similar results only for the abundance of Methanomassiliicoccaceae (previously referred as RCC), which comprised more than 65% of the total archaeal families. Taxon R4-41B was exclusively identified by Mothur in the rumen of feed efficient bulls, whereas Kraken uniquely identified Methanococcaceae in inefficient bulls. Although Kraken enhanced the microbial classification at the species level, identification of bacteria or archaea in the rumen is limited due to a lack of reference genomes for the rumen microbiome. The findings from this study suggest that the development of the combined pipelines using Mothur and Kraken is needed for a more inclusive and representative classification of microbiomes.

13.
Protist ; 167(3): 268-78, 2016 06.
Article in English | MEDLINE | ID: mdl-27236418

ABSTRACT

The term 'filose amoebae' describes a highly polyphyletic assemblage of protists whose phylogenetic placement can be unpredictable based on gross morphology alone. We isolated six filose amoebae from soils of two European countries and describe a new genus and species of naked filose amoebae, Kraken carinae gen. nov. sp. nov. We provide a morphological description based on light microscopy and small subunit rRNA gene sequences (SSU rDNA). In culture, Kraken carinae strains were very slow-moving and preyed on bacteria using a network of filopodia. Phylogenetic analyses of SSU sequences reveal that Kraken are core (filosan) Cercozoa, branching weakly at the base of the cercomonad radiation, most closely related to Paracercomonas, Metabolomonas, and Brevimastigomonas. Some Kraken sequences are >99% similar to an environmental sequence obtained from a freshwater lake in Antarctica, indicating that Kraken is not exclusively soil dwelling, but also inhabits freshwater habitats.


Subject(s)
Cercozoa/classification , Cercozoa/isolation & purification , Cercozoa/cytology , Cercozoa/genetics , Cluster Analysis , DNA, Protozoan/chemistry , DNA, Protozoan/genetics , DNA, Ribosomal/chemistry , DNA, Ribosomal/genetics , Microscopy , Phylogeny , RNA, Ribosomal, 18S/genetics , Sequence Analysis, DNA , Soil/parasitology
14.
J Microbiol Methods ; 122: 38-42, 2016 Mar.
Article in English | MEDLINE | ID: mdl-26812576

ABSTRACT

Ultrafast-metagenomic sequence classification using exact alignments (Kraken) is a novel approach to classify 16S rDNA sequences. The classifier is based on mapping short sequences to the lowest ancestor and performing alignments to form subtrees with specific weights in each taxon node. This study aimed to evaluate the classification performance of Kraken with long 16S rDNA random environmental sequences produced by cloning and then Sanger sequenced. A total of 480 clones were isolated and expanded, and 264 of these clones formed contigs (1352 ± 153 bp). The same sequences were analyzed using the Ribosomal Database Project (RDP) classifier. Deeper classification performance was achieved by Kraken than by the RDP: 73% of the contigs were classified up to the species or variety levels, whereas 67% of these contigs were classified no further than the genus level by the RDP. The results also demonstrated that unassembled sequences analyzed by Kraken provide similar or inclusively deeper information. Moreover, sequences that did not form contigs, which are usually discarded by other programs, provided meaningful information when analyzed by Kraken. Finally, it appears that the assembly step for Sanger sequences can be eliminated when using Kraken. Kraken cumulates the information of both sequence senses, providing additional elements for the classification. In conclusion, the results demonstrate that Kraken is an excellent choice for use in the taxonomic assignment of sequences obtained by Sanger sequencing or based on third generation sequencing, of which the main goal is to generate larger sequences.


Subject(s)
Bacteria/genetics , DNA, Ribosomal/classification , Metagenomics/methods , RNA, Ribosomal, 16S/classification , Sequence Alignment/methods , Sequence Analysis, DNA/methods , Algorithms , Bacteria/classification , Cloning, Organism/methods , Computational Biology/methods , DNA Barcoding, Taxonomic/methods , DNA, Ribosomal/analysis , DNA, Ribosomal/genetics , Databases, Genetic , Escherichia coli/genetics , RNA, Ribosomal, 16S/analysis , RNA, Ribosomal, 16S/genetics , Sensitivity and Specificity , Software
SELECTION OF CITATIONS
SEARCH DETAIL