RESUMO
BACKGROUND: Triple modulator therapy elexacaftor/tezacaftor/ivacaftor (ETI) improves lung function and impacts upon the respiratory microbiome in people with Cystic fibrosis (pwCF) with advanced lung disease. However, adolescents with cystic fibrosis (CF) are less colonized with bacterial pathogens than adult pwCF but their microbiota already differs from healthy individuals. The aim of this study was to longitudinally analyze the impact of ETI on the respiratory metagenome in adolescents with predominantly mild CF lung disease. METHODS: In this prospective observational study, we included pwCF aged 12-20 years with at least one F508del mutation, who collected oropharyngeal swabs before and after initiation of ETI therapy twice per week to biweekly over three months. We performed whole metagenome shotgun sequencing, followed by host DNA filtering and taxonomic profiling. We used linear and additive mixed effects models adjusted for known confounders and corrected for multiple testing to study longitudinal development of the microbiome. We analyzed bacterial diversity, abundance, and strain-level phylogeny. RESULTS: We analyzed the metagenomic data of 297 swabs of 20 pwCF. Microbiome composition changed after initiation of ETI therapy. We observed a slight diversification of the microbiome over time (Inv Simpson, Coef 0.085, 95 %CI 0.003, 0.17, p = 0.04). Strain-level analysis and clustering showed that strain retention of the most frequent bacterial species is predominant even during ETI therapy. CONCLUSIONS: During three months of ETI therapy, commensal bacteria increased, which may help to prevent overgrowth of bacterial pathogens.
RESUMO
Wastewater treatment plants (WWTPs) serve as reservoirs for various pathogens and play a pivotal role in safeguarding environmental safety and public health by mitigating pathogen release. Pathogenic bacteria, known for their potential to cause fatal infections, present a significant and emerging threat to global health and remain poorly understood regarding their origins and transmission in the environment. Using metagenomic approaches, we identified a total of 299 pathogens from three full-scale WWTPs. We comprehensively elucidated the occurrence, dissemination, and source tracking of the pathogens across the WWTPs, addressing deficiencies in traditional detection strategies. While indicator pathogens in current wastewater treatment systems such as Escherichia coli are effectively removed, specific drug-resistant pathogens, including Pseudomonas aeruginosa, Pseudomonas putida, and Aeromonas caviae, persist throughout the treatment process, challenging complete eradication efforts. The anoxic section plays a predominant role in controlling abundance but significantly contributes to downstream pathogen diversity. Additionally, evolution throughout the treatment process enhances pathogen diversity, except for upstream transmission, such as A. caviae str. WP8-S18-ESBL-04 and P. aeruginosa PAO1. Our findings highlight the necessity of expanding current biomonitoring indicators for wastewater treatment to optimize treatment strategies and mitigate the potential health risks posed by emerging pathogens. By addressing these research priorities, we can effectively mitigate risks and safeguard environmental safety and public health.
RESUMO
With the development of sequencing technology and analytic tools, studying within-species variations enhances the understanding of microbial biological processes. Nevertheless, most existing methods designed for strain-level analysis lack the capability to concurrently assess both strain proportions and genome-wide single nucleotide variants (SNVs) across longitudinal metagenomic samples. In this study, we introduce LongStrain, an integrated pipeline for the analysis of large-scale metagenomic data from individuals with longitudinal or repeated samples. In LongStrain, we first utilize two efficient tools, Kraken2 and Bowtie2, for the taxonomic classification and alignment of sequencing reads, respectively. Subsequently, we propose to jointly model strain proportions and shared haplotypes across samples within individuals. This approach specifically targets tracking a primary strain and a secondary strain for each subject, providing their respective proportions and SNVs as output. With extensive simulation studies of a microbial community and single species, our results demonstrate that LongStrain is superior to two genotyping methods and two deconvolution methods across a majority of scenarios. Furthermore, we illustrate the potential applications of LongStrain in the real data analysis of The Environmental Determinants of Diabetes in the Young study and a gastric intestinal metaplasia microbiome study. In summary, the proposed analytic pipeline demonstrates marked statistical efficiency over the same type of methods and has great potential in understanding the genomic variants and dynamic changes at strain level. LongStrain and its tutorial are freely available online at https://github.com/BoyanZhou/LongStrain. IMPORTANCE: The advancement in DNA-sequencing technology has enabled the high-resolution identification of microorganisms in microbial communities. Since different microbial strains within species may contain extreme phenotypic variability (e.g., nutrition metabolism, antibiotic resistance, and pathogen virulence), investigating within-species variations holds great scientific promise in understanding the underlying mechanism of microbial biological processes. To fully utilize the shared genomic variants across longitudinal metagenomics samples collected in microbiome studies, we develop an integrated analytic pipeline (LongStrain) for longitudinal metagenomics data. It concurrently leverages the information on proportions of mapped reads for individual strains and genome-wide SNVs to enhance the efficiency and accuracy of strain identification. Our method helps to understand strains' dynamic changes and their association with genome-wide variants. Given the fast-growing longitudinal studies of microbial communities, LongStrain which streamlines analyses of large-scale raw sequencing data should be of great value in microbiome research communities.
RESUMO
Complex microbiomes are part of the food we eat and influence our own microbiome, but their diversity remains largely unexplored. Here, we generated the open access curatedFoodMetagenomicData (cFMD) resource by integrating 1,950 newly sequenced and 583 public food metagenomes. We produced 10,899 metagenome-assembled genomes spanning 1,036 prokaryotic and 108 eukaryotic species-level genome bins (SGBs), including 320 previously undescribed taxa. Food SGBs displayed significant microbial diversity within and between food categories. Extension to >20,000 human metagenomes revealed that food SGBs accounted on average for 3% of the adult gut microbiome. Strain-level analysis highlighted potential instances of food-to-gut transmission and intestinal colonization (e.g., Lacticaseibacillus paracasei) as well as SGBs with divergent genomic structures in food and humans (e.g., Streptococcus gallolyticus and Limosilactobabillus mucosae). The cFMD expands our knowledge on food microbiomes, their role in shaping the human microbiome, and supports future uses of metagenomics for food quality, safety, and authentication.
Assuntos
Microbioma Gastrointestinal , Metagenoma , Humanos , Metagenoma/genética , Microbioma Gastrointestinal/genética , Microbiota/genética , Microbiologia de Alimentos , Metagenômica/métodos , Bactérias/genética , Bactérias/classificaçãoRESUMO
The current study aims to develop a new technique for the precise identification of Escherichia coli strains, utilizing matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) combined with a long short-term memory (LSTM) neural network. A total of 48 Escherichia coli strains were isolated and cultured on tryptic soy agar medium for 24 hours for the generation of MALDI-TOF MS spectra. Eight hundred MALDI-TOF MS spectra were obtained per strain, resulting in a database of 38,400 spectra. Fifty percent of the data was utilized for LSTM neural network training, with fine-tuned parameters for strain-level identification. The other half served as the test set to assess model performance. Traditional PCA dimension reduction of MALDI-TOF MS spectra indicated 47 out of 48 strains to be unclassifiable. In contrast, the LSTM neural network demonstrated remarkable efficacy. After 20 training epochs, the model achieved a loss value of 0.0524, an accuracy of 0.999, a precision of 0.985, and a recall of 0.982. When tested on the unseen data, the model attained an overall accuracy of 92.24%. The integration of MALDI-TOF MS and LSTM neural network markedly enhances the identification of Escherichia coli strains. This innovative approach offers an effective and accurate tool for MALDI-TOF MS-based strain-level identification, thus expanding the analytical capabilities of microbial diagnostics.
Assuntos
Escherichia coli , Redes Neurais de Computação , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz/métodosRESUMO
The gut microbiome is closely associated with human health and the development of diseases. Isolating, characterizing, and identifying gut microbes are crucial for research on the gut microbiome and essential for advancing our understanding and utilization of it. Although culture-independent approaches have been developed, a pure culture is required for in-depth analysis of disease mechanisms and the development of biotherapy strategies. Currently, microbiome research faces the challenge of expanding the existing database of culturable gut microbiota and rapidly isolating target microorganisms. This review examines the advancements in gut microbe isolation and cultivation techniques, such as culturomics, droplet microfluidics, phenotypic and genomics selection, and membrane diffusion. Furthermore, we evaluate the progress made in technology for identifying gut microbes considering both non-targeted and targeted strategies. The focus of future research in gut microbial culturomics is expected to be on high-throughput, automation, and integration. Advancements in this field may facilitate strain-level investigation into the mechanisms underlying diseases related to gut microbiota.
Assuntos
Microbioma Gastrointestinal , Microbioma Gastrointestinal/fisiologia , HumanosRESUMO
Detecting cyanobacteria in environments is an important concern due to their crucial roles in ecosystems, and they can form blooms with the potential to harm humans and nonhuman entities. However, the most widely used methods for high-throughput detection of environmental cyanobacteria, such as 16S rRNA sequencing, typically provide above-species-level resolution, thereby disregarding intraspecific variation. To address this, we developed a novel DNA microarray tool, termed the CyanoStrainChip, that enables strain-level comprehensive profiling of environmental cyanobacteria. The CyanoStrainChip was designed to target 1277 strains; nearly all major groups of cyanobacteria are included by implementing 43,666 genome-wide, strain-specific probes. It demonstrated strong specificity by in vitro mock community experiments. The high correlation (Pearson's R > 0.97) between probe fluorescence intensities and the corresponding DNA amounts (ranging from 1-100 ng) indicated excellent quantitative capability. Consistent cyanobacterial profiles of field samples were observed by both the CyanoStrainChip and next-generation sequencing methods. Furthermore, CyanoStrainChip analysis of surface water samples in Lake Chaohu uncovered a high intraspecific variation of abundance change within the genus Microcystis between different severity levels of cyanobacterial blooms, highlighting two toxic Microcystis strains that are of critical concern for Lake Chaohu harmful blooms suppression. Overall, these results suggest a potential for CyanoStrainChip as a valuable tool for cyanobacterial ecological research and harmful bloom monitoring to supplement existing techniques.
Assuntos
Cianobactérias , Microcystis , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , RNA Ribossômico 16S/genética , Ecossistema , Proliferação Nociva de Algas , Cianobactérias/genética , Lagos/microbiologia , Microcystis/genéticaRESUMO
BACKGROUND: Undernutrition (UN) is a critical public health issue that threatens the lives of children under five in developing countries. While evidence indicates the crucial role of the gut microbiome (GM) in UN pathogenesis, the strain-level inspection and bacterial co-occurrence network investigation in the GM of UN children are lacking. RESULTS: This study examines the strain compositions of the GM in 61 undernutrition patients (UN group) and 36 healthy children (HC group) and explores the topological features of GM co-occurrence networks using a complex network strategy. The strain-level annotation reveals that the differentially enriched species between the UN and HC groups are due to discriminated strain compositions. For example, Prevotella copri is mainly composed of P. copri ASM1680343v1 and P. copri ASM345920v1 in the HC group, but it is composed of P. copri ASM346549v1 and P. copri ASM347465v1 in the UN group. In addition, the UN-risk model constructed at the strain level demonstrates higher accuracy (AUC = 0.810) than that at the species level (AUC = 0.743). With complex network analysis, we further discovered that the UN group had a more complex GM co-occurrence network, with more hub bacteria and a higher clustering coefficient but lower information transfer efficiencies. Moreover, the results at the strain level suggested the inaccurate and even false conclusions obtained from species level analysis. CONCLUSIONS: Overall, this study highlights the importance of examining the GM at the strain level and investigating bacterial co-occurrence networks to advance our knowledge of UN pathogenesis.
Assuntos
Microbioma Gastrointestinal , Desnutrição , Criança , Humanos , Análise por Conglomerados , Saúde PúblicaRESUMO
IMPORTANCE: Fusobacterium nucleatum is one of the predominant oral bacteria in humans. However, this bacterium is enriched in colorectal cancer (CRC) tissues and may be involved in CRC development. Our previous research suggested that F. nucleatum is present in CRC tissues originating from the oral cavity using a traditional strain-typing method [arbitrarily primed polymerase chain reaction (AP-PCR)]. First, using whole-genome sequencing, this study confirmed an exemplary similarity between the oral and tumoral strains derived from each patient with CRC. Second, we successfully developed a method to genotype this bacterium at the strain level, targeting the clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated system, which is hypervariable (defined as F. nucleatum-strain genotyping PCR). This method can identify F. nucleatum strains in cryopreserved samples and is significantly superior to traditional AP-PCR, which can only be performed on isolates. The new methods have great potential for application in etiological studies of F. nucleatum in CRC.
Assuntos
Neoplasias Colorretais , Fusobacterium nucleatum , Humanos , Fusobacterium nucleatum/genética , Sistemas CRISPR-Cas , Boca/microbiologia , Reação em Cadeia da Polimerase/métodos , Neoplasias Colorretais/diagnósticoRESUMO
The emergence of next-generation sequencing (NGS) technology has greatly influenced microbiome research and led to the development of novel bioinformatics tools to deeply analyze metagenomics datasets. Identifying strain-level variations in microbial communities is important to understanding the onset and progression of diseases, host-pathogen interrelationships, and drug resistance, in addition to designing new therapeutic regimens. In this study, we developed a novel tool called StrainIQ (strain identification and quantification) based on a new n-gram-based (series of n number of adjacent nucleotides in the DNA sequence) algorithm for predicting and quantifying strain-level taxa from whole-genome metagenomic sequencing data. We thoroughly evaluated our method using simulated and mock metagenomic datasets and compared its performance with existing methods. On average, it showed 85.8% sensitivity and 78.2% specificity on simulated datasets. It also showed higher specificity and sensitivity using n-gram models built from reduced reference genomes and on models with lower coverage sequencing data. It outperforms alternative approaches in genus- and strain-level prediction and strain abundance estimation. Overall, the results show that StrainIQ achieves high accuracy by implementing customized model-building and is an efficient tool for site-specific microbial community profiling.
Assuntos
Microbiota , Humanos , Microbiota/genética , Metagenoma/genética , Algoritmos , Biologia Computacional , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
OBJECTIVES: The purpose of the present study was to characterize co-aggregation interactions between isolates of Fusobacterium nucleatum subsp. animalis and other colorectal cancer (CRC)-relevant species. METHODS: Co-aggregation interactions were assessed by comparing optical density values following 2-h stationary strain co-incubations to strain optical density values when incubated alone. Co-aggregation was characterized between strains from a previously isolated, CRC biopsy-derived community and F. nucleatum subsp. animalis, a species linked to CRC and known to be highly aggregative. Interactions were also investigated between the fusobacterial isolates and strains sourced from alternate human gastrointestinal samples whose closest species match aligned with species in the CRC biopsy-derived community. RESULTS: Co-aggregation interactions were observed to be strain-specific, varying between both F. nucleatum subsp. animalis strains and different strains of the same co-aggregation partner species. F. nucleatum subsp. animalis strains were observed to co-aggregate strongly with several taxa linked to CRC: Campylobacter concisus, Gemella spp., Hungatella hathewayi, and Parvimonas micra. CONCLUSIONS: Co-aggregation interactions suggest the ability to encourage the formation of biofilms, and colonic biofilms, in turn, have been linked to promotion and/or progression of CRC. Co-aggregation between F. nucleatum subsp. animalis and CRC-linked species such as C. concisus, Gemella spp., H. hathewayi, and P. micra may contribute to both biofilm formation along CRC lesions and to disease progression.
Assuntos
Neoplasias Colorretais , Infecções por Fusobacterium , Humanos , Fusobacterium nucleatum , Fusobacterium , Infecções por Fusobacterium/microbiologia , Neoplasias Colorretais/microbiologiaRESUMO
The vast population of bacterial phages or viruses (virome) plays pivotal roles in the ecology of human microbial flora and health conditions. Obstacles, including poor viral sequence inference, strain-sensitive virus-host relationship, and the high diversity among individuals, hinder the in-depth understanding of the human virome. We conducted longitudinal studies of the virome based on constructing a high-quality personal reference metagenome (PRM). By applying long-read sequencing for representative samples, we could build a PRM of high continuity that allows accurate annotation and abundance estimation of viruses and bacterial species in all samples of the same individual by aligning short sequencing reads to the PRM. We applied this approach to a series of fecal samples collected for 6 months from a 2-year-old boy who had experienced a 2-month flare-up of atopic eczema (dermatitis) in this period. We identified 31 viral strains in the patient's gut microbiota and deciphered their strain-level relationship to their bacterial hosts. Among them, a lytic crAssphage developed into a dozen substrains and coordinated downregulation in the catabolism of aromatic amino acids (AAAs) in their host bacteria which govern the production of immune-active AAA derivates. The metabolic alterations confirmed based on metabolomic assays cooccurred with symptom remission. Our PRM-based analysis provides an easy approach for deciphering the dynamics of the strain-level human gut virome in the context of entire microbiota. Close temporal correlations among virome alteration, microbial metabolism, and disease remission suggest a potential mechanism for how bacterial phages in microbiota are intimately related to human health. IMPORTANCE The vast populations of viruses or bacteriophages in human gut flora remain mysterious. However, poor annotation and abundance estimation remain obstacles to strain-level analysis and clarification of their roles in microbiome ecology and metabolism associated with human health and diseases. We demonstrate that a personal reference metagenome (PRM)-based approach provides strain-level resolution for analyzing the gut microbiota-associated virome. When applying such an approach to longitudinal samples collected from a 2-year-old boy who has experienced a 2-month flare-up of atopic eczema, we observed thriving substrains of a lytic crAssphage, showing temporal correlation with downregulated catabolism of aromatic amino acids, lower production of immune-active metabolites, and remission of the disease. The PRM-based approach is practical and powerful for strain-centric analysis of the human gut virome, and the underlying mechanism of how strain-level virome dynamics affect disease deserves further investigation.
RESUMO
In fields such as the food industry, it is very important to identify target bacteria at the species level or lower for optimal product quality control. Bacteria identification at the subspecies or lower level requires time-consuming and high-cost analyses such as multi-locus sequence typing and amplified fragment length polymorphism analyses. Herein, we developed a primer design algorithm for precisely identifying bacteria based on a whole genome DNA sequence that is easy to apply. The algorithm designs primer sets that produce fragments from all input sequences and maximizes the differences in the amplicon size or amplicon sequence among input sequences. We demonstrate that the primer sets designed by the algorithm clearly classified six subspecies of Lactobacillus delbrueckii, and we observed that the resolution of the method is equal to that of a multi-locus sequence analysis. The algorithm allows the easy but precise identification of bacteria within a short time. (SHRS is available freely from PyPI under the MIT license.).
Assuntos
Bactérias , Lactobacillus delbrueckii , Tipagem de Sequências Multilocus/métodos , Análise do Polimorfismo de Comprimento de Fragmentos Amplificados , Bactérias/genética , Lactobacillus delbrueckii/genética , AlgoritmosRESUMO
Some marine microbes are seemingly "ubiquitous," thriving across a wide range of environmental conditions. While the increased depth in metagenomic sequencing has led to a growing body of research on within-population heterogeneity in environmental microbial populations, there have been fewer systematic comparisons and characterizations of population-level genetic diversity over broader expanses of time and space. Here, we investigated the factors that govern the diversification of ubiquitous microbial taxa found within and between ocean basins. Specifically, we use mapped metagenomic paired reads to examine the genetic diversity of ammonia-oxidizing archaeal ("Candidatus Nitrosopelagicus brevis") populations in the Pacific (Hawaii Ocean Time-series [HOT]) and Atlantic (Bermuda Atlantic Time Series [BATS]) Oceans sampled over 2 years. We observed higher nucleotide diversity in "Ca. N. brevis" at HOT, driven by a higher rate of homologous recombination. In contrast, "Ca. N. brevis" at BATS featured a more open pangenome with a larger set of genes that were specific to BATS, suggesting a history of dynamic gene gain and loss events. Furthermore, we identified highly differentiated genes that were regulatory in function, some of which exhibited evidence of recent selective sweeps. These findings indicate that different modes of genetic diversification likely incur specific adaptive advantages depending on the selective pressures that they are under. Within-population diversity generated by the environment-specific strategies of genetic diversification is likely key to the ecological success of "Ca. N. brevis." IMPORTANCE Ammonia-oxidizing archaea (AOA) are one of the most abundant chemolithoautotrophic microbes in the marine water column and are major contributors to global carbon and nitrogen cycling. Despite their ecological importance and geographical pervasiveness, there have been limited systematic comparisons and characterizations of their population-level genetic diversity over time and space. Here, we use metagenomic time series from two ocean observatories to address the fundamental questions of how abiotic and biotic factors shape the population-level genetic diversity and how natural microbial populations adapt across diverse habitats. We show that the marine AOA "Candidatus Nitrosopelagicus brevis" in different ocean basins exhibits distinct modes of genetic diversification in response to their selective regimes shaped by nutrient availability and patterns of environmental fluctuations. Our findings specific to "Ca. N. brevis" have broader implications, particularly in understanding the population-level responses to the changing climate and predicting its impact on biogeochemical cycles.
Assuntos
Amônia , Ecossistema , Filogenia , Oceanos e Mares , Archaea/genéticaRESUMO
Intestinal bacteria strains play crucial roles in maintaining host health. Researchers have increasingly recognized the importance of strain-level analysis in metagenomic studies. Many analysis tools and several cutting-edge sequencing techniques like single cell sequencing have been proposed to decipher strains in metagenomes. However, strain-level complexity is far from being well characterized up to date. As the indicator of strain-level complexity, metagenomic single-nucleotide polymorphisms (SNPs) have been utilized to disentangle conspecific strains. Lots of SNP-based tools have been developed to identify strains in metagenomes. However, the sufficient sequencing depth for SNP and strain-level analysis remains unclear. We conducted ultra-deep sequencing of the human gut microbiome and constructed an unbiased framework to perform reliable SNP analysis. SNP profiles of the human gut metagenome by ultra-deep sequencing were obtained. SNPs identified from conventional and ultra-deep sequencing data were thoroughly compared and the relationship between SNP identification and sequencing depth were investigated. The results show that the commonly used shallow-depth sequencing is incapable to support a systematic metagenomic SNP discovery. In contrast, ultra-deep sequencing could detect more functionally important SNPs, which leads to reliable downstream analyses and novel discoveries. We also constructed a machine learning model to provide guidance for researchers to determine the optimal sequencing depth for their projects (SNPsnp, https://github.com/labomics/SNPsnp). To conclude, the SNP profiles based on ultra-deep sequencing data extend current knowledge on metagenomics and highlights the importance of evaluating sequencing depth before starting SNP analysis. This study provides new ideas and references for future strain-level investigations.
RESUMO
In this viewpoint, by reviewing the recent findings on wild animals and their gut microbiomes, we found some potential new insights and challenges in the study of the evolution of wild animals and their gut microbiome. We suggested that wild animal gut microbiomes may come from microbiomes in the animals' living habitats along with animals' special behavior, and that the study of long-term changes in gut microbiomes should consider both habitat and special behaviors. Also, host behavior would facilitate the gut microbiome transmission between individuals. We suggested that research should integrate the evolutionary history and physiological systems of wild animals to understand the evolution of animals and their gut microbiomes. Finally, we proposed the Noncultured-Cultured-Fermentation-Model Animal pipeline to determine the function (diet digestion, physiology, and behavior) of these target strains in the wild animal gut.
RESUMO
Viruses change constantly during replication, leading to high intra-species diversity. Although many changes are neutral or deleterious, some can confer on the virus different biological properties such as better adaptability. In addition, viral genotypes often have associated metadata, such as host residence, which can help with inferring viral transmission during pandemics. Thus, subspecies analysis can provide important insights into virus characterization. Here, we present VirStrain, a tool taking short reads as input with viral strain composition as output. We rigorously test VirStrain on multiple simulated and real virus sequencing datasets. VirStrain outperforms the state-of-the-art tools in both sensitivity and accuracy.
Assuntos
Vírus de RNA , Vírus , Genoma Viral , Sequenciamento de Nucleotídeos em Larga Escala , Metagenômica , Vírus de RNA/genética , Vírus/genéticaRESUMO
Metagenomic next-generation sequencing (mNGS) enables comprehensive pathogen detection and has become increasingly popular in clinical diagnosis. The distinct pathogenic traits between strains require mNGS to achieve a strain-level resolution, but an equivocal concept of 'strain' as well as the low pathogen loads in most clinical specimens hinders such strain awareness. Here we introduce a metagenomic intra-species typing (MIST) tool (https://github.com/pandafengye/MIST), which hierarchically organizes reference genomes based on average nucleotide identity (ANI) and performs maximum likelihood estimation to infer the strain-level compositional abundance. In silico analysis using synthetic datasets showed that MIST accurately predicted the strain composition at a 99.9% average nucleotide identity (ANI) resolution with a merely 0.001× sequencing depth. When applying MIST on 359 culture-positive and 359 culture-negative real-world specimens of infected body fluids, we found the presence of multiple-strain reached considerable frequencies (30.39%-93.22%), which were otherwise underestimated by current diagnostic techniques due to their limited resolution. Several high-risk clones were identified to be prevalent across samples, including Acinetobacter baumannii sequence type (ST)208/ST195, Staphylococcus aureus ST22/ST398 and Klebsiella pneumoniae ST11/ST15, indicating potential outbreak events occurring in the clinical settings. Interestingly, contaminations caused by the engineered Escherichia coli strain K-12 and BL21 throughout the mNGS datasets were also identified by MIST instead of the statistical decontamination approach. Our study systemically characterized the infected body fluids at the strain level for the first time. Extension of mNGS testing to the strain level can greatly benefit clinical diagnosis of bacterial infections, including the identification of multi-strain infection, decontamination and infection control surveillance.
Assuntos
Infecções Bacterianas , Líquidos Corporais , Infecções Bacterianas/diagnóstico , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Metagenômica/métodos , NucleotídeosRESUMO
Gut microbiome is of major interest due to its close relationship to health and disease. Bacteria usually vary in gene content, leading to functional variations within species, so resolution higher than species-level methods is needed for ecological and clinical relevance. We design a protocol to identify strains in selected species with high discrimination and in high numbers by amplicon sequencing of the flagellin gene. We apply the protocol to fecal samples from a human diet trial, targeting Escherichia coli. Across the 119 samples from 16 individuals, there are 1,532 amplicon sequence variants (ASVs), but only 32 ASVs are dominant in one or more fecal samples, despite frequent dominant strain turnover. Major strains in an intestine are found to be commonly accompanied by a large number of satellite cells, and many are identified as potential extraintestinal pathogens. The protocol could be used to track epidemics or investigate the intra- or inter-host diversity of pathogens.
Assuntos
Escherichia coli/metabolismo , Microbioma Gastrointestinal/genética , Transcriptoma/genética , Adulto , DNA Bacteriano/genética , Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Fezes/microbiologia , Feminino , Flagelina/genética , Flagelina/metabolismo , Microbioma Gastrointestinal/fisiologia , Expressão Gênica/genética , Variação Genética/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Intestinos , Masculino , Microbiota/genética , Pessoa de Meia-Idade , Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA/métodosRESUMO
Current methods to characterize microbial communities generally employ sequencing of the 16S rRNA gene (<500 bp) with high accuracy (â¼99%) but limited phylogenetic resolution. However, long-read sequencing now allows for the profiling of near-full-length ribosomal operons (16S-ITS-23S rRNA genes) on platforms such as the Oxford Nanopore MinION. Here, we describe an rRNA operon database with >300 ,000 entries, representing >10 ,000 prokaryotic species and â¼ 150, 000 strains. Additionally, BLAST parameters were identified for strain-level resolution using in silico mutated, mock rRNA operon sequences (70-95% identity) from four bacterial phyla and two members of the Euryarchaeota, mimicking MinION reads. MegaBLAST settings were determined that required <3 s per read on a Mac Mini with strain-level resolution for sequences with >84% identity. These settings were tested on rRNA operon libraries from the human respiratory tract, farm/forest soils and marine sponges ( n = 1, 322, 818 reads for all sample sets). Most rRNA operon reads in this data set yielded best BLAST hits (95 ± 8%). However, only 38-82% of library reads were compatible with strain-level resolution, reflecting the dominance of human/biomedical-associated prokaryotic entries in the database. Since the MinION and the Mac Mini are both portable, this study demonstrates the possibility of rapid strain-level microbiome analysis in the field.