ABSTRACT
Bacteriophages typically have small genomes1 and depend on their bacterial hosts for replication2. Here we sequenced DNA from diverse ecosystems and found hundreds of phage genomes with lengths of more than 200 kilobases (kb), including a genome of 735 kb, which is-to our knowledge-the largest phage genome to be described to date. Thirty-five genomes were manually curated to completion (circular and no gaps). Expanded genetic repertoires include diverse and previously undescribed CRISPR-Cas systems, transfer RNAs (tRNAs), tRNA synthetases, tRNA-modification enzymes, translation-initiation and elongation factors, and ribosomal proteins. The CRISPR-Cas systems of phages have the capacity to silence host transcription factors and translational genes, potentially as part of a larger interaction network that intercepts translation to redirect biosynthesis to phage-encoded functions. In addition, some phages may repurpose bacterial CRISPR-Cas systems to eliminate competing phages. We phylogenetically define the major clades of huge phages from human and other animal microbiomes, as well as from oceans, lakes, sediments, soils and the built environment. We conclude that the large gene inventories of huge phages reflect a conserved biological strategy, and that the phages are distributed across a broad bacterial host range and across Earth's ecosystems.
Subject(s)
Bacteria/virology , Bacteriophages/classification , Bacteriophages/genetics , Earth, Planet , Ecosystem , Genome, Viral/genetics , Phylogeny , Amino Acyl-tRNA Synthetases/genetics , Animals , Bacteria/genetics , Bacteriophages/isolation & purification , Bacteriophages/metabolism , Biodiversity , CRISPR-Cas Systems/genetics , Evolution, Molecular , Gene Expression Regulation, Bacterial , Gene Expression Regulation, Viral , Host Specificity , Humans , Lakes/virology , Molecular Sequence Annotation , Oceans and Seas , Prophages/genetics , Protein Biosynthesis , RNA, Transfer/genetics , Ribosomal Proteins/genetics , Seawater/virology , Soil Microbiology , Transcription, GeneticABSTRACT
MOTIVATION: Analyzing metagenomic data can be highly valuable for understanding the function and distribution of antimicrobial resistance genes (ARGs). However, there is a need for standardized and reproducible workflows to ensure the comparability of studies, as the current options involve various tools and reference databases, each designed with a specific purpose in mind. RESULTS: In this work, we have created the workflow ARGprofiler to process large amounts of raw sequencing reads for studying the composition, distribution, and function of ARGs. ARGprofiler tackles the challenge of deciding which reference database to use by providing the PanRes database of 14 078 unique ARGs that combines several existing collections into one. Our pipeline is designed to not only produce abundance tables of genes and microbes but also to reconstruct the flanking regions of ARGs with ARGextender. ARGextender is a bioinformatic approach combining KMA and SPAdes to recruit reads for a targeted de novo assembly. While our aim is on ARGs, the pipeline also creates Mash sketches for fast searching and comparisons of sequencing runs. AVAILABILITY AND IMPLEMENTATION: The ARGprofiler pipeline is a Snakemake workflow that supports the reuse of metagenomic sequencing data and is easily installable and maintained at https://github.com/genomicepidemiology/ARGprofiler.
Subject(s)
Anti-Bacterial Agents , Software , Drug Resistance, Bacterial/genetics , Metagenome , MetagenomicsABSTRACT
The growing threat of antimicrobial resistance (AMR) calls for new epidemiological surveillance methods, as well as a deeper understanding of how antimicrobial resistance genes (ARGs) have been transmitted around the world. The large pool of sequencing data available in public repositories provides an excellent resource for monitoring the temporal and spatial dissemination of AMR in different ecological settings. However, only a limited number of research groups globally have the computational resources to analyze such data. We retrieved 442 Tbp of sequencing reads from 214,095 metagenomic samples from the European Nucleotide Archive (ENA) and aligned them using a uniform approach against ARGs and 16S/18S rRNA genes. Here, we present the results of this extensive computational analysis and share the counts of reads aligned. Over 6.76â108 read fragments were assigned to ARGs and 3.21â109 to rRNA genes, where we observed distinct differences in both the abundance of ARGs and the link between microbiome and resistome compositions across various sampling types. This collection is another step towards establishing global surveillance of AMR and can serve as a resource for further research into the environmental spread and dynamic changes of ARGs.
Subject(s)
Anti-Infective Agents , Metagenome , Anti-Bacterial Agents/pharmacology , Genes, Bacterial , Metagenome/genetics , Metagenomics/methodsABSTRACT
BACKGROUND: The possibility of recovering metagenome-assembled genomes (MAGs) from sequence reads allows for further insights into microbial communities and their members, possibly even analyzing such sequences with tools designed for single-isolate genomes. As result quality depends on sequence quality, performance of tools for single-isolate genomes on MAGs should be tested beforehand. Bioinformatics can be leveraged to quickly create varied synthetic test sets with known composition for this purpose. RESULTS: We present MAGICIAN, a flexible, user-friendly pipeline for the simulation of MAGs. MAGICIAN combines a synthetic metagenome simulator with a metagenomic assembly and binning pipeline to simulate MAGs based on user-supplied input genomes, allowing users to test performance of tools on MAGs while having a ground truth to compare results to. Using MAGICIAN, we found that even very slight (1%) changes in depth of coverage can drastically affect whether a genome can be recovered. We also demonstrate the use of simulated MAGs by evaluating the suitability of such genomes obtained with MAGICIAN's current default pipeline for analysis with the antimicrobial resistance gene identification tool ResFinder. CONCLUSIONS: Using MAGICIAN, it is possible to simulate MAGs which, while generally high in quality, reflect issues encountered with real-world data, thus providing realistic best-case data. Evaluating the results of ResFinder analysis of these genomes revealed a risk for plausible-looking false positives, which underlines the need for pipeline validation so that researchers are aware of the potential issues when interpreting real-world data. Furthermore, the effects of fluctuations in depth of coverage on genome recovery in our simulated "random sequencing" warrant further investigation and indicate random subsampling of reads may affect discovery of more genomes.
Subject(s)
Metagenome , Microbiota , Computer Simulation , Microbiota/genetics , Metagenomics/methods , Computational BiologyABSTRACT
We report the discovery of a persistent presence of Vibrio cholerae at very low abundance in the inlet of a single wastewater treatment plant in Copenhagen, Denmark at least since 2015. Remarkably, no environmental or locally transmitted clinical case of V. cholerae has been reported in Denmark for more than 100 years. We, however, have recovered a near-complete genome out of 115 metagenomic sewage samples taken over the past 8 years, despite the extremely low relative abundance of one V. cholerae read out of 500,000 sequenced reads. Due to the very low relative abundance, routine screening of the individual samples did not reveal V. cholerae. The recovered genome lacks the gene responsible for cholerae toxin production, but although this strain may not pose an immediate public health risk, our finding illustrates the importance, challenges, and effectiveness of wastewater-based pathogen surveillance.
Subject(s)
Sewage , Vibrio cholerae , Denmark , Sewage/microbiology , Vibrio cholerae/genetics , Vibrio cholerae/isolation & purification , Vibrio cholerae/classification , Genome, Bacterial , Wastewater/microbiology , Cholera/microbiology , Cholera/epidemiologyABSTRACT
Wastewaters serve as important hot spots for antimicrobial resistance and monitoring can be used to analyse the abundance and diversity of antimicrobial resistance genes at the level of large bacterial and human populations. In this study, whole genome sequencing of beta-lactamase-producing Escherichia coli and metagenomic analysis of whole-community DNA were used to characterize the occurrence of antimicrobial resistance in hospital, municipal and river waters in the city of Brno (Czech Republic). Cefotaxime-resistant E. coli were mainly extended-spectrum beta-lactamase (ESBL) producers (95.6%, n = 158), of which the majority carried blaCTX-M (98.7%; n = 151) and were detected in all water samples except the outflow from hospital wastewater treatment plant. A wide phylogenetic diversity was observed among the sequenced E. coli (n = 78) based on the detection of 40 sequence types and single nucleotide polymorphisms (average number 34,666 ± 15,710) between strains. The metagenomic analysis revealed a high occurrence of bacterial genera with potentially pathogenic members, including Pseudomonas, Escherichia, Klebsiella, Aeromonas, Enterobacter and Arcobacter (relative abundance >50%) in untreated hospital and municipal wastewaters and predominance of environmental bacteria in treated and river waters. Genes encoding resistance to aminoglycosides, beta-lactams, quinolones and macrolides were frequently detected, however blaCTX-M was not found in this dataset which may be affected by insufficient sequencing depth of the samples. The study pointed out municipal treated wastewater as a possible source of multi-drug resistant E. coli and antimicrobial resistance genes for surface waters. Moreover, the combination of two different approaches provided a more holistic view on antimicrobial resistance in water environments. The culture-based approach facilitated insight into the dynamics of ESBL-producing E. coli and the metagenomics shows abundance and diversity of bacteria and antimicrobial resistance genes vary across water sites.
Subject(s)
Escherichia coli , Wastewater , Anti-Bacterial Agents/pharmacology , Czech Republic , Drug Resistance, Bacterial/genetics , Escherichia coli/genetics , Hospitals , Humans , Metagenomics , Phylogeny , beta-Lactamases/geneticsABSTRACT
OBJECTIVES: Previous studies in food-producing animals have shown associations between antimicrobial use (AMU) and resistance (AMR) in specifically isolated bacterial species. Multi-country data are scarce and only describe between-country differences. Here we investigate associations between the pig faecal mobile resistome and characteristics at the farm-level across Europe. METHODS: A cross-sectional study was conducted among 176 conventional pig farms from nine European countries. Twenty-five faecal samples from fattening pigs were pooled per farm and acquired resistomes were determined using shotgun metagenomics and the Resfinder reference database, i.e. the full collection of horizontally acquired AMR genes (ARGs). Normalized fragments resistance genes per kilobase reference per million bacterial fragments (FPKM) were calculated. Specific farm-level data (AMU, biosecurity) were collected. Random-effects meta-analyses were performed by country, relating farm-level data to relative ARG abundances (FPKM). RESULTS: Total AMU during fattening was positively associated with total ARG (total FPKM). Positive associations were particularly observed between widely used macrolides and tetracyclines, and ARGs corresponding to the respective antimicrobial classes. Significant AMU-ARG associations were not found for ß-lactams and only few colistin ARGs were found, despite high use of these antimicrobial classes in younger pigs. Increased internal biosecurity was directly related to higher abundances of ARGs mainly encoding macrolide resistance. These effects of biosecurity were independent of AMU in mutually adjusted models. CONCLUSIONS: Using resistome data in association studies is unprecedented and adds accuracy and new insights to previously observed AMU-AMR associations. Major components of the pig resistome are positively and independently associated with on-farm AMU and biosecurity conditions.
Subject(s)
Animal Husbandry/methods , Anti-Infective Agents/therapeutic use , Biota/drug effects , Drug Resistance, Bacterial , Drug Utilization/statistics & numerical data , Feces/microbiology , Genes, Bacterial , Animals , Computational Biology , Cross-Sectional Studies , Europe , Metagenomics , SwineABSTRACT
OBJECTIVES: To determine associations between farm- and flock-level antimicrobial usage (AMU), farm biosecurity status and the abundance of faecal antimicrobial resistance genes (ARGs) on broiler farms. METHODS: In the cross-sectional pan-European EFFORT study, conventional broiler farms were visited and faeces, AMU information and biosecurity records were collected. The resistomes of pooled faecal samples were determined by metagenomic analysis for 176 farms. A meta-analysis approach was used to relate total and class-specific ARGs (expressed as fragments per kb reference per million bacterial fragments, FPKM) to AMU (treatment incidence per DDD, TIDDDvet) per country and subsequently across all countries. In a similar way, the association between biosecurity status (Biocheck.UGent) and the resistome was explored. RESULTS: Sixty-six (38%) flocks did not report group treatments but showed a similar resistome composition and roughly similar ARG levels to antimicrobial-treated flocks. Nevertheless, we found significant positive associations between ß-lactam, tetracycline, macrolide and lincosamide, trimethoprim and aminoglycoside antimicrobial flock treatments and ARG clusters conferring resistance to the same class. Similar associations were found with purchased products. In gene-level analysis for ß-lactams and macrolides, lincosamides and streptogramins, a significant positive association was found with the most abundant gene clusters blaTEM and erm(B). Little evidence was found for associations with biosecurity. CONCLUSIONS: The faecal microbiome in European broilers contains a high diversity of ARGs, even in the absence of current antimicrobial selection pressure. Despite this, the relative abundance of genes and the composition of the resistome is positively related to AMU in European broiler farms for several antimicrobial classes.
Subject(s)
Anti-Infective Agents/therapeutic use , Bacteria/drug effects , Chickens/microbiology , Drug Resistance, Bacterial/drug effects , Metagenomics , Microbiota/drug effects , Animals , Anti-Infective Agents/pharmacology , Bacteria/genetics , Computational Biology , Cross-Sectional Studies , Europe , Farms , Feces/microbiology , Microbiota/genetics , Risk FactorsABSTRACT
OBJECTIVES: Reliable methods for monitoring antimicrobial resistance (AMR) in livestock and other reservoirs are essential to understand the trends, transmission and importance of agricultural resistance. Quantification of AMR is mostly done using culture-based techniques, but metagenomic read mapping shows promise for quantitative resistance monitoring. METHODS: We evaluated the ability of: (i) MIC determination for Escherichia coli; (ii) cfu counting of E. coli; (iii) cfu counting of aerobic bacteria; and (iv) metagenomic shotgun sequencing to predict expected tetracycline resistance based on known antimicrobial consumption in 10 Danish integrated slaughter pig herds. In addition, we evaluated whether fresh or manure floor samples constitute suitable proxies for intestinal sampling, using cfu counting, qPCR and metagenomic shotgun sequencing. RESULTS: Metagenomic read-mapping outperformed cultivation-based techniques in terms of predicting expected tetracycline resistance based on antimicrobial consumption. Our metagenomic approach had sufficient resolution to detect antimicrobial-induced changes to individual resistance gene abundances. Pen floor manure samples were found to represent rectal samples well when analysed using metagenomics, as they contain the same DNA with the exception of a few contaminating taxa that proliferate in the extraintestinal environment. CONCLUSIONS: We present a workflow, from sampling to interpretation, showing how resistance monitoring can be carried out in swine herds using a metagenomic approach. We propose metagenomic sequencing should be part of routine livestock resistance monitoring programmes and potentially of integrated One Health monitoring in all reservoirs.
Subject(s)
Bacteria/drug effects , Bacteria/genetics , Feces/microbiology , Metagenomics/methods , Swine/microbiology , Tetracycline Resistance , Animals , Colony Count, Microbial , Denmark , Environmental Microbiology , Epidemiological Monitoring , Microbial Sensitivity Tests , Real-Time Polymerase Chain ReactionABSTRACT
IMPORTANCE: The Flankophile pipeline enables the analysis and visualization of flanking regions of prokaryotic sequences of interest on large data sets in one step and in a consistent manner. A specific tool for flanking region analysis with automated visualization has not been developed before, and Flankophile will make flanking region analysis easier and accessible to more people. Flankophile will be especially useful in the field of genomic epidemiology of acquired antimicrobial resistance genes. Here, information from flanking region sequences can be instrumental in rejecting or supporting the possibility of a recent common source of the same resistance gene found in different samples.
Subject(s)
Computational Biology , Genomics , Humans , Synteny , Genome , Prokaryotic CellsABSTRACT
The rapid spread of antimicrobial resistance (AMR) is a threat to global health, and the nature of co-occurring antimicrobial resistance genes (ARGs) may cause collateral AMR effects once antimicrobial agents are used. Therefore, it is essential to identify which pairs of ARGs co-occur. Given the wealth of next-generation sequencing data available in public repositories, we have investigated the correlation between ARG abundances in a collection of 214,095 metagenomic data sets. Using more than 6.76â108 read fragments aligned to acquired ARGs to infer pairwise correlation coefficients, we found that more ARGs correlated with each other in human and animal sampling origins than in soil and water environments. Furthermore, we argued that the correlations could serve as risk profiles of resistance co-occurring to critically important antimicrobials (CIAs). Using these profiles, we found evidence of several ARGs conferring resistance for CIAs being co-abundant, such as tetracycline ARGs correlating with most other forms of resistance. In conclusion, this study highlights the important ARG players indirectly involved in shaping the resistomes of various environments that can serve as monitoring targets in AMR surveillance programs. IMPORTANCE: Understanding the collateral effects happening in a resistome can reveal previously unknown links between antimicrobial resistance genes (ARGs). Through the analysis of pairwise ARG abundances in 214K metagenomic samples, we observed that the co-abundance is highly dependent on the environmental context and argue that these correlations can be used to show the risk of co-selection occurring in different settings.
Subject(s)
Anti-Bacterial Agents , Bacteria , Drug Resistance, Bacterial , Metagenomics , Humans , Anti-Bacterial Agents/pharmacology , Bacteria/genetics , Bacteria/drug effects , Bacteria/classification , Drug Resistance, Bacterial/genetics , Animals , Genes, Bacterial/genetics , Soil Microbiology , High-Throughput Nucleotide Sequencing , Metagenome/geneticsABSTRACT
Our 24-month study used metagenomics to investigate antimicrobial resistance (AMR) abundance in raw sewage from wastewater treatment works (WWTWs) in two municipalities in Gauteng Province, South Africa. At the AMR class level, data showed similar trends at all WWTWs, showing that aminoglycoside, beta-lactam, sulfonamide and tetracycline resistance was most abundant. AMR abundance differences were shown between municipalities, where Tshwane Metropolitan Municipality (TMM) WWTWs showed overall higher abundance of AMR compared to Ekurhuleni Metropolitan Municipality (EMM) WWTWs. Also, within each municipality, there were differing trends in AMR abundance. Notably, within TMM, certain AMR classes (macrolides and macrolides_streptogramin B) were in higher abundance at a WWTW serving an urban high-income area, while other AMR classes (aminoglycosides) were in higher abundance at a WWTW serving a semi-urban low income area. At the AMR gene level, all WWTWs samples showed the most abundance for the sul1 gene (encoding sulfonamide resistance). Following this, the next 14 most abundant genes encoded resistance to sulfonamides, aminoglycosides, macrolides, tetracyclines and beta-lactams. Notably, within TMM, some macrolide-encoding resistance genes (mefC, msrE, mphG and mphE) were in highest abundance at a WWTW serving an urban high-income area; while sul1, sul2 and tetC genes were in highest abundance at a WWTW serving a semi-urban low income area. Differential abundance analysis of AMR genes at WWTWs, following stratification of data by season, showed some notable variance in six AMR genes, of which blaKPC-2 and blaKPC-34 genes showed the highest prevalence of seasonal abundance differences when comparing data within a WWTW. The general trend was to see higher abundances of AMR genes in colder seasons, when comparing seasonal data within a WWTW. Our study investigated wastewater samples in only one province of South Africa, from WWTWs located within close proximity to one another. We would require a more widespread investigation at WWTWs distributed across all regions/provinces of South Africa, in order to describe a more comprehensive profile of AMR abundance across the country.
Subject(s)
Metagenomics , Sewage , South Africa , Sewage/microbiology , Metagenomics/methods , Anti-Bacterial Agents/pharmacology , Drug Resistance, Bacterial/genetics , Humans , Wastewater/microbiologyABSTRACT
Metagenomic sequencing has proven to be a powerful tool in the monitoring of antimicrobial resistance (AMR). Here, we provide a comparative analysis of the resistome from pigs, poultry, veal calves, turkey, and rainbow trout, for a total of 538 herds across nine European countries. We calculated the effects of per-farm management practices and antimicrobial usage (AMU) on the resistome in pigs, broilers, and veal calves. We also provide an in-depth study of the associations between bacterial diversity, resistome diversity, and AMR abundances as well as co-occurrence analysis of bacterial taxa and antimicrobial resistance genes (ARGs) and the universality of the latter. The resistomes of veal calves and pigs clustered together, as did those of avian origin, while the rainbow trout resistome was different. Moreover, we identified clear core resistomes for each specific food-producing animal species. We identified positive associations between bacterial alpha diversity and both resistome alpha diversity and abundance. Network analyses revealed very few taxa-ARG associations in pigs but a large number for the avian species. Using updated reference databases and optimized bioinformatics, previously reported significant associations between AMU, biosecurity, and AMR in pig and poultry farms were validated. AMU is an important driver for AMR; however, our integrated analyses suggest that factors contributing to increased bacterial diversity might also be associated with higher AMR load. We also found that dispersal limitations of ARGs are shaping livestock resistomes, and future efforts to fight AMR should continue to emphasize biosecurity measures.IMPORTANCEUnderstanding the occurrence, diversity, and drivers for antimicrobial resistance (AMR) is important to focus future control efforts. So far, almost all attempts to limit AMR in livestock have addressed antimicrobial consumption. We here performed an integrated analysis of the resistomes of five important farmed animal populations across Europe finding that the resistome and AMR levels are also shaped by factors related to bacterial diversity, as well as dispersal limitations. Thus, future studies and interventions aimed at reducing AMR should not only address antimicrobial usage but also consider other epidemiological and ecological factors.
Subject(s)
Anti-Infective Agents , Livestock , Swine , Animals , Cattle , Drug Resistance, Bacterial/genetics , Chickens/microbiology , Anti-Infective Agents/pharmacology , Bacteria/geneticsABSTRACT
Sewage metagenomics has risen to prominence in urban population surveillance of pathogens and antimicrobial resistance (AMR). Unknown species with similarity to known genomes cause database bias in reference-based metagenomics. To improve surveillance, we seek to recover sewage genomes and develop a quantification and correlation workflow for these genomes and AMR over time. We use longitudinal sewage sampling in seven treatment plants from five major European cities to explore the utility of catch-all sequencing of these population-level samples. Using metagenomic assembly methods, we recover 2332 metagenome-assembled genomes (MAGs) from prokaryotic species, 1334 of which were previously undescribed. These genomes account for ~69% of sequenced DNA and provide insight into sewage microbial dynamics. Rotterdam (Netherlands) and Copenhagen (Denmark) show strong seasonal microbial community shifts, while Bologna, Rome, (Italy) and Budapest (Hungary) have occasional blooms of Pseudomonas-dominated communities, accounting for up to ~95% of sample DNA. Seasonal shifts and blooms present challenges for effective sewage surveillance. We find that bacteria of known shared origin, like human gut microbiota, form communities, suggesting the potential for source-attributing novel species and their ARGs through network community analysis. This could significantly improve AMR tracking in urban environments.
Subject(s)
Bacteria , Metagenome , Metagenomics , Microbiota , Seasons , Sewage , Sewage/microbiology , Metagenomics/methods , Humans , Microbiota/genetics , Bacteria/genetics , Bacteria/classification , Bacteria/isolation & purification , Metagenome/genetics , EuropeABSTRACT
Microbial communities have huge impacts on their ecosystems and local environments spanning from marine and soil communities to the mammalian gut. Bacteriophages (phages) are important drivers of population control and diversity in the community, but our understanding of complex microbial communities is halted by biased detection techniques. Metagenomics have provided a method of novel phage discovery independent of in vitro culturing techniques and have revealed a large proportion of understudied phages. Here, five jumbophage genomes, that were previously assembled in silico from pig faecal metagenomes, are detected and observed directly in their natural environment using a modified phageFISH approach, and combined with methods to decrease bias against large-sized phages (e.g., jumbophages). These phages are uncultured with unknown hosts. The specific phages were detected by PCR and fluorescent in situ hybridisation in their original faecal samples as well as across other faecal samples. Co-localisation of bacterial signals and phage signals allowed detection of the different stages of phage life cycle. All phages displayed examples of early infection, advanced infection, burst, and free phages. To our knowledge, this is the first detection of jumbophages in faeces, which were investigated independently of culture, host identification, and size, and based solely on the genome sequence. This approach opens up opportunities for characterisation of novel in silico phages in vivo from a broad range of gut microbiomes.
Subject(s)
Bacteriophages , Microbiota , Animals , Bacteria/genetics , Bacteriophages/genetics , Fluorescence , Metagenome , SwineABSTRACT
BACKGROUND: Although antimicrobial use is a key selector for antimicrobial resistance, recent studies have suggested that the ecological context in which antimicrobials are used might provide important factors for the prediction of the emergence and spread of antimicrobial resistance. METHODS: We used 1547 variables from the World Bank dataset consisting of socioeconomic, developmental, health, and nutritional indicators; data from a global sewage-based study on antimicrobial resistance (abundance of antimicrobial resistance genes [ARGs]); and data on antimicrobial usage computed from the ECDC database and the IQVIA database. We characterised and built models predicting the global resistome at an antimicrobial class level. We used a generalised linear mixed-effects model to estimate the association between antimicrobial usage and ARG abundance in the sewage samples; a multivariate random forest model to build predictive models for each antimicrobial resistance class and to select the most important variables for ARG abundance; logistic regression models to test the association between the predicted country-level antimicrobial resistance abundance and the country-level proportion of clinical resistant bacterial isolates; finite mixture models to investigate geographical heterogeneities in the abundance of ARGs; and multivariate finite mixture models with covariates to investigate the effect of heterogeneity in the association between the most important variables and the observed ARG abundance across the different country subgroups. We compared our predictions with available clinical phenotypic data from the SENTRY Antimicrobial Surveillance Program from eight antimicrobial classes and 12 genera from 56 countries. FINDINGS: Using antimicrobial use data from between Jan 1, 2016, and Dec 31, 2019, we found that antimicrobial usage was not significantly associated with the global ARG abundance in sewage (p=0·72; incidence rate ratio 1·02 [95% CI 0·92-1·13]), whereas country-specific World Bank's variables explained a large amount of variation. The importance of the World Bank variables differed between antimicrobial classes and countries. Generally, the estimated global ARG abundance was positively associated with the prevalence of clinical phenotypic resistance, with a strong association for bacterial groups in the human gut. The associations between bacterial groups and ARG abundance were positive and significantly different from zero for the aminoglycosides (three of the four of the taxa tested), ß-lactam (all the six microbial groups), fluoroquinolones (seven of nine of the microbial groups), glycopeptide (one microbial group tested), folate pathway antagonists (four of five microbial groups), and tetracycline (two of nine microbial groups). INTERPRETATION: Metagenomic analysis of sewage is a robust approach for the surveillance of antimicrobial resistance in pathogens, especially for bacterial groups associated with the human gut. Additional studies on the associations between important socioeconomic, nutritional, and health factors and antimicrobial resistance should consider the variation in these associations between countries and antimicrobial classes. FUNDING: EU Horizon 2020 and Novo Nordisk Foundation.
Subject(s)
Anti-Bacterial Agents , Anti-Infective Agents , Humans , Anti-Bacterial Agents/pharmacology , Drug Resistance, Bacterial , Sewage/microbiology , Anti-Infective Agents/pharmacology , Bacteria/genetics , Socioeconomic FactorsABSTRACT
Most investigations of geographical within-species differences are limited to focusing on a single species. Here, we investigate global differences for multiple bacterial species using a dataset of 757 metagenomics sewage samples from 101 countries worldwide. The within-species variations were determined by performing genome reconstructions, and the analyses were expanded by gene focused approaches. Applying these methods, we recovered 3353 near complete (NC) metagenome assembled genomes (MAGs) encompassing 1439 different MAG species and found that within-species genomic variation was in 36% of the investigated species (12/33) coherent with regional separation. Additionally, we found that variation of organelle genes correlated less with geography compared to metabolic and membrane genes, suggesting that the global differences of these species are caused by regional environmental selection rather than dissemination limitations. From the combination of the large and globally distributed dataset and in-depth analysis, we present a wide investigation of global within-species phylogeny of sewage bacteria. The global differences found here emphasize the need for worldwide data sets when making global conclusions.
Subject(s)
Bacteria , Sewage , Phylogeny , Sewage/microbiology , Bacteria/genetics , Cluster Analysis , GeographyABSTRACT
Since the initial discovery of a mobilized colistin resistance gene (mcr-1), several other variants have been reported, some of which might have circulated a while beforehand. Publicly available metagenomic data provide an opportunity to reanalyze samples to understand the evolutionary history of recently discovered antimicrobial resistance genes (ARGs). Here, we present a large-scale metagenomic study of 442 Tbp of sequencing reads from 214,095 samples to describe the dissemination and emergence of nine mcr gene variants (mcr-1 to mcr-9). Our results show that the dissemination of each variant is not uniform. Instead, the source and location play a role in the spread. However, the genomic context and the genes themselves remain primarily unchanged. We report evidence of new subvariants occurring in specific environments, such as a highly prevalent and new variant of mcr-9. This work emphasizes the importance of sharing genomic data for the surveillance of ARGs in our understanding of antimicrobial resistance. IMPORTANCE The ever-growing collection of metagenomic samples available in public data repositories has the potential to reveal new details on the emergence and dissemination of mobilized colistin resistance genes. Our analysis of metagenomes deposited online in the last 10 years shows that the environmental distribution of mcr gene variants depends on sampling source and location, possibly leading to the emergence of new variants, although the contig on which the mcr genes were found remained consistent.
Subject(s)
Anti-Bacterial Agents , Colistin , Anti-Bacterial Agents/pharmacology , Metagenome , Drug Resistance, Bacterial , Genes, BacterialABSTRACT
High-throughput genome sequencing technologies enable the investigation of complex genetic interactions, including the horizontal gene transfer of plasmids and bacteriophages. However, identifying these elements from assembled reads remains challenging due to genome sequence plasticity and the difficulty in assembling complete sequences. In this study, we developed a classifier, using random forest, to identify whether sequences originated from bacterial chromosomes, plasmids, or bacteriophages. The classifier was trained on a diverse collection of 23,211 chromosomal, plasmid, and bacteriophage sequences from hundreds of bacterial species. In order to adapt the classifier to incomplete sequences, each complete sequence was subsampled into 5,000 nucleotide fragments and further subdivided into k-mers. This three-class classifier succeeded in identifying chromosomes, plasmids, and bacteriophages using k-mer distributions of complete and partial genome sequences, including simulated metagenomic scaffolds with minimum performance of 0.939 area under the receiver operating characteristic curve (AUC). This classifier, implemented as SourceFinder, has been made available as an online web service to help the community with predicting the chromosomal, plasmid, and bacteriophage sources of assembled bacterial sequence data (https://cge.food.dtu.dk/services/SourceFinder/). IMPORTANCE Extra-chromosomal genes encoding antimicrobial resistance, metal resistance, and virulence provide selective advantages for bacterial survival under stress conditions and pose serious threats to human and animal health. These accessory genes can impact the composition of microbiomes by providing selective advantages to their hosts. Accurately identifying extra-chromosomal elements in genome sequence data are critical for understanding gene dissemination trajectories and taking preventative measures. Therefore, in this study, we developed a random forest classifier for identifying the source of bacterial chromosomal, plasmid, and bacteriophage sequences.
Subject(s)
Bacteriophages , Genome, Bacterial , Humans , Bacteriophages/genetics , Plasmids/genetics , Chromosomes, Bacterial/genetics , Machine LearningABSTRACT
Plasmids play a major role facilitating the spread of antimicrobial resistance between bacteria. Understanding the host range and dissemination trajectories of plasmids is critical for surveillance and prevention of antimicrobial resistance. Identification of plasmid host ranges could be improved using automated pattern detection methods compared to homology-based methods due to the diversity and genetic plasticity of plasmids. In this study, we developed a method for predicting the host range of plasmids using machine learning-specifically, random forests. We trained the models with 8,519 plasmids from 359 different bacterial species per taxonomic level; the models achieved Matthews correlation coefficients of 0.662 and 0.867 at the species and order levels, respectively. Our results suggest that despite the diverse nature and genetic plasticity of plasmids, our random forest model can accurately distinguish between plasmid hosts. This tool is available online through the Center for Genomic Epidemiology (https://cge.cbs.dtu.dk/services/PlasmidHostFinder/). IMPORTANCE Antimicrobial resistance is a global health threat to humans and animals, causing high mortality and morbidity while effectively ending decades of success in fighting against bacterial infections. Plasmids confer extra genetic capabilities to the host organisms through accessory genes that can encode antimicrobial resistance and virulence. In addition to lateral inheritance, plasmids can be transferred horizontally between bacterial taxa. Therefore, detection of the host range of plasmids is crucial for understanding and predicting the dissemination trajectories of extrachromosomal genes and bacterial evolution as well as taking effective countermeasures against antimicrobial resistance.