Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 64
Filter
1.
medRxiv ; 2024 Aug 28.
Article in English | MEDLINE | ID: mdl-39252917

ABSTRACT

Doxycycline post-exposure prophylaxis (doxy-PEP) for sexually transmitted bacterial infections reduces the risk of syphilis and chlamydia, but effectiveness against gonorrhea is variable, likely attributable to varying resistance rates. As doxy-PEP is incorporated into clinical practice, an urgent unanswered question is whether increased doxycycline use will drive tetracycline-class resistance in Neisseria gonorrhoeae. Here, we report an updated RT-PCR molecular diagnostic to detect the tetM gene that confers high-level tetracycline resistance in N. gonorrhoeae.

2.
Bioinformatics ; 40(9)2024 Sep 02.
Article in English | MEDLINE | ID: mdl-39298479

ABSTRACT

MOTIVATION: Metagenome-Assembled Genomes (MAGs) or Single-cell Amplified Genomes (SAGs) are often incomplete, with sequences missing due to errors in assembly or low coverage. This presents a particular challenge for the identification of true gene frequencies within a microbial population, as core genes missing in only a few assemblies will be mischaracterized by current pangenome approaches. RESULTS: Here, we present CELEBRIMBOR, a Snakemake pangenome analysis pipeline which uses a measure of genome completeness to automatically adjust the frequency threshold at which core genes are identified, enabling accurate core gene identification in MAGs and SAGs. AVAILABILITY AND IMPLEMENTATION: CELEBRIMBOR is published under open source Apache 2.0 licence at https://github.com/bacpop/CELEBRIMBOR and is available as a Docker container from this repository. Supplementary material is available in the online version of the article.


Subject(s)
Metagenome , Software , Metagenomics/methods
3.
Lancet Microbe ; 5(8): 100847, 2024 Aug.
Article in English | MEDLINE | ID: mdl-38851206

ABSTRACT

BACKGROUND: The antibiotic bedaquiline is a key component of new WHO regimens for drug-resistant tuberculosis; however, predicting bedaquiline resistance from bacterial genotypes remains challenging. We aimed to understand the genetic mechanisms of bedaquiline resistance by analysing Mycobacterium tuberculosis isolates from South Africa. METHODS: For this genomic analysis, we conducted whole-genome sequencing of Mycobacterium tuberculosis samples collected at two referral laboratories in Cape Town and Johannesburg, covering regions of South Africa with a high prevalence of tuberculosis. We used the tool ARIBA to measure the status of predefined genes that are associated with bedaquiline resistance. To produce a broad genetic landscape of M tuberculosis in South Africa, we extended our analysis to include all publicly available isolates from the European Nucleotide Archive, including isolates obtained by the CRyPTIC consortium, for which minimum inhibitory concentrations of bedaquiline were available. FINDINGS: Between Jan 10, 2019, and July, 22, 2020, we sequenced 505 M tuberculosis isolates from 461 patients. Of the 64 isolates with mutations within the mmpR5 regulatory gene, we found 53 (83%) had independent acquisition of 31 different mutations, with a particular enrichment of truncated MmpR5 in bedaquiline-resistant isolates resulting from either frameshift mutations or the introduction of an insertion element. Truncation occurred across three M tuberculosis lineages, and were present in 66% of bedaquiline-resistant isolates. Although the distributions overlapped, the median minimum inhibitory concentration of bedaquiline was 0·25 mg/L (IQR 0·12-0·25) in mmpR5-disrupted isolates, compared with 0·06 mg/L (0·03-0·06) in wild-type M tuberculosis. INTERPRETATION: Reduction in the susceptibility of M tuberculosis to bedaquiline has evolved repeatedly across the phylogeny. In our data, we see no evidence that this reduction has led to the spread of a successful strain in South Africa. Binary phenotyping based on the bedaquiline breakpoint might be inappropriate to monitor resistance to this drug. We recommend the use of minimum inhibitory concentrations in addition to MmpR5 truncation screening to identify moderate increases in resistance to bedaquiline. FUNDING: US Centers for Disease Control and Prevention.


Subject(s)
Antitubercular Agents , Bacterial Proteins , Diarylquinolines , Microbial Sensitivity Tests , Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/drug effects , South Africa/epidemiology , Diarylquinolines/pharmacology , Humans , Antitubercular Agents/pharmacology , Tuberculosis, Multidrug-Resistant/microbiology , Tuberculosis, Multidrug-Resistant/genetics , Tuberculosis, Multidrug-Resistant/epidemiology , Bacterial Proteins/genetics , Whole Genome Sequencing , Mutation , Genomics , Drug Resistance, Bacterial/genetics
4.
Diagn Microbiol Infect Dis ; 109(2): 116249, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38537504

ABSTRACT

Targeted Next Generation Sequencing (tNGS) and Whole Genome Sequencing (WGS) are increasingly used for genotypic drug susceptibility testing (gDST) of Mycobacterium tuberculosis. Thirty-two multi-drugs resistant and 40 drug susceptible isolates from Madagascar were tested with Deeplex® Myc-TB and WGS using the Mykrobe analysis pipeline. Sixty-four of 72 (89 %) yielded concordant categorical gDST results for drugs tested by both assays. Mykrobe didn't detect pncA K96T, pncA Q141P, pncA H51P, pncA H82R, rrs C517T and rpsL K43R mutations, which were identified as minority variants in corresponding isolates by tNGS. One discrepancy (rrs C517T) was associated with insufficient sequencing depth on WGS. Deeplex® Myc-TB didn't detect inhA G-154A which isn't covered by the assay's amplification targets. Despite those targets being included in the Deeplex® Myc-TB assay, a pncA T47A and a deletion in gid were not identified in one isolate respectively. The evaluated WGS and tNGS gDST assays show high but imperfect concordance.


Subject(s)
Antitubercular Agents , Genotype , High-Throughput Nucleotide Sequencing , Microbial Sensitivity Tests , Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Whole Genome Sequencing , Mycobacterium tuberculosis/genetics , Mycobacterium tuberculosis/drug effects , Antitubercular Agents/pharmacology , Microbial Sensitivity Tests/methods , Humans , High-Throughput Nucleotide Sequencing/methods , Tuberculosis, Multidrug-Resistant/microbiology , Drug Resistance, Multiple, Bacterial/genetics , Madagascar , Genome, Bacterial/genetics , Mutation , Bacterial Proteins/genetics , Genotyping Techniques/methods
5.
PLoS Biol ; 22(3): e3002507, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38451924

ABSTRACT

While the malaria parasite Plasmodium falciparum has low average genome-wide diversity levels, likely due to its recent introduction from a gorilla-infecting ancestor (approximately 10,000 to 50,000 years ago), some genes display extremely high diversity levels. In particular, certain proteins expressed on the surface of human red blood cell-infecting merozoites (merozoite surface proteins (MSPs)) possess exactly 2 deeply diverged lineages that have seemingly not recombined. While of considerable interest, the evolutionary origin of this phenomenon remains unknown. In this study, we analysed the genetic diversity of 2 of the most variable MSPs, DBLMSP and DBLMSP2, which are paralogs (descended from an ancestral duplication). Despite thousands of available Illumina WGS datasets from malaria-endemic countries, diversity in these genes has been hard to characterise as reads containing highly diverged alleles completely fail to align to the reference genome. To solve this, we developed a pipeline leveraging genome graphs, enabling us to genotype them at high accuracy and completeness. Using our newly- resolved sequences, we found that both genes exhibit 2 deeply diverged lineages in a specific protein domain (DBL) and that one of the 2 lineages is shared across the genes. We identified clear evidence of nonallelic gene conversion between the 2 genes as the likely mechanism behind sharing, leading us to propose that gene conversion between diverged paralogs, and not recombination suppression, can generate this surprising genealogy; a model that is furthermore consistent with high diversity levels in these 2 genes despite the strong historical P. falciparum transmission bottleneck.


Subject(s)
Hominidae , Malaria, Falciparum , Malaria , Parasites , Animals , Humans , Plasmodium falciparum/metabolism , Parasites/metabolism , Gene Conversion , Antigens, Surface , Malaria/parasitology , Protozoan Proteins/genetics , Protozoan Proteins/metabolism , Genetic Variation
6.
Microb Genom ; 10(2)2024 Feb.
Article in English | MEDLINE | ID: mdl-38358325

ABSTRACT

The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.


Subject(s)
COVID-19 , SARS-CoV-2 , Humans , SARS-CoV-2/genetics , Pandemics , COVID-19/epidemiology , Genomics , Information Dissemination
7.
Microb Genom ; 9(8)2023 08.
Article in English | MEDLINE | ID: mdl-37552534

ABSTRACT

Tuberculosis is a global pandemic disease with a rising burden of antimicrobial resistance. As a result, the World Health Organization (WHO) has a goal of enabling universal access to drug susceptibility testing (DST). Given the slowness of and infrastructure requirements for phenotypic DST, whole-genome sequencing, followed by genotype-based prediction of DST, now provides a route to achieving this. Since a central component of genotypic DST is to detect the presence of any known resistance-causing mutations, a natural approach is to use a reference graph that allows encoding of known variation. We have developed DrPRG (Drug resistance Prediction with Reference Graphs) using the bacterial reference graph method Pandora. First, we outline the construction of a Mycobacterium tuberculosis drug resistance reference graph. The graph is built from a global dataset of isolates with varying drug susceptibility profiles, thus capturing common and rare resistance- and susceptible-associated haplotypes. We benchmark DrPRG against the existing graph-based tool Mykrobe and the haplotype-based approach of TBProfiler using 44 709 and 138 publicly available Illumina and Nanopore samples with associated phenotypes. We find that DrPRG has significantly improved sensitivity and specificity for some drugs compared to these tools, with no significant decreases. It uses significantly less computational memory than both tools, and provides significantly faster runtimes, except when runtime is compared to Mykrobe with Nanopore data. We discover and discuss novel insights into resistance-conferring variation for M. tuberculosis - including deletion of genes katG and pncA - and suggest mutations that may warrant reclassification as associated with resistance.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Tuberculosis , Humans , Antitubercular Agents/pharmacology , Antitubercular Agents/therapeutic use , Tuberculosis, Multidrug-Resistant/genetics , Microbial Sensitivity Tests , Drug Resistance, Multiple, Bacterial/genetics , Tuberculosis/microbiology
8.
Microb Genom ; 9(7)2023 07.
Article in English | MEDLINE | ID: mdl-37405394

ABSTRACT

Healthcare-associated infections (HCAIs) affect the most vulnerable people in society and are increasingly difficult to treat in the face of mounting antimicrobial resistance (AMR). Routine surveillance represents an effective way of understanding the circulation and burden of bacterial resistance and transmission in hospital settings. Here, we used whole-genome sequencing (WGS) to retrospectively analyse carbapenemase-producing Gram-negative bacteria from a single hospital in the UK over 6 years (n=165). We found that the vast majority of isolates were either hospital-onset (HAI) or HCAI. Most carbapenemase-producing organisms were carriage isolates, with 71 % isolated from screening (rectal) swabs. Using WGS, we identified 15 species, the most common being Escherichia coli and Klebsiella pneumoniae. Only one significant clonal outbreak occurred during the study period and involved a sequence type (ST)78 K. pneumoniae carrying bla NDM-1 on an IncFIB/IncHI1B plasmid. Contextualization with public data revealed little evidence of this ST outside of the study hospital, warranting ongoing surveillance. Carbapenemase genes were found on plasmids in 86 % of isolates, the most common types being bla NDM- and bla OXA-type alleles. Using long-read sequencing, we determined that approximately 30 % of isolates with carbapenemase genes on plasmids had acquired them via horizontal transmission. Overall, a national framework to collate more contextual genomic data, particularly for plasmids and resistant bacteria in the community, is needed to better understand how carbapenemase genes are transmitted in the UK.


Subject(s)
Hospitals , Klebsiella pneumoniae , Humans , Retrospective Studies , Plasmids/genetics , Klebsiella pneumoniae/genetics , Escherichia coli/genetics , Genomics , United Kingdom/epidemiology
9.
bioRxiv ; 2023 Apr 18.
Article in English | MEDLINE | ID: mdl-37131636

ABSTRACT

Comprehensive collections approaching millions of sequenced genomes have become central information sources in the life sciences. However, the rapid growth of these collections makes it effectively impossible to search these data using tools such as BLAST and its successors. Here, we present a technique called phylogenetic compression, which uses evolutionary history to guide compression and efficiently search large collections of microbial genomes using existing algorithms and data structures. We show that, when applied to modern diverse collections approaching millions of genomes, lossless phylogenetic compression improves the compression ratios of assemblies, de Bruijn graphs, and k-mer indexes by one to two orders of magnitude. Additionally, we develop a pipeline for a BLAST-like search over these phylogeny-compressed reference data, and demonstrate it can align genes, plasmids, or entire sequencing experiments against all sequenced bacteria until 2019 on ordinary desktop computers within a few hours. Phylogenetic compression has broad applications in computational biology and may provide a fundamental design principle for future genomics infrastructure.

10.
Lancet Microbe ; 4(5): e358-e368, 2023 05.
Article in English | MEDLINE | ID: mdl-37003285

ABSTRACT

BACKGROUND: Bedaquiline is a core drug for the treatment of multidrug-resistant tuberculosis; however, the understanding of resistance mechanisms is poor, which is hampering rapid molecular diagnostics. Some bedaquiline-resistant mutants are also cross-resistant to clofazimine. To decipher bedaquiline and clofazimine resistance determinants, we combined experimental evolution, protein modelling, genome sequencing, and phenotypic data. METHODS: For this in-vitro and in-silico data analysis, we used a novel in-vitro evolutionary model using subinhibitory drug concentrations to select bedaquiline-resistant and clofazimine-resistant mutants. We determined bedaquiline and clofazimine minimum inhibitory concentrations and did Illumina and PacBio sequencing to characterise selected mutants and establish a mutation catalogue. This catalogue also includes phenotypic and genotypic data of a global collection of more than 14 000 clinical Mycobacterium tuberculosis complex isolates, and publicly available data. We investigated variants implicated in bedaquiline resistance by protein modelling and dynamic simulations. FINDINGS: We discerned 265 genomic variants implicated in bedaquiline resistance, with 250 (94%) variants affecting the transcriptional repressor (Rv0678) of the MmpS5-MmpL5 efflux system. We identified 40 new variants in vitro, and a new bedaquiline resistance mechanism caused by a large-scale genomic rearrangement. Additionally, we identified in vitro 15 (7%) of 208 mutations found in clinical bedaquiline-resistant isolates. From our in-vitro work, we detected 14 (16%) of 88 mutations so far identified as being associated with clofazimine resistance and also seen in clinically resistant strains, and catalogued 35 new mutations. Structural modelling of Rv0678 showed four major mechanisms of bedaquiline resistance: impaired DNA binding, reduction in protein stability, disruption of protein dimerisation, and alteration in affinity for its fatty acid ligand. INTERPRETATION: Our findings advance the understanding of drug resistance mechanisms in M tuberculosis complex strains. We have established an extended mutation catalogue, comprising variants implicated in resistance and susceptibility to bedaquiline and clofazimine. Our data emphasise that genotypic testing can delineate clinical isolates with borderline phenotypes, which is essential for the design of effective treatments. FUNDING: Leibniz ScienceCampus Evolutionary Medicine of the Lung, Deutsche Forschungsgemeinschaft, Research Training Group 2501 TransEvo, Rhodes Trust, Stanford University Medical Scientist Training Program, National Institute for Health and Care Research Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Bill & Melinda Gates Foundation, Wellcome Trust, and Marie Sklodowska-Curie Actions.


Subject(s)
Clofazimine , Mycobacterium tuberculosis , Clofazimine/pharmacology , Clofazimine/therapeutic use , Mycobacterium tuberculosis/genetics , Antitubercular Agents/pharmacology , Antitubercular Agents/therapeutic use , Diarylquinolines/pharmacology , Diarylquinolines/therapeutic use
11.
J Clin Microbiol ; 61(3): e0157822, 2023 03 23.
Article in English | MEDLINE | ID: mdl-36815861

ABSTRACT

Universal access to drug susceptibility testing for newly diagnosed tuberculosis patients is recommended. Access to culture-based diagnostics remains limited, and targeted molecular assays are vulnerable to emerging resistance mutations. Improved protocols for direct-from-sputum Mycobacterium tuberculosis sequencing would accelerate access to comprehensive drug susceptibility testing and molecular typing. We assessed a thermo-protection buffer-based direct-from-sample M. tuberculosis whole-genome sequencing protocol. We prospectively analyzed 60 acid-fast bacilli smear-positive clinical sputum samples in India and Madagascar. A diversity of semiquantitative smear positivity-level samples were included. Sequencing was performed using Illumina and MinION (monoplex and multiplex) technologies. We measured the impact of bacterial inoculum and sequencing platforms on genomic read depth, drug susceptibility prediction performance, and typing accuracy. M. tuberculosis was identified by direct sputum sequencing in 45/51 samples using Illumina, 34/38 were identified using MinION-monoplex sequencing, and 20/24 were identified using MinION-multiplex sequencing. The fraction of M. tuberculosis reads from MinION sequencing was lower than from Illumina, but monoplexing grade 3+ samples on MinION produced higher read depth than Illumina (P < 0.05) and MinION multiplexing (P < 0.01). No significant differences in sensitivity and specificity of drug susceptibility predictions were seen across sequencing modalities or within each technology when stratified by smear grade. Illumina sequencing from sputum accurately identified 1/8 (rifampin) and 6/12 (isoniazid) resistant samples, compared to 2/3 (rifampin) and 3/6 (isoniazid) accurately identified with Nanopore monoplex. Lineage agreement levels between direct and culture-based sequencing were 85% (MinION-monoplex), 88% (Illumina), and 100% (MinION-multiplex). M. tuberculosis direct-from-sample whole-genome sequencing remains challenging. Improved and affordable sample treatment protocols are needed prior to clinical deployment.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis, Multidrug-Resistant , Tuberculosis , Humans , Mycobacterium tuberculosis/genetics , Antitubercular Agents/pharmacology , Antitubercular Agents/therapeutic use , Isoniazid , Rifampin , Microbial Sensitivity Tests , Sputum/microbiology , Tuberculosis/diagnosis , Tuberculosis/drug therapy , Genomics , Tuberculosis, Multidrug-Resistant/microbiology
12.
Microbiol Spectr ; : e0282622, 2023 Feb 14.
Article in English | MEDLINE | ID: mdl-36786614

ABSTRACT

Outbreak strains of Mycobacterium tuberculosis are promising candidates as targets in the search for intrinsic determinants of transmissibility, as they are responsible for many cases with sustained transmission; however, the use of low-resolution typing methods and restricted geographical investigations represent flaws in assessing the success of long-lived outbreak strains. We can now address the nature of outbreak strains by combining large genomic data sets and phylodynamic approaches. We retrospectively sequenced the whole genome of representative samples assigned to an outbreak circulating in the Canary Islands (the GC strain) since 1993, which accounts for ~20% of local tuberculosis cases. We selected a panel of specific single nucleotide polymorphism (SNP) markers for an in-silico search for additional outbreak-related sequences within publicly available tuberculosis genomic data. Using this information, we inferred the origin, spread, and epidemiological parameters of the GC strain. Our approach allowed us to accurately trace the historical and more recent dispersion of the GC strain. We provide evidence of a highly successful nature within the Canarian archipelago but limited expansion abroad. Estimation of epidemiological parameters from genomic data disagree with a distinctive biology of the GC strain. With the increasing availability of genomic data allowing for the accurate inference of strain spread and critical epidemiological parameters, we can now revisit the link between Mycobacterium tuberculosis genotypes and transmission, as is routinely carried out for SARS-CoV-2 variants of concern. We demonstrate that social determinants rather than intrinsically higher bacterial transmissibility better explain the success of the GC strain. Importantly, our approach can be used to trace and characterize strains of interest worldwide. IMPORTANCE Infectious disease outbreaks represent a significant problem for public health. Tracing outbreak expansion and understanding the main factors behind emergence and persistence remain critical to effective disease control. Our study allows researchers and public health authorities to use Whole-Genome Sequencing-based methods to trace outbreaks, and shows how available epidemiological information helps to evaluate the factors underpinning outbreak persistence. Taking advantage of all the freely available information placed in public repositories, researchers can accurately establish the expansion of an outbreak beyond original boundaries, and determine the potential risk of a strain to inform health authorities which, in turn, can define target strategies to mitigate expansion and persistence. Finally, we show the need to evaluate strain transmissibility in different geographic contexts to unequivocally associate spread to local or pathogenic factors, an important lesson taken from genomic surveillance of SARS-CoV-2.

13.
Lancet Microbe ; 4(2): e84-e92, 2023 02.
Article in English | MEDLINE | ID: mdl-36549315

ABSTRACT

BACKGROUND: Mycobacterium tuberculosis whole-genome sequencing (WGS) has been widely used for genotypic drug susceptibility testing (DST) and outbreak investigation. For both applications, Illumina technology is used by most public health laboratories; however, Nanopore technology developed by Oxford Nanopore Technologies has not been thoroughly evaluated. The aim of this study was to determine whether Nanopore sequencing data can provide equivalent information to Illumina for transmission clustering and genotypic DST for M tuberculosis. METHODS: In this genomic analysis, we analysed 151 M tuberculosis isolates from Madagascar, South Africa, and England, which were collected between 2011 and 2018, using phenotypic DST and matched Illumina and Nanopore data. Illumina sequencing was done with the MiSeq, HiSeq 2500, or NextSeq500 platforms and Nanopore sequencing was done on the MinION or GridION platforms. Using highly reliable PacBio sequencing assemblies and pairwise distance correlation between Nanopore and Illumina data, we optimise Nanopore variant filters for detecting single-nucleotide polymorphisms (SNPs; using BCFtools software). We then used those SNPs to compare transmission clusters identified by Nanopore with the currently used UK Health Security Agency Illumina pipeline (COMPASS). We compared Illumina and Nanopore WGS-based DST predictions using the Mykrobe software and mutation catalogue. FINDINGS: The Nanopore BCFtools pipeline identified SNPs with a median precision of 99·3% (IQR 99·1-99·6) and recall of 90·2% (88·1-94·2) compared with a precision of 99·6% (99·4-99·7) and recall of 91·9% (87·6-98·6) using the Illumina COMPASS pipeline. Using a threshold of 12 SNPs for putative transmission clusters, Illumina identified 98 isolates as unrelated and 53 as belonging to 19 distinct clusters (size range 2-7). Nanopore reproduced 15 out of 19 clusters perfectly; two clusters were merged into one cluster, one cluster had a single sample missing, and one cluster had an additional sample adjoined. Illumina-based clusters were also closely replicated using a five SNP threshold and clustering accuracy was maintained using mixed Illumina and Nanopore datasets. Genotyping resistance variants with Nanopore was highly concordant with Illumina, having zero discordant SNPs across more than 3000 SNPs and four insertions or deletions (indels), across 60 000 indels. INTERPRETATION: Illumina and Nanopore technologies can be used independently or together by public health laboratories performing M tuberculosis genotypic DST and outbreak investigations. As a result, clinical and public health institutions making decisions on which sequencing technology to adopt for tuberculosis can base the choice on cost (which varies by country), batching, and turnaround time. FUNDING: Academy for Medical Sciences, Oxford Wellcome Institutional Strategic Support Fund, and the Swiss South Africa Joint Research Award (Swiss National Science Foundation and South African National Research Foundation).


Subject(s)
Mycobacterium tuberculosis , Nanopore Sequencing , Tuberculosis , Humans , Mycobacterium tuberculosis/genetics , Microbial Sensitivity Tests , Sequence Analysis, DNA , Genomics , Tuberculosis/diagnosis , Tuberculosis/drug therapy , Tuberculosis/epidemiology , Disease Outbreaks
14.
Lancet Microbe ; 3(11): e857-e866, 2022 11.
Article in English | MEDLINE | ID: mdl-36206776

ABSTRACT

BACKGROUND: Viet Nam has high rates of antimicrobial resistance (AMR) but little capacity for genomic surveillance. This study used whole genome sequencing to examine the prevalence and transmission of three key AMR pathogens in two intensive care units (ICUs) in Hanoi, Viet Nam. METHODS: A prospective surveillance study of all adults admitted to ICUs at the National Hospital for Tropical Diseases and Bach Mai Hospital was done between June 19, 2017, and Jan 16, 2018. Clinical and environmental samples were cultured on selective media, characterised with MALDI TOF mass spectrometry, and sequenced with Illumina. Phylogenies based on the de-novo assemblies (SPAdes) were constructed with MAFFT (PARsnp), Gubbins, and RAxML. Resistance genes were detected with Abricate against the US National Center for Biotechnology Information database. FINDINGS: 3153 Escherichia coli, Klebsiella pneumoniae, and Acinetobacter baumannii isolates from 369 patients were analysed. Phylogenetic analysis revealed predominant lineages within A baumannii (global clone 2, sequence types ST2 and ST571) and K pneumoniae (ST15, ST16, ST656, ST11, and ST147) isolates. Isolation from stool was most common with E coli (87·0%) followed by K pneumoniae (62·5%). Of the E coli, 85·0% carried a blaCTX-M variant, while 81·8% of K pneumoniae isolates carried blaNDM (54·4%), or blaKPC (45·1%), or both. Transmission analysis with single nucleotide polymorphisms identified 167 clusters involving 251 (68%) of 369 patients, in some cases involving patients from both ICUs. There were no clear differences between the lineages or AMR genes recovered between the two ICUs. INTERPRETATION: This study represents the largest prospective surveillance study of key AMR pathogens in Vietnamese ICUs. Clusters of closely related isolates in patients across both ICUs suggests recent transmission before ICU admission in other health-care settings or in the community. FUNDING: UK Medical Research Council Newton Fund, Viet Nam Ministry of Science and Technology, Wellcome Trust, Academy of Medical Sciences, Health Foundation, and UK National Institute for Health and Care Research Cambridge Biomedical Research Centre.


Subject(s)
Acinetobacter baumannii , Cross Infection , Adult , Humans , Klebsiella pneumoniae/genetics , Acinetobacter baumannii/genetics , Escherichia coli/genetics , Phylogeny , Prospective Studies , Vietnam/epidemiology , Microbial Sensitivity Tests , Cross Infection/epidemiology , Intensive Care Units , Genomics
15.
Genome Biol ; 23(1): 147, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35791022

ABSTRACT

There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables joint genotyping; we demonstrate on a large (N=13k) M. tuberculosis cohort, building a map of non-synonymous SNPs and indels in a region where all such variants are assumed to cause rifampicin resistance. We quantify the correlation with phenotypic resistance and then replicate in a second cohort (N=10k).


Subject(s)
High-Throughput Nucleotide Sequencing , Mycobacterium tuberculosis , Genome, Bacterial , Genotype , Humans , INDEL Mutation , Mycobacterium tuberculosis/genetics , Polymorphism, Single Nucleotide
16.
Bioinformatics ; 38(12): 3291-3293, 2022 06 13.
Article in English | MEDLINE | ID: mdl-35551365

ABSTRACT

SUMMARY: Viral sequence data from clinical samples frequently contain contaminating human reads, which must be removed prior to sharing for legal and ethical reasons. To enable host read removal for SARS-CoV-2 sequencing data on low-specification laptops, we developed ReadItAndKeep, a fast lightweight tool for Illumina and nanopore data that only keeps reads matching the SARS-CoV-2 genome. Peak RAM usage is typically below 10 MB, and runtime less than 1 min. We show that by excluding the polyA tail from the viral reference, ReadItAndKeep prevents bleed-through of human reads, whereas mapping to the human genome lets some reads escape. We believe our test approach (including all possible reads from the human genome, human samples from each of the 26 populations in the 1000 genomes data and a diverse set of SARS-CoV-2 genomes) will also be useful for others. AVAILABILITY AND IMPLEMENTATION: ReadItAndKeep is implemented in C++, released under the MIT license, and available from https://github.com/GenomePathogenAnalysisService/read-it-and-keep. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
COVID-19 , Software , Humans , Sequence Analysis, DNA , SARS-CoV-2/genetics , Decontamination , High-Throughput Nucleotide Sequencing , Genome, Human
17.
Lancet Microbe ; 3(4): e265-e273, 2022 04.
Article in English | MEDLINE | ID: mdl-35373160

ABSTRACT

Background: Molecular diagnostics are considered the most promising route to achieving rapid, universal drug susceptibility testing for Mycobacterium tuberculosiscomplex (MTBC). We aimed to generate a WHO endorsed catalogue of mutations to serve as a global standard for interpreting molecular information for drug resistance prediction. Methods: A candidate gene approach was used to identify mutations as associated with resistance, or consistent with susceptibility, for 13 WHO endorsed anti-tuberculosis drugs. 38,215 MTBC isolates with paired whole-genome sequencing and phenotypic drug susceptibility testing data were amassed from 45 countries. For each mutation, a contingency table of binary phenotypes and presence or absence of the mutation computed positive predictive value, and Fisher's exact tests generated odds ratios and Benjamini-Hochberg corrected p-values. Mutations were graded as Associated with Resistance if present in at least 5 isolates, if the odds ratio was >1 with a statistically significant corrected p-value, and if the lower bound of the 95% confidence interval on the positive predictive value for phenotypic resistance was >25%. A series of expert rules were applied for final confidence grading of each mutation. Findings: 15,667 associations were computed for 13,211 unique mutations linked to one or more drugs. 1,149/15,667 (7·3%) mutations were classified as associated with phenotypic resistance and 107/15,667 (0·7%) were deemed consistent with susceptibility. For rifampicin, isoniazid, ethambutol, fluoroquinolones, and streptomycin, the mutations' pooled sensitivity was >80%. Specificity was over 95% for all drugs except ethionamide (91·4%), moxifloxacin (91·6%) and ethambutol (93·3%). Only two resistance mutations were classified for bedaquiline, delamanid, clofazimine, and linezolid as prevalence of phenotypic resistance was low for these drugs. Interpretation: This first WHO endorsed catalogue of molecular targets for MTBC drug susceptibility testing provides a global standard for resistance interpretation. Its existence should encourage the implementation of molecular diagnostics by National Tuberculosis Programmes. Funding: UNITAID, Wellcome, MRC, BMGF.


Subject(s)
Ethambutol , Mycobacterium tuberculosis , Antitubercular Agents/pharmacology , Drug Resistance , Microbial Sensitivity Tests , Mutation , Mycobacterium tuberculosis/genetics , World Health Organization
18.
Bioinformatics ; 38(7): 1781-1787, 2022 03 28.
Article in English | MEDLINE | ID: mdl-35020793

ABSTRACT

MOTIVATION: Short-read whole-genome sequencing (WGS) is a vital tool for clinical applications and basic research. Genetic divergence from the reference genome, repetitive sequences and sequencing bias reduces the performance of variant calling using short-read alignment, but the loss in recall and specificity has not been adequately characterized. To benchmark short-read variant calling, we used 36 diverse clinical Mycobacterium tuberculosis (Mtb) isolates dually sequenced with Illumina short-reads and PacBio long-reads. We systematically studied the short-read variant calling accuracy and the influence of sequence uniqueness, reference bias and GC content. RESULTS: Reference-based Illumina variant calling demonstrated a maximum recall of 89.0% and minimum precision of 98.5% across parameters evaluated. The approach that maximized variant recall while still maintaining high precision (<99%) was tuning the mapping quality filtering threshold, i.e. confidence of the read mapping (recall = 85.8%, precision = 99.1%, MQ ≥ 40). Additional masking of repetitive sequence content is an alternative conservative approach to variant calling that increases precision at cost to recall (recall = 70.2%, precision = 99.6%, MQ ≥ 40). Of the genomic positions typically excluded for Mtb, 68% are accurately called using Illumina WGS including 52/168 PE/PPE genes (34.5%). From these results, we present a refined list of low confidence regions across the Mtb genome, which we found to frequently overlap with regions with structural variation, low sequence uniqueness and low sequencing coverage. Our benchmarking results have broad implications for the use of WGS in the study of Mtb biology, inference of transmission in public health surveillance systems and more generally for WGS applications in other organisms. AVAILABILITY AND IMPLEMENTATION: All relevant code is available at https://github.com/farhat-lab/mtb-illumina-wgs-evaluation. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Mycobacterium tuberculosis , Tuberculosis , Humans , Benchmarking , Mycobacterium tuberculosis/genetics , Software , Sequence Analysis, DNA/methods , High-Throughput Nucleotide Sequencing/methods
19.
PLoS Biol ; 19(11): e3001421, 2021 11.
Article in English | MEDLINE | ID: mdl-34752446

ABSTRACT

The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed raw sequence data. In both cases, considerable computational effort is required before biological questions can be addressed. Here, we assembled and characterised 661,405 bacterial genomes retrieved from the European Nucleotide Archive (ENA) in November of 2018 using a uniform standardised approach. Of these, 311,006 did not previously have an assembly. We produced a searchable COmpact Bit-sliced Signature (COBS) index, facilitating the easy interrogation of the entire dataset for a specific sequence (e.g., gene, mutation, or plasmid). Additional MinHash and pp-sketch indices support genome-wide comparisons and estimations of genomic distance. Combined, this resource will allow data to be easily subset and searched, phylogenetic relationships between genomes to be quickly elucidated, and hypotheses rapidly generated and tested. We believe that this combination of uniform processing and variety of search/filter functionalities will make this a resource of very wide utility. In terms of diversity within the data, a breakdown of the 639,981 high-quality genomes emphasised the uneven species composition of the ENA/public databases, with just 20 of the total 2,336 species making up 90% of the genomes. The overrepresented species tend to be acute/common human pathogens, aligning with research priorities at different levels from individual interests to funding bodies and national and global public health agencies.


Subject(s)
Bacteria/genetics , Biodiversity , DNA, Bacterial/genetics , Data Curation , Base Sequence , Drug Resistance, Bacterial/genetics , Species Specificity
20.
Genome Biol ; 22(1): 259, 2021 09 06.
Article in English | MEDLINE | ID: mdl-34488837

ABSTRACT

Genome graphs allow very general representations of genetic variation; depending on the model and implementation, variation at different length-scales (single nucleotide polymorphisms (SNPs), structural variants) and on different sequence backgrounds can be incorporated with different levels of transparency. We implement a model which handles this multiscale variation and develop a JSON extension of VCF (jVCF) allowing for variant calls on multiple references, both implemented in our software gramtools. We find gramtools outperforms existing methods for genotyping SNPs overlapping large deletions in M. tuberculosis and is able to genotype on multiple alternate backgrounds in P. falciparum, revealing previously hidden recombination.


Subject(s)
Algorithms , Genetic Variation , Genome, Human , Alleles , Antigens, Surface/metabolism , Computer Simulation , Genotyping Techniques , Haplotypes/genetics , Humans , Mycobacterium tuberculosis/genetics , Plasmodium falciparum/genetics , Polymorphism, Single Nucleotide/genetics , Reproducibility of Results , Sequence Deletion
SELECTION OF CITATIONS
SEARCH DETAIL