RESUMO
How do segmented RNA viruses correctly recruit their genome has yet to be clarified. Bluetongue virus is a double-stranded RNA virus with 10 segments of different sizes, but it assembles its genome in single-stranded form through a series of specific RNA-RNA interactions prior to packaging. In this study, we determined the structure of each BTV transcript, individually and in different combinations, using 2'-hydroxyl acylation analysed by primer extension and mutational profiling (SHAPE-MaP). SHAPE-MaP identified RNA structural changes during complex formation and putative RNA-RNA interaction sites. Our data also revealed a core RNA-complex of smaller segments which serves as the foundation ('anchor') for the assembly of a complete network composed of ten ssRNA segments. The same order of core RNA complex formation was identified in cells transfected with viral RNAs. No viral protein was required for these assembly reactions. Further, substitution mutations in the interacting bases within the core assemblies, altered subsequent segment addition and affected virus replication. These data identify a wholly RNA driven reaction that may offer novel opportunities for designed attenuation or antiviral therapeutics.
Assuntos
Vírus Bluetongue , Genoma Viral , RNA Viral , Replicação Viral , Vírus Bluetongue/genética , RNA Viral/metabolismo , RNA Viral/genética , RNA Viral/química , Animais , Replicação Viral/genética , Conformação de Ácido Nucleico , Montagem de Vírus/genética , Linhagem Celular , MutaçãoRESUMO
Analysis of genome sequencing data from >100,000 genomes of Mycobacterium tuberculosis complex using TB-Annotator software revealed a previously unknown lineage, proposed name L10, in central Africa. Phylogenetic reconstruction suggests L10 could represent a missing link in the evolutionary and geographic migration histories of M. africanum.
Assuntos
Evolução Biológica , Mycobacterium , Filogenia , Mycobacterium/genética , Software , África Central/epidemiologiaRESUMO
With >1 million associated deaths in 2020, human tuberculosis (TB) caused by the bacteria Mycobacterium tuberculosis remains one of the deadliest infectious diseases. A plethora of genomic tools and bioinformatics pipelines have become available in recent years to assist the whole genome sequencing of M. tuberculosis. The Oxford Nanopore Technologies (ONT) portable sequencer is a promising platform for cost-effective application in clinics, including personalizing treatment through detection of drug resistance-associated mutations, or in the field, to assist epidemiological and transmission investigations. In this study, we performed a comparison of 10 clinical isolates with DNA sequenced on both long-read ONT and (gold standard) short-read Illumina HiSeq platforms. Our analysis demonstrates the robustness of the ONT variant calling for single nucleotide polymorphisms, despite the high error rate. Moreover, because of improved coverage in repetitive regions where short sequencing reads fail to align accurately, ONT data analysis can incorporate additional regions of the genome usually excluded (e.g. pe/ppe genes). The resulting extra resolution can improve the characterization of transmission clusters and dynamics based on inferring closely related isolates. High concordance in variants in loci associated with drug resistance supports its use for the rapid detection of resistant mutations. Overall, ONT sequencing is a promising tool for TB genomic investigations, particularly to inform clinical and surveillance decision-making to reduce the disease burden.
Assuntos
Mycobacterium tuberculosis , Tuberculose , Biologia Computacional , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mycobacterium tuberculosis/genética , Análise de Sequência de DNA , Tuberculose/tratamento farmacológico , Tuberculose/epidemiologia , Sequenciamento Completo do Genoma/métodosRESUMO
SUMMARY: Fastlin is a bioinformatics tool designed for rapid Mycobacterium tuberculosis complex (MTBC) lineage typing. It utilizes an ultra-fast alignment-free approach to detect previously identified barcode single nucleotide polymorphisms associated with specific MTBC lineages. In a comprehensive benchmarking against existing tools, fastlin demonstrated high accuracy and significantly faster running times. AVAILABILITY AND IMPLEMENTATION: fastlin is freely available at https://github.com/rderelle/fastlin and can easily be installed using Conda.
Assuntos
Mycobacterium tuberculosis , Mycobacterium tuberculosis/genética , Biologia Computacional , Polimorfismo de Nucleotídeo Único , SoftwareRESUMO
MOTIVATION: Tuberculosis (TB) is caused by members of the Mycobacterium tuberculosis complex (MTBC), which has a strain- or lineage-based clonal population structure. The evolution of drug-resistance in the MTBC poses a threat to successful treatment and eradication of TB. Machine learning approaches are being increasingly adopted to predict drug-resistance and characterize underlying mutations from whole genome sequences. However, such approaches may not generalize well in clinical practice due to confounding from the population structure of the MTBC. RESULTS: To investigate how population structure affects machine learning prediction, we compared three different approaches to reduce lineage dependency in random forest (RF) models, including stratification, feature selection, and feature weighted models. All RF models achieved moderate-high performance (area under the ROC curve range: 0.60-0.98). First-line drugs had higher performance than second-line drugs, but it varied depending on the lineages in the training dataset. Lineage-specific models generally had higher sensitivity than global models which may be underpinned by strain-specific drug-resistance mutations or sampling effects. The application of feature weights and feature selection approaches reduced lineage dependency in the model and had comparable performance to unweighted RF models. AVAILABILITY AND IMPLEMENTATION: https://github.com/NinaMercedes/RF_lineages.
Assuntos
Mycobacterium tuberculosis , Tuberculose Resistente a Múltiplos Medicamentos , Tuberculose , Humanos , Mycobacterium tuberculosis/genética , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Tuberculose/tratamento farmacológico , Mutação , Sequenciamento Completo do Genoma , Antituberculosos/farmacologia , Antituberculosos/uso terapêuticoRESUMO
The microbiome plays a key role in the health of the human body. Interest often lies in finding features of the microbiome, alongside other covariates, which are associated with a phenotype of interest. One important property of microbiome data, which is often overlooked, is its compositionality as it can only provide information about the relative abundance of its constituting components. Typically, these proportions vary by several orders of magnitude in datasets of high dimensions. To address these challenges we develop a Bayesian hierarchical linear log-contrast model which is estimated by mean field Monte-Carlo co-ordinate ascent variational inference (CAVI-MC) and easily scales to high dimensional data. We use novel priors which account for the large differences in scale and constrained parameter space associated with the compositional covariates. A reversible jump Monte Carlo Markov chain guided by the data through univariate approximations of the variational posterior probability of inclusion, with proposal parameters informed by approximating variational densities via auxiliary parameters, is used to estimate intractable marginal expectations. We demonstrate that our proposed Bayesian method performs favourably against existing frequentist state of the art compositional data analysis methods. We then apply the CAVI-MC to the analysis of real data exploring the relationship of the gut microbiome to body mass index.
Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Teorema de Bayes , Modelos Lineares , Cadeias de Markov , Método de Monte CarloRESUMO
BACKGROUND: Carbapenem-resistant Klebsiella pneumoniae (CRKP) strains are of particular concern, especially strains with mobilizable carbapenemase genes such as blaKPC, blaNDM or blaOXA-48, given that carbapenems are usually the last line drugs in the ß-lactam class and, resistance to this sub-class is associated with increased mortality and frequently co-occurs with resistance to other antimicrobial classes. OBJECTIVES: To characterize the genomic diversity and international dissemination of CRKP strains from tertiary care hospitals in Lisbon, Portugal. METHODS: Twenty CRKP isolates obtained from different patients were subjected to WGS for species confirmation, typing, drug resistance gene detection and phylogenetic reconstruction. Two additional genomic datasets were included for comparative purposes: 26 isolates (ST13, ST17 and ST231) from our collection and 64 internationally available genomic assemblies (ST13). RESULTS: By imposing a 21 SNP cut-off on pairwise comparisons we identified two genomic clusters (GCs): ST13/GC1 (nâ=â11), all bearing blaKPC-3, and ST17/GC2 (nâ=â4) harbouring blaOXA-181 and blaCTX-M-15 genes. The inclusion of the additional datasets allowed the expansion of GC1/ST13/KPC-3 to 23 isolates, all exclusively from Portugal, France and the Netherlands. The phylogenetic tree reinforced the importance of the GC1/KPC-3-producing clones along with their rapid emergence and expansion across these countries. The data obtained suggest that the ST13 branch emerged over a decade ago and only more recently did it underpin a stronger pulse of transmission in the studied population. CONCLUSIONS: This study identifies an emerging OXA-181/ST17-producing strain in Portugal and highlights the ongoing international dissemination of a KPC-3/ST13-producing clone from Portugal.
Assuntos
Enterobacteriáceas Resistentes a Carbapenêmicos , Infecções por Klebsiella , Humanos , Klebsiella pneumoniae , Filogenia , Portugal/epidemiologia , beta-Lactamases/genética , Proteínas de Bactérias/genética , Carbapenêmicos , Genômica , Testes de Sensibilidade Microbiana , Infecções por Klebsiella/epidemiologia , Antibacterianos/farmacologia , Chaperonas Moleculares/genética , Proteínas Supressoras de Tumor/genéticaRESUMO
Although Plasmodium vivax parasites are the predominant cause of malaria outside of sub-Saharan Africa, they not always prioritised by elimination programmes. P. vivax is resilient and poses challenges through its ability to re-emerge from dormancy in the human liver. With observed growing drug-resistance and the increasing reports of life-threatening infections, new tools to inform elimination efforts are needed. In order to halt transmission, we need to better understand the dynamics of transmission, the movement of parasites, and the reservoirs of infection in order to design targeted interventions. The use of molecular genetics and epidemiology for tracking and studying malaria parasite populations has been applied successfully in P. falciparum species and here we sought to develop a molecular genetic tool for P. vivax. By assembling the largest set of P. vivax whole genome sequences (n = 433) spanning 17 countries, and applying a machine learning approach, we created a 71 SNP barcode with high predictive ability to identify geographic origin (91.4%). Further, due to the inclusion of markers for within population variability, the barcode may also distinguish local transmission networks. By using P. vivax data from a low-transmission setting in Malaysia, we demonstrate the potential ability to infer outbreak events. By characterising the barcoding SNP genotypes in P. vivax DNA sourced from UK travellers (n = 132) to ten malaria endemic countries predominantly not used in the barcode construction, we correctly predicted the geographic region of infection origin. Overall, the 71 SNP barcode outperforms previously published genotyping methods and when rolled-out within new portable platforms, is likely to be an invaluable tool for informing targeted interventions towards elimination of this resilient human malaria.
Assuntos
Surtos de Doenças/prevenção & controle , Genoma de Protozoário/genética , Técnicas de Genotipagem/métodos , Malária Vivax/transmissão , Plasmodium vivax/genética , África Oriental , Ásia , Conjuntos de Dados como Assunto , Erradicação de Doenças/métodos , Marcadores Genéticos/genética , Genótipo , Geografia , Humanos , Malária Vivax/epidemiologia , Malária Vivax/parasitologia , Metadados , Repetições de Microssatélites/genética , Plasmodium vivax/isolamento & purificação , Polimorfismo de Nucleotídeo Único/genética , Valor Preditivo dos Testes , América do Sul , Doença Relacionada a Viagens , Reino Unido , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: SARS-CoV-2 virus sequencing has been applied to track the COVID-19 pandemic spread and assist the development of PCR-based diagnostics, serological assays, and vaccines. With sequencing becoming routine globally, bioinformatic tools are needed to assist in the robust processing of resulting genomic data. RESULTS: We developed a web-based bioinformatic pipeline ("COVID-Profiler") that inputs raw or assembled sequencing data, displays raw alignments for quality control, annotates mutations found and performs phylogenetic analysis. The pipeline software can be applied to other (re-) emerging pathogens. CONCLUSIONS: The webserver is available at http://genomics.lshtm.ac.uk/ . The source code is available at https://github.com/jodyphelan/covid-profiler .
Assuntos
COVID-19 , SARS-CoV-2 , Genômica , Humanos , Pandemias , Filogenia , SARS-CoV-2/genéticaRESUMO
BACKGROUND: Drug resistant Mycobacterium tuberculosis is complicating the effective treatment and control of tuberculosis disease (TB). With the adoption of whole genome sequencing as a diagnostic tool, machine learning approaches are being employed to predict M. tuberculosis resistance and identify underlying genetic mutations. However, machine learning approaches can overfit and fail to identify causal mutations if they are applied out of the box and not adapted to the disease-specific context. We introduce a machine learning approach that is customized to the TB setting, which extracts a library of genomic variants re-occurring across individual studies to improve genotypic profiling. RESULTS: We developed a customized decision tree approach, called Treesist-TB, that performs TB drug resistance prediction by extracting and evaluating genomic variants across multiple studies. The application of Treesist-TB to rifampicin (RIF), isoniazid (INH) and ethambutol (EMB) drugs, for which resistance mutations are known, demonstrated a level of predictive accuracy similar to the widely used TB-Profiler tool (Treesist-TB vs. TB-Profiler tool: RIF 97.5% vs. 97.6%; INH 96.8% vs. 96.5%; EMB 96.8% vs. 95.8%). Application of Treesist-TB to less understood second-line drugs of interest, ethionamide (ETH), cycloserine (CYS) and para-aminosalisylic acid (PAS), led to the identification of new variants (52, 6 and 11, respectively), with a high number absent from the TB-Profiler library (45, 4, and 6, respectively). Thereby, Treesist-TB had improved predictive sensitivity (Treesist-TB vs. TB-Profiler tool: PAS 64.3% vs. 38.8%; CYS 45.3% vs. 30.7%; ETH 72.1% vs. 71.1%). CONCLUSION: Our work reinforces the utility of machine learning for drug resistance prediction, while highlighting the need to customize approaches to the disease-specific context. Through applying a modified decision learning approach (Treesist-TB) across a range of anti-TB drugs, we identified plausible resistance-encoding genomic variants with high predictive ability, whilst potentially overcoming the overfitting challenges that can affect standard machine learning applications.
Assuntos
Farmacorresistência Bacteriana Múltipla/genética , Mycobacterium tuberculosis , Antituberculosos/farmacologia , Árvores de Decisões , Humanos , Testes de Sensibilidade Microbiana , Mutação , Mycobacterium tuberculosis/genética , Tuberculose Resistente a Múltiplos Medicamentos/diagnóstico , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológicoRESUMO
BACKGROUND: Second-line drug resistance (SLD) among tuberculosis (TB) patients is a serious emerging challenge towards global control of the disease. We characterized SLD-resistance conferring-mutations among TB patients with rifampicin and/or isoniazid (RIF and/or INH) drug-resistance tested at the Uganda National TB Reference Laboratory (NTRL) between June 2017 and December 2019. METHODS: This was a descriptive cross-sectional secondary data analysis of 20,508 M. tuberculosis isolates of new and previously treated patients' resistant to RIF and/or INH. DNA strips with valid results to characterise the SLD resistance using the commercial Line Probe Assay Genotype MTBDRsl Version 2.0 Assay (Hain Life Science, Nehren, Germany) were reviewed. Data were analysed with STATAv15 using cross-tabulation for frequency and proportions of known resistance-conferring mutations to injectable agents (IA) and fluoroquinolones (FQ). RESULTS: Among the eligible participants, 12,993/20,508 (63.4%) were male and median (IQR) age 32 (24-43). A total of 576/20,508 (2.8%) of the M. tuberculosis isolates from participants had resistance to RIF and/or INH. These included; 102/576 (17.7%) single drug-resistant and 474/576 (82.3%) multidrug-resistant (MDR) strains. Only 102 patients had test results for FQ of whom 70/102 (68.6%) and 01/102 (0.98%) had resistance-conferring mutations in the gyrA locus and gyrB locus respectively. Among patients with FQ resistance, gyrAD94G 42.6% (30.0-55.9) and gyrA A90V 41.1% (28.6-54.3) mutations were most observed. Only one mutation, E540D was detected in the gyrB locus. A total of 26 patients had resistance-conferring mutations to IA in whom, 20/26 77.0% (56.4-91.0) had A1401G mutation in the rrs gene locus. CONCLUSIONS: Our study reveals a high proportion of mutations known to confer high-level fluoroquinolone drug-resistance among patients with rifampicin and/or isoniazid drug resistance. Utilizing routinely generated laboratory data from existing molecular diagnostic methods may aid real-time surveillance of emerging tuberculosis drug-resistance in resource-limited settings.
Assuntos
Mycobacterium tuberculosis , Tuberculose Resistente a Múltiplos Medicamentos , Adulto , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Estudos Transversais , Farmacorresistência Bacteriana Múltipla/genética , Feminino , Fluoroquinolonas/uso terapêutico , Humanos , Isoniazida/farmacologia , Isoniazida/uso terapêutico , Masculino , Testes de Sensibilidade Microbiana , Mutação , Rifampina/farmacologia , Rifampina/uso terapêutico , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Tuberculose Resistente a Múltiplos Medicamentos/epidemiologia , Uganda/epidemiologia , Adulto JovemRESUMO
Multidrug-resistant tuberculosis (MDR TB), pre-extensively drug-resistant tuberculosis (pre-XDR TB), and extensively drug-resistant tuberculosis (XDR TB) complicate disease control. We analyzed whole-genome sequence data for 579 phenotypically drug-resistant M. tuberculosis isolates (28% of available MDR/pre-XDR and all culturable XDR TB isolates collected in Thailand during 2014-2017). Most isolates were from lineage 2 (n = 482; 83.2%). Cluster analysis revealed that 281/579 isolates (48.5%) formed 89 clusters, including 205 MDR TB, 46 pre-XDR TB, 19 XDR TB, and 11 poly-drug-resistant TB isolates based on genotypic drug resistance. Members of most clusters had the same subset of drug resistance-associated mutations, supporting potential primary resistance in MDR TB (n = 176/205; 85.9%), pre-XDR TB (n = 29/46; 63.0%), and XDR TB (n = 14/19; 73.7%). Thirteen major clades were significantly associated with geography (p<0.001). Clusters of clonal origin contribute greatly to the high prevalence of drug-resistant TB in Thailand.
Assuntos
Mycobacterium tuberculosis , Preparações Farmacêuticas , Tuberculose Resistente a Múltiplos Medicamentos , Antituberculosos/uso terapêutico , Farmacorresistência Bacteriana Múltipla , Humanos , Testes de Sensibilidade Microbiana , Análise de Sequência , Tailândia , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológicoRESUMO
Tuberculosis disease is a major global public health concern and the growing prevalence of drug-resistant Mycobacterium tuberculosis is making disease control more difficult. However, the increasing application of whole-genome sequencing as a diagnostic tool is leading to the profiling of drug resistance to inform clinical practice and treatment decision making. Computational approaches for identifying established and novel resistance-conferring mutations in genomic data include genome-wide association study (GWAS) methodologies, tests for convergent evolution and machine learning techniques. These methods may be confounded by extensive co-occurrent resistance, where statistical models for a drug include unrelated mutations known to be causing resistance to other drugs. Here, we introduce a novel 'cannibalistic' elimination algorithm ("Hungry, Hungry SNPos") that attempts to remove these co-occurrent resistant variants. Using an M. tuberculosis genomic dataset for the virulent Beijing strain-type (n = 3,574) with phenotypic resistance data across five drugs (isoniazid, rifampicin, ethambutol, pyrazinamide, and streptomycin), we demonstrate that this new approach is considerably more robust than traditional methods and detects resistance-associated variants too rare to be likely picked up by correlation-based techniques like GWAS.
Assuntos
Farmacorresistência Bacteriana Múltipla/genética , Mycobacterium tuberculosis/genética , Mutação Puntual , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , Algoritmos , Antituberculosos/farmacologia , Genes Bacterianos , Marcadores Genéticos , Estudo de Associação Genômica Ampla , Aprendizado de Máquina , Testes de Sensibilidade Microbiana , Modelos Biológicos , Mycobacterium tuberculosis/efeitos dos fármacos , Filogenia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
BACKGROUND: Malaria, caused by Plasmodium parasites, is a major global public health problem. To assist an understanding of malaria pathogenesis, including drug resistance, there is a need for the timely detection of underlying genetic mutations and their spread. With the increasing use of whole-genome sequencing (WGS) of Plasmodium DNA, the potential of deep learning models to detect loci under recent positive selection, historically signals of drug resistance, was evaluated. METHODS: A deep learning-based approach (called "DeepSweep") was developed, which can be trained on haplotypic images from genetic regions with known sweeps, to identify loci under positive selection. DeepSweep software is available from https://github.com/WDee/Deepsweep . RESULTS: Using simulated genomic data, DeepSweep could detect recent sweeps with high predictive accuracy (areas under ROC curve > 0.95). DeepSweep was applied to Plasmodium falciparum (n = 1125; genome size 23 Mbp) and Plasmodium vivax (n = 368; genome size 29 Mbp) WGS data, and the genes identified overlapped with two established extended haplotype homozygosity methods (within-population iHS, across-population Rsb) (~ 60-75% overlap of hits at P < 0.0001). DeepSweep hits included regions proximal to known drug resistance loci for both P. falciparum (e.g. pfcrt, pfdhps and pfmdr1) and P. vivax (e.g. pvmrp1). CONCLUSION: The deep learning approach can detect positive selection signatures in malaria parasite WGS data. Further, as the approach is generalizable, it may be trained to detect other types of selection. With the ability to rapidly generate WGS data at low cost, machine learning approaches (e.g. DeepSweep) have the potential to assist parasite genome-based surveillance and inform malaria control decision-making.
Assuntos
Aprendizado Profundo/estatística & dados numéricos , Tamanho do Genoma , Genoma de Protozoário , Plasmodium falciparum/genética , Plasmodium vivax/genética , Seleção Genética , Análise de Sequência de DNARESUMO
BACKGROUND: Tuberculosis (TB), particularly multi- and or extensive drug resistant TB, is still a global medical emergency. Whole genome sequencing (WGS) is a current alternative to the WHO-approved probe-based methods for TB diagnosis and detection of drug resistance, genetic diversity and transmission dynamics of Mycobacterium tuberculosis complex (MTBC). This study compared WGS and clinical data in participants with TB. RESULTS: This cohort study performed WGS on 87 from MTBC DNA isolates, 57 (66%) and 30 (34%) patients with drug resistant and susceptible TB, respectively. Drug resistance was determined by Xpert® MTB/RIF assay and phenotypic culture-based drug-susceptibility-testing (DST). WGS and bioinformatics data that predict phenotypic resistance to anti-TB drugs were compared with participant's clinical outcomes. They were 47 female participants (54%) and the median age was 35 years (IQR): 29-44). Twenty (23%) and 26 (30%) of participants had TB/HIV co-infection BMI < 18 kg/m2 respectively. MDR-TB participants had MTBC with multiple mutant genes, compared to those with mono or polyresistant TB, and the majority belonged to lineage 3 Central Asian Strain (CAS). Also, MDR-TB was associated with delayed culture-conversion (median: IQR (83: 60-180 vs. 51:30-66) days). WGS had high concordance with both culture-based DST and Xpert® MTB/RIF assay in detecting drug resistance (kappa = 1.00). CONCLUSION: This study offers comparison of mutations detected by Xpert and WGS with phenotypic DST of M. tuberculosis isolates in Tanzania. The high concordance between the different methods and further insights provided by WGS such as PZA-DST, which is not routinely performed in most resource-limited-settings, provides an avenue for inclusion of WGS into diagnostic matrix of TB including drug-resistant TB.
Assuntos
Antituberculosos/uso terapêutico , Farmacorresistência Bacteriana Múltipla/genética , Mutação , Mycobacterium tuberculosis/genética , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Adulto , Estudos de Coortes , Feminino , Humanos , Masculino , Mycobacterium tuberculosis/fisiologia , Tanzânia , Resultado do Tratamento , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , Sequenciamento Completo do GenomaRESUMO
SUMMARY: Recombinase polymerase amplification (RPA), an isothermal nucleic acid amplification method, is enhancing our ability to detect a diverse array of pathogens, thereby assisting the diagnosis of infectious diseases and the detection of microorganisms in food and water. However, new bioinformatics tools are needed to automate and improve the design of the primers and probes sets to be used in RPA, particularly to account for the high genetic diversity of circulating pathogens and cross detection of genetically similar organisms. PrimedRPA is a python-based package that automates the creation and filtering of RPA primers and probe sets. It aligns several sequences to identify conserved targets, and filters regions that cross react with possible background organisms. AVAILABILITY AND IMPLEMENTATION: PrimedRPA was implemented in Python 3 and supported on Linux and MacOS and is freely available from http://pathogenseq.lshtm.ac.uk/PrimedRPA.html. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Primers do DNA , Técnicas de Amplificação de Ácido Nucleico , Recombinases , Software , Biologia ComputacionalRESUMO
BACKGROUND: Continuing evolution of the Mycobacterium tuberculosis (Mtb) complex genomes associated with resistance to anti-tuberculosis drugs is threatening tuberculosis disease control efforts. Both multi- and extensively drug resistant Mtb (MDR and XDR, respectively) are increasing in prevalence, but the full set of Mtb genes involved are not known. There is a need for increased sensitivity of genome-wide approaches in order to elucidate the genetic basis of anti-microbial drug resistance and gain a more detailed understanding of Mtb genome evolution in a context of widespread antimicrobial therapy. Population structure within the Mtb complex, due to clonal expansion, lack of lateral gene transfer and low levels of recombination between lineages, may be reducing statistical power to detect drug resistance associated variants. RESULTS: To investigate the effect of lineage-specific effects on the identification of drug resistance associations, we applied genome-wide association study (GWAS) and convergence-based (PhyC) methods to multiple drug resistance phenotypes of a global dataset of Mtb lineages 2 and 4, using both lineage-wise and combined approaches. We identify both well-established drug resistance variants and novel associations; uniquely identifying associations for both lineage-specific and -combined GWAS analyses. We report 17 potential novel associations between antimicrobial resistance phenotypes and Mtb genomic variants. CONCLUSIONS: For GWAS, both lineage-specific and -combined analyses are useful, whereas PhyC may perform better in contexts of greater diversity. Unique associations with XDR in lineage-specific analyses provide evidence of diverging evolutionary trajectories between lineages 2 and 4 in response to antimicrobial drug therapy.
Assuntos
Estudo de Associação Genômica Ampla/métodos , Mycobacterium tuberculosis/genética , Polimorfismo Genético , Tuberculose Resistente a Múltiplos Medicamentos , Proteínas de Bactérias/genética , Farmacorresistência Bacteriana Múltipla , Evolução Molecular , Transferência Genética Horizontal , Testes de Sensibilidade Microbiana , Mycobacterium tuberculosis/efeitos dos fármacos , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: Mixed, polyclonal Mycobacterium tuberculosis infection occurs in natural populations. Developing an effective method for detecting such cases is important in measuring the success of treatment and reconstruction of transmission between patients. Using whole genome sequence (WGS) data, we assess two methods for detecting mixed infection: (i) a combination of the number of heterozygous sites and the proportion of heterozygous sites to total SNPs, and (ii) Bayesian model-based clustering of allele frequencies from sequencing reads at heterozygous sites. RESULTS: In silico and in vitro artificially mixed and known pure M. tuberculosis samples were analysed to determine the specificity and sensitivity of each method. We found that both approaches were effective in distinguishing between pure strains and mixed infection where there was relatively high (> 10%) proportion of a minor strain in the mixture. A large dataset of clinical isolates (n = 1963) from the Karonga Prevention Study in Northern Malawi was tested to examine correlations with patient characteristics and outcomes with mixed infection. The frequency of mixed infection in the population was found to be around 10%, with an association with year of diagnosis, but no association with age, sex, HIV status or previous tuberculosis. CONCLUSIONS: Mixed Mycobacterium tuberculosis infection was identified in silico using whole genome sequence data. The methods presented here can be applied to population-wide analyses of tuberculosis to estimate the frequency of mixed infection, and to identify individual cases of mixed infections. These cases are important when considering the evolution and transmission of the disease, and in patient treatment.
Assuntos
Mycobacterium tuberculosis/classificação , Mycobacterium tuberculosis/genética , Análise de Sequência de DNA/métodos , Tuberculose/diagnóstico , Sequenciamento Completo do Genoma/métodos , Adolescente , Adulto , Teorema de Bayes , DNA Bacteriano , Feminino , Genoma Bacteriano , Humanos , Masculino , Pessoa de Meia-Idade , Mycobacterium tuberculosis/isolamento & purificação , Polimorfismo de Nucleotídeo Único , Tuberculose/genética , Tuberculose/microbiologia , Adulto JovemRESUMO
BACKGROUND: The ongoing epidemic of multidrug-resistant tuberculosis (MDR-TB) in Georgia highlights the need for more effective control strategies. A new regimen to treat MDR-TB that includes pyrazinamide (PZA) is currently being evaluated and PZA resistance status will largely influence the success of current and future treatment strategies. PZA susceptibility testing was not routinely performed at the National Reference Laboratory (NRL) in Tbilisi between 2010 and September 2015. We here provide a first insight into the prevalence of PZA resistant TB in this region. METHODS: Phenotypic susceptibility to PZA was determined in a convenience collection of well-characterised TB patient isolates collected at the NRL in Tbilisi between 2012 and 2013. In addition, the pncA gene was sequenced and whole genome sequencing was performed on two isolates. RESULTS: Out of 57 isolates tested 33 (57.9%) showed phenotypic drug resistance to PZA and had a single pncA mutation. All of these 33 isolates were MDR-TB strains. pncA mutations were absent in all but one of the 24 PZA susceptible isolate. In total we found 18 polymorphisms in the pncA gene. From the two major MDR-TB clusters represented (94-32 and 100-32), 10 of 15, 67.0% and 13 of 14, 93.0% strains, respectively were PZA resistant. We also identified a member of the potentially highly transmissive clade A strain carrying the characteristic I6L substitution in PncA. Another strain with the same MLVA type as the clade A strain acquired a different mutation in pncA and was genetically more distantly related suggesting that different branches of this particular lineage have been introduced into this region. CONCLUSION: In this high MDR-TB setting more than half of the tested MDR-TB isolates were resistant to PZA. As PZA is part of current and planned MDR-TB treatment regimens this is alarming and deserves the attention of health authorities. Based on our typing and sequence analysis results we conclude that PZA resistance is the result of primary transmission as well as acquisition within the patient and recommend prospective genotyping and PZA resistance testing in high MDR-TB settings. This is of utmost importance in order to preserve bacterial susceptibility to PZA to help protect (new) second line drugs in PZA containing regimens.
Assuntos
Amidoidrolases/genética , Mutação , Mycobacterium tuberculosis/efeitos dos fármacos , Pirazinamida/farmacologia , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia , Antituberculosos/farmacologia , Antituberculosos/uso terapêutico , Genótipo , República da Geórgia/epidemiologia , Humanos , Testes de Sensibilidade Microbiana , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/isolamento & purificação , Prevalência , Estudos Prospectivos , Pirazinamida/uso terapêutico , Tuberculose Resistente a Múltiplos Medicamentos/epidemiologia , Tuberculose Resistente a Múltiplos Medicamentos/transmissãoRESUMO
BACKGROUND: Approximately 10% of the Mycobacterium tuberculosis genome is made up of two families of genes that are poorly characterized due to their high GC content and highly repetitive nature. The PE and PPE families are typified by their highly conserved N-terminal domains that incorporate proline-glutamate (PE) and proline-proline-glutamate (PPE) signature motifs. They are hypothesised to be important virulence factors involved with host-pathogen interactions, but their high genetic variability and complexity of analysis means they are typically disregarded in genome studies. RESULTS: To elucidate the structure of these genes, 518 genomes from a diverse international collection of clinical isolates were de novo assembled. A further 21 reference M. tuberculosis complex genomes and long read sequence data were used to validate the approach. SNP analysis revealed that variation in the majority of the 168 pe/ppe genes studied was consistent with lineage. Several recombination hotspots were identified, notably pe_pgrs3 and pe_pgrs17. Evidence of positive selection was revealed in 65 pe/ppe genes, including epitopes potentially binding to major histocompatibility complex molecules. CONCLUSIONS: This, the first comprehensive study of the pe and ppe genes, provides important insight into M. tuberculosis diversity and has significant implications for vaccine development.