Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
1.
Nucleic Acids Res ; 51(D1): D690-D699, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36263822

ABSTRACT

The Comprehensive Antibiotic Resistance Database (CARD; card.mcmaster.ca) combines the Antibiotic Resistance Ontology (ARO) with curated AMR gene (ARG) sequences and resistance-conferring mutations to provide an informatics framework for annotation and interpretation of resistomes. As of version 3.2.4, CARD encompasses 6627 ontology terms, 5010 reference sequences, 1933 mutations, 3004 publications, and 5057 AMR detection models that can be used by the accompanying Resistance Gene Identifier (RGI) software to annotate genomic or metagenomic sequences. Focused curation enhancements since 2020 include expanded ß-lactamase curation, incorporation of likelihood-based AMR mutations for Mycobacterium tuberculosis, addition of disinfectants and antiseptics plus their associated ARGs, and systematic curation of resistance-modifying agents. This expanded curation includes 180 new AMR gene families, 15 new drug classes, 1 new resistance mechanism, and two new ontological relationships: evolutionary_variant_of and is_small_molecule_inhibitor. In silico prediction of resistomes and prevalence statistics of ARGs has been expanded to 377 pathogens, 21,079 chromosomes, 2,662 genomic islands, 41,828 plasmids and 155,606 whole-genome shotgun assemblies, resulting in collation of 322,710 unique ARG allele sequences. New features include the CARD:Live collection of community submitted isolate resistome data and the introduction of standardized 15 character CARD Short Names for ARGs to support machine learning efforts.


Subject(s)
Data Curation , Databases, Factual , Drug Resistance, Microbial , Machine Learning , Anti-Bacterial Agents/pharmacology , Genes, Bacterial , Likelihood Functions , Software , Molecular Sequence Annotation
2.
Clin Microbiol Rev ; 29(4): 881-913, 2016 10.
Article in English | MEDLINE | ID: mdl-28590251

ABSTRACT

The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.


Subject(s)
Communicable Diseases/microbiology , Genome, Bacterial/genetics , Genomics , High-Throughput Nucleotide Sequencing , Humans
3.
Bioinformatics ; 32(8): 1275-7, 2016 04 15.
Article in English | MEDLINE | ID: mdl-26656932

ABSTRACT

MOTIVATION: There are various reasons for rerunning bioinformatics tools and pipelines on sequencing data, including reproducing a past result, validation of a new tool or workflow using a known dataset, or tracking the impact of database changes. For identical results to be achieved, regularly updated reference sequence databases must be versioned and archived. Database administrators have tried to fill the requirements by supplying users with one-off versions of databases, but these are time consuming to set up and are inconsistent across resources. Disk storage and data backup performance has also discouraged maintaining multiple versions of databases since databases such as NCBI nr can consume 50 Gb or more disk space per version, with growth rates that parallel Moore's law. RESULTS: Our end-to-end solution combines our own Kipper software package-a simple key-value large file versioning system-with BioMAJ (software for downloading sequence databases), and Galaxy (a web-based bioinformatics data processing platform). Available versions of databases can be recalled and used by command-line and Galaxy users. The Kipper data store format makes publishing curated FASTA databases convenient since in most cases it can store a range of versions into a file marginally larger than the size of the latest version. AVAILABILITY AND IMPLEMENTATION: Kipper v1.0.0 and the Galaxy Versioned Data tool are written in Python and released as free and open source software available at https://github.com/Public-Health-Bioinformatics/kipper and https://github.com/Public-Health-Bioinformatics/versioned_data, respectively; detailed setup instructions can be found at https://github.com/Public-Health-Bioinformatics/versioned_data/blob/master/doc/setup.md CONTACT: : Damion.Dooley@Bccdc.Ca or William.Hsiao@Bccdc.CaSupplementary information: Supplementary data are available at Bioinformatics online.


Subject(s)
Computational Biology , Databases, Nucleic Acid , Software , User-Computer Interface
4.
BMC Genomics ; 17(1): 990, 2016 12 03.
Article in English | MEDLINE | ID: mdl-27912729

ABSTRACT

BACKGROUND: Whole genome sequencing (WGS) is useful for determining clusters of human cases, investigating outbreaks, and defining the population genetics of bacteria. It also provides information about other aspects of bacterial biology, including classical typing results, virulence, and adaptive strategies of the organism. Cell culture invasion and protein expression patterns of four related multilocus sequence type 21 (ST21) C. jejuni isolates from a significant Canadian water-borne outbreak were previously associated with the presence of a CJIE1 prophage. Whole genome sequencing was used to examine the genetic diversity among these isolates and confirm that previous observations could be attributed to differential prophage carriage. Moreover, we sought to determine the presence of genome sequences that could be used as surrogate markers to delineate outbreak-associated isolates. RESULTS: Differential carriage of the CJIE1 prophage was identified as the major genetic difference among the four outbreak isolates. High quality single-nucleotide variant (hqSNV) and core genome multilocus sequence typing (cgMLST) clustered these isolates within expanded datasets consisting of additional C. jejuni strains. The number and location of homopolymeric tract regions was identical in all four outbreak isolates but differed from all other C. jejuni examined. Comparative genomics and PCR amplification enabled the identification of large chromosomal inversions of approximately 93 kb and 388 kb within the outbreak isolates associated with transducer-like proteins containing long nucleotide repeat sequences. The 93-kb inversion was characteristic of the outbreak-associated isolates, and the gene content of this inverted region displayed high synteny with the reference strain. CONCLUSIONS: The four outbreak isolates were clonally derived and differed mainly in the presence of the CJIE1 prophage, validating earlier findings linking the prophage to phenotypic differences in virulence assays and protein expression. The identification of large, genetically syntenous chromosomal inversions in the genomes of outbreak-associated isolates provided a unique method for discriminating outbreak isolates from the background population. Transducer-like proteins appear to be associated with the chromosomal inversions. CgMLST and hqSNV analysis also effectively delineated the outbreak isolates within the larger C. jejuni population structure.


Subject(s)
Campylobacter Infections/microbiology , Campylobacter jejuni/genetics , Genome, Bacterial , Genomics , Campylobacter Infections/epidemiology , Campylobacter jejuni/classification , Campylobacter jejuni/isolation & purification , Campylobacter jejuni/virology , Canada/epidemiology , Chromosome Inversion , Chromosomes, Bacterial , Disease Outbreaks , Genetic Variation , Genomics/methods , High-Throughput Nucleotide Sequencing , Humans , Multilocus Sequence Typing , Phylogeny , Plasmids/genetics , Prophages/genetics , Sequence Analysis, DNA
5.
Microorganisms ; 10(2)2022 Jan 26.
Article in English | MEDLINE | ID: mdl-35208747

ABSTRACT

Whole genome sequencing (WGS) of Salmonella supports both molecular typing and detection of antimicrobial resistance (AMR). Here, we evaluated the correlation between phenotypic antimicrobial susceptibility testing (AST) and in silico prediction of AMR from WGS in Salmonella enterica (n = 1321) isolated from human infections in Canada. Phenotypic AMR results from broth microdilution testing were used as the gold standard. To facilitate high-throughput prediction of AMR from genome assemblies, we created a tool called Staramr, which incorporates the ResFinder and PointFinder databases and a custom gene-drug key for antibiogram prediction. Overall, there was 99% concordance between phenotypic and genotypic detection of categorical resistance for 14 antimicrobials in 1321 isolates (18,305 of 18,494 results in agreement). We observed an average sensitivity of 91.2% (range 80.5-100%), a specificity of 99.7% (98.6-100%), a positive predictive value of 95.4% (68.2-100%), and a negative predictive value of 99.1% (95.6-100%). The positive predictive value of gentamicin was 68%, due to seven isolates that carried aac(3)-IVa, which conferred MICs just below the breakpoint of resistance. Genetic mechanisms of resistance in these 1321 isolates included 64 unique acquired alleles and mutations in three chromosomal genes. In general, in silico prediction of AMR in Salmonella was reliable compared to the gold standard of broth microdilution. WGS can provide higher-resolution data on the epidemiology of resistance mechanisms and the emergence of new resistance alleles.

6.
Emerg Infect Dis ; 17(11): 2113-21, 2011 Nov.
Article in English | MEDLINE | ID: mdl-22099115

ABSTRACT

Cholera was absent from the island of Hispaniola at least a century before an outbreak that began in Haiti in the fall of 2010. Pulsed-field gel electrophoresis (PFGE) analysis of clinical isolates from the Haiti outbreak and recent global travelers returning to the United States showed indistinguishable PFGE fingerprints. To better explore the genetic ancestry of the Haiti outbreak strain, we acquired 23 whole-genome Vibrio cholerae sequences: 9 isolates obtained in Haiti or the Dominican Republic, 12 PFGE pattern-matched isolates linked to Asia or Africa, and 2 nonmatched outliers from the Western Hemisphere. Phylogenies for whole-genome sequences and core genome single-nucleotide polymorphisms showed that the Haiti outbreak strain is genetically related to strains originating in India and Cameroon. However, because no identical genetic match was found among sequenced contemporary isolates, a definitive genetic origin for the outbreak in Haiti remains speculative.


Subject(s)
Genome, Bacterial , Vibrio cholerae/genetics , Africa/epidemiology , Alleles , Asia/epidemiology , Bacterial Proteins/genetics , Cholera/epidemiology , Cholera Toxin/genetics , Disease Outbreaks , Electrophoresis, Gel, Pulsed-Field , Gene Order , Haiti/epidemiology , Humans , Interspersed Repetitive Sequences/genetics , Phylogeny , Prophages , Sequence Homology, Amino Acid , Vibrio cholerae/classification , Vibrio cholerae/isolation & purification
7.
Bioinformatics ; 26(24): 3125-6, 2010 Dec 15.
Article in English | MEDLINE | ID: mdl-20956244

ABSTRACT

SUMMARY: GView is a Java application for viewing and examining prokaryotic genomes in a circular or linear context. It accepts standard sequence file formats and an optional style specification file to generate customizable, publication quality genome maps in bitmap and scalable vector graphics formats. GView features an interactive pan-and-zoom interface, a command-line interface for incorporation in genome analysis pipelines, and a public Application Programming Interface for incorporation in other Java applications. AVAILABILITY: GView is a freely available application licensed under the GNU Public License. The application, source code, documentation, file specifications, tutorials and image galleries are available at http://gview.ca.


Subject(s)
Computer Graphics , Genome, Bacterial , Software , Genomics , Roseobacter/genetics , User-Computer Interface
9.
Sci Rep ; 10(1): 3937, 2020 03 03.
Article in English | MEDLINE | ID: mdl-32127598

ABSTRACT

For a One-Health investigation of antimicrobial resistance (AMR) in Enterococcus spp., isolates from humans and beef cattle along with abattoirs, manured fields, natural streams, and wastewater from both urban and cattle feedlot sources were collected over two years. Species identification of Enterococcus revealed distinct associations across the continuum. Of the 8430 isolates collected, Enterococcus faecium and Enterococcus faecalis were the main species in urban wastewater (90%) and clinical human isolates (99%); Enterococcus hirae predominated in cattle (92%) and feedlot catch-basins (60%), whereas natural streams harbored environmental Enterococcus spp. Whole-genome sequencing of E. faecalis (n = 366 isolates) and E. faecium (n = 342 isolates), revealed source clustering of isolates, indicative of distinct adaptation to their respective environments. Phenotypic resistance to tetracyclines and macrolides encoded by tet(M) and erm(B) respectively, was prevalent among Enterococcus spp. regardless of source. For E. faecium from cattle, resistance to ß-lactams and quinolones was observed among 3% and 8% of isolates respectively, compared to 76% and 70% of human clinical isolates. Clinical vancomycin-resistant E. faecium exhibited high rates of multi-drug resistance, with resistance to all ß-lactam, macrolides, and quinolones tested. Differences in the AMR profiles among isolates reflected antimicrobial use practices in each sector of the One-Health continuum.


Subject(s)
Anti-Bacterial Agents/pharmacology , Enterococcus/pathogenicity , Drug Resistance, Bacterial/genetics , Enterococcus/drug effects , Enterococcus/genetics , Enterococcus faecalis/drug effects , Enterococcus faecalis/genetics , Enterococcus faecalis/pathogenicity , Enterococcus faecium/drug effects , Enterococcus faecium/genetics , Enterococcus faecium/pathogenicity , Humans , Macrolides/pharmacology , Phylogeny , Quinolones/pharmacology , Tetracyclines/pharmacology , Virulence , Whole Genome Sequencing , beta-Lactam Resistance/genetics
10.
Int J Mycobacteriol ; 8(3): 273-280, 2019.
Article in English | MEDLINE | ID: mdl-31512604

ABSTRACT

Background: Mycobacterium abscessus is a rapid growing nontuberculous mycobacteria (NTM) and a clinically significant pathogen capable of causing variable infections in humans that are difficult to treat and may require months of therapy/surgical interventions. Like other NTMs, M. abscessus can be associated with outbreaks leading to complex investigations and treatment of affected cases. Typing schemes for bacterial pathogens provide numerous applications; including identifying chain of transmission and tracking genomic evolution, are lacking or limited for many NTMs including M. abscessus. Methods: We extended the existing scheme from PubMLST using whole-genome data for M. abscessus by extracting data for 15 genetic regions within the M. abscessus genome. A total of 168 whole genomes and 11 gene sequences were used to build this scheme (MAB-multilocus sequence typing [MLST]). Results: All seven genes from the PubMLST scheme, namely argH, cya, gnd, murC, pta, purH, and rpoB, were expanded by 10, 14, 13, 10, 13, 10, and 9 alleles, respectively. Another eight novel genes were added including hsp 65, erm(41), arr, rrs, rrl, gyrA, gyrB, and recA with 16, 16, 25, 7, 32, 35, 29, and 15 alleles, respectively, with 85 unique sequence types identified among all isolates. Conclusion: MAB-MLST can provide identification of M. abscessus complex to the subspecies level based on three genes and can provide antimicrobial resistance susceptibility prediction based on results from seven genes. MAB-MLST generated a total of 85 STs, resulting in subtyping of 90 additional isolates that could not be genotyped using PubMLST and yielding results comparable to whole-genome sequencing (WGS). Implementation of a Galaxy-based data analysis tool, MAB-MLST, that simplifies the WGS data and yet maintains a high discriminatory index that can aid in deciphering an outbreak has vast applicability for routine diagnostics.


Subject(s)
Multilocus Sequence Typing , Mycobacterium abscessus/classification , Whole Genome Sequencing , Bacterial Proteins/genetics , Bacterial Typing Techniques , DNA, Bacterial , Drug Resistance, Multiple, Bacterial/genetics , Genome, Bacterial , Genotype , Mycobacterium Infections, Nontuberculous/microbiology , Mycobacterium abscessus/genetics , Phylogeny , Sequence Analysis, DNA
11.
Microb Genom ; 5(1)2019 01.
Article in English | MEDLINE | ID: mdl-30648944

ABSTRACT

The persuasiveness of genomic evidence has pressured scientific agencies to supplement or replace well-established methodologies to inform public health and food safety decision-making. This study of 52 epidemiologically defined Listeria monocytogenes isolates, collected between 1981 and 2011, including nine outbreaks, was undertaken (1) to characterize their phylogenetic relationship at finished genome-level resolution, (2) to elucidate the underlying genetic diversity within an endemic subtype, CC8, and (3) to re-evaluate the genetic relationship and epidemiology of a CC8-delimited outbreak in Canada in 2008. Genomes representing Canadian Listeria outbreaks between 1981 and 2010 were closed and manually annotated. Single nucleotide variants (SNVs) and horizontally acquired traits were used to generate phylogenomic models. Phylogenomic relationships were congruent with classical subtyping and epidemiology, except for CC8 outbreaks, wherein the distribution of SNV and prophages revealed multiple co-evolving lineages. Chronophyletic reconstruction of CC8 evolution indicates that prophage-related genetic changes among CC8 strains manifest as PFGE subtype reversions, obscuring the relationship between CC8 isolates, and complicating the public health interpretation of subtyping data, even at maximum genome resolution. The size of the shared genome interrogated did not change the genetic relationship measured between highly related isolates near the tips of the phylogenetic tree, illustrating the robustness of these approaches for routine public health applications where the focus is recent ancestry. The possibility exists for temporally and epidemiologically distinct events to appear related even at maximum genome resolution, highlighting the continued importance of epidemiological evidence.


Subject(s)
Databases, Nucleic Acid , Genome, Bacterial , Listeria monocytogenes/genetics , Listeriosis/genetics , Phylogeny , Prophages/genetics , Sequence Analysis, DNA , Canada , DNA, Bacterial/genetics , Disease Outbreaks , Humans , Listeriosis/epidemiology
12.
Front Microbiol ; 8: 375, 2017.
Article in English | MEDLINE | ID: mdl-28348549

ABSTRACT

Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized genomic profiles, its central database, and its ability to be run in a graphical user interface. However, creating a functional wgMLST scheme requires extended up-front development and subject-matter expertise. When a scheme does not exist or when the highest resolution is needed, SNP analysis is used. Using three Listeria outbreak data sets, we demonstrated the concordance between Lyve-SET SNP typing and wgMLST. Availability: Lyve-SET can be found at https://github.com/lskatz/Lyve-SET.

13.
Microb Genom ; 3(6): e000116, 2017 06 30.
Article in English | MEDLINE | ID: mdl-29026651

ABSTRACT

The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research laboratory and into the front lines of surveillance and outbreak response requires user-friendly, reproducible and scalable pipelines that have been well validated. Single Nucleotide Variant Phylogenomics (SNVPhyl) is a bioinformatics pipeline for identifying high-quality single-nucleotide variants (SNVs) and constructing a whole-genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity, and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds or in the presence of contamination. SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.


Subject(s)
Computational Biology , Disease Outbreaks , Genome, Microbial , Infections , Phylogeny , Whole Genome Sequencing , Workflow , Humans , Infections/epidemiology , Infections/genetics , Infections/microbiology
14.
mBio ; 4(4)2013 Jul 02.
Article in English | MEDLINE | ID: mdl-23820394

ABSTRACT

UNLABELLED: Prior to the epidemic that emerged in Haiti in October of 2010, cholera had not been documented in this country. After its introduction, a strain of Vibrio cholerae O1 spread rapidly throughout Haiti, where it caused over 600,000 cases of disease and >7,500 deaths in the first two years of the epidemic. We applied whole-genome sequencing to a temporal series of V. cholerae isolates from Haiti to gain insight into the mode and tempo of evolution in this isolated population of V. cholerae O1. Phylogenetic and Bayesian analyses supported the hypothesis that all isolates in the sample set diverged from a common ancestor within a time frame that is consistent with epidemiological observations. A pangenome analysis showed nearly homogeneous genomic content, with no evidence of gene acquisition among Haiti isolates. Nine nearly closed genomes assembled from continuous-long-read data showed evidence of genome rearrangements and supported the observation of no gene acquisition among isolates. Thus, intrinsic mutational processes can account for virtually all of the observed genetic polymorphism, with no demonstrable contribution from horizontal gene transfer (HGT). Consistent with this, the 12 Haiti isolates tested by laboratory HGT assays were severely impaired for transformation, although unlike previously characterized noncompetent V. cholerae isolates, each expressed hapR and possessed a functional quorum-sensing system. Continued monitoring of V. cholerae in Haiti will illuminate the processes influencing the origin and fate of genome variants, which will facilitate interpretation of genetic variation in future epidemics. IMPORTANCE: Vibrio cholerae is the cause of substantial morbidity and mortality worldwide, with over three million cases of disease each year. An understanding of the mode and rate of evolutionary change is critical for proper interpretation of genome sequence data and attribution of outbreak sources. The Haiti epidemic provides an unprecedented opportunity to study an isolated, single-source outbreak of Vibrio cholerae O1 over an established time frame. By using multiple approaches to assay genetic variation, we found no evidence that the Haiti strain has acquired any genes by horizontal gene transfer, an observation that led us to discover that it is also poorly transformable. We have found no evidence that environmental strains have played a role in the evolution of the outbreak strain.


Subject(s)
Cholera/epidemiology , Cholera/microbiology , Epidemics , Evolution, Molecular , Genome, Bacterial , Vibrio cholerae O1/genetics , Vibrio cholerae O1/isolation & purification , DNA, Bacterial/chemistry , DNA, Bacterial/genetics , Gene Order , Haiti/epidemiology , Humans , Mutation , Sequence Analysis, DNA , Vibrio cholerae O1/classification
SELECTION OF CITATIONS
SEARCH DETAIL