Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 8 de 8
Filter
Add more filters










Database
Language
Publication year range
1.
ArXiv ; 2024 May 08.
Article in English | MEDLINE | ID: mdl-38764594

ABSTRACT

The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the Portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This Portal has been coupled with other resources like Viral AI and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this Portal, including its contextual data not available elsewhere, and the 'Duotang', a web platform that presents key genomic epidemiology and modeling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the Portal (COVID-MVP, CoVizu), are all open-source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.

2.
Bioinform Adv ; 3(1): vbac099, 2023.
Article in English | MEDLINE | ID: mdl-36698766

ABSTRACT

Motivation: Increasingly complex omics datasets are being generated, along with associated diverse categories of metadata (environmental, clinical, etc.). Looking at the correlation between these variables can be critical to identify potential confounding factors and novel relationships. To date, some correlation globe software has been developed to aid investigations; however, they lack secure, dynamic visualization capability. Results: GlobeCorr.ca is a web-based application designed to provide user-friendly, interactive visualization and analysis of correlation datasets. Users load tabular data listing pairwise variables and their correlation values, and GlobeCorr creates a dynamic visualization using ribbons to represent positive and negative correlations, optionally grouped by domain/category (such as microbiome taxa against other metadata). GlobeCorr runs securely (locally on a user's computer) and provides a simple method for users to visualize and summarize complex datasets. This tool is applicable to a wide range of disciplines and domains of interest, including the bioinformatics/microbiome and metadata examples provided within. Availability and Implementation: See https://GlobeCorr.ca; Code provided under an open source MIT license: https://github.com/brinkmanlab/globecorr.

3.
Nucleic Acids Res ; 51(D1): D690-D699, 2023 01 06.
Article in English | MEDLINE | ID: mdl-36263822

ABSTRACT

The Comprehensive Antibiotic Resistance Database (CARD; card.mcmaster.ca) combines the Antibiotic Resistance Ontology (ARO) with curated AMR gene (ARG) sequences and resistance-conferring mutations to provide an informatics framework for annotation and interpretation of resistomes. As of version 3.2.4, CARD encompasses 6627 ontology terms, 5010 reference sequences, 1933 mutations, 3004 publications, and 5057 AMR detection models that can be used by the accompanying Resistance Gene Identifier (RGI) software to annotate genomic or metagenomic sequences. Focused curation enhancements since 2020 include expanded ß-lactamase curation, incorporation of likelihood-based AMR mutations for Mycobacterium tuberculosis, addition of disinfectants and antiseptics plus their associated ARGs, and systematic curation of resistance-modifying agents. This expanded curation includes 180 new AMR gene families, 15 new drug classes, 1 new resistance mechanism, and two new ontological relationships: evolutionary_variant_of and is_small_molecule_inhibitor. In silico prediction of resistomes and prevalence statistics of ARGs has been expanded to 377 pathogens, 21,079 chromosomes, 2,662 genomic islands, 41,828 plasmids and 155,606 whole-genome shotgun assemblies, resulting in collation of 322,710 unique ARG allele sequences. New features include the CARD:Live collection of community submitted isolate resistome data and the introduction of standardized 15 character CARD Short Names for ARGs to support machine learning efforts.


Subject(s)
Data Curation , Databases, Factual , Drug Resistance, Microbial , Machine Learning , Anti-Bacterial Agents/pharmacology , Genes, Bacterial , Likelihood Functions , Software , Molecular Sequence Annotation
4.
PLoS One ; 17(2): e0261103, 2022.
Article in English | MEDLINE | ID: mdl-35196314

ABSTRACT

A variety of islet autoantibodies (AAbs) can predict and possibly dictate eventual type 1 diabetes (T1D) diagnosis. Upwards of 75% of those with T1D are positive for AAbs against glutamic acid decarboxylase (GAD65 or GAD), a producer of gamma-aminobutyric acid (GABA) in human pancreatic beta cells. Interestingly, bacterial populations within the human gut also express GAD and produce GABA. Evidence suggests that dysbiosis of the microbiome may correlate with T1D pathogenesis and physiology. Therefore, autoimmune linkages between the gut microbiome and islets susceptible to autoimmune attack need to be further elucidated. Utilizing in silico analyses, we show that 25 GAD sequences from human gut bacterial sources show sequence and motif similarities to human beta cell GAD65. Our motif analyses determined that most gut GAD sequences contain the pyroxical dependent decarboxylase (PDD) domain of human GAD65, which is important for its enzymatic activity. Additionally, we showed overlap with known human GAD65 T cell receptor epitopes, which may implicate the immune destruction of beta cells. Thus, we propose a physiological hypothesis in which changes in the gut microbiome in those with T1D result in a release of bacterial GAD, thus causing miseducation of the host immune system. Due to the notable similarities we found between human and bacterial GAD, these deputized immune cells may then target human beta cells leading to the development of T1D.


Subject(s)
Autoantibodies/immunology , Bacteria/enzymology , Diabetes Mellitus, Type 1/immunology , Diabetes Mellitus, Type 1/microbiology , Gastrointestinal Microbiome/immunology , Glutamate Decarboxylase/genetics , Glutamate Decarboxylase/immunology , Animals , Antigen-Presenting Cells/immunology , Computer Simulation , Diabetes Mellitus, Type 1/enzymology , Epitopes, T-Lymphocyte/immunology , Genes, Bacterial , Humans , Islets of Langerhans/enzymology , Islets of Langerhans/immunology , Mice , Pan troglodytes/microbiology , Phylogeny , Protein Domains , Sequence Alignment/methods , gamma-Aminobutyric Acid/metabolism
5.
Microb Genom ; 6(10)2020 10.
Article in English | MEDLINE | ID: mdl-33001022

ABSTRACT

Metagenomic methods enable the simultaneous characterization of microbial communities without time-consuming and bias-inducing culturing. Metagenome-assembled genome (MAG) binning methods aim to reassemble individual genomes from this data. However, the recovery of mobile genetic elements (MGEs), such as plasmids and genomic islands (GIs), by binning has not been well characterized. Given the association of antimicrobial resistance (AMR) genes and virulence factor (VF) genes with MGEs, studying their transmission is a public-health priority. The variable copy number and sequence composition of MGEs makes them potentially problematic for MAG binning methods. To systematically investigate this issue, we simulated a low-complexity metagenome comprising 30 GI-rich and plasmid-containing bacterial genomes. MAGs were then recovered using 12 current prediction pipelines and evaluated. While 82-94 % of chromosomes could be correctly recovered and binned, only 38-44 % of GIs and 1-29 % of plasmid sequences were found. Strikingly, no plasmid-borne VF nor AMR genes were recovered, and only 0-45 % of AMR or VF genes within GIs. We conclude that short-read MAG approaches, without further optimization, are largely ineffective for the analysis of mobile genes, including those of public-health importance, such as AMR and VF genes. We propose that researchers should explore developing methods that optimize for this issue and consider also using unassembled short reads and/or long-read approaches to more fully characterize metagenomic data.


Subject(s)
Bacteria/genetics , Genomic Islands/genetics , Metagenome/genetics , Metagenomics/methods , Plasmids/genetics , Algorithms , Computer Simulation , Genome, Bacterial/genetics , Microbiota/genetics , Sequence Analysis, DNA
6.
Bioinformatics ; 36(10): 3043-3048, 2020 05 01.
Article in English | MEDLINE | ID: mdl-32108861

ABSTRACT

MOTIVATION: Many methods for microbial protein subcellular localization (SCL) prediction exist; however, none is readily available for analysis of metagenomic sequence data, despite growing interest from researchers studying microbial communities in humans, agri-food relevant organisms and in other environments (e.g. for identification of cell-surface biomarkers for rapid protein-based diagnostic tests). We wished to also identify new markers of water quality from freshwater samples collected from pristine versus pollution-impacted watersheds. RESULTS: We report PSORTm, the first bioinformatics tool designed for prediction of diverse bacterial and archaeal protein SCL from metagenomics data. PSORTm incorporates components of PSORTb, one of the most precise and widely used protein SCL predictors, with an automated classification by cell envelope. An evaluation using 5-fold cross-validation with in silico-fragmented sequences with known localization showed that PSORTm maintains PSORTb's high precision, while sensitivity increases proportionately with metagenomic sequence fragment length. PSORTm's read-based analysis was similar to PSORTb-based analysis of metagenome-assembled genomes (MAGs); however, the latter requires non-trivial manual classification of each MAG by cell envelope, and cannot make use of unassembled sequences. Analysis of the watershed samples revealed the importance of normalization and identified potential biomarkers of water quality. This method should be useful for examining a wide range of microbial communities, including human microbiomes, and other microbiomes of medical, environmental or industrial importance. AVAILABILITY AND IMPLEMENTATION: Documentation, source code and docker containers are available for running PSORTm locally at https://www.psort.org/psortm/ (freely available, open-source software under GNU General Public License Version 3). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Archaea , Metagenomics , Archaea/genetics , Bacteria/genetics , Humans , Metagenome , Software
7.
Nucleic Acids Res ; 45(D1): D566-D573, 2017 01 04.
Article in English | MEDLINE | ID: mdl-27789705

ABSTRACT

The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis.


Subject(s)
Computational Biology/methods , Databases, Genetic , Drug Resistance, Microbial , Microbiology , Biological Ontologies , Data Curation , Web Browser
8.
Front Microbiol ; 6: 1036, 2015.
Article in English | MEDLINE | ID: mdl-26483767

ABSTRACT

The International Pseudomonas aeruginosa Consortium is sequencing over 1000 genomes and building an analysis pipeline for the study of Pseudomonas genome evolution, antibiotic resistance and virulence genes. Metadata, including genomic and phenotypic data for each isolate of the collection, are available through the International Pseudomonas Consortium Database (http://ipcd.ibis.ulaval.ca/). Here, we present our strategy and the results that emerged from the analysis of the first 389 genomes. With as yet unmatched resolution, our results confirm that P. aeruginosa strains can be divided into three major groups that are further divided into subgroups, some not previously reported in the literature. We also provide the first snapshot of P. aeruginosa strain diversity with respect to antibiotic resistance. Our approach will allow us to draw potential links between environmental strains and those implicated in human and animal infections, understand how patients become infected and how the infection evolves over time as well as identify prognostic markers for better evidence-based decisions on patient care.

SELECTION OF CITATIONS
SEARCH DETAIL
...