RESUMO
The vast amount of heterogeneous omics data, encompassing a broad range of biomolecular information, requires novel methods of analysis, including those that integrate the available levels of information. In this work, we describe Regression2Net, a computational approach that is able to integrate gene expression and genomic or methylation data in two steps. First, penalized regressions are used to build Expression-Expression (EEnet) and Expression-Genomic or Expression-Methylation (EMnet) networks. Second, network theory is used to highlight important communities of genes. When applying our approach, Regression2Net to gene expression and methylation profiles for individuals with glioblastoma multiforme, we identified, respectively, 284 and 447 potentially interesting genes in relation to glioblastoma pathology. These genes showed at least one connection in the integrated networks ANDnet and XORnet derived from aforementioned EEnet and EMnet networks. Although the edges in ANDnet occur in both EEnet and EMnet, the edges in XORnet occur in EMnet but not in EEnet. In-depth biological analysis of connected genes in ANDnet and XORnet revealed genes that are related to energy metabolism, cell cycle control (AATF), immune system response, and several cancer types. Importantly, we observed significant overrepresentation of cancer-related pathways including glioma, especially in the XORnet network, suggesting a nonignorable role of methylation in glioblastoma multiforma. In the ANDnet, we furthermore identified potential glioma suppressor genes ACCN3 and ACCN4 linked to the NBPF1 neuroblastoma breakpoint family, as well as numerous ABC transporter genes (ABCA1, ABCB1) suggesting drug resistance of glioblastoma tumors.
Assuntos
Metilação de DNA , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Genômica/métodos , Glioblastoma/genética , Proteínas de Neoplasias/genética , Biologia Computacional/métodos , Glioblastoma/patologia , HumanosRESUMO
Gene regulatory network (GRN) inference is an active area of research that facilitates understanding the complex interplays between biological molecules. We propose a novel framework to create such GRNs, based on Conditional Inference Forests (CIFs) as proposed by Strobl et al. Our framework consists of using ensembles of Conditional Inference Trees (CITs) and selecting an appropriate aggregation scheme for variant selection prior to network construction. We show on synthetic microarray data that taking the original implementation of CIFs with conditional permutation scheme (CIFcond ) may lead to improved performance compared to Breiman's implementation of Random Forests (RF). Among all newly introduced CIF-based methods and five network scenarios obtained from the DREAM4 challenge, CIFcond performed best. Networks derived from well-tuned CIFs, obtained by simply averaging P-values over tree ensembles (CIFmean ) are particularly attractive, because they combine adequate performance with computational efficiency. Moreover, thresholds for variable selection are based on significance levels for P-values and, hence, do not need to be tuned. From a practical point of view, our extensive simulations show the potential advantages of CIFmean -based methods. Although more work is needed to improve on speed, especially when fully exploiting the advantages of CITs in the context of heterogeneous and correlated data, we have shown that CIF methodology can be flexibly inserted in a framework to infer biological interactions. Notably, we confirmed biologically relevant interaction between IL2RA and FOXP1, linked to the IL-2 signaling pathway and to type 1 diabetes.
Assuntos
Biomarcadores/análise , Diabetes Mellitus Tipo 1/genética , Fatores de Transcrição Forkhead/genética , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Subunidade alfa de Receptor de Interleucina-2/genética , Modelos Genéticos , Proteínas Repressoras/genética , HumanosRESUMO
The molecular details of the association between the human Fyn-SH3 domain, and the fragment of 18.5-kDa myelin basic protein (MBP) spanning residues S38-S107 (denoted as xα2-peptide, murine sequence numbering), were studied in silico via docking and molecular dynamics over 50-ns trajectories. The results show that interaction between the two proteins is energetically favorable and heavily dependent on the MBP proline-rich region (P93-P98) in both aqueous and membrane environments. In aqueous conditions, the xα2-peptide/Fyn-SH3 complex adopts a "sandwich""-like structure. In the membrane context, the xα2-peptide interacts with the Fyn-SH3 domain via the proline-rich region and the ß-sheets of Fyn-SH3, with the latter wrapping around the proline-rich region in a form of a clip. Moreover, the simulations corroborate prior experimental evidence of the importance of upstream segments beyond the canonical SH3-ligand. This study thus provides a more-detailed glimpse into the context-dependent interaction dynamics and importance of the ß-sheets in Fyn-SH3 and proline-rich region of MBP. Proteins 2017; 85:1336-1350. © 2017 Wiley Periodicals, Inc.
Assuntos
Bicamadas Lipídicas/química , Proteína Básica da Mielina/química , Proteínas Proto-Oncogênicas c-fyn/química , Água/química , Domínios de Homologia de src , Sequência de Aminoácidos , Animais , Sítios de Ligação , Dimiristoilfosfatidilcolina/química , Humanos , Camundongos , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Peptídeos/química , Fosforilcolina/análogos & derivados , Fosforilcolina/química , Prolina/química , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Estrutura Terciária de Proteína , Termodinâmica , Unitiol/químicaRESUMO
Genome-wide association studies have revealed a vast amount of common loci associated to human complex diseases. Still, a large proportion of heritability remains unexplained. The extent to which rare genetic variants (RVs) are able to explain a relevant portion of the genetic heritability for complex traits leaves room for several debates and paves the way to the collection of RV databases and the development of novel analytic tools to analyze these. To date, several statistical methods have been proposed to uncover the association of RVs with complex diseases, but none of them is the clear winner in all possible scenarios of study design and assumed underlying disease model. The latter may involve differences in the distributions of effect sizes, proportions of causal variants, and ratios of protective to deleterious variants at distinct regions throughout the genome. Therefore, there is a need for robust scalable methods with acceptable overall performance in terms of power and type I error under various realistic scenarios. In this paper, we propose a novel RV association analysis strategy, which satisfies several of the desired properties that a RV analysis tool should exhibit.
Assuntos
Variação Genética , Estudo de Associação Genômica Ampla , Modelos Genéticos , Redução Dimensional com Múltiplos Fatores , Cromossomos Humanos Par 4/genética , HumanosRESUMO
OBJECTIVES: Different types of '-omics' data are becoming available in the post-genome era; still a single -omics assessment provides limited insights to understand the biological mechanism of complex diseases. Genomics, epigenomics and transcriptomics data provide insight into the molecular dysregulation of neoplastic diseases, among them urothelial bladder cancer (UBC). Here, we propose a detailed analytical framework necessary to achieve an adequate integration of the three sets of -omics data to ultimately identify previously hidden genetic mechanisms in UBC. METHODS: We built a multi-staged framework to study possible pair-wise combinations and integrated the data in three-way relationships. SNP genotypes, CpG methylation levels and gene expression levels were determined for a total of 70 individuals with UBC and with fresh tumour tissue available. RESULTS: We suggest two main hypothesis-based scenarios for gene regulation based on the -omics integration analysis, where DNA methylation affects gene expression and genetic variants co-regulate gene expression and DNA methylation. We identified several three-way trans-association 'hotspots' that are found at the molecular level and that deserve further studies. CONCLUSIONS: The proposed integrative framework allowed us to identify relationships at the whole-genome level providing some new biological insights and highlighting the importance of integrating -omics data.
Assuntos
Doença/genética , Epigenômica , Perfilação da Expressão Gênica , Estatística como Assunto , Adulto , Idoso , Idoso de 80 Anos ou mais , Ilhas de CpG/genética , Metilação de DNA/genética , Feminino , Regulação da Expressão Gênica , Frequência do Gene/genética , Estudo de Associação Genômica Ampla , Técnicas de Genotipagem , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Locos de Características Quantitativas/genéticaRESUMO
Genome-wide association interaction (GWAI) studies have increased in popularity. Yet to date, no standard protocol exists. In practice, any GWAI workflow involves making choices about quality control strategy, SNP filtering, linkage disequilibrium (LD) pruning, analytic tool to model or to test for genetic interactions. Each of these can have an impact on the final epistasis findings and may affect their reproducibility in follow-up analyses. Choosing an analytic tool is not straightforward, as different tools exist and current understanding about their performance is based on often very particular simulation settings. In the present study, we wish to create awareness for the impact of (minor) changes in a GWAI analysis protocol can have on final epistasis findings. In particular, we investigate the influence of marker selection and marker prioritization strategies, LD pruning and the choice of epistasis detection analytics on study results, giving rise to 8 GWAI protocols. Discussions are made in the context of the ankylosing spondylitis (AS) data obtained via the Wellcome Trust Case Control Consortium (WTCCC2). As expected, the largest impact on AS epistasis findings is caused by the choice of marker selection criterion, followed by marker coding and LD pruning. In MB-MDR, co-dominant coding of main effects is more robust to the effects of LD pruning than additive coding. We were able to reproduce previously reported epistasis involvement of HLA-B and ERAP1 in AS pathology. In addition, our results suggest involvement of MAGI3 and PARK2, responsible for cell adhesion and cellular trafficking. Gene ontology biological function enrichment analysis across the 8 considered GWAI protocols also suggested that AS could be associated to the central nervous system malfunctions, specifically, in nerve impulse propagation and in neurotransmitters metabolic processes.
Assuntos
Bases de Dados Genéticas , Epistasia Genética , Técnicas de Genotipagem/métodos , Desequilíbrio de Ligação , Polimorfismo de Nucleotídeo Único , Espondilite Anquilosante/genética , Idoso , Aminopeptidases/genética , Feminino , Estudo de Associação Genômica Ampla , Técnicas de Genotipagem/normas , Antígenos HLA-B/genética , Humanos , Masculino , Proteínas de Membrana/genética , Antígenos de Histocompatibilidade Menor , Ubiquitina-Proteína Ligases/genéticaRESUMO
Shiga toxin-producing Escherichia coli (STEC) can cause severe clinical disease in humans, particularly in young children. Recent advances have led to greater availability of sequencing technologies. We sought to use whole genome sequencing data to identify the presence or absence of known virulence factors in all clinical isolates submitted to our laboratory from Southern Alberta dated 2020-2022 and correlate these virulence factors with clinical outcomes obtained through chart review. Overall, the majority of HUS and hospitalizations were seen in patients with O157:H7 serotypes, and HUS cases were primarily in young children. The frequency of virulence factors differed between O157:H7 and non-O157 serotypes. Within the O157:H7 cases, certain virulence factors, including espP, espX1, and katP, were more frequent in HUS cases. The number of samples was too low to determine statistical significance.
RESUMO
A mixed phospholipid-cholestrol bilayer, with cholera toxin B (CTB) units attached to the monosialotetrahexosylganglioside (GM1) binding sites in the distal leaflet, was deposited on a Au(111) electrode surface. Polarization modulation infrared reflection absorption spectroscopy (PM-IRRAS) measurements were used to characterize structural and orientational changes in this model biological membrane upon binding CTB and the application of the electrode potential. The data presented in this article show that binding cholera toxin to the membrane leads to an overall increase in the tilt angle of the fatty acid chains; however, the conformation of the bilayer remains relatively constant as indicated by the small decrease in the total number of gauche conformers of acyl tails. In addition, the bound toxin caused a significant decrease in the hydration of the ester group contained within the lipid bilayer. Furthermore, changes in the applied potential had a minimal effect on the overall structure of the membrane. In contrast, our results showed significant voltage-dependent changes in the average orientation of the protein α-helices that may correspond to the voltage-gated opening and closing of the central pore that resides within the B subunit of cholera toxin.
Assuntos
Toxina da Cólera/química , Técnicas Eletroquímicas , Gangliosídeos/química , Sítios de Ligação , Eletrodos , Ouro/química , Modelos Moleculares , Espectrofotometria Infravermelho , Propriedades de SuperfícieRESUMO
The 18.5 kDa myelin basic protein (MBP), the most abundant splice isoform in adult mammalian myelin, is a multifunctional, intrinsically disordered protein involved in the development and compaction of the myelin sheath in the central nervous system. A highly conserved central segment comprises a membrane-anchoring amphipathic α-helix followed by a proline-rich segment that represents a ligand for SH3 domain-containing proteins. Here, we have determined using solution nuclear magnetic resonance spectroscopy the structure of a 36-residue peptide fragment of MBP (murine 18.5 kDa residues S72-S107, denoted the α2-peptide) comprising these two structural motifs, in association with dodecylphosphocholine (DPC) micelles. The structure was calculated using CS-ROSETTA (version 1.01) because the nuclear Overhauser effect restraints were insufficient for this protein. The experimental studies were complemented by molecular dynamics simulations of a corresponding 24-residue peptide fragment (murine 18.5 kDa residues E80-G103, denoted the MD-peptide), also in association with a DPC micelle in silico. The experimental and theoretical results agreed well with one another, despite the independence of the starting structures and analyses, both showing membrane association via the amphipathic α-helix, and a sharp bend in the vicinity of the Pro93 residue (murine 18.5 kDa sequence numbering). Overall, the conformations elucidated here show how the SH3 ligand is presented to the cytoplasm for interaction with SH3 domain-containing proteins such as Fyn and contribute to our understanding of myelin architecture at the molecular level.
Assuntos
Micelas , Simulação de Dinâmica Molecular , Proteína Básica da Mielina/química , Ressonância Magnética Nuclear Biomolecular/métodos , Fosforilcolina/análogos & derivados , Sequência de Aminoácidos , Animais , Camundongos , Fosforilcolina/químicaRESUMO
Enterococcus faecium is a ubiquitous opportunistic pathogen that is exhibiting increasing levels of antimicrobial resistance (AMR). Many of the genes that confer resistance and pathogenic functions are localized on mobile genetic elements (MGEs), which facilitate their transfer between lineages. Here, features including resistance determinants, virulence factors and MGEs were profiled in a set of 1273 E. faecium genomes from two disparate geographic locations (in the UK and Canada) from a range of agricultural, clinical and associated habitats. Neither lineages of E. faecium, type A and B, nor MGEs are constrained by geographic proximity, but our results show evidence of a strong association of many profiled genes and MGEs with habitat. Many features were associated with a group of clinical and municipal wastewater genomes that are likely forming a new human-associated ecotype within type A. The evolutionary dynamics of E. faecium make it a highly versatile emerging pathogen, and its ability to acquire, transmit and lose features presents a high risk for the emergence of new pathogenic variants and novel resistance combinations. This study provides a workflow for MGE-centric surveillance of AMR in Enterococcus that can be adapted to other pathogens.
Assuntos
Anti-Infecciosos , Enterococcus faecium , Saúde Única , Enterococcus faecium/genética , Humanos , Fatores de Virulência/genética , Águas ResiduáriasRESUMO
Cryptosporidium is a protozoan parasite that is transmitted to both humans and animals through zoonotic or anthroponotic means. When a host is infected with this parasite, it causes a gastrointestinal disease known as cryptosporidiosis. To understand the transmission dynamics of Cryptosporidium, the small subunit (SSU or 18S) rRNA and gp60 genes are commonly studied through PCR analysis and conventional Sanger sequencing. However, analyzing sequence chromatograms manually is both time consuming and prone to human error, especially in the presence of poorly resolved, heterozygous peaks and the absence of a validated database. For this study, we developed a Cryptosporidium genotyping tool, called CryptoGenotyper, which has the capability to read raw Sanger sequencing data for the two common Cryptosporidium gene targets (SSU rRNA and gp60) and classify the sequence data into standard nomenclature. The CryptoGenotyper has the capacity to perform quality control and properly classify sequences using a high quality, manually curated reference database, saving users' time and removing bias during data analysis. The incorporated heterozygous base calling algorithms for the SSU rRNA gene target resolves double peaks, therefore recovering data previously classified as inconclusive. The CryptoGenotyper successfully genotyped 99.3% (428/431) and 95.1% (154/162) of SSU rRNA chromatograms containing single and mixed sequences, respectively, and correctly subtyped 95.6% (947/991) of gp60 chromatograms without manual intervention. This new, user-friendly tool can provide both fast and reproducible analyses of Sanger sequencing data for the two most common Cryptosporidium gene targets.
RESUMO
Ingestion of food- or waterborne antibiotic-resistant bacteria may lead to dissemination of antibiotic resistance genes (ARGs) in the gut microbiota. The gut microbiota often suffers from various disturbances. It is not clear whether and how disturbed microbiota may affect ARG mobility under antibiotic treatments. For proof of concept, in the presence or absence of streptomycin pre-treatment, mice were inoculated orally with a ß-lactam-susceptible Salmonella enterica serovar Heidelberg clinical isolate (recipient) and a ß-lactam resistant Escherichia coli O80:H26 isolate (donor) carrying a blaCMY-2 gene on an IncI2 plasmid. Immediately following inoculation, mice were treated with or without ampicillin in drinking water for 7 days. Faeces were sampled, donor, recipient and transconjugant were enumerated, blaCMY-2 abundance was determined by quantitative PCR, faecal microbial community composition was determined by 16S rRNA amplicon sequencing and cecal samples were observed histologically for evidence of inflammation. In faeces of mice that received streptomycin pre-treatment, the donor abundance remained high, and the abundance of S. Heidelberg transconjugant and the relative abundance of Enterobacteriaceae increased significantly during the ampicillin treatment. Co-blooming of the donor, transconjugant and commensal Enterobacteriaceae in the inflamed intestine promoted significantly (P<0.05) higher and possibly wider dissemination of the blaCMY-2 gene in the gut microbiota of mice that received the combination of streptomycin pre-treatment and ampicillin treatment (Str-Amp) compared to the other mice. Following cessation of the ampicillin treatment, faecal shedding of S. Heidelberg transconjugant persisted much longer from mice in the Str-Amp group compared to the other mice. In addition, only mice in the Str-Amp group shed a commensal E. coli O2:H6 transconjugant, which carries three copies of the blaCMY-2 gene, one on the IncI2 plasmid and two on the chromosome. The findings highlight the significance of pre-existing gut microbiota for ARG dissemination and persistence during and following antibiotic treatments of infectious diseases.
Assuntos
Ampicilina/administração & dosagem , Escherichia coli/genética , Infecções por Bactérias Gram-Negativas/tratamento farmacológico , Salmonella enterica/genética , Estreptomicina/administração & dosagem , Resistência beta-Lactâmica , beta-Lactamases/genética , Ampicilina/farmacologia , Animais , Antibioticoprofilaxia , Modelos Animais de Doenças , Escherichia coli/efeitos dos fármacos , Escherichia coli/patogenicidade , Fezes/microbiologia , Feminino , Transferência Genética Horizontal , Infecções por Bactérias Gram-Negativas/microbiologia , Camundongos , Estudo de Prova de Conceito , RNA Ribossômico 16S/genética , Infecções por Salmonella , Salmonella enterica/efeitos dos fármacos , Salmonella enterica/patogenicidade , Estreptomicina/farmacologia , Sequenciamento Completo do GenomaRESUMO
Escherichia coli is a priority foodborne pathogen of public health concern and phenotypic serotyping provides critical information for surveillance and outbreak detection activities. Public health and food safety laboratories are increasingly adopting whole-genome sequencing (WGS) for characterizing pathogens, but it is imperative to maintain serotype designations in order to minimize disruptions to existing public health workflows. Multiple in silico tools have been developed for predicting serotypes from WGS data, including SRST2, SerotypeFinder and EToKi EBEis, but these tools were not designed with the specific requirements of diagnostic laboratories, which include: speciation, input data flexibility (fasta/fastq), quality control information and easily interpretable results. To address these specific requirements, we developed ECTyper (https://github.com/phac-nml/ecoli_serotyping) for performing both speciation within Escherichia and Shigella, and in silico serotype prediction. We compared the serotype prediction performance of each tool on a newly sequenced panel of 185 isolates with confirmed phenotypic serotype information. We found that all tools were highly concordant, with 92-97â% for O-antigens and 98-100â% for H-antigens, and ECTyper having the highest rate of concordance. We extended the benchmarking to a large panel of 6954 publicly available E. coli genomes to assess the performance of the tools on a more diverse dataset. On the public data, there was a considerable drop in concordance, with 75-91â% for O-antigens and 62-90â% for H-antigens, and ECTyper and SerotypeFinder being the most concordant. This study highlights that in silico predictions show high concordance with phenotypic serotyping results, but there are notable differences in tool performance. ECTyper provides highly accurate and sensitive in silico serotype predictions, in addition to speciation, and is designed to be easily incorporated into bioinformatic workflows.
Assuntos
Antígenos de Bactérias/genética , Biologia Computacional/métodos , Escherichia coli/classificação , Hexosiltransferases/genética , Escherichia coli/genética , Especiação Genética , Genoma Bacteriano , Sorotipagem , Software , Sequenciamento Completo do GenomaRESUMO
Hierarchical genotyping approaches can provide insights into the source, geography and temporal distribution of bacterial pathogens. Multiple hierarchical SNP genotyping schemes have previously been developed so that new isolates can rapidly be placed within pre-computed population structures, without the need to rebuild phylogenetic trees for the entire dataset. This classification approach has, however, seen limited uptake in routine public health settings due to analytical complexity and the lack of standardized tools that provide clear and easy ways to interpret results. The BioHansel tool was developed to provide an organism-agnostic tool for hierarchical SNP-based genotyping. The tool identifies split k-mers that distinguish predefined lineages in whole genome sequencing (WGS) data using SNP-based genotyping schemes. BioHansel uses the Aho-Corasick algorithm to type isolates from assembled genomes or raw read sequence data in a matter of seconds, with limited computational resources. This makes BioHansel ideal for use by public health agencies that rely on WGS methods for surveillance of bacterial pathogens. Genotyping results are evaluated using a quality assurance module which identifies problematic samples, such as low-quality or contaminated datasets. Using existing hierarchical SNP schemes for Mycobacterium tuberculosis and Salmonella Typhi, we compare the genotyping results obtained with the k-mer-based tools BioHansel and SKA, with those of the organism-specific tools TBProfiler and genotyphi, which use gold-standard reference-mapping approaches. We show that the genotyping results are fully concordant across these different methods, and that the k-mer-based tools are significantly faster. We also test the ability of the BioHansel quality assurance module to detect intra-lineage contamination and demonstrate that it is effective, even in populations with low genetic diversity. We demonstrate the scalability of the tool using a dataset of ~8100 S. Typhi public genomes and provide the aggregated results of geographical distributions as part of the tool's output. BioHansel is an open source Python 3 application available on PyPI and Conda repositories and as a Galaxy tool from the public Galaxy Toolshed. In a public health context, BioHansel enables rapid and high-resolution classification of bacterial pathogens with low genetic diversity.
Assuntos
Bactérias/genética , Técnicas de Tipagem Bacteriana/métodos , Técnicas de Genotipagem/métodos , Polimorfismo de Nucleotídeo Único , Bactérias/classificação , Bactérias/isolamento & purificação , Variação Genética , Genoma Bacteriano , Genótipo , Epidemiologia Molecular/métodos , Mycobacterium tuberculosis/genética , Filogenia , Salmonella/genética , Software , Sequenciamento Completo do GenomaRESUMO
Myelin basic protein (MBP), specifically the 18.5 kDa isoform, is a peripheral membrane protein and a major component of mammalian central nervous system myelin. It is an intrinsically disordered and multifunctional protein that binds cytoskeletal and other cytosolic proteins to a membrane surface and thereby acquires ordered structure. These associations are modulated by post-translational modifications of MBP, as well as by interactions of MBP with Ca(2+)-calmodulin (CaM). Enzymatic deimination of usually six arginine residues to citrulline results in a decrease in the net positive charge of the protein from 19 to ≤13. This deiminated form is found in greater amounts in normal children and in adult patients with the demyelinating disease multiple sclerosis. In this paper, we examine the secondary structure of a calmodulin-binding domain, residues A141-L154, when associated with a lipid bilayer in recombinant murine 18.5 kDa forms rmC1 (unmodified) and rmC8 (pseudodeiminated). We demonstrate here by site-directed spin labeling and electron paramagnetic resonance (EPR) spectroscopy that the Y142-L154 segment in membrane-associated rmC1 forms an amphipathic α-helix, with high accessibility to O(2) and low accessibility to NiEDDA. In membrane-associated rmC8, this segment assumed a structure distorted from an α-helix. Spin-labeled residues in rmC1 in solution were more immobilized on binding Ca(2+)-CaM than those in rmC8. Furthermore, rmC8 was dissociated more readily from a lipid bilayer by Ca(2+)-CaM than was rmC1. These results confirm both a predicted induced ordering upon membrane association in a specific segment of 18.5 kDa MBP, and that this segment is a CaM-binding site, with both interactions weakened by deimination of residues outside of this segment. The deiminated form would be more susceptible to regulation of its membrane binding functions by Ca(2+)-CaM than the unmodified form.
Assuntos
Cálcio/química , Calmodulina/química , Proteína Básica da Mielina/química , Animais , Sítios de Ligação , Cálcio/metabolismo , Calmodulina/genética , Calmodulina/metabolismo , Camundongos , Proteína Básica da Mielina/genética , Proteína Básica da Mielina/metabolismo , Ligação Proteica , Estrutura Secundária de Proteína , Estrutura Terciária de Proteína , Receptor do Retrovírus Politrópico e XenotrópicoRESUMO
Bacterial plasmids play a large role in allowing bacteria to adapt to changing environments and can pose a significant risk to human health if they confer virulence and antimicrobial resistance (AMR). Plasmids differ significantly in the taxonomic breadth of host bacteria in which they can successfully replicate, this is commonly referred to as 'host range' and is usually described in qualitative terms of 'narrow' or 'broad'. Understanding the host range potential of plasmids is of great interest due to their ability to disseminate traits such as AMR through bacterial populations and into human pathogens. We developed the MOB-suite to facilitate characterization of plasmids and introduced a whole-sequence-based classification system based on clustering complete plasmid sequences using Mash distances (https://github.com/phac-nml/mob-suite). We updated the MOB-suite database from 12â091 to 23â671 complete sequences, representing 17â779 unique plasmids. With advances in new algorithms for rapidly calculating average nucleotide identity (ANI), we compared clustering characteristics using two different distance measures - Mash and ANI - and three clustering algorithms on the unique set of plasmids. The plasmid nomenclature is designed to group highly similar plasmids together that are unlikely to have multiple representatives within a single cell. Based on our results, we determined that clusters generated using Mash and complete-linkage clustering at a Mash distance of 0.06 resulted in highly homogeneous clusters while maintaining cluster size. The taxonomic distribution of plasmid biomarker sequences for replication and relaxase typing, in combination with MOB-suite whole-sequence-based clusters have been examined in detail for all high-quality publicly available plasmid sequences. We have incorporated prediction of plasmid replication host range into the MOB-suite based on observed distributions of these sequence features in combination with known plasmid hosts from the literature. Host range is reported as the highest taxonomic rank that covers all of the plasmids which share replicon or relaxase biomarkers or belong to the same MOB-suite cluster code. Reporting host range based on these criteria allows for comparisons of host range between studies and provides information for plasmid surveillance.
Assuntos
Bactérias/genética , Especificidade de Hospedeiro/genética , Plasmídeos/classificação , Plasmídeos/genética , Conjugação Genética/genética , Bases de Dados Genéticas , Humanos , Tipagem Molecular/métodosRESUMO
Ingestion of food- or waterborne antibiotic-resistant bacteria may lead to the dissemination of antibiotic-resistance genes in the gut microbiota and the development of antibiotic-resistant bacterial infection, a significant threat to animal and public health. Food or water may be contaminated with multiple resistant bacteria, but animal models on gene transfer were mainly based on single-strain infections. In this study, we investigated the mobility of ß-lactam resistance following infection with single- versus multi-strain of resistant bacteria under ampicillin treatment. We characterized three bacterial strains isolated from food-animal production systems, Escherichia coli O80:H26 and Salmonella enterica serovars Bredeney and Heidelberg. Each strain carries at least one conjugative plasmid that encodes a ß-lactamase. We orally infected mice with each or all three bacterial strain(s) in the presence or absence of ampicillin treatment. We assessed plasmid transfer from the three donor bacteria to an introduced E. coli CV601gfp recipient in the mouse gut, and evaluated the impacts of the bacterial infection on gut microbiota and gut health. In the absence of ampicillin treatment, none of the donor or recipient bacteria established in the normal gut microbiota and plasmid transfer was not detected. In contrast, the ampicillin treatment disrupted the gut microbiota and enabled S. Bredeney and Heidelberg to colonize and transfer their plasmids to the E. coli CV601gfp recipient. E. coli O80:H26 on its own failed to colonize the mouse gut. However, during co-infection with the two Salmonella strains, E. coli O80:H26 colonized and transferred its plasmid to the E. coli CV601gfp recipient and a residential E. coli O2:H6 strain. The co-infection significantly increased plasmid transfer frequency, enhanced Proteobacteria expansion and resulted in inflammation in the mouse gut. Our findings suggest that single-strain infection models for evaluating in vivo gene transfer may underrepresent the consequences of multi-strain infections following the consumption of heavily contaminated food or water.
RESUMO
Environmental DNA (eDNA) is an effective approach for detecting vertebrates and plants, especially in aquatic ecosystems, but prior studies have largely examined eDNA in cool temperate settings. By contrast, this study employs eDNA to survey the fish fauna in tropical Lake Bacalar (Mexico) with the additional goal of assessing the possible presence of invasive fishes, such as Amazon sailfin catfish and tilapia. Sediment and water samples were collected from eight stations in Lake Bacalar on three occasions over a 4-month interval. Each sample was stored in the presence or absence of lysis buffer to compare eDNA recovery. Short fragments (184-187 bp) of the cytochrome c oxidase I (COI) gene were amplified using fusion primers and then sequenced on Ion Torrent PGM or S5 before their source species were determined using a custom reference sequence database constructed on BOLD. In total, eDNA sequences were recovered from 75 species of vertebrates including 47 fishes, 15 birds, 7 mammals, 5 reptiles, and 1 amphibian. Although all species are known from this region, six fish species represent new records for the study area, while two require verification. Sequences for five species (2 birds, 2 mammals, 1 reptile) were only detected from sediments, while sequences from 52 species were only recovered from water. Because DNA from the Amazon sailfin catfish was not detected, we used a mock eDNA experiment to confirm our methods would enable its detection. In summary, we developed protocols that recovered eDNA from tropical oligotrophic aquatic ecosystems and confirmed their effectiveness in detecting fishes and diverse species of vertebrates.
Assuntos
Código de Barras de DNA Taxonômico/métodos , Ecossistema , Peixes/genética , Lagos , Vertebrados/genética , Animais , DNA/química , DNA/genética , Complexo IV da Cadeia de Transporte de Elétrons/genética , Peixes/classificação , Variação Genética , Sedimentos Geológicos/química , Sequenciamento de Nucleotídeos em Larga Escala/métodos , México , Filogenia , Especificidade da Espécie , Vertebrados/classificação , Água/químicaRESUMO
The reliable taxonomic identification of organisms through DNA sequence data requires a well parameterized library of curated reference sequences. However, it is estimated that just 15% of described animal species are represented in public sequence repositories. To begin to address this deficiency, we provide DNA barcodes for 1,500,003 animal specimens collected from 23 terrestrial and aquatic ecozones at sites across Canada, a nation that comprises 7% of the planet's land surface. In total, 14 phyla, 43 classes, 163 orders, 1123 families, 6186 genera, and 64,264 Barcode Index Numbers (BINs; a proxy for species) are represented. Species-level taxonomy was available for 38% of the specimens, but higher proportions were assigned to a genus (69.5%) and a family (99.9%). Voucher specimens and DNA extracts are archived at the Centre for Biodiversity Genomics where they are available for further research. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, and the Global Genome Biodiversity Network Data Portal.