RESUMO
Genomic data can be used to track the transmission and geographic spread of infectious diseases. However, the sequencing capacity required for genomic surveillance remains limited in many low- and middle-income countries (LMICs), where dog-mediated rabies and/or rabies transmitted by wildlife such as vampire bats pose major public health and economic concerns. We present here a rapid and affordable sample-to-sequence-to-interpretation workflow using nanopore technology. Protocols for sample collection and the diagnosis of rabies are briefly described, followed by details of the optimized whole genome sequencing workflow, including primer design and optimization for multiplex polymerase chain reaction (PCR), a modified, low-cost sequencing library preparation, sequencing with live and offline base calling, genetic lineage designation, and phylogenetic analysis. Implementation of the workflow is demonstrated, and critical steps are highlighted for local deployment, such as pipeline validation, primer optimization, inclusion of negative controls, and the use of publicly available data and genomic tools (GLUE, MADDOG) for classification and placement within regional and global phylogenies. The turnaround time for the workflow is 2-3 days, and the cost ranges from $25 per sample for a 96 sample run to $80 per sample for a 12 sample run. We conclude that setting up rabies virus genomic surveillance in LMICs is feasible and can support progress toward the global goal of zero dog-mediated human rabies deaths by 2030, as well as enhanced monitoring of wildlife rabies spread. Moreover, the platform can be adapted for other pathogens, helping to build a versatile genomic capacity that contributes to epidemic and pandemic preparedness.
Assuntos
Quirópteros , Nanoporos , Vírus da Raiva , Raiva , Humanos , Animais , Cães , Vírus da Raiva/genética , Raiva/diagnóstico , Raiva/veterinária , Filogenia , Animais Selvagens , Tecnologia , Sequenciamento Completo do GenomaRESUMO
Carney complex (CNC) is an ultrarare disorder causing cutaneous and cardiac myxomas, primary pigmented nodular adrenocortical disease, hypophyseal adenoma, and gonadal tumours. Genetic alterations are often missed under routine genetic testing. Pathogenic variants in PRKAR1A are identified in most cases, while large exonic or chromosomal deletions have only been reported in a few cases. Our aim was to identify the causal genetic alteration in our kindred with a clinical diagnosis of CNC and prove its pathogenic role by functional investigation. Targeted testing of PRKAR1A gene, whole exome and whole genome sequencing (WGS) were performed in the proband, one clinically affected and one unaffected relative. WGS identified a novel, large, 10,662 bp (10.6 kbp; LRG_514t1:c.-10403_-7 + 265del; hg19, chr17:g.66498293_66508954del) deletion in the promoter of PRKAR1A in heterozygous form in the affected family members. The exact breakpoints and the increased enzyme activity in deletion carriers compared to wild type carrier were proved. Segregation analysis and functional evaluation of PKA activity confirmed the pathogenic role of this alteration. A novel deletion upstream of the PRKAR1A gene was proved to be the cause of CNC. Our study underlines the need for WGS in molecular genetic testing of patients with monogenic disorders where conventional genetic analysis fails.
Assuntos
Complexo de Carney , Subunidade RIalfa da Proteína Quinase Dependente de AMP Cíclico , Complexo de Carney/diagnóstico , Complexo de Carney/genética , Mixoma/genética , Humanos , Deleção de Genes , Linhagem , Regiões Promotoras Genéticas , Masculino , Feminino , Sequenciamento Completo do Genoma , Subunidade RIalfa da Proteína Quinase Dependente de AMP Cíclico/genéticaRESUMO
BACKGROUND: Bisulfite sequencing is a powerful tool for profiling genomic methylation, an epigenetic modification critical in the understanding of cancer, psychiatric disorders, and many other conditions. Raw data generated by whole genome bisulfite sequencing (WGBS) requires several computational steps before it is ready for statistical analysis, and particular care is required to process data in a timely and memory-efficient manner. Alignment to a reference genome is one of the most computationally demanding steps in a WGBS workflow, taking several hours or even days with commonly used WGBS-specific alignment software. This naturally motivates the creation of computational workflows that can utilize GPU-based alignment software to greatly speed up the bottleneck step. In addition, WGBS produces raw data that is large and often unwieldy; a lack of memory-efficient representation of data by existing pipelines renders WGBS impractical or impossible to many researchers. RESULTS: We present BiocMAP, a Bioconductor-friendly methylation analysis pipeline consisting of two modules, to address the above concerns. The first module performs computationally-intensive read alignment using Arioc, a GPU-accelerated short-read aligner. Since GPUs are not always available on the same computing environments where traditional CPU-based analyses are convenient, the second module may be run in a GPU-free environment. This module extracts and merges DNA methylation proportions-the fractions of methylated cytosines across all cells in a sample at a given genomic site. Bioconductor-based output objects in R utilize an on-disk data representation to drastically reduce required main memory and make WGBS projects computationally feasible to more researchers. CONCLUSIONS: BiocMAP is implemented using Nextflow and available at http://research.libd.org/BiocMAP/ . To enable reproducible analysis across a variety of typical computing environments, BiocMAP can be containerized with Docker or Singularity, and executed locally or with the SLURM or SGE scheduling engines. By providing Bioconductor objects, BiocMAP's output can be integrated with powerful analytical open source software for analyzing methylation data.
Assuntos
Genômica , Sulfitos , Humanos , Análise de Sequência de DNA , Sequenciamento Completo do GenomaRESUMO
Ceratocystis canker caused by Ceratocystis destructans is a severe disease of almond, reducing the longevity and productivity of infected trees. Once the disease has established in an individual tree, there is no cure, and management efforts are often limited to removing the infected area of cankers. In this study, we present the genome assemblies of five C. destructans isolates isolated from symptomatic almond trees. The genomes were assembled into a genome size of 27.2 ± 0.9 Mbp with an average of 6924 ± 135 protein-coding genes and an average GC content of 48.8 ± 0.02%. We concentrated our efforts on identifying putative virulence factors of canker pathogens. Analysis of the secreted carbohydrate-active enzymes showed that the genomes harbored 83.4 ± 1.8 secreted CAZymes. The secreted CAZymes covered all the known categories of CAZymes. AntiSMASH revealed that the genomes had at least 7 biosynthetic gene clusters, with one of the non-ribosomal peptide synthases encoding dimethylcoprogen, a conserved virulence determinant of plant pathogenic ascomycetes. From the predicted proteome, we also annotated cytochrome P450 monooxygenases, and transporters, these are well-established virulence determinants of canker pathogens. Moreover, we managed to identify 57.4 ± 2.1 putative effector proteins. Gene Ontology (GO) annotation was applied to compare gene content with two closely related species C. fimbriata, and C. albifundus. This study provides the first genome assemblies for C. destructans, expanding genomic resources for an important almond canker pathogen. The acquired knowledge provides a foundation for further advanced studies, such as molecular interactions with the host, which is critical for breeding for resistance.
Assuntos
Geraniaceae , Prunus dulcis , Ceratocystis , Prunus dulcis/genética , Melhoramento Vegetal , California , Sequenciamento Completo do GenomaRESUMO
MOTIVATION: The Positional Burrows-Wheeler Transform (PBWT) is a data structure that indexes haplotype sequences in a manner that enables finding maximal haplotype matches in h sequences containing w variation sites in O(hw) time. This represents a significant improvement over classical quadratic-time approaches. However, the original PBWT data structure does not allow for queries over Biobank panels that consist of several millions of haplotypes, if an index of the haplotypes must be kept entirely in memory. RESULTS: In this article, we leverage the notion of r-index proposed for the BWT to present a memory-efficient method for constructing and storing the run-length encoded PBWT, and computing set maximal matches (SMEMs) queries in haplotype sequences. We implement our method, which we refer to as µ-PBWT, and evaluate it on datasets of 1000 Genome Project and UK Biobank data. Our experiments demonstrate that the µ-PBWT reduces the memory usage up to a factor of 20% compared to the best current PBWT-based indexing. In particular, µ-PBWT produces an index that stores high-coverage whole genome sequencing data of chromosome 20 in about a third of the space of its BCF file. µ-PBWT is an adaptation of techniques for the run-length compressed BWT for the PBWT (RLPBWT) and it is based on keeping in memory only a succinct representation of the RLPBWT that still allows the efficient computation of set maximal matches (SMEMs) over the original panel. AVAILABILITY AND IMPLEMENTATION: Our implementation is open source and available at https://github.com/dlcgold/muPBWT. The binary is available at https://bioconda.github.io/recipes/mupbwt/README.html.
Assuntos
Bancos de Espécimes Biológicos , Haplótipos , Sequenciamento Completo do Genoma , Reino UnidoRESUMO
NG-Test CARBA 5 (NG-Biotech) is a rapid in vitro multiplex immunoassay for the phenotypic detection and differentiation of the "big five" carbapenemase families (KPC, OXA-48-like, VIM, IMP, and NDM). Version 2 of this assay was evaluated alongside the Xpert Carba-R assay (Cepheid, Inc.), the modified carbapenem inactivation method (mCIM), and the CIMTris assay, with a collection of carbapenem-resistant non-fermenting Gram-negative bacilli comprising 138 Pseudomonas aeruginosa and 97 Acinetobacter baumannii isolates. Whole-genome sequencing (WGS) was used as the reference standard. For P. aeruginosa, NG-Test CARBA 5 produced an overall percentage agreement (OPA) with WGS of 97.1%, compared with 92.8% forXpert Carba-R and 90.6% for mCIM. For A. baumannii, as OXA-type carbapenemases (non-OXA-48) are not included, both the NG-Test CARBA 5 and Xpert Carba-R only had an OPA of 6.2%, while the CIMTris performed well with an OPA of 99.0%. The majority of A. baumannii isolates (95.9%) tested falsely positive for IMP on NG-Test CARBA 5; no IMP genes were found on WGS. No clear cause was found for this phenomenon; a cross-reacting protein antigen unique to A. baumannii is a possible culprit. NG-Test CARBA 5 performed well for carbapenemase detection in P. aeruginosa. However, results from A. baumannii isolates should be interpreted with caution.
Assuntos
Proteínas de Bactérias , beta-Lactamases , Humanos , Proteínas de Bactérias/genética , beta-Lactamases/genética , Sequenciamento Completo do Genoma , Carbapenêmicos/farmacologia , Bactérias Gram-Negativas/genética , Pseudomonas aeruginosa/genéticaRESUMO
The hard-shell mussel (Mytilus coruscus) is widespread in the temperate coastal areas of the northwest Pacific and holds a significant position in the shellfish aquaculture market in China. However, the natural resources of this species have been declining, and population genetic studies of M. coruscus are also lacking. In this study, we conducted whole-genome resequencing (WGR) of M. coruscus from eight different latitudes along the Chinese coast and identified a total of 25,859,986 single nucleotide polymorphism (SNP) markers. Our findings indicated that the genetic diversity of M. coruscus from the Zhoushan region was lower compared with populations from other regions. Furthermore, we observed that the evolutionary tree clustered into two primary branches, and the Zhangzhou (ZZ) population was in a separate branch. The ZZ population was partly isolated from populations in other regions, but the distribution of branches was not geographically homogeneous, and a nested pattern emerged, consistent with the population differentiation index (FST) results. To investigate the selection characteristics, we utilized the northern M. coruscus populations (Dalian and Qingdao) and the central populations (Zhoushan and Xiangshan) as reference populations and the southern ZZ population as the target population. Our selection scan analysis identified several genes associated with thermal responses, including Hsp70 and CYP450. These genes may play important roles in the adaptation of M. coruscus to different living environments. Overall, our study provides a comprehensive understanding of the genomic diversity of coastal M. coruscus in China and is a valuable resource for future studies on genetic breeding and the evolutionary adaptation of this species.
Assuntos
Mytilus , Animais , Povo Asiático , Variação Genética , Mytilus/genética , Sequenciamento Completo do GenomaRESUMO
BACKGROUND: Brucellosis is a zoonotic disease whose causative agent, Brucella spp., is endemic in many countries of the Mediterranean basin, including Greece. Although the occurrence of brucellosis must be reported to the authorities, it is believed that the disease is under-reported in Greece, and knowledge about the genomic diversity of brucellae is lacking. METHODS: Thus, 44 Brucella isolates, primarily B. melitensis, collected between 1999 and 2009 from humans and small ruminants in Greece were subjected to whole genome sequencing using short-read technology. The raw reads and assembled genomes were used for in silico genotyping based on single nucleotide substitutions and alleles. Further, specific genomic regions encoding putative virulence genes were screened for characteristic nucleotide changes, which arose in different genotype lineages. RESULTS: In silico genotyping revealed that the isolates belonged to three of the known sublineages of the East Mediterranean genotype. In addition, a novel subgenotype was identified that was basal to the other East Mediterranean sublineages, comprising two Greek strains. The majority of the isolates can be assumed to be of endemic origin, as they were clustered with strains from the Western Balkans or Turkey, whereas one strain of human origin could be associated with travel to another endemic region, e.g. Portugal. Further, nucleotide substitutions in the housekeeping gene rpoB and virulence-associated genes were detected, which were characteristic of the different subgenotypes. One of the isolates originating from an aborted bovine foetus was identified as B. abortus vaccine strain RB51. CONCLUSION: The results demonstrate the existence of several distinct persistent Brucella sp. foci in Greece. To detect these and for tracing infection chains, extensive sampling initiatives are required.
Assuntos
Brucella melitensis , Brucelose , Humanos , Animais , Bovinos , Brucella melitensis/genética , Grécia/epidemiologia , Tipagem de Sequências Multilocus , Filogenia , Brucelose/epidemiologia , Brucelose/veterinária , Genótipo , Sequenciamento Completo do GenomaRESUMO
Incarvillea younghusbandii Sprague is a traditional tonic herb. The roots are used as herbal medicine for nourishing and strengthening, as well as treating postpartum milk deficiency and weakness. In this study, the chloroplast genome of I. younghusbandii was sequenced and assembled by the high-throughput sequencing technology. The sequence characteristics, sequence repeats, codon usage bias, phylogenetic relationships and estimated divergence time of I. younghusbandii were analyzed. The 159 323 bp sequence contained a large single copy (80 197 bp), a small single copy (9 030 bp) and two inverted repeat sequences (35 048 bp). It contained 120 genes, including 77 protein coding genes, 8 ribosomal RNA genes and 35 transfer RNA genes. AAA was the most frequent codon in the chloroplast coding sequence of I. younghusbandii. A total of 42 simple sequence repeats were identified in the chloroplast genome. Phylogenetic analysis revealed I. younghusbandii was mostly like its taxonomically close relative Incarvillea compacta. The divergence between I. younghusbandii and I. compacta was dated to 4.66 million years ago. This study was significant for the scientific conservation and development of resources related to I. compacta. It also provides a basic genetic resource for the subsequent species identification of the genus Incarvillea, and the population genetic diversity study of Bignoniaceae.
Assuntos
Genoma de Cloroplastos , Filogenia , Anotação de Sequência Molecular , Análise de Sequência de DNA , Sequenciamento Completo do GenomaRESUMO
Campylobacter jejuni and Campylobacter coli are important foodborne zoonotic pathogens and cause for concern due to the increasing trend in antimicrobial resistance. A long-run surveillance study was conducted in animals from different age groups in five dairy cattle farms to investigate the within-farm diversity and transmission dynamics of resistant Campylobacter throughout time. The resistance phenotype of the circulating isolates (170 C. jejuni and 37 C. coli) was determined by broth microdilution and a selection of 56 isolates were whole genome sequenced using the Oxford-Nanopore long-fragment sequencing technology resulting in completely resolved and circularized genomes (both chromosomes and plasmids). C. jejuni was isolated from all farms while C. coli was isolated from only two farms, but resistance rates were higher in C. coli than in C. jejuni and in calves than in adult animals. Some genotypes (e.g. ST-48, gyrA_T86I/tet(O)/blaOXA-61 in farm F1; ST-12000, aadE-Cc/tet(O)/blaOXA-489 in F4) persisted throughout the study while others were only sporadically detected. Acquisition of extracellular genes from other isolates and intracellular mutational events were identified as the processes that led to the emergence of the resistant genotypes that spread within the herds. Monitoring with Oxford Nanopore Technologies sequencing helped to decipher the complex molecular epidemiology underlying the within-farm dissemination of resistant Campylobacter.
Assuntos
Anti-Infecciosos , Infecções por Campylobacter , Campylobacter jejuni , Campylobacter , Bovinos , Animais , Fazendas , Infecções por Campylobacter/veterinária , Infecções por Campylobacter/epidemiologia , Sequenciamento Completo do Genoma , Anti-Infecciosos/farmacologia , Antibacterianos/farmacologia , Farmacorresistência Bacteriana/genética , Testes de Sensibilidade MicrobianaRESUMO
BACKGROUND: Since the introduction of next-generation sequencing (NGS) techniques, whole-exome sequencing (WES) and whole-genome sequencing (WGS) have not only revolutionized research, but also diagnostics. The gradual switch from single gene testing to WES and WGS required a different set of skills, given the amount and type of data generated, while the demand for standardization remained. However, most of the tools currently available are solely applicable for human analysis because they require access to specific databases and/or simply do not support other species. Additionally, a complicating factor in clinical genetics in animals is that genetic diversity is often dangerously low due to the breeding history. Combined, there is a clear need for an easy-to-use, flexible tool that allows standardized data processing and preferably, monitoring of genetic diversity as well. To fill these gaps, we developed the R-package variantscanR that allows an easy and straightforward identification and prioritization of known phenotype-associated variants identified in dogs and other domestic animals. RESULTS: The R-package variantscanR enables the filtering of variant call format (VCF) files for the presence of known phenotype-associated variants and allows for the estimation of genetic diversity using multi-sample VCF files. Next to this, additional functions are available for the quality control and processing of user-defined input files to make the workflow as easy and straightforward as possible. This user-friendly approach enables the standardisation of complex data analysis in clinical settings. CONCLUSION: We developed an R-package for the identification of known phenotype-associated variants and calculation of genetic diversity.
Assuntos
Animais Domésticos , Software , Humanos , Animais , Cães , Animais Domésticos/genética , Sequenciamento Completo do Genoma/métodos , Fenótipo , Bases de Dados Genéticas , Sequenciamento de Nucleotídeos em Larga EscalaRESUMO
Whether genetic testing in autism can help understand longitudinal health outcomes and health service needs is unclear. The objective of this study was to determine whether carrying an autism-associated rare genetic variant is associated with differences in health system utilization by autistic children and youth. This retrospective cohort study examined 415 autistic children/youth who underwent genome sequencing and data collection through a translational neuroscience program (Province of Ontario Neurodevelopmental Disorders Network). Participant data were linked to provincial health administrative databases to identify historical health service utilization, health care costs, and complex chronic medical conditions during a 3-year period. Health administrative data were compared between participants with and without a rare genetic variant in at least 1 of 74 genes associated with autism. Participants with a rare variant impacting an autism-associated gene (n = 83, 20%) were less likely to have received psychiatric care (at least one psychiatrist visit: 19.3% vs. 34.3%, p = 0.01; outpatient mental health visit: 66% vs. 77%, p = 0.04). Health care costs were similar between groups (median: $5589 vs. $4938, p = 0.4) and genetic status was not associated with odds of being a high-cost participant (top 20%) in this cohort. There were no differences in the proportion with complex chronic medical conditions between those with and without an autism-associated genetic variant. Our study highlights the feasibility and potential value of genomic and health system data linkage to understand health service needs, disparities, and health trajectories in individuals with neurodevelopmental conditions.
Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Criança , Adolescente , Humanos , Transtorno Autístico/genética , Transtorno do Espectro Autista/epidemiologia , Transtorno do Espectro Autista/genética , Estudos Retrospectivos , Estudo de Prova de Conceito , Sequenciamento Completo do GenomaRESUMO
This study aimed to determinate characteristics of drug resistance Mycobacterium tuberculosis from patients with extra-pulmonary tuberculosis (EPTB). Patients were retrospectively studied from January 2020 to December 2021. All the isolates were cultured, tested drug susceptibility, and detected the gene mutation using whole genome sequencing. The correlations of whole genome sequencing, pattern of DR, patients' distribution, and transmission were analyzed. 111 DR-EPTB isolates included pre-XDR-TB (53.2%), MDR-TB (29.7%), and poly-DR-TB (12.6%). The resistant drugs were INH followed by RFP and SM. The genotypes of 111 strains were lineage 2 and lineage 4. KatG_p.Ser315Thr was main gene mutation for resistance to INH; rpsL_p.Lys43Arg for SM, rpoB_p.Ser450Leu for rifampicin, embB_p.Met306Val for ethambutol, gyrA_p.Asp94Gly for FQs, and pncA_p.Thr76Pro for PZA. The residence was a significant risk factor for cluster transmission by patients and phenotypic DR types of strains for lineage 2 transmission. In the local area of southwest China INH, rifampicin and SM were main drugs in patients with DR-EPTB. KatG_p.Ser315, rpoB_p.Ser450Leu, and rpsL_p.Lys43Arg were main gene mutations. Phenotypic DR types and residence were main risk of transmission.
Assuntos
Mycobacterium tuberculosis , Tuberculose Extrapulmonar , Tuberculose Resistente a Múltiplos Medicamentos , Humanos , Mycobacterium tuberculosis/genética , Estudos Retrospectivos , Rifampina , Tuberculose Resistente a Múltiplos Medicamentos/genética , Sequenciamento Completo do Genoma , Resistência a MedicamentosRESUMO
BACKGROUND: Environmental conditions vary among deserts across the world, spanning from hyper-arid to high-elevation deserts. However, prior genomic studies on desert adaptation have focused on desert and non-desert comparisons overlooking the complexity of conditions within deserts. Focusing on the adaptation mechanisms to diverse desert environments will advance our understanding of how species adapt to extreme desert environments. The hairy-footed jerboas are well adapted to diverse desert environments, inhabiting high-altitude arid regions, hyper-arid deserts, and semi-deserts, but the genetic basis of their adaptation to different deserts remains unknown. RESULTS: Here, we sequenced the whole genome of 83 hairy-footed jerboas from distinct desert zones in China to assess how they responded under contrasting conditions. Population genomics analyses reveal the existence of three species in hairy-footed jerboas distributed in China: Dipus deasyi, Dipus sagitta, and Dipus sowerbyi. Analyses of selection between high-altitude desert (elevation ≥ 3000m) and low-altitude desert (< 500m) populations identified two strongly selected genes, ATR and HIF1AN, associated with intense UV radiation and hypoxia in high-altitude environments. A number of candidate genes involved in energy and water homeostasis were detected in the comparative genomic analyses of hyper-arid desert (average annual precipitation < 70mm) and arid desert (< 200mm) populations versus semi-desert (> 360mm) populations. Hyper-arid desert animals also exhibited stronger adaptive selection in energy homeostasis, suggesting water and resource scarcity may be the main drivers of desert adaptation in hairy-footed jerboas. CONCLUSIONS: Our study challenges the view of deserts as homogeneous environments and shows that distinct genomic adaptations can be found among desert animals depending on their habitats.
Assuntos
Aclimatação , Roedores , Animais , Sequenciamento Completo do Genoma , Meio Ambiente , AltitudeRESUMO
The use of dyes in textile industries has resulted in substantially contaminated soil, water and ecosystem including fauna and flora. So, the application of eco-friendly approach for dyes removal is in great demand. The goal of this research was to develop and test a bacterial consortium for biodegrading dyes in artificial textile effluent (ATE) derived from mixture of Indigo carmine (40 mg/l); Malachite green (20 mg/l); Cotton bleu (40 mg/l); Bromocresol green (20 mg/l) and CI Reactive Red 66 (40 mg/l) dissolved in artificial seawater. The Box-Behnken design (BBD) which combine six variables with three levels each was used to determine the potential removal of dyes in ATE, by the selected microbial consortium (M31 and M69b). The experimental process indicated that decolourization of ATE reached 77.36 % under these conditions values: salinity (30 g/l), pH (9), peptone (5 g/l), inoculum size (1.5 108 CFU/ml), agitation (150 rpm) and contact time (72 h). The decolourization was confirmed by FTIR spectrum analysis of ATE before and after bacterial treatment. Bacterial strains used in this study were identified as Halomonas pacifica M31 and Shewanella algae M69b using 16 rDNA sequences. Moreover, the total genome analysis of M31 and M69b validated the implication of bacterial genes in mixture dyes removal. Therefore, the effect of the selected bacterial consortium on ATE removal was confirmed and it may be used in industrial wastewater treatment to issuing environmental safety.
Assuntos
Ecossistema , Consórcios Microbianos , Consórcios Microbianos/genética , Corantes , Verde de Bromocresol , Sequenciamento Completo do GenomaRESUMO
Lyme disease is the most common vector-borne disease in North America and Europe. The clinical manifestations of Lyme disease vary based on the genospecies of the infecting Borrelia burgdorferi spirochete, but the microbial genetic elements underlying these associations are not known. Here, we report the whole genome sequence (WGS) and analysis of 299 B. burgdorferi (Bb) isolates derived from patients in the Eastern and Midwestern US and Central Europe. We develop a WGS-based classification of Bb isolates, confirm and extend the findings of previous single- and multi-locus typing systems, define the plasmid profiles of human-infectious Bb isolates, annotate the core and strain-variable surface lipoproteome, and identify loci associated with disseminated infection. A core genome consisting of ~900 open reading frames and a core set of plasmids consisting of lp17, lp25, lp36, lp28-3, lp28-4, lp54, and cp26 are found in nearly all isolates. Strain-variable (accessory) plasmids and genes correlate strongly with phylogeny. Using genetic association study methods, we identify an accessory genome signature associated with dissemination in humans and define the individual plasmids and genes that make up this signature. Strains within the RST1/WGS A subgroup, particularly a subset marked by the OspC type A genotype, have increased rates of dissemination in humans. OspC type A strains possess a unique set of strongly linked genetic elements including the presence of lp56 and lp28-1 plasmids and a cluster of genes that may contribute to their enhanced virulence compared to other genotypes. These features of OspC type A strains reflect a broader paradigm across Bb isolates, in which near-clonal genotypes are defined by strain-specific clusters of linked genetic elements, particularly those encoding surface-exposed lipoproteins. These clusters of genes are maintained by strain-specific patterns of plasmid occupancy and are associated with the probability of invasive infection.
Assuntos
Borrelia burgdorferi , Doença de Lyme , Humanos , Borrelia burgdorferi/genética , Genótipo , Sequenciamento Completo do Genoma , Plasmídeos/genéticaRESUMO
Listeria monocytogenes is a pathogenic bacterium which can live in adverse environments (low pH, high salinity, and low temperature). Even though there are various whole genome sequencing (WGS) data on L. monocytogenes, investigations on genetic differences between stress-resistant and -sensitive L. monocytogenes grown under stress environments have been not fully examined. This study aims to investigate and compare genetic characteristics between stress-resistant and -sensitive L. monocytogenes using whole genome sequencing (WGS). A total of 47 L. monocytogenes strains (43 stress-resistant and 4 stress-sensitive) were selected based on the stress-resistance tests under pH 3, 5% salt concentration, and 1 °C. The sequencing library for WGS was prepared and sequenced using an Illumina MiSeq. Genetic characteristics of two different L. monocytogenes groups were examined to analyze the pangenome, functionality, virulence, antibiotic resistance, core, and unique genes. The functionality of unique genes in the stress-resistant L. monocytogenes was distinct compared to the stress-sensitive L. monocytogenes, such as carbohydrate and nucleotide transport and metabolism. The lisR virulence gene was detected more in the stress-resistant L. monocytogenes than in the stress-sensitive group. Five stress-resistant L. monocytogenes strains possessed tet(M) antibiotic resistance gene. This is the first study suggesting that deep genomic characteristics of L. monocytogenes may have different resistance level under stress conditions. This new insight will aid in understanding the genetic relationship between stress-resistant and -sensitive L. monocytogenes strains isolated from diverse resources. KEY POINTS: ⢠Whole genomes of L. monocytogenes isolated from three different sources were analyzed. ⢠Differences in two L. monocytogenes groups were identified in functionality, virulence, and antibiotic resistance genes. ⢠This study first examines the association between resistances and whole genomes of stress-resistant and -sensitive L. monocytogenes.
Assuntos
Listeria monocytogenes , Listeria monocytogenes/genética , Microbiologia de Alimentos , Virulência/genética , Fatores de Virulência/genética , Sequenciamento Completo do GenomaRESUMO
Whole genome sequencing (WGS) in cancer genomics has become widespread with recent technological innovations, and the amount and types of information obtained from WGS are increasing rapidly. Appropriate interpretation of results is becoming increasingly important in clinical applications. This study aimed to evaluate the accuracy of tumor content estimation and its impact on somatic variant detection, using 100 simulated tumor samples covering 10-100% tumor content constructed from the sequencing data of cell line models. Extensive analysis revealed that the estimation results varied among computational analytical methods. Notably, there was a large discrepancy in low tumor content (≤ 30%). The reproducibility decreased in cases wherein chromosome-scale copy number changes were observed in normal cells. The minimum tumor content required to detect somatic alterations was estimated to be 10-30%. Identification of whole genome doubling was achieved with the lowest tumor content, followed by single nucleotide variation/insertion or deletion, structural variation, and copy number variation. Tumor content had a significantly higher impact on the false negatives than the false positives in variant calls. Results should be interpreted cautiously for samples wherein tumor content is a concern. These results can form the basis of developing important guidelines for evaluating cancer WGS.
Assuntos
Variações do Número de Cópias de DNA , Neoplasias , Humanos , Reprodutibilidade dos Testes , Neoplasias/diagnóstico , Neoplasias/genética , Sequenciamento Completo do Genoma/métodos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodosRESUMO
The process of biofilm formation is intricate and multifaceted, requiring the individual cells to secrete extracellular polymeric substances (EPS) that subsequently aggregate and adhere to various surfaces. The issue of biofilms is a significant concern for public health due to the increased resistance of microorganisms associated with biofilms to antimicrobial agents. The current study describes the whole genome and corresponding functions of a biofilm inhibiting and eradicating actinobacteria isolate identified as Nocardiopsis lucentensis EMB25. The N. lucentensis EMB25 has 6.5 Mbp genome with 71.62% GC content. The genome analysis by BLAST Ring Image Generator (BRIG) revealed it to be closely related to Nocardiopsis dassonvillei NOCA502F. Interestingly, based on orthologous functional groups reflected by average nucleotide identity (ANI) analysis, it was 81.48% similar to N. arvandica DSM4527. Also, it produces lanthipeptides and linear azole(in)e-containing peptides (LAPs) akin to N. arvandica. The secondary metabolite search revealed the presence of major gene clusters involved in terpene, ectoine, siderophores, Lanthipeptides, RiPP-like, and T1PKS biosynthesis. After 24 h of treatment, the cell-free extract effectively eradicates the pre-existing biofilm of P. aeruginosa PseA. Also, the isolated bacteria exhibited antibacterial activity against MRSA, Staphylococcus aureus and Bacillus subtilis bacteria. Overall, this finding offers valuable insights into the identification of BGCs, which contain enzymes that play a role in the biosynthesis of natural products. Specifically, it sheds light on the functional aspects of these BGCs in relation to N. lucentensis.
Assuntos
Actinobacteria , Bacillus , Biofilmes , Actinobacteria/genética , Matriz Extracelular de Substâncias Poliméricas , Sequenciamento Completo do GenomaRESUMO
High-quality whole-genome resequencing in large-scale pig populations with pedigree structure and multiple breeds would enable accurate construction of haplotype and robust selection-signature detection. Here, we sequence 740 pigs, combine with 149 of our previously published resequencing data, retrieve 207 resequencing datasets, and form a panel of worldwide distributed wild boars, aboriginal and highly selected pigs with pedigree structures, amounting to 1096 genomes from 43 breeds. Combining with their haplotype-informative reads and pedigree structure, we accurately construct a panel of 1874 haploid genomes with 41,964,356 genetic variants. We further demonstrate its valuable applications in GWAS by identifying five novel loci for intramuscular fat content, and in genomic selection by increasing the accuracy of estimated breeding value by 36.7%. In evolutionary selection, we detect MUC13 gene under a long-term balancing selection, as well as NPR3 gene under positive selection for pig stature. Our study provides abundant genomic variations for robust selection-signature detection and accurate haplotypes for deciphering complex traits in pigs.