RESUMO
BACKGROUND: Pneumococcus kills over one million children annually and over 90 % of these deaths occur in low-income countries especially in Sub-Saharan Africa (SSA) where HIV exacerbates the disease burden. In SSA, serotype 1 pneumococci particularly the endemic ST217 clone, causes majority of the pneumococcal disease burden. To understand the evolution of the virulent ST217 clone, we analysed ST217 whole genomes from isolates sampled from African and Asian countries. METHODS: We analysed 226 whole genome sequences from the ST217 lineage sampled from 9 African and 4 Asian countries. We constructed a whole genome alignment and used it for phylogenetic and coalescent analyses. We also screened the genomes to determine presence of antibiotic resistance conferring genes. RESULTS: Population structure analysis grouped the ST217 isolates into five sequence clusters (SCs), which were highly associated with different geographical regions and showed limited intracontinental and intercontinental spread. The SCs showed lower than expected genomic sequence, which suggested strong purifying selection and small population sizes caused by bottlenecks. Recombination rates varied between the SCs but were lower than in other successful clones such as PMEN1. African isolates showed higher prevalence of antibiotic resistance genes than Asian isolates. Interestingly, certain West African isolates harbored a defective chloramphenicol and tetracycline resistance-conferring element (Tn5253) with a deletion in the loci encoding the chloramphenicol resistance gene (cat pC194), which caused lower chloramphenicol than tetracycline resistance. Furthermore, certain genes that promote colonisation were absent in the isolates, which may contribute to serotype 1's rarity in carriage and consequently its lower recombination rates. CONCLUSIONS: The high phylogeographic diversity of the ST217 clone shows that this clone has been in circulation globally for a long time, which allowed its diversification and adaptation in different geographical regions. Such geographic adaptation reflects local variations in selection pressures in different locales. Further studies will be required to fully understand the biological mechanisms which makes the ST217 clone highly invasive but unable to successfully colonise the human nasopharynx for long durations which results in lower recombination rates.
Assuntos
Infecções Pneumocócicas/microbiologia , Streptococcus pneumoniae/genética , África , Ásia , Farmacorresistência Bacteriana/genética , Variação Genética , Humanos , Nasofaringe/microbiologia , Filogenia , Infecções Pneumocócicas/epidemiologia , Recombinação Genética , Seleção Genética , Sorogrupo , Streptococcus pneumoniae/efeitos dos fármacos , Streptococcus pneumoniae/isolamento & purificação , Resistência a Tetraciclina/genéticaRESUMO
Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.
Assuntos
Bacteriemia/microbiologia , Proteínas de Bactérias/genética , Meningite/microbiologia , Infecções Estreptocócicas/microbiologia , Streptococcus pneumoniae/genética , Proteínas de Bactérias/metabolismo , Genoma Bacteriano , Genômica , Humanos , Streptococcus pneumoniae/classificação , Streptococcus pneumoniae/isolamento & purificação , Streptococcus pneumoniae/metabolismoRESUMO
We present whole genome profiling (WGP), a novel next-generation sequencing-based physical mapping technology for construction of bacterial artificial chromosome (BAC) contigs of complex genomes, using Arabidopsis thaliana as an example. WGP leverages short read sequences derived from restriction fragments of two-dimensionally pooled BAC clones to generate sequence tags. These sequence tags are assigned to individual BAC clones, followed by assembly of BAC contigs based on shared regions containing identical sequence tags. Following in silico analysis of WGP sequence tags and simulation of a map of Arabidopsis chromosome 4 and maize, a WGP map of Arabidopsis thaliana ecotype Columbia was constructed de novo using a six-genome equivalent BAC library. Validation of the WGP map using the Columbia reference sequence confirmed that 350 BAC contigs (98%) were assembled correctly, spanning 97% of the 102-Mb calculated genome coverage. We demonstrate that WGP maps can also be generated for more complex plant genomes and will serve as excellent scaffolds to anchor genetic linkage maps and integrate whole genome sequence data.
Assuntos
Arabidopsis/genética , Mapeamento Cromossômico/métodos , Genoma de Planta/genética , Sequenciamento de Nucleotídeos em Larga Escala , Cromossomos Artificiais Bacterianos/genética , Biologia Computacional , Mapeamento de Sequências Contíguas , Biblioteca GenômicaRESUMO
The duplicated and the highly repetitive nature of the maize genome has historically impeded the development of true single nucleotide polymorphism (SNP) markers in this crop. Recent advances in genome complexity reduction methods coupled with sequencing-by-synthesis technologies permit the implementation of efficient genome-wide SNP discovery in maize. In this study, we have applied Complexity Reduction of Polymorphic Sequences technology (Keygene N.V., Wageningen, The Netherlands) for the identification of informative SNPs between two genetically distinct maize inbred lines of North and South American origins. This approach resulted in the discovery of 1,123 putative SNPs representing low and single copy loci. In silico and experimental (Illumina GoldenGate (GG) assay) validation of putative SNPs resulted in mapping of 604 markers, out of which 188 SNPs represented 43 haplotype blocks distributed across all ten chromosomes. We have determined and clearly stated a specific combination of stringent criteria (>0.3 minor allele frequency, >0.8 GenTrainScore and >0.5 Chi_test100 score) necessary for the identification of highly polymorphic and genetically stable SNP markers. Due to these criteria, we identified a subset of 120 high-quality SNP markers to leverage in GG assay-based marker-assisted selection projects. A total of 32 high-quality SNPs represented 21 haplotypes out of 43 identified in this study. The information on the selection criteria of highly polymorphic SNPs in a complex genome such as maize and the public availability of these SNP assays will be of great value for the maize molecular genetics and breeding community.
Assuntos
Mapeamento Cromossômico , Cromossomos de Plantas/genética , Marcadores Genéticos/genética , Genoma de Planta/genética , Polimorfismo de Nucleotídeo Único/genética , Zea mays/genética , Cruzamento , Primers do DNA , DNA de Plantas/genética , Ligação Genética , Genótipo , Reação em Cadeia da PolimeraseRESUMO
Serotype 1 is one of the most common causes of pneumococcal disease worldwide. Pneumococcal protein vaccines are currently being developed as an alternate intervention strategy to pneumococcal conjugate vaccines. Pre-requisites for an efficacious pneumococcal protein vaccine are universal presence and minimal variation of the target antigen in the pneumococcal population, and the capability to induce a robust human immune response. We used in silico analysis to assess the prevalence of seven protein vaccine candidates (CbpA, PcpA, PhtD, PspA, SP0148, SP1912, SP2108) among 445 serotype 1 pneumococci from 26 different countries, across four continents. CbpA (76%), PspA (68%), PhtD (28%), PcpA (11%) were not universally encoded in the study population, and would not provide full coverage against serotype 1. PcpA was widely present in the European (82%), but not in the African (2%) population. A multi-valent vaccine incorporating CbpA, PcpA, PhtD and PspA was predicted to provide coverage against 86% of the global population. SP0148, SP1912 and SP2108 were universally encoded and we further assessed their predicted amino acid, antigenic and structural variation. Multiple allelic variants of these proteins were identified, different allelic variants dominated in different continents; the observed variation was predicted to impact the antigenicity and structure of two SP0148 variants, one SP1912 variant and four SP2108 variants, however these variants were each only present in a small fraction of the global population (<2%). The vast majority of the observed variation was predicted to have no impact on the efficaciousness of a protein vaccine incorporating a single variant of SP0148, SP1912 and/or SP2108 from S. pneumoniae TIGR4. Our findings emphasise the importance of taking geographic differences into account when designing global vaccine interventions and support the continued development of SP0148, SP1912 and SP2108 as protein vaccine candidates against this important pneumococcal serotype.
Assuntos
Variação Antigênica , Antígenos de Bactérias/genética , Proteínas de Bactérias/genética , Infecções Pneumocócicas/prevenção & controle , Vacinas Pneumocócicas/administração & dosagem , Streptococcus pneumoniae/genética , Streptococcus pneumoniae/patogenicidade , África , Alelos , Sequência de Aminoácidos , Antígenos de Bactérias/química , Antígenos de Bactérias/imunologia , Ásia , Proteínas de Bactérias/química , Proteínas de Bactérias/imunologia , Europa (Continente) , Geografia , Saúde Global , Humanos , Modelos Moleculares , Infecções Pneumocócicas/imunologia , Infecções Pneumocócicas/patologia , Infecções Pneumocócicas/virologia , Vacinas Pneumocócicas/biossíntese , Vacinas Pneumocócicas/genética , Vacinas Pneumocócicas/imunologia , Sorogrupo , América do Sul , Streptococcus pneumoniae/classificação , Streptococcus pneumoniae/imunologia , Cobertura Vacinal/estatística & dados numéricos , Vacinas de Subunidades Antigênicas , VirulênciaRESUMO
MOTIVATION: To obtain large-scale sequence alignments in a fast and flexible way is an important step in the analyses of next generation sequencing data. Applications based on the Smith-Waterman (SW) algorithm are often either not fast enough, limited to dedicated tasks or not sufficiently accurate due to statistical issues. Current SW implementations that run on graphics hardware do not report the alignment details necessary for further analysis. RESULTS: With the Parallel SW Alignment Software (PaSWAS) it is possible (a) to have easy access to the computational power of NVIDIA-based general purpose graphics processing units (GPGPUs) to perform high-speed sequence alignments, and (b) retrieve relevant information such as score, number of gaps and mismatches. The software reports multiple hits per alignment. The added value of the new SW implementation is demonstrated with two test cases: (1) tag recovery in next generation sequence data and (2) isotype assignment within an immunoglobulin 454 sequence data set. Both cases show the usability and versatility of the new parallel Smith-Waterman implementation.
Assuntos
Alinhamento de Sequência , Análise de Sequência de DNA , Sequência de Bases , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Imunoglobulinas/genética , Polimorfismo de Nucleotídeo Único , SoftwareRESUMO
Serotype 1 Streptococcus pneumoniae is a leading cause of invasive pneumococcal disease (IPD) worldwide, with the highest burden in developing countries. We report the whole-genome sequencing analysis of 448 serotype 1 isolates from 27 countries worldwide (including 11 in Africa). The global serotype 1 population shows a strong phylogeographic structure at the continental level, and within Africa there is further region-specific structure. Our results demonstrate that region-specific diversification within Africa has been driven by limited cross-region transfer events, genetic recombination and antimicrobial selective pressure. Clonal replacement of the dominant serotype 1 clones circulating within regions is uncommon; however, here we report on the accessory gene content that has contributed to a rare clonal replacement event of ST3081 with ST618 as the dominant cause of IPD in the Gambia.
RESUMO
Conventional marker-based genotyping platforms are widely available, but not without their limitations. In this context, we developed Sequence-Based Genotyping (SBG), a technology for simultaneous marker discovery and co-dominant scoring, using next-generation sequencing. SBG offers users several advantages including a generic sample preparation method, a highly robust genome complexity reduction strategy to facilitate de novo marker discovery across entire genomes, and a uniform bioinformatics workflow strategy to achieve genotyping goals tailored to individual species, regardless of the availability of a reference sequence. The most distinguishing features of this technology are the ability to genotype any population structure, regardless whether parental data is included, and the ability to co-dominantly score SNP markers segregating in populations. To demonstrate the capabilities of SBG, we performed marker discovery and genotyping in Arabidopsis thaliana and lettuce, two plant species of diverse genetic complexity and backgrounds. Initially we obtained 1,409 SNPs for arabidopsis, and 5,583 SNPs for lettuce. Further filtering of the SNP dataset produced over 1,000 high quality SNP markers for each species. We obtained a genotyping rate of 201.2 genotypes/SNP and 58.3 genotypes/SNP for arabidopsis (nâ=â222 samples) and lettuce (nâ=â87 samples), respectively. Linkage mapping using these SNPs resulted in stable map configurations. We have therefore shown that the SBG approach presented provides users with the utmost flexibility in garnering high quality markers that can be directly used for genotyping and downstream applications. Until advances and costs will allow for routine whole-genome sequencing of populations, we expect that sequence-based genotyping technologies such as SBG will be essential for genotyping of model and non-model genomes alike.
Assuntos
Arabidopsis/genética , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Lactuca/genética , Mapeamento Cromossômico , Biologia Computacional/métodos , Ligação Genética , Marcadores Genéticos , Genoma de Planta , Genótipo , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos TestesRESUMO
Application of single nucleotide polymorphisms (SNPs) is revolutionizing human bio-medical research. However, discovery of polymorphisms in low polymorphic species is still a challenging and costly endeavor, despite widespread availability of Sanger sequencing technology. We present CRoPS as a novel approach for polymorphism discovery by combining the power of reproducible genome complexity reduction of AFLP with Genome Sequencer (GS) 20/GS FLX next-generation sequencing technology. With CRoPS, hundreds-of-thousands of sequence reads derived from complexity-reduced genome sequences of two or more samples are processed and mined for SNPs using a fully-automated bioinformatics pipeline. We show that over 75% of putative maize SNPs discovered using CRoPS are successfully converted to SNPWave assays, confirming them to be true SNPs derived from unique (single-copy) genome sequences. By using CRoPS, polymorphism discovery will become affordable in organisms with high levels of repetitive DNA in the genome and/or low levels of polymorphism in the (breeding) germplasm without the need for prior sequence information.