RESUMO
Salmonella enterica serovar Typhimurium strain ATCC14028s is commercially available from multiple national type culture collections, and has been widely used since 1960 for quality control of growth media and experiments on fitness ("laboratory evolution"). ATCC14028s has been implicated in multiple cross-contaminations in the laboratory, and has also caused multiple laboratory infections and one known attempt at bioterrorism. According to hierarchical clustering of 3002 core gene sequences, ATCC14028s belongs to HierCC cluster HC20_373 in which most internal branch lengths are only one to three SNPs long. Many natural Typhimurium isolates from humans, domesticated animals and the environment also belong to HC20_373, and their core genomes are almost indistinguishable from those of laboratory strains. These natural isolates have infected humans in Ireland and Taiwan for decades, and are common in the British Isles as well as the Americas. The isolation history of some of the natural isolates confirms the conclusion that they do not represent recent contamination by the laboratory strain, and 10% carry plasmids or bacteriophages which have been acquired in nature by HGT from unrelated bacteria. We propose that ATCC14028s has repeatedly escaped from the laboratory environment into nature via laboratory accidents or infections, but the escaped micro-lineages have only a limited life span. As a result, there is a genetic gap separating HC20_373 from its closest natural relatives due to a divergence between them in the late 19th century followed by repeated extinction events of escaped HC20_373.
Assuntos
Genoma Bacteriano , Laboratórios , Salmonella enterica/genética , Teorema de Bayes , Bioterrorismo , Bases de Dados Genéticas , Evolução Molecular , Funções Verossimilhança , Filogenia , Salmonella enterica/classificaçãoRESUMO
The gastric bacterium Helicobacter pylori shares a coevolutionary history with humans that predates the out-of-Africa diaspora, and the geographical specificities of H. pylori populations reflect multiple well-known human migrations. We extensively sampled H. pylori from 16 ethnically diverse human populations across Siberia to help resolve whether ancient northern Eurasian populations persisted at high latitudes through the last glacial maximum and the relationships between present-day Siberians and Native Americans. A total of 556 strains were cultivated and genotyped by multilocus sequence typing, and 54 representative draft genomes were sequenced. The genetic diversity across Eurasia and the Americas was structured into three populations: hpAsia2, hpEastAsia, and hpNorthAsia. hpNorthAsia is closely related to the subpopulation hspIndigenousAmericas from Native Americans. Siberian bacteria were structured into five other subpopulations, two of which evolved through a divergence from hpAsia2 and hpNorthAsia, while three originated though Holocene admixture. The presence of both anciently diverged and recently admixed strains across Siberia support both Pleistocene persistence and Holocene recolonization. We also show that hspIndigenousAmericas is endemic in human populations across northern Eurasia. The evolutionary history of hspIndigenousAmericas was reconstructed using approximate Bayesian computation, which showed that it colonized the New World in a single migration event associated with a severe demographic bottleneck followed by low levels of recent admixture across the Bering Strait.
Assuntos
Migração Animal/fisiologia , Helicobacter pylori/fisiologia , América , Evolução Biológica , Genoma Bacteriano , Geografia , Helicobacter pylori/classificação , Helicobacter pylori/genética , Humanos , Modelos Biológicos , Tipagem de Sequências Multilocus , SibériaRESUMO
Bacterial genomes can contain traces of a complex evolutionary history, including extensive homologous recombination, gene loss, gene duplications, and horizontal gene transfer. To reconstruct the phylogenetic and population history of a set of multiple bacteria, it is necessary to examine their pangenome, the composite of all the genes in the set. Here we introduce PEPPAN, a novel pipeline that can reliably construct pangenomes from thousands of genetically diverse bacterial genomes that represent the diversity of an entire genus. PEPPAN outperforms existing pangenome methods by providing consistent gene and pseudogene annotations extended by similarity-based gene predictions, and identifying and excluding paralogs by combining tree- and synteny-based approaches. The PEPPAN package additionally includes PEPPAN_parser, which implements additional downstream analyses, including the calculation of trees based on accessory gene content or allelic differences between core genes. To test the accuracy of PEPPAN, we implemented SimPan, a novel pipeline for simulating the evolution of bacterial pangenomes. We compared the accuracy and speed of PEPPAN with four state-of-the-art pangenome pipelines using both empirical and simulated data sets. PEPPAN was more accurate and more specific than any of the other pipelines and was almost as fast as any of them. As a case study, we used PEPPAN to construct a pangenome of approximately 40,000 genes from 3052 representative genomes spanning at least 80 species of Streptococcus The resulting gene and allelic trees provide an unprecedented overview of the genomic diversity of the entire Streptococcus genus.
Assuntos
Bactérias/classificação , Genoma Bacteriano , Genômica/métodos , Filogenia , Algoritmos , Genes Bacterianos , Pseudogenes , Software , Streptococcus/classificação , Streptococcus/genéticaRESUMO
EnteroBase is an integrated software environment that supports the identification of global population structures within several bacterial genera that include pathogens. Here, we provide an overview of how EnteroBase works, what it can do, and its future prospects. EnteroBase has currently assembled more than 300,000 genomes from Illumina short reads from Salmonella, Escherichia, Yersinia, Clostridioides, Helicobacter, Vibrio, and Moraxella and genotyped those assemblies by core genome multilocus sequence typing (cgMLST). Hierarchical clustering of cgMLST sequence types allows mapping a new bacterial strain to predefined population structures at multiple levels of resolution within a few hours after uploading its short reads. Case Study 1 illustrates this process for local transmissions of Salmonella enterica serovar Agama between neighboring social groups of badgers and humans. EnteroBase also supports single nucleotide polymorphism (SNP) calls from both genomic assemblies and after extraction from metagenomic sequences, as illustrated by Case Study 2 which summarizes the microevolution of Yersinia pestis over the last 5000 years of pandemic plague. EnteroBase can also provide a global overview of the genomic diversity within an entire genus, as illustrated by Case Study 3, which presents a novel, global overview of the population structure of all of the species, subspecies, and clades within Escherichia.
Assuntos
Bases de Dados Genéticas , Escherichia/genética , Genoma Bacteriano , Genômica , Salmonella/genética , Yersinia pestis/genética , Escherichia/classificação , Genômica/métodos , Metagenoma , Metagenômica/métodos , Tipagem de Sequências Multilocus , Filogenia , Salmonella/classificação , Software , Interface Usuário-Computador , Navegador , Yersinia pestis/classificaçãoRESUMO
MOTIVATION: Routine infectious disease surveillance is increasingly based on large-scale whole-genome sequencing databases. Real-time surveillance would benefit from immediate assignments of each genome assembly to hierarchical population structures. Here we present pHierCC, a pipeline that defines a scalable clustering scheme, HierCC, based on core genome multi-locus typing that allows incremental, static, multi-level cluster assignments of genomes. We also present HCCeval, which identifies optimal thresholds for assigning genomes to cohesive HierCC clusters. HierCC was implemented in EnteroBase in 2018 and has since genotyped >530 000 genomes from Salmonella, Escherichia/Shigella, Streptococcus, Clostridioides, Vibrio and Yersinia. AVAILABILITY AND IMPLEMENTATION: https://enterobase.warwick.ac.uk/ and Source code and instructions: https://github.com/zheminzhou/pHierCC. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
[This corrects the article DOI: 10.1371/journal.ppat.1002776.].
RESUMO
Current methods struggle to reconstruct and visualize the genomic relationships of large numbers of bacterial genomes. GrapeTree facilitates the analyses of large numbers of allelic profiles by a static "GrapeTree Layout" algorithm that supports interactive visualizations of large trees within a web browser window. GrapeTree also implements a novel minimum spanning tree algorithm (MSTree V2) to reconstruct genetic relationships despite high levels of missing data. GrapeTree is a stand-alone package for investigating phylogenetic trees plus associated metadata and is also integrated into EnteroBase to facilitate cutting edge navigation of genomic relationships among bacterial pathogens.
Assuntos
Bactérias/genética , Código de Barras de DNA Taxonômico/métodos , Genoma Bacteriano , Filogenia , Software , Alelos , Bactérias/classificação , Bactérias/patogenicidadeRESUMO
For many decades, Salmonella enterica has been subdivided by serological properties into serovars or further subdivided for epidemiological tracing by a variety of diagnostic tests with higher resolution. Recently, it has been proposed that so-called eBurst groups (eBGs) based on the alleles of seven housekeeping genes (legacy multilocus sequence typing [MLST]) corresponded to natural populations and could replace serotyping. However, this approach lacks the resolution needed for epidemiological tracing and the existence of natural populations had not been independently validated by independent criteria. Here, we describe EnteroBase, a web-based platform that assembles draft genomes from Illumina short reads in the public domain or that are uploaded by users. EnteroBase implements legacy MLST as well as ribosomal gene MLST (rMLST), core genome MLST (cgMLST), and whole genome MLST (wgMLST) and currently contains over 100,000 assembled genomes from Salmonella. It also provides graphical tools for visual interrogation of these genotypes and those based on core single nucleotide polymorphisms (SNPs). eBGs based on legacy MLST are largely consistent with eBGs based on rMLST, thus demonstrating that these correspond to natural populations. rMLST also facilitated the selection of representative genotypes for SNP analyses of the entire breadth of diversity within Salmonella. In contrast, cgMLST provides the resolution needed for epidemiological investigations. These observations show that genomic genotyping, with the assistance of EnteroBase, can be applied at all levels of diversity within the Salmonella genus.
Assuntos
Bases de Dados Genéticas , Genoma Bacteriano , Salmonella/classificação , Salmonella/genética , Tipagem de Sequências Multilocus , Filogenia , Polimorfismo de Nucleotídeo ÚnicoRESUMO
Epidemics and pandemics of cholera, a severe diarrheal disease, have occurred since the early 19th century and waves of epidemic disease continue today. Cholera epidemics are caused by individual, genetically monomorphic lineages of Vibrio cholerae: the ongoing seventh pandemic, which has spread globally since 1961, is associated with lineage L2 of biotype El Tor. Previous genomic studies of the epidemiology of the seventh pandemic identified three successive sub-lineages within L2, designated waves 1 to 3, which spread globally from the Bay of Bengal on multiple occasions. However, these studies did not include samples from China, which also experienced multiple epidemics of cholera in recent decades. We sequenced the genomes of 71 strains isolated in China between 1961 and 2010, as well as eight from other sources, and compared them with 181 published genomes. The results indicated that outbreaks in China between 1960 and 1990 were associated with wave 1 whereas later outbreaks were associated with wave 2. However, the previously defined waves overlapped temporally, and are an inadequate representation of the shape of the global genealogy. We therefore suggest replacing them by a series of tightly delineated clades. Between 1960 and 1990 multiple such clades were imported into China, underwent further microevolution there and then spread to other countries. China was thus both a sink and source during the pandemic spread of V. cholerae, and needs to be included in reconstructions of the global patterns of spread of cholera.
Assuntos
Cólera/epidemiologia , Vibrio cholerae/classificação , China/epidemiologia , Humanos , Pandemias , Polimorfismo de Nucleotídeo Único , Vibrio cholerae/genéticaRESUMO
Multiple epidemic diseases have been designated as emerging or reemerging because the numbers of clinical cases have increased. Emerging diseases are often suspected to be driven by increased virulence or fitness, possibly associated with the gain of novel genes or mutations. However, the time period over which humans have been afflicted by such diseases is only known for very few bacterial pathogens, and the evidence for recently increased virulence or fitness is scanty. Has Darwinian (diversifying) selection at the genomic level recently driven microevolution within bacterial pathogens of humans? Salmonella enterica serovar Paratyphi A is a major cause of enteric fever, with a microbiological history dating to 1898. We identified seven modern lineages among 149 genomes on the basis of 4,584 SNPs in the core genome and estimated that Paratyphi A originated 450 y ago. During that time period, the effective population size has undergone expansion, reduction, and recent expansion. Mutations, some of which inactivate genes, have occurred continuously over the history of Paratyphi A, as has the gain or loss of accessory genes. We also identified 273 mutations that were under Darwinian selection. However, most genetic changes are transient, continuously being removed by purifying selection, and the genome of Paratyphi A has not changed dramatically over centuries. We conclude that Darwinian selection is not responsible for increased frequency of enteric fever and suggest that environmental changes may be more important for the frequency of disease.
Assuntos
Saúde Global , Salmonella enterica/genética , Seleção Genética , Febre Tifoide/epidemiologia , Genes Bacterianos , Humanos , Polimorfismo de Nucleotídeo Único , Febre Tifoide/microbiologiaRESUMO
The genus Yersinia has been used as a model system to study pathogen evolution. Using whole-genome sequencing of all Yersinia species, we delineate the gene complement of the whole genus and define patterns of virulence evolution. Multiple distinct ecological specializations appear to have split pathogenic strains from environmental, nonpathogenic lineages. This split demonstrates that contrary to hypotheses that all pathogenic Yersinia species share a recent common pathogenic ancestor, they have evolved independently but followed parallel evolutionary paths in acquiring the same virulence determinants as well as becoming progressively more limited metabolically. Shared virulence determinants are limited to the virulence plasmid pYV and the attachment invasion locus ail. These acquisitions, together with genomic variations in metabolic pathways, have resulted in the parallel emergence of related pathogens displaying an increasingly specialized lifestyle with a spectrum of virulence potential, an emerging theme in the evolution of other important human pathogens.
Assuntos
Evolução Molecular , Virulência/genética , Yersinia/genética , Yersinia/patogenicidade , Genoma Bacteriano , Humanos , Redes e Vias Metabólicas/genética , Filogenia , Especificidade da Espécie , Yersinia/metabolismo , Yersinia enterocolitica/genética , Yersinia enterocolitica/metabolismo , Yersinia enterocolitica/patogenicidadeRESUMO
The widespread use of antibiotics in association with high-density clinical care has driven the emergence of drug-resistant bacteria that are adapted to thrive in hospitalized patients. Of particular concern are globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clones that cause outbreaks and epidemics associated with health care. The most rapidly spreading and tenacious health-care-associated clone in Europe currently is EMRSA-15, which was first detected in the UK in the early 1990s and subsequently spread throughout Europe and beyond. Using phylogenomic methods to analyze the genome sequences for 193 S. aureus isolates, we were able to show that the current pandemic population of EMRSA-15 descends from a health-care-associated MRSA epidemic that spread throughout England in the 1980s, which had itself previously emerged from a primarily community-associated methicillin-sensitive population. The emergence of fluoroquinolone resistance in this EMRSA-15 subclone in the English Midlands during the mid-1980s appears to have played a key role in triggering pandemic spread, and occurred shortly after the first clinical trials of this drug. Genome-based coalescence analysis estimated that the population of this subclone over the last 20 yr has grown four times faster than its progenitor. Using comparative genomic analysis we identified the molecular genetic basis of 99.8% of the antimicrobial resistance phenotypes of the isolates, highlighting the potential of pathogen genome sequencing as a diagnostic tool. We document the genetic changes associated with adaptation to the hospital environment and with increasing drug resistance over time, and how MRSA evolution likely has been influenced by country-specific drug use regimens.
Assuntos
Genoma Bacteriano , Staphylococcus aureus Resistente à Meticilina/genética , Infecções Estafilocócicas/epidemiologia , Análise por Conglomerados , Farmacorresistência Bacteriana/genética , Genômica , Genótipo , Humanos , Staphylococcus aureus Resistente à Meticilina/classificação , Pandemias , Filogenia , Filogeografia , Infecções Estafilocócicas/transmissão , Reino Unido/epidemiologiaRESUMO
Only few molecular studies have addressed the age of bacterial pathogens that infected humans before the beginnings of medical bacteriology, but these have provided dramatic insights. The global genetic diversity of Helicobacter pylori, which infects human stomachs, parallels that of its human host. The time to the most recent common ancestor (tMRCA) of these bacteria approximates that of anatomically modern humans, i.e. at least 100 000 years, after calibrating the evolutionary divergence within H. pylori against major ancient human migrations. Similarly, genomic reconstructions of Mycobacterium tuberculosis, the cause of tuberculosis, from ancient skeletons in South America and mummies in Hungary support estimates of less than 6000 years for the tMRCA of M. tuberculosis Finally, modern global patterns of genetic diversity and ancient DNA studies indicate that during the last 5000 years plague caused by Yersinia pestis has spread globally on multiple occasions from China and Central Asia. Such tMRCA estimates provide only lower bounds on the ages of bacterial pathogens, and additional studies are needed for realistic upper bounds on how long humans and animals have suffered from bacterial diseases.
Assuntos
Evolução Biológica , Genoma Bacteriano , Helicobacter pylori/genética , Mycobacterium tuberculosis/genética , Yersinia pestis/genética , Animais , DNA Bacteriano/análise , HumanosRESUMO
Both anatomically modern humans and the gastric pathogen Helicobacter pylori originated in Africa, and both species have been associated for at least 100,000 years. Seven geographically distinct H. pylori populations exist, three of which are indigenous to Africa: hpAfrica1, hpAfrica2, and hpNEAfrica. The oldest and most divergent population, hpAfrica2, evolved within San hunter-gatherers, who represent one of the deepest branches of the human population tree. Anticipating the presence of ancient H. pylori lineages within all hunter-gatherer populations, we investigated the prevalence and population structure of H. pylori within Baka Pygmies in Cameroon. Gastric biopsies were obtained by esophagogastroduodenoscopy from 77 Baka from two geographically separated populations, and from 101 non-Baka individuals from neighboring agriculturalist populations, and subsequently cultured for H. pylori. Unexpectedly, Baka Pygmies showed a significantly lower H. pylori infection rate (20.8%) than non-Baka (80.2%). We generated multilocus haplotypes for each H. pylori isolate by DNA sequencing, but were not able to identify Baka-specific lineages, and most isolates in our sample were assigned to hpNEAfrica or hpAfrica1. The population hpNEAfrica, a marker for the expansion of the Nilo-Saharan language family, was divided into East African and Central West African subpopulations. Similarly, a new hpAfrica1 subpopulation, identified mainly among Cameroonians, supports eastern and western expansions of Bantu languages. An age-structured transmission model shows that the low H. pylori prevalence among Baka Pygmies is achievable within the timeframe of a few hundred years and suggests that demographic factors such as small population size and unusually low life expectancy can lead to the eradication of H. pylori from individual human populations. The Baka were thus either H. pylori-free or lost their ancient lineages during past demographic fluctuations. Using coalescent simulations and phylogenetic inference, we show that Baka almost certainly acquired their extant H. pylori through secondary contact with their agriculturalist neighbors.
Assuntos
Trato Gastrointestinal/microbiologia , Genética Populacional , Infecções por Helicobacter/genética , Helicobacter pylori/genética , África , Biópsia , População Negra , Variação Genética , Transtornos do Crescimento/microbiologia , Haplótipos , Infecções por Helicobacter/epidemiologia , Infecções por Helicobacter/microbiologia , Helicobacter pylori/patogenicidade , Humanos , FilogeniaRESUMO
Salmonella enterica serovar Agona has caused multiple food-borne outbreaks of gastroenteritis since it was first isolated in 1952. We analyzed the genomes of 73 isolates from global sources, comparing five distinct outbreaks with sporadic infections as well as food contamination and the environment. Agona consists of three lineages with minimal mutational diversity: only 846 single nucleotide polymorphisms (SNPs) have accumulated in the non-repetitive, core genome since Agona evolved in 1932 and subsequently underwent a major population expansion in the 1960s. Homologous recombination with other serovars of S. enterica imported 42 recombinational tracts (360 kb) in 5/143 nodes within the genealogy, which resulted in 3,164 additional SNPs. In contrast to this paucity of genetic diversity, Agona is highly diverse according to pulsed-field gel electrophoresis (PFGE), which is used to assign isolates to outbreaks. PFGE diversity reflects a highly dynamic accessory genome associated with the gain or loss (indels) of 51 bacteriophages, 10 plasmids, and 6 integrative conjugational elements (ICE/IMEs), but did not correlate uniquely with outbreaks. Unlike the core genome, indels occurred repeatedly in independent nodes (homoplasies), resulting in inaccurate PFGE genealogies. The accessory genome contained only few cargo genes relevant to infection, other than antibiotic resistance. Thus, most of the genetic diversity within this recently emerged pathogen reflects changes in the accessory genome, or is due to recombination, but these changes seemed to reflect neutral processes rather than Darwinian selection. Each outbreak was caused by an independent clade, without universal, outbreak-associated genomic features, and none of the variable genes in the pan-genome seemed to be associated with an ability to cause outbreaks.
Assuntos
DNA Bacteriano , Sorogrupo , DNA Bacteriano/genética , Surtos de Doenças , Eletroforese em Gel de Campo Pulsado , Genômica , Humanos , Infecções por Salmonella , Salmonella enterica/genéticaRESUMO
The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin coupled with a slow clock rate. Here we identified 2,326 SNPs from 133 genomes of Y. pestis strains that were isolated in China and elsewhere. These SNPs define the genealogy of Y. pestis since its most recent common ancestor. All but 28 of these SNPs represented mutations that happened only once within the genealogy, and they were distributed essentially at random among individual genes. Only seven genes contained a significant excess of nonsynonymous SNP, suggesting that the fixation of SNPs mainly arises via neutral processes, such as genetic drift, rather than Darwinian selection. However, the rate of fixation varies dramatically over the genealogy: the number of SNPs accumulated by different lineages was highly variable and the genealogy contains multiple polytomies, one of which resulted in four branches near the time of the Black Death. We suggest that demographic changes can affect the speed of evolution in epidemic pathogens even in the absence of natural selection, and hypothesize that neutral SNPs are fixed rapidly during intermittent epidemics and outbreaks.
Assuntos
Evolução Molecular , Deriva Genética , Variação Genética , Taxa de Mutação , Yersinia pestis/genética , Sequência de Bases , China , Genética Populacional , Funções Verossimilhança , Modelos Genéticos , Epidemiologia Molecular , Dados de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNARESUMO
Listeria monocytogenes is ubiquitously prevalent in natural environments and is transmitted via the food chain to animals and humans, in whom it can cause life-threatening diseases. We used Multilocus Sequence Typing (MLST) of â¼2000 isolates of L. monocytogenes to investigate whether specific associations existed between clonal complexes (CCs) and the environment versus diseased hosts. Most CCs (72%) were not specific for any single source, and many have been isolated from the environment, food products, animals as well as from humans. Our results confirm that the population structure of L. monocytogenes is largely clonal and consists of four lineages (I-IV), three of which contain multiple CCs. Most CCs have remained stable for decades, but one epidemic clone (CC101) was common in the mid-1950s and very rare until recently when it may have begun to re-emerge. The historical perspective used here indicates that the central sequence types of CCs were not ancestral founders but have rather simply increased in frequency over decades.
Assuntos
Variação Genética , Listeria monocytogenes/classificação , Tipagem de Sequências Multilocus , Filogenia , Animais , Técnicas de Tipagem Bacteriana , Microbiologia Ambiental , Contaminação de Alimentos , Genótipo , Humanos , Listeria monocytogenes/genética , ListerioseRESUMO
When modern humans left Africa ca. 60,000 years ago (60 kya), they were already infected with Helicobacter pylori, and these bacteria have subsequently diversified in parallel with their human hosts. But how long were humans infected by H. pylori prior to the out-of-Africa event? Did this co-evolution predate the emergence of modern humans, spanning the species divide? To answer these questions, we investigated the diversity of H. pylori in Africa, where both humans and H. pylori originated. Three distinct H. pylori populations are native to Africa: hpNEAfrica in Afro-Asiatic and Nilo-Saharan speakers, hpAfrica1 in Niger-Congo speakers and hpAfrica2 in South Africa. Rather than representing a sustained co-evolution over millions of years, we find that the coalescent for all H. pylori plus its closest relative H. acinonychis dates to 88-116 kya. At that time the phylogeny split into two primary super-lineages, one of which is associated with the former hunter-gatherers in southern Africa known as the San. H. acinonychis, which infects large felines, resulted from a later host jump from the San, 43-56 kya. These dating estimates, together with striking phylogenetic and quantitative human-bacterial similarities show that H. pylori is approximately as old as are anatomically modern humans. They also suggest that H. pylori may have been acquired via a single host jump from an unknown, non-human host. We also find evidence for a second Out of Africa migration in the last 52,000 years, because hpEurope is a hybrid population between hpAsia2 and hpNEAfrica, the latter of which arose in northeast Africa 36-52 kya, after the Out of Africa migrations around 60 kya.
Assuntos
Evolução Molecular , Infecções por Helicobacter/microbiologia , Helicobacter pylori/classificação , Helicobacter pylori/genética , África , Animais , Gatos , Emigração e Imigração , Variação Genética , Infecções por Helicobacter/epidemiologia , Humanos , Dados de Sequência Molecular , Pan troglodytes/microbiologia , Filogenia , RNA Ribossômico 16S/genéticaRESUMO
Salmonella enterica subspecies enterica is traditionally subdivided into serovars by serological and nutritional characteristics. We used Multilocus Sequence Typing (MLST) to assign 4,257 isolates from 554 serovars to 1092 sequence types (STs). The majority of the isolates and many STs were grouped into 138 genetically closely related clusters called eBurstGroups (eBGs). Many eBGs correspond to a serovar, for example most Typhimurium are in eBG1 and most Enteritidis are in eBG4, but many eBGs contained more than one serovar. Furthermore, most serovars were polyphyletic and are distributed across multiple unrelated eBGs. Thus, serovar designations confounded genetically unrelated isolates and failed to recognize natural evolutionary groupings. An inability of serotyping to correctly group isolates was most apparent for Paratyphi B and its variant Java. Most Paratyphi B were included within a sub-cluster of STs belonging to eBG5, which also encompasses a separate sub-cluster of Java STs. However, diphasic Java variants were also found in two other eBGs and monophasic Java variants were in four other eBGs or STs, one of which is in subspecies salamae and a second of which includes isolates assigned to Enteritidis, Dublin and monophasic Paratyphi B. Similarly, Choleraesuis was found in eBG6 and is closely related to Paratyphi C, which is in eBG20. However, Choleraesuis var. Decatur consists of isolates from seven other, unrelated eBGs or STs. The serological assignment of these Decatur isolates to Choleraesuis likely reflects lateral gene transfer of flagellar genes between unrelated bacteria plus purifying selection. By confounding multiple evolutionary groups, serotyping can be misleading about the disease potential of S. enterica. Unlike serotyping, MLST recognizes evolutionary groupings and we recommend that Salmonella classification by serotyping should be replaced by MLST or its equivalents.
Assuntos
Técnicas de Tipagem Bacteriana/métodos , Salmonella enterica/classificação , Sorotipagem/métodos , Filogenia , Salmonella enterica/genéticaRESUMO
Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.