Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
Más filtros

Bases de datos
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 33(6): 988-998, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37253539

RESUMEN

Bacterial genome data are accumulating at an unprecedented speed due to the routine use of sequencing in clinical diagnoses, public health surveillance, and population genetics studies. Genealogical reconstruction is fundamental to many of these uses; however, inferring genealogy from large-scale genome data sets quickly, accurately, and flexibly is still a challenge. Here, we extend an alignment- and annotation-free method, PopPUNK, to increase its flexibility and interpretability across data sets. Our method, iterative-PopPUNK, rapidly produces multiple consistent cluster assignments across a range of sequence identities. By constructing a partially resolved genealogical tree with respect to these clusters, users can select a resolution most appropriate for their needs. We showed the accuracy of clusters at all levels of similarity and genealogical inference of iterative-PopPUNK based on simulated data and obtained phylogenetically concordant results in real data sets from seven bacterial species. Using two example sets of Escherichia/Shigella and Vibrio parahaemolyticus genomes, we show that iterative-PopPUNK can achieve cluster resolutions ranging from phylogroup down to sequence typing (ST). The iterative-PopPUNK algorithm is implemented in the "PopPUNK_iterate" program, available as part of the PopPUNK package.


Asunto(s)
Algoritmos , Genoma Bacteriano , Bacterias/genética , Análisis por Conglomerados
2.
PLoS Genet ; 17(9): e1009829, 2021 09.
Artículo en Inglés | MEDLINE | ID: mdl-34582435

RESUMEN

Measuring molecular evolution in bacteria typically requires estimation of the rate at which nucleotide changes accumulate in strains sampled at different times that share a common ancestor. This approach has been useful for dating ecological and evolutionary events that coincide with the emergence of important lineages, such as outbreak strains and obligate human pathogens. However, in multi-host (niche) transmission scenarios, where the pathogen is essentially an opportunistic environmental organism, sampling is often sporadic and rarely reflects the overall population, particularly when concentrated on clinical isolates. This means that approaches that assume recent common ancestry are not applicable. Here we present a new approach to estimate the molecular clock rate in Campylobacter that draws on the popular probability conundrum known as the 'birthday problem'. Using large genomic datasets and comparative genomic approaches, we use isolate pairs that share recent common ancestry to estimate the rate of nucleotide change for the population. Identifying synonymous and non-synonymous nucleotide changes, both within and outside of recombined regions of the genome, we quantify clock-like diversification to estimate synonymous rates of nucleotide change for the common pathogenic bacteria Campylobacter coli (2.4 x 10-6 s/s/y) and Campylobacter jejuni (3.4 x 10-6 s/s/y). Finally, using estimated total rates of nucleotide change, we infer the number of effective lineages within the sample time frame-analogous to a shared birthday-and assess the rate of turnover of lineages in our sample set over short evolutionary timescales. This provides a generalizable approach to calibrating rates in populations of environmental bacteria and shows that multiple lineages are maintained, implying that large-scale clonal sweeps may take hundreds of years or more in these species.


Asunto(s)
Campylobacter/genética , Evolución Molecular , Campylobacter/clasificación , Genes Bacterianos , Variación Genética , Filogenia , Especificidad de la Especie
3.
Nature ; 519(7543): 309-314, 2015 Mar 19.
Artículo en Inglés | MEDLINE | ID: mdl-25788095

RESUMEN

Fine-scale genetic variation between human populations is interesting as a signature of historical demographic events and because of its potential for confounding disease studies. We use haplotype-based statistical methods to analyse genome-wide single nucleotide polymorphism (SNP) data from a carefully chosen geographically diverse sample of 2,039 individuals from the United Kingdom. This reveals a rich and detailed pattern of genetic differentiation with remarkable concordance between genetic clusters and geography. The regional genetic differentiation and differing patterns of shared ancestry with 6,209 individuals from across Europe carry clear signals of historical demographic events. We estimate the genetic contribution to southeastern England from Anglo-Saxon migrations to be under half, and identify the regions not carrying genetic material from these migrations. We suggest significant pre-Roman but post-Mesolithic movement into southeastern England from continental Europe, and show that in non-Saxon parts of the United Kingdom, there exist genetically differentiated subgroups rather than a general 'Celtic' population.


Asunto(s)
Genética de Población , Haplotipos/genética , Polimorfismo de Nucleótido Simple/genética , Algoritmos , Humanos , Análisis de Componente Principal , Reino Unido/etnología , Población Blanca/genética
4.
PLoS Genet ; 13(2): e1006546, 2017 02.
Artículo en Inglés | MEDLINE | ID: mdl-28231283

RESUMEN

For the last 500 years, the Americas have been a melting pot both for genetically diverse humans and for the pathogenic and commensal organisms associated with them. One such organism is the stomach-dwelling bacterium Helicobacter pylori, which is highly prevalent in Latin America where it is a major current public health challenge because of its strong association with gastric cancer. By analyzing the genome sequence of H. pylori isolated in North, Central and South America, we found evidence for admixture between H. pylori of European and African origin throughout the Americas, without substantial input from pre-Columbian (hspAmerind) bacteria. In the US, strains of African and European origin have remained genetically distinct, while in Colombia and Nicaragua, bottlenecks and rampant genetic exchange amongst isolates have led to the formation of national gene pools. We found three outer membrane proteins with atypical levels of Asian ancestry in American strains, as well as alleles that were nearly fixed specifically in South American isolates, suggesting a role for the ethnic makeup of hosts in the colonization of incoming strains. Our results show that new H. pylori subpopulations can rapidly arise, spread and adapt during times of demographic flux, and suggest that differences in transmission ecology between high and low prevalence areas may substantially affect the composition of bacterial populations.


Asunto(s)
Infecciones por Helicobacter/genética , Helicobacter pylori/genética , Filogenia , Neoplasias Gástricas/genética , Alelos , ADN Mitocondrial/genética , Evolución Molecular , Genoma Bacteriano , Infecciones por Helicobacter/epidemiología , Helicobacter pylori/patogenicidad , Humanos , Indígenas Norteamericanos , América Latina , Neoplasias Gástricas/epidemiología , Neoplasias Gástricas/microbiología , Población Blanca
6.
Mol Biol Evol ; 35(5): 1284-1290, 2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-29474601

RESUMEN

Powerful approaches to inferring recent or current population structure based on nearest neighbor haplotype "coancestry" have so far been inaccessible to users without high quality genome-wide haplotype data. With a boom in nonmodel organism genomics, there is a pressing need to bring these methods to communities without access to such data. Here, we present RADpainter, a new program designed to infer the coancestry matrix from restriction-site-associated DNA sequencing (RADseq) data. We combine this program together with a previously published MCMC clustering algorithm into fineRADstructure-a complete, easy to use, and fast population inference package for RADseq data (https://github.com/millanek/fineRADstructure; last accessed February 24, 2018). Finally, with two example data sets, we illustrate its use, benefits, and robustness to missing RAD alleles in double digest RAD sequencing.


Asunto(s)
Genómica/métodos , Programas Informáticos , Alelos , Caryophyllaceae/genética , Población , Análisis de Secuencia de ADN
7.
BMC Biol ; 16(1): 84, 2018 08 02.
Artículo en Inglés | MEDLINE | ID: mdl-30071832

RESUMEN

BACKGROUND: Helicobacter pylori are stomach-dwelling bacteria that are present in about 50% of the global population. Infection is asymptomatic in most cases, but it has been associated with gastritis, gastric ulcers and gastric cancer. Epidemiological evidence shows that progression to cancer depends upon the host and pathogen factors, but questions remain about why cancer phenotypes develop in a minority of infected people. Here, we use comparative genomics approaches to understand how genetic variation amongst bacterial strains influences disease progression. RESULTS: We performed a genome-wide association study (GWAS) on 173 H. pylori isolates from the European population (hpEurope) with known disease aetiology, including 49 from individuals with gastric cancer. We identified SNPs and genes that differed in frequency between isolates from patients with gastric cancer and those with gastritis. The gastric cancer phenotype was associated with the presence of babA and genes in the cag pathogenicity island, one of the major virulence determinants of H. pylori, as well as non-synonymous variations in several less well-studied genes. We devised a simple risk score based on the risk level of associated elements present, which has the potential to identify strains that are likely to cause cancer but will require refinement and validation. CONCLUSION: There are a number of challenges to applying GWAS to bacterial infections, including the difficulty of obtaining matched controls, multiple strain colonization and the possibility that causative strains may not be present when disease is detected. Our results demonstrate that bacterial factors have a sufficiently strong influence on disease progression that even a small-scale GWAS can identify them. Therefore, H. pylori GWAS can elucidate mechanistic pathways to disease and guide clinical treatment options, including for asymptomatic carriers.


Asunto(s)
Variación Genética , Genoma Bacteriano , Estudio de Asociación del Genoma Completo , Helicobacter pylori/genética , Neoplasias Gástricas/microbiología , Gastritis/etiología , Humanos , Metaplasia/etiología , Polimorfismo de Nucleótido Simple , Riesgo , Neoplasias Gástricas/epidemiología , Factores de Virulencia/genética
8.
Mol Biol Evol ; 33(2): 456-71, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26516092

RESUMEN

Recombination enhances the adaptive potential of organisms by allowing genetic variants to be tested on multiple genomic backgrounds. Its distribution in the genome can provide insight into the evolutionary forces that underlie traits, such as the emergence of pathogenicity. Here, we examined landscapes of realized homologous recombination of 500 genomes from ten bacterial species and found all species have "hot" regions with elevated rates relative to the genome average. We examined the size, gene content, and chromosomal features associated with these regions and the correlations between closely related species. The recombination landscape is variable and evolves rapidly. For example in Salmonella, only short regions of around 1 kb in length are hot whereas in the closely related species Escherichia coli, some hot regions exceed 100 kb, spanning many genes. Only Streptococcus pyogenes shows evidence for the positive correlation between GC content and recombination that has been reported for several eukaryotes. Genes with function related to the cell surface/membrane are often found in recombination hot regions but E. coli is the only species where genes annotated as "virulence associated" are consistently hotter. There is also evidence that some genes with "housekeeping" functions tend to be overrepresented in cold regions. For example, ribosomal proteins showed low recombination in all of the species. Among specific genes, transferrin-binding proteins are recombination hot in all three of the species in which they were found, and are subject to interspecies recombination.


Asunto(s)
Bacterias/genética , Recombinación Homóloga , Bacterias/patogenicidad , Composición de Base , Evolución Biológica , Análisis por Conglomerados , Genes Bacterianos , Genoma Bacteriano , Genómica , Selección Genética , Virulencia/genética
9.
Mol Biol Evol ; 32(6): 1396-410, 2015 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-25605790

RESUMEN

We investigated global patterns of variation in 157 whole-genome sequences of Vibrio parahaemolyticus, a free-living and seafood associated marine bacterium. Pandemic clones, responsible for recent outbreaks of gastroenteritis in humans, have spread globally. However, there are oceanic gene pools, one located in the oceans surrounding Asia and another in the Mexican Gulf. Frequent recombination means that most isolates have acquired the genetic profile of their current location. We investigated the genetic structure in the Asian gene pool by calculating the effective population size in two different ways. Under standard neutral models, the two estimates should give similar answers but we found a 27-fold difference. We propose that this discrepancy is caused by the subdivision of the species into a hundred or more ecotypes which are maintained stably in the population. To investigate the genetic factors involved, we used 51 unrelated isolates to conduct a genome-wide scan for epistatically interacting loci. We found a single example of strong epistasis between distant genome regions. A majority of strains had a type VI secretion system associated with bacterial killing. The remaining strains had genes associated with biofilm formation and regulated by cyclic dimeric GMP signaling. All strains had one or other of the two systems and none of isolate had complete complements of both systems, although several strains had remnants. Further "top down" analysis of patterns of linkage disequilibrium within frequently recombining species will allow a detailed understanding of how selection acts to structure the pattern of variation within natural bacterial populations.


Asunto(s)
Pool de Genes , Genética de Población , Genoma Bacteriano , Vibrio parahaemolyticus/genética , Vibrio parahaemolyticus/aislamiento & purificación , Asia , Biopelículas , Cromosomas Bacterianos/genética , Epistasis Genética , Sitios Genéticos , México , Océanos y Mares , Filogenia , Filogeografía , Polimorfismo de Nucleótido Simple , Recombinación Genética , Agua de Mar/microbiología , Análisis de Secuencia de ADN , Vibrio parahaemolyticus/clasificación
10.
Bioinformatics ; 31(22): 3691-3, 2015 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-26198102

RESUMEN

UNLABELLED: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. AVAILABILITY AND IMPLEMENTATION: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary CONTACT: roary@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma Bacteriano , Células Procariotas/metabolismo , Programas Informáticos , Simulación por Computador , Bases de Datos Genéticas , Salmonella typhi/genética
11.
Proc Natl Acad Sci U S A ; 110(29): 11923-7, 2013 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-23818615

RESUMEN

Genome-wide association studies have the potential to identify causal genetic factors underlying important phenotypes but have rarely been performed in bacteria. We present an association mapping method that takes into account the clonal population structure of bacteria and is applicable to both core and accessory genome variation. Campylobacter is a common cause of human gastroenteritis as a consequence of its proliferation in multiple farm animal species and its transmission via contaminated meat and poultry. We applied our association mapping method to identify the factors responsible for adaptation to cattle and chickens among 192 Campylobacter isolates from these and other host sources. Phylogenetic analysis implied frequent host switching but also showed that some lineages were strongly associated with particular hosts. A seven-gene region with a host association signal was found. Genes in this region were almost universally present in cattle but were frequently absent in isolates from chickens and wild birds. Three of the seven genes encoded vitamin B5 biosynthesis. We found that isolates from cattle were better able to grow in vitamin B5-depleted media and propose that this difference may be an adaptation to host diet.


Asunto(s)
Evolución Biológica , Vías Biosintéticas/genética , Campylobacter/genética , Bovinos/microbiología , Estudio de Asociación del Genoma Completo/métodos , Especificidad del Huésped/genética , Ácido Pantoténico/biosíntesis , Animales , Secuencia de Bases , Pollos/microbiología , Análisis por Conglomerados , Biología Computacional , Genética de Población , Genoma Bacteriano/genética , Modelos Genéticos , Datos de Secuencia Molecular , Filogenia , Análisis de Secuencia de ADN
12.
PLoS Genet ; 9(9): e1003775, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24068950

RESUMEN

Both anatomically modern humans and the gastric pathogen Helicobacter pylori originated in Africa, and both species have been associated for at least 100,000 years. Seven geographically distinct H. pylori populations exist, three of which are indigenous to Africa: hpAfrica1, hpAfrica2, and hpNEAfrica. The oldest and most divergent population, hpAfrica2, evolved within San hunter-gatherers, who represent one of the deepest branches of the human population tree. Anticipating the presence of ancient H. pylori lineages within all hunter-gatherer populations, we investigated the prevalence and population structure of H. pylori within Baka Pygmies in Cameroon. Gastric biopsies were obtained by esophagogastroduodenoscopy from 77 Baka from two geographically separated populations, and from 101 non-Baka individuals from neighboring agriculturalist populations, and subsequently cultured for H. pylori. Unexpectedly, Baka Pygmies showed a significantly lower H. pylori infection rate (20.8%) than non-Baka (80.2%). We generated multilocus haplotypes for each H. pylori isolate by DNA sequencing, but were not able to identify Baka-specific lineages, and most isolates in our sample were assigned to hpNEAfrica or hpAfrica1. The population hpNEAfrica, a marker for the expansion of the Nilo-Saharan language family, was divided into East African and Central West African subpopulations. Similarly, a new hpAfrica1 subpopulation, identified mainly among Cameroonians, supports eastern and western expansions of Bantu languages. An age-structured transmission model shows that the low H. pylori prevalence among Baka Pygmies is achievable within the timeframe of a few hundred years and suggests that demographic factors such as small population size and unusually low life expectancy can lead to the eradication of H. pylori from individual human populations. The Baka were thus either H. pylori-free or lost their ancient lineages during past demographic fluctuations. Using coalescent simulations and phylogenetic inference, we show that Baka almost certainly acquired their extant H. pylori through secondary contact with their agriculturalist neighbors.


Asunto(s)
Tracto Gastrointestinal/microbiología , Genética de Población , Infecciones por Helicobacter/genética , Helicobacter pylori/genética , África , Biopsia , Población Negra , Variación Genética , Trastornos del Crecimiento/microbiología , Haplotipos , Infecciones por Helicobacter/epidemiología , Infecciones por Helicobacter/microbiología , Helicobacter pylori/patogenicidad , Humanos , Filogenia
13.
Proc Natl Acad Sci U S A ; 110(2): 577-82, 2013 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-23271803

RESUMEN

The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin coupled with a slow clock rate. Here we identified 2,326 SNPs from 133 genomes of Y. pestis strains that were isolated in China and elsewhere. These SNPs define the genealogy of Y. pestis since its most recent common ancestor. All but 28 of these SNPs represented mutations that happened only once within the genealogy, and they were distributed essentially at random among individual genes. Only seven genes contained a significant excess of nonsynonymous SNP, suggesting that the fixation of SNPs mainly arises via neutral processes, such as genetic drift, rather than Darwinian selection. However, the rate of fixation varies dramatically over the genealogy: the number of SNPs accumulated by different lineages was highly variable and the genealogy contains multiple polytomies, one of which resulted in four branches near the time of the Black Death. We suggest that demographic changes can affect the speed of evolution in epidemic pathogens even in the absence of natural selection, and hypothesize that neutral SNPs are fixed rapidly during intermittent epidemics and outbreaks.


Asunto(s)
Evolución Molecular , Flujo Genético , Variación Genética , Tasa de Mutación , Yersinia pestis/genética , Secuencia de Bases , China , Genética de Población , Funciones de Verosimilitud , Modelos Genéticos , Epidemiología Molecular , Datos de Secuencia Molecular , Filogenia , Polimorfismo de Nucleótido Simple/genética , Análisis de Secuencia de ADN
14.
Gut ; 64(4): 554-61, 2015 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-25007814

RESUMEN

OBJECTIVE: To study the detailed nature of genomic microevolution during mixed infection with multiple Helicobacter pylori strains in an individual. DESIGN: We sampled 18 isolates from a single biopsy from a patient with chronic gastritis and nephritis. Whole-genome sequencing was applied to these isolates, and statistical genetic tools were used to investigate their evolutionary history. RESULTS: The genomes fall into two clades, reflecting colonisation of the stomach by two distinct strains, and these lineages have accumulated diversity during an estimated 2.8 and 4.2 years of evolution. We detected about 150 clear recombination events between the two clades. Recombination between the lineages is a continuous ongoing process and was detected on both clades, but the effect of recombination in one clade was nearly an order of magnitude higher than in the other. Imputed ancestral sequences also showed evidence of recombination between the two strains prior to their diversification, and we estimate that they have both been infecting the same host for at least 12 years. Recombination tracts between the lineages were, on average, 895 bp in length, and showed evidence for the interspersion of recipient sequences that has been observed in in vitro experiments. The complex evolutionary history of a phage-related protein provided evidence for frequent reinfection of both clades by a single phage lineage during the past 4 years. CONCLUSIONS: Whole genome sequencing can be used to make detailed conclusions about the mechanisms of genetic change of H. pylori based on sampling bacteria from a single gastric biopsy.


Asunto(s)
Gastritis/microbiología , Infecciones por Helicobacter/microbiología , Helicobacter pylori/clasificación , Helicobacter pylori/genética , Enfermedad Crónica , Coinfección , Genómica , Humanos , Masculino , Persona de Mediana Edad
15.
Mol Biol Evol ; 31(6): 1593-605, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24586045

RESUMEN

In eukaryotes, detailed surveys of recombination rates have shown variation at multiple genomic scales and the presence of "hotspots" of highly elevated recombination. In bacteria, studies of recombination rate variation are less developed, in part because there are few analysis methods that take into account the clonal context within which bacterial evolution occurs. Here, we focus in particular on identifying "hot regions" of the genome where DNA is transferred frequently between isolates. We present a computationally efficient algorithm based on the recently developed "chromosome painting" algorithm, which characterizes patterns of haplotype sharing across a genome. We compare the average genome wide painting, which principally reflects clonal descent, with the painting for each site which additionally reflects the specific deviations at the site due to recombination. Using simulated data, we show that hot regions have consistently higher deviations from the genome wide average than normal regions. We applied our approach to previously analyzed Escherichia coli genomes and revealed that the new method is highly correlated with the number of recombination events affecting each site inferred by ClonalOrigin, a method that is only applicable to small numbers of genomes. Furthermore, we analyzed recombination hot regions in Campylobacter jejuni by using 200 genomes. We identified three recombination hot regions, which are enriched for genes related to membrane proteins. Our approach and its implementation, which is downloadable from https://github.com/bioprojects/orderedPainting, will help to develop a new phase of population genomic studies of recombination in prokaryotes.


Asunto(s)
Campylobacter jejuni/genética , Escherichia coli/genética , Genoma Bacteriano , Recombinación Genética , Algoritmos , Proteínas Bacterianas/genética , Pintura Cromosómica , Biología Computacional , Proteínas de la Membrana/genética , Modelos Genéticos
16.
Annu Rev Genomics Hum Genet ; 13: 337-61, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22703172

RESUMEN

A large number of algorithms have been developed to classify individuals into discrete populations using genetic data. Recent results show that the information used by both model-based clustering methods and principal components analysis can be summarized by a matrix of pairwise similarity measures between individuals. Similarity matrices have been constructed in a number of ways, usually treating markers as independent but differing in the weighting given to polymorphisms of different frequencies. Additionally, methods are now being developed that take linkage into account. We review several such matrices and evaluate their information content. A two-stage approach for population identification is to first construct a similarity matrix and then perform clustering. We review a range of common clustering algorithms and evaluate their performance through a simulation study. The clustering step can be performed either on the matrix or by first using a dimension-reduction technique; we find that the latter approach substantially improves the performance of most algorithms. Based on these results, we describe the population structure signal contained in each similarity matrix and find that accounting for linkage leads to significant improvements for sequence data. We also perform a comparison on real data, where we find that population genetics models outperform generic clustering approaches, particularly with regard to robustness for features such as relatedness between individuals.


Asunto(s)
Algoritmos , Modelos Genéticos , Análisis por Conglomerados , Simulación por Computador , Ligamiento Genético , Genética de Población , Genoma Humano , Humanos , Polimorfismo Genético , Análisis de Componente Principal
17.
PLoS Genet ; 8(1): e1002453, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22291602

RESUMEN

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this "chromosome painting" can be summarized as a "coancestry matrix," which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/.


Asunto(s)
Haplotipos/genética , Proyecto Genoma Humano , Polimorfismo de Nucleótido Simple/genética , Población/genética , Análisis de Componente Principal/métodos , Grupos Raciales/genética , Algoritmos , Simulación por Computador , Genoma Humano , Humanos , Desequilibrio de Ligamiento/genética , Modelos Teóricos , Programas Informáticos
19.
PLoS Genet ; 7(7): e1002191, 2011 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-21829375

RESUMEN

Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.


Asunto(s)
Recombinación Genética , Salmonella enterica/genética , Ligamiento Genético , Genoma Bacteriano/genética
20.
Proc Natl Acad Sci U S A ; 108(12): 5033-8, 2011 Mar 22.
Artículo en Inglés | MEDLINE | ID: mdl-21383187

RESUMEN

High genetic diversity is a hallmark of the gastric pathogen Helicobacter pylori. We used 454 sequencing technology to perform whole-genome comparisons for five sets of H. pylori strains that had been sequentially cultured from four chronically infected Colombians (isolation intervals=3-16 y) and one human volunteer experimentally infected with H. pylori as part of a vaccine trial. The four sets of genomes from Colombian H. pylori differed by 27-232 isolated SNPs and 16-441 imported clusters of polymorphisms resulting from recombination. Imports (mean length=394 bp) were distributed nonrandomly over the chromosome and frequently occurred in groups, suggesting that H. pylori first takes up long DNA fragments, which subsequently become partially integrated in multiple shorter pieces. Imports were present at significantly increased frequency in members of the hop family of outer membrane gene paralogues, some of which are involved in bacterial adhesion, suggesting diversifying selection. No evidence of recombination and few other differences were identified in the strain pair from an infected volunteer, indicating that the H. pylori genome is stable in the absence of mixed infection. Among these few differences was an OFF/ON switch in the phase-variable adhesin gene hopZ, suggesting strong in vivo selection for this putative adhesin during early colonization.


Asunto(s)
Evolución Molecular , Genoma Bacteriano/fisiología , Inestabilidad Genómica , Infecciones por Helicobacter/genética , Helicobacter pylori/genética , Polimorfismo de Nucleótido Simple , Adolescente , Niño , Preescolar , Femenino , Estudio de Asociación del Genoma Completo , Infecciones por Helicobacter/metabolismo , Helicobacter pylori/metabolismo , Helicobacter pylori/patogenicidad , Humanos , Masculino
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA