Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 74
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Genome Res ; 33(6): 988-998, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-37253539

RESUMO

Bacterial genome data are accumulating at an unprecedented speed due to the routine use of sequencing in clinical diagnoses, public health surveillance, and population genetics studies. Genealogical reconstruction is fundamental to many of these uses; however, inferring genealogy from large-scale genome data sets quickly, accurately, and flexibly is still a challenge. Here, we extend an alignment- and annotation-free method, PopPUNK, to increase its flexibility and interpretability across data sets. Our method, iterative-PopPUNK, rapidly produces multiple consistent cluster assignments across a range of sequence identities. By constructing a partially resolved genealogical tree with respect to these clusters, users can select a resolution most appropriate for their needs. We showed the accuracy of clusters at all levels of similarity and genealogical inference of iterative-PopPUNK based on simulated data and obtained phylogenetically concordant results in real data sets from seven bacterial species. Using two example sets of Escherichia/Shigella and Vibrio parahaemolyticus genomes, we show that iterative-PopPUNK can achieve cluster resolutions ranging from phylogroup down to sequence typing (ST). The iterative-PopPUNK algorithm is implemented in the "PopPUNK_iterate" program, available as part of the PopPUNK package.


Assuntos
Algoritmos , Genoma Bacteriano , Bactérias/genética , Análise por Conglomerados
2.
PLoS Genet ; 17(9): e1009829, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34582435

RESUMO

Measuring molecular evolution in bacteria typically requires estimation of the rate at which nucleotide changes accumulate in strains sampled at different times that share a common ancestor. This approach has been useful for dating ecological and evolutionary events that coincide with the emergence of important lineages, such as outbreak strains and obligate human pathogens. However, in multi-host (niche) transmission scenarios, where the pathogen is essentially an opportunistic environmental organism, sampling is often sporadic and rarely reflects the overall population, particularly when concentrated on clinical isolates. This means that approaches that assume recent common ancestry are not applicable. Here we present a new approach to estimate the molecular clock rate in Campylobacter that draws on the popular probability conundrum known as the 'birthday problem'. Using large genomic datasets and comparative genomic approaches, we use isolate pairs that share recent common ancestry to estimate the rate of nucleotide change for the population. Identifying synonymous and non-synonymous nucleotide changes, both within and outside of recombined regions of the genome, we quantify clock-like diversification to estimate synonymous rates of nucleotide change for the common pathogenic bacteria Campylobacter coli (2.4 x 10-6 s/s/y) and Campylobacter jejuni (3.4 x 10-6 s/s/y). Finally, using estimated total rates of nucleotide change, we infer the number of effective lineages within the sample time frame-analogous to a shared birthday-and assess the rate of turnover of lineages in our sample set over short evolutionary timescales. This provides a generalizable approach to calibrating rates in populations of environmental bacteria and shows that multiple lineages are maintained, implying that large-scale clonal sweeps may take hundreds of years or more in these species.


Assuntos
Campylobacter/genética , Evolução Molecular , Campylobacter/classificação , Genes Bacterianos , Variação Genética , Filogenia , Especificidade da Espécie
3.
Nature ; 519(7543): 309-314, 2015 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-25788095

RESUMO

Fine-scale genetic variation between human populations is interesting as a signature of historical demographic events and because of its potential for confounding disease studies. We use haplotype-based statistical methods to analyse genome-wide single nucleotide polymorphism (SNP) data from a carefully chosen geographically diverse sample of 2,039 individuals from the United Kingdom. This reveals a rich and detailed pattern of genetic differentiation with remarkable concordance between genetic clusters and geography. The regional genetic differentiation and differing patterns of shared ancestry with 6,209 individuals from across Europe carry clear signals of historical demographic events. We estimate the genetic contribution to southeastern England from Anglo-Saxon migrations to be under half, and identify the regions not carrying genetic material from these migrations. We suggest significant pre-Roman but post-Mesolithic movement into southeastern England from continental Europe, and show that in non-Saxon parts of the United Kingdom, there exist genetically differentiated subgroups rather than a general 'Celtic' population.


Assuntos
Genética Populacional , Haplótipos/genética , Polimorfismo de Nucleotídeo Único/genética , Algoritmos , Humanos , Análise de Componente Principal , Reino Unido/etnologia , População Branca/genética
4.
PLoS Genet ; 13(2): e1006546, 2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-28231283

RESUMO

For the last 500 years, the Americas have been a melting pot both for genetically diverse humans and for the pathogenic and commensal organisms associated with them. One such organism is the stomach-dwelling bacterium Helicobacter pylori, which is highly prevalent in Latin America where it is a major current public health challenge because of its strong association with gastric cancer. By analyzing the genome sequence of H. pylori isolated in North, Central and South America, we found evidence for admixture between H. pylori of European and African origin throughout the Americas, without substantial input from pre-Columbian (hspAmerind) bacteria. In the US, strains of African and European origin have remained genetically distinct, while in Colombia and Nicaragua, bottlenecks and rampant genetic exchange amongst isolates have led to the formation of national gene pools. We found three outer membrane proteins with atypical levels of Asian ancestry in American strains, as well as alleles that were nearly fixed specifically in South American isolates, suggesting a role for the ethnic makeup of hosts in the colonization of incoming strains. Our results show that new H. pylori subpopulations can rapidly arise, spread and adapt during times of demographic flux, and suggest that differences in transmission ecology between high and low prevalence areas may substantially affect the composition of bacterial populations.


Assuntos
Infecções por Helicobacter/genética , Helicobacter pylori/genética , Filogenia , Neoplasias Gástricas/genética , Alelos , DNA Mitocondrial/genética , Evolução Molecular , Genoma Bacteriano , Infecções por Helicobacter/epidemiologia , Helicobacter pylori/patogenicidade , Humanos , Indígenas Norte-Americanos , América Latina , Neoplasias Gástricas/epidemiologia , Neoplasias Gástricas/microbiologia , População Branca
6.
Mol Biol Evol ; 35(5): 1284-1290, 2018 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-29474601

RESUMO

Powerful approaches to inferring recent or current population structure based on nearest neighbor haplotype "coancestry" have so far been inaccessible to users without high quality genome-wide haplotype data. With a boom in nonmodel organism genomics, there is a pressing need to bring these methods to communities without access to such data. Here, we present RADpainter, a new program designed to infer the coancestry matrix from restriction-site-associated DNA sequencing (RADseq) data. We combine this program together with a previously published MCMC clustering algorithm into fineRADstructure-a complete, easy to use, and fast population inference package for RADseq data (https://github.com/millanek/fineRADstructure; last accessed February 24, 2018). Finally, with two example data sets, we illustrate its use, benefits, and robustness to missing RAD alleles in double digest RAD sequencing.


Assuntos
Genômica/métodos , Software , Alelos , Caryophyllaceae/genética , População , Análise de Sequência de DNA
7.
BMC Biol ; 16(1): 84, 2018 08 02.
Artigo em Inglês | MEDLINE | ID: mdl-30071832

RESUMO

BACKGROUND: Helicobacter pylori are stomach-dwelling bacteria that are present in about 50% of the global population. Infection is asymptomatic in most cases, but it has been associated with gastritis, gastric ulcers and gastric cancer. Epidemiological evidence shows that progression to cancer depends upon the host and pathogen factors, but questions remain about why cancer phenotypes develop in a minority of infected people. Here, we use comparative genomics approaches to understand how genetic variation amongst bacterial strains influences disease progression. RESULTS: We performed a genome-wide association study (GWAS) on 173 H. pylori isolates from the European population (hpEurope) with known disease aetiology, including 49 from individuals with gastric cancer. We identified SNPs and genes that differed in frequency between isolates from patients with gastric cancer and those with gastritis. The gastric cancer phenotype was associated with the presence of babA and genes in the cag pathogenicity island, one of the major virulence determinants of H. pylori, as well as non-synonymous variations in several less well-studied genes. We devised a simple risk score based on the risk level of associated elements present, which has the potential to identify strains that are likely to cause cancer but will require refinement and validation. CONCLUSION: There are a number of challenges to applying GWAS to bacterial infections, including the difficulty of obtaining matched controls, multiple strain colonization and the possibility that causative strains may not be present when disease is detected. Our results demonstrate that bacterial factors have a sufficiently strong influence on disease progression that even a small-scale GWAS can identify them. Therefore, H. pylori GWAS can elucidate mechanistic pathways to disease and guide clinical treatment options, including for asymptomatic carriers.


Assuntos
Variação Genética , Genoma Bacteriano , Estudo de Associação Genômica Ampla , Helicobacter pylori/genética , Neoplasias Gástricas/microbiologia , Gastrite/etiologia , Humanos , Metaplasia/etiologia , Polimorfismo de Nucleotídeo Único , Risco , Neoplasias Gástricas/epidemiologia , Fatores de Virulência/genética
8.
Mol Biol Evol ; 33(2): 456-71, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26516092

RESUMO

Recombination enhances the adaptive potential of organisms by allowing genetic variants to be tested on multiple genomic backgrounds. Its distribution in the genome can provide insight into the evolutionary forces that underlie traits, such as the emergence of pathogenicity. Here, we examined landscapes of realized homologous recombination of 500 genomes from ten bacterial species and found all species have "hot" regions with elevated rates relative to the genome average. We examined the size, gene content, and chromosomal features associated with these regions and the correlations between closely related species. The recombination landscape is variable and evolves rapidly. For example in Salmonella, only short regions of around 1 kb in length are hot whereas in the closely related species Escherichia coli, some hot regions exceed 100 kb, spanning many genes. Only Streptococcus pyogenes shows evidence for the positive correlation between GC content and recombination that has been reported for several eukaryotes. Genes with function related to the cell surface/membrane are often found in recombination hot regions but E. coli is the only species where genes annotated as "virulence associated" are consistently hotter. There is also evidence that some genes with "housekeeping" functions tend to be overrepresented in cold regions. For example, ribosomal proteins showed low recombination in all of the species. Among specific genes, transferrin-binding proteins are recombination hot in all three of the species in which they were found, and are subject to interspecies recombination.


Assuntos
Bactérias/genética , Recombinação Homóloga , Bactérias/patogenicidade , Composição de Bases , Evolução Biológica , Análise por Conglomerados , Genes Bacterianos , Genoma Bacteriano , Genômica , Seleção Genética , Virulência/genética
9.
Mol Biol Evol ; 32(6): 1396-410, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25605790

RESUMO

We investigated global patterns of variation in 157 whole-genome sequences of Vibrio parahaemolyticus, a free-living and seafood associated marine bacterium. Pandemic clones, responsible for recent outbreaks of gastroenteritis in humans, have spread globally. However, there are oceanic gene pools, one located in the oceans surrounding Asia and another in the Mexican Gulf. Frequent recombination means that most isolates have acquired the genetic profile of their current location. We investigated the genetic structure in the Asian gene pool by calculating the effective population size in two different ways. Under standard neutral models, the two estimates should give similar answers but we found a 27-fold difference. We propose that this discrepancy is caused by the subdivision of the species into a hundred or more ecotypes which are maintained stably in the population. To investigate the genetic factors involved, we used 51 unrelated isolates to conduct a genome-wide scan for epistatically interacting loci. We found a single example of strong epistasis between distant genome regions. A majority of strains had a type VI secretion system associated with bacterial killing. The remaining strains had genes associated with biofilm formation and regulated by cyclic dimeric GMP signaling. All strains had one or other of the two systems and none of isolate had complete complements of both systems, although several strains had remnants. Further "top down" analysis of patterns of linkage disequilibrium within frequently recombining species will allow a detailed understanding of how selection acts to structure the pattern of variation within natural bacterial populations.


Assuntos
Pool Gênico , Genética Populacional , Genoma Bacteriano , Vibrio parahaemolyticus/genética , Vibrio parahaemolyticus/isolamento & purificação , Ásia , Biofilmes , Cromossomos Bacterianos/genética , Epistasia Genética , Loci Gênicos , México , Oceanos e Mares , Filogenia , Filogeografia , Polimorfismo de Nucleotídeo Único , Recombinação Genética , Água do Mar/microbiologia , Análise de Sequência de DNA , Vibrio parahaemolyticus/classificação
10.
Bioinformatics ; 31(22): 3691-3, 2015 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-26198102

RESUMO

UNLABELLED: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. AVAILABILITY AND IMPLEMENTATION: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary CONTACT: roary@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma Bacteriano , Células Procarióticas/metabolismo , Software , Simulação por Computador , Bases de Dados Genéticas , Salmonella typhi/genética
11.
PLoS Genet ; 9(9): e1003775, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24068950

RESUMO

Both anatomically modern humans and the gastric pathogen Helicobacter pylori originated in Africa, and both species have been associated for at least 100,000 years. Seven geographically distinct H. pylori populations exist, three of which are indigenous to Africa: hpAfrica1, hpAfrica2, and hpNEAfrica. The oldest and most divergent population, hpAfrica2, evolved within San hunter-gatherers, who represent one of the deepest branches of the human population tree. Anticipating the presence of ancient H. pylori lineages within all hunter-gatherer populations, we investigated the prevalence and population structure of H. pylori within Baka Pygmies in Cameroon. Gastric biopsies were obtained by esophagogastroduodenoscopy from 77 Baka from two geographically separated populations, and from 101 non-Baka individuals from neighboring agriculturalist populations, and subsequently cultured for H. pylori. Unexpectedly, Baka Pygmies showed a significantly lower H. pylori infection rate (20.8%) than non-Baka (80.2%). We generated multilocus haplotypes for each H. pylori isolate by DNA sequencing, but were not able to identify Baka-specific lineages, and most isolates in our sample were assigned to hpNEAfrica or hpAfrica1. The population hpNEAfrica, a marker for the expansion of the Nilo-Saharan language family, was divided into East African and Central West African subpopulations. Similarly, a new hpAfrica1 subpopulation, identified mainly among Cameroonians, supports eastern and western expansions of Bantu languages. An age-structured transmission model shows that the low H. pylori prevalence among Baka Pygmies is achievable within the timeframe of a few hundred years and suggests that demographic factors such as small population size and unusually low life expectancy can lead to the eradication of H. pylori from individual human populations. The Baka were thus either H. pylori-free or lost their ancient lineages during past demographic fluctuations. Using coalescent simulations and phylogenetic inference, we show that Baka almost certainly acquired their extant H. pylori through secondary contact with their agriculturalist neighbors.


Assuntos
Trato Gastrointestinal/microbiologia , Genética Populacional , Infecções por Helicobacter/genética , Helicobacter pylori/genética , África , Biópsia , População Negra , Variação Genética , Transtornos do Crescimento/microbiologia , Haplótipos , Infecções por Helicobacter/epidemiologia , Infecções por Helicobacter/microbiologia , Helicobacter pylori/patogenicidade , Humanos , Filogenia
12.
Proc Natl Acad Sci U S A ; 110(29): 11923-7, 2013 Jul 16.
Artigo em Inglês | MEDLINE | ID: mdl-23818615

RESUMO

Genome-wide association studies have the potential to identify causal genetic factors underlying important phenotypes but have rarely been performed in bacteria. We present an association mapping method that takes into account the clonal population structure of bacteria and is applicable to both core and accessory genome variation. Campylobacter is a common cause of human gastroenteritis as a consequence of its proliferation in multiple farm animal species and its transmission via contaminated meat and poultry. We applied our association mapping method to identify the factors responsible for adaptation to cattle and chickens among 192 Campylobacter isolates from these and other host sources. Phylogenetic analysis implied frequent host switching but also showed that some lineages were strongly associated with particular hosts. A seven-gene region with a host association signal was found. Genes in this region were almost universally present in cattle but were frequently absent in isolates from chickens and wild birds. Three of the seven genes encoded vitamin B5 biosynthesis. We found that isolates from cattle were better able to grow in vitamin B5-depleted media and propose that this difference may be an adaptation to host diet.


Assuntos
Evolução Biológica , Vias Biossintéticas/genética , Campylobacter/genética , Bovinos/microbiologia , Estudo de Associação Genômica Ampla/métodos , Especificidade de Hospedeiro/genética , Ácido Pantotênico/biossíntese , Animais , Sequência de Bases , Galinhas/microbiologia , Análise por Conglomerados , Biologia Computacional , Genética Populacional , Genoma Bacteriano/genética , Modelos Genéticos , Dados de Sequência Molecular , Filogenia , Análise de Sequência de DNA
13.
Proc Natl Acad Sci U S A ; 110(2): 577-82, 2013 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-23271803

RESUMO

The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin coupled with a slow clock rate. Here we identified 2,326 SNPs from 133 genomes of Y. pestis strains that were isolated in China and elsewhere. These SNPs define the genealogy of Y. pestis since its most recent common ancestor. All but 28 of these SNPs represented mutations that happened only once within the genealogy, and they were distributed essentially at random among individual genes. Only seven genes contained a significant excess of nonsynonymous SNP, suggesting that the fixation of SNPs mainly arises via neutral processes, such as genetic drift, rather than Darwinian selection. However, the rate of fixation varies dramatically over the genealogy: the number of SNPs accumulated by different lineages was highly variable and the genealogy contains multiple polytomies, one of which resulted in four branches near the time of the Black Death. We suggest that demographic changes can affect the speed of evolution in epidemic pathogens even in the absence of natural selection, and hypothesize that neutral SNPs are fixed rapidly during intermittent epidemics and outbreaks.


Assuntos
Evolução Molecular , Deriva Genética , Variação Genética , Taxa de Mutação , Yersinia pestis/genética , Sequência de Bases , China , Genética Populacional , Funções Verossimilhança , Modelos Genéticos , Epidemiologia Molecular , Dados de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNA
14.
Gut ; 64(4): 554-61, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25007814

RESUMO

OBJECTIVE: To study the detailed nature of genomic microevolution during mixed infection with multiple Helicobacter pylori strains in an individual. DESIGN: We sampled 18 isolates from a single biopsy from a patient with chronic gastritis and nephritis. Whole-genome sequencing was applied to these isolates, and statistical genetic tools were used to investigate their evolutionary history. RESULTS: The genomes fall into two clades, reflecting colonisation of the stomach by two distinct strains, and these lineages have accumulated diversity during an estimated 2.8 and 4.2 years of evolution. We detected about 150 clear recombination events between the two clades. Recombination between the lineages is a continuous ongoing process and was detected on both clades, but the effect of recombination in one clade was nearly an order of magnitude higher than in the other. Imputed ancestral sequences also showed evidence of recombination between the two strains prior to their diversification, and we estimate that they have both been infecting the same host for at least 12 years. Recombination tracts between the lineages were, on average, 895 bp in length, and showed evidence for the interspersion of recipient sequences that has been observed in in vitro experiments. The complex evolutionary history of a phage-related protein provided evidence for frequent reinfection of both clades by a single phage lineage during the past 4 years. CONCLUSIONS: Whole genome sequencing can be used to make detailed conclusions about the mechanisms of genetic change of H. pylori based on sampling bacteria from a single gastric biopsy.


Assuntos
Gastrite/microbiologia , Infecções por Helicobacter/microbiologia , Helicobacter pylori/classificação , Helicobacter pylori/genética , Doença Crônica , Coinfecção , Genômica , Humanos , Masculino , Pessoa de Meia-Idade
15.
Mol Biol Evol ; 31(6): 1593-605, 2014 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-24586045

RESUMO

In eukaryotes, detailed surveys of recombination rates have shown variation at multiple genomic scales and the presence of "hotspots" of highly elevated recombination. In bacteria, studies of recombination rate variation are less developed, in part because there are few analysis methods that take into account the clonal context within which bacterial evolution occurs. Here, we focus in particular on identifying "hot regions" of the genome where DNA is transferred frequently between isolates. We present a computationally efficient algorithm based on the recently developed "chromosome painting" algorithm, which characterizes patterns of haplotype sharing across a genome. We compare the average genome wide painting, which principally reflects clonal descent, with the painting for each site which additionally reflects the specific deviations at the site due to recombination. Using simulated data, we show that hot regions have consistently higher deviations from the genome wide average than normal regions. We applied our approach to previously analyzed Escherichia coli genomes and revealed that the new method is highly correlated with the number of recombination events affecting each site inferred by ClonalOrigin, a method that is only applicable to small numbers of genomes. Furthermore, we analyzed recombination hot regions in Campylobacter jejuni by using 200 genomes. We identified three recombination hot regions, which are enriched for genes related to membrane proteins. Our approach and its implementation, which is downloadable from https://github.com/bioprojects/orderedPainting, will help to develop a new phase of population genomic studies of recombination in prokaryotes.


Assuntos
Campylobacter jejuni/genética , Escherichia coli/genética , Genoma Bacteriano , Recombinação Genética , Algoritmos , Proteínas de Bactérias/genética , Coloração Cromossômica , Biologia Computacional , Proteínas de Membrana/genética , Modelos Genéticos
16.
Annu Rev Genomics Hum Genet ; 13: 337-61, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22703172

RESUMO

A large number of algorithms have been developed to classify individuals into discrete populations using genetic data. Recent results show that the information used by both model-based clustering methods and principal components analysis can be summarized by a matrix of pairwise similarity measures between individuals. Similarity matrices have been constructed in a number of ways, usually treating markers as independent but differing in the weighting given to polymorphisms of different frequencies. Additionally, methods are now being developed that take linkage into account. We review several such matrices and evaluate their information content. A two-stage approach for population identification is to first construct a similarity matrix and then perform clustering. We review a range of common clustering algorithms and evaluate their performance through a simulation study. The clustering step can be performed either on the matrix or by first using a dimension-reduction technique; we find that the latter approach substantially improves the performance of most algorithms. Based on these results, we describe the population structure signal contained in each similarity matrix and find that accounting for linkage leads to significant improvements for sequence data. We also perform a comparison on real data, where we find that population genetics models outperform generic clustering approaches, particularly with regard to robustness for features such as relatedness between individuals.


Assuntos
Algoritmos , Modelos Genéticos , Análise por Conglomerados , Simulação por Computador , Ligação Genética , Genética Populacional , Genoma Humano , Humanos , Polimorfismo Genético , Análise de Componente Principal
17.
PLoS Genet ; 8(1): e1002453, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22291602

RESUMO

The advent of genome-wide dense variation data provides an opportunity to investigate ancestry in unprecedented detail, but presents new statistical challenges. We propose a novel inference framework that aims to efficiently capture information on population structure provided by patterns of haplotype similarity. Each individual in a sample is considered in turn as a recipient, whose chromosomes are reconstructed using chunks of DNA donated by the other individuals. Results of this "chromosome painting" can be summarized as a "coancestry matrix," which directly reveals key information about ancestral relationships among individuals. If markers are viewed as independent, we show that this matrix almost completely captures the information used by both standard Principal Components Analysis (PCA) and model-based approaches such as STRUCTURE in a unified manner. Furthermore, when markers are in linkage disequilibrium, the matrix combines information across successive markers to increase the ability to discern fine-scale population structure using PCA. In parallel, we have developed an efficient model-based approach to identify discrete populations using this matrix, which offers advantages over PCA in terms of interpretability and over existing clustering algorithms in terms of speed, number of separable populations, and sensitivity to subtle population structure. We analyse Human Genome Diversity Panel data for 938 individuals and 641,000 markers, and we identify 226 populations reflecting differences on continental, regional, local, and family scales. We present multiple lines of evidence that, while many methods capture similar information among strongly differentiated groups, more subtle population structure in human populations is consistently present at a much finer level than currently available geographic labels and is only captured by the haplotype-based approach. The software used for this article, ChromoPainter and fineSTRUCTURE, is available from http://www.paintmychromosomes.com/.


Assuntos
Haplótipos/genética , Projeto Genoma Humano , Polimorfismo de Nucleotídeo Único/genética , População/genética , Análise de Componente Principal/métodos , Grupos Raciais/genética , Algoritmos , Simulação por Computador , Genoma Humano , Humanos , Desequilíbrio de Ligação/genética , Modelos Teóricos , Software
19.
PLoS Genet ; 7(7): e1002191, 2011 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-21829375

RESUMO

Salmonella enterica is a bacterial pathogen that causes enteric fever and gastroenteritis in humans and animals. Although its population structure was long described as clonal, based on high linkage disequilibrium between loci typed by enzyme electrophoresis, recent examination of gene sequences has revealed that recombination plays an important evolutionary role. We sequenced around 10% of the core genome of 114 isolates of enterica using a resequencing microarray. Application of two different analysis methods (Structure and ClonalFrame) to our genomic data allowed us to define five clear lineages within S. enterica subspecies enterica, one of which is five times older than the other four and two thirds of the age of the whole subspecies. We show that some of these lineages display more evidence of recombination than others. We also demonstrate that some level of sexual isolation exists between the lineages, so that recombination has occurred predominantly between members of the same lineage. This pattern of recombination is compatible with expectations from the previously described ecological structuring of the enterica population as well as mechanistic barriers to recombination observed in laboratory experiments. In spite of their relatively low level of genetic differentiation, these lineages might therefore represent incipient species.


Assuntos
Recombinação Genética , Salmonella enterica/genética , Ligação Genética , Genoma Bacteriano/genética
20.
Proc Natl Acad Sci U S A ; 108(12): 5033-8, 2011 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-21383187

RESUMO

High genetic diversity is a hallmark of the gastric pathogen Helicobacter pylori. We used 454 sequencing technology to perform whole-genome comparisons for five sets of H. pylori strains that had been sequentially cultured from four chronically infected Colombians (isolation intervals=3-16 y) and one human volunteer experimentally infected with H. pylori as part of a vaccine trial. The four sets of genomes from Colombian H. pylori differed by 27-232 isolated SNPs and 16-441 imported clusters of polymorphisms resulting from recombination. Imports (mean length=394 bp) were distributed nonrandomly over the chromosome and frequently occurred in groups, suggesting that H. pylori first takes up long DNA fragments, which subsequently become partially integrated in multiple shorter pieces. Imports were present at significantly increased frequency in members of the hop family of outer membrane gene paralogues, some of which are involved in bacterial adhesion, suggesting diversifying selection. No evidence of recombination and few other differences were identified in the strain pair from an infected volunteer, indicating that the H. pylori genome is stable in the absence of mixed infection. Among these few differences was an OFF/ON switch in the phase-variable adhesin gene hopZ, suggesting strong in vivo selection for this putative adhesin during early colonization.


Assuntos
Evolução Molecular , Genoma Bacteriano/fisiologia , Instabilidade Genômica , Infecções por Helicobacter/genética , Helicobacter pylori/genética , Polimorfismo de Nucleotídeo Único , Adolescente , Criança , Pré-Escolar , Feminino , Estudo de Associação Genômica Ampla , Infecções por Helicobacter/metabolismo , Helicobacter pylori/metabolismo , Helicobacter pylori/patogenicidade , Humanos , Masculino
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa