Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
1.
Mol Biol Evol ; 41(7)2024 Jul 03.
Artigo em Inglês | MEDLINE | ID: mdl-38958167

RESUMO

Admixture between populations and species is common in nature. Since the influx of new genetic material might be either facilitated or hindered by selection, variation in mixture proportions along the genome is expected in organisms undergoing recombination. Various graph-based models have been developed to better understand these evolutionary dynamics of population splits and mixtures. However, current models assume a single mixture rate for the entire genome and do not explicitly account for linkage. Here, we introduce TreeSwirl, a novel method for inferring branch lengths and locus-specific mixture proportions by using genome-wide allele frequency data, assuming that the admixture graph is known or has been inferred. TreeSwirl builds upon TreeMix that uses Gaussian processes to estimate the presence of gene flow between diverged populations. However, in contrast to TreeMix, our model infers locus-specific mixture proportions employing a hidden Markov model that accounts for linkage. Through simulated data, we demonstrate that TreeSwirl can accurately estimate locus-specific mixture proportions and handle complex demographic scenarios. It also outperforms related D- and f-statistics in terms of accuracy and sensitivity to detect introgressed loci.


Assuntos
Frequência do Gene , Modelos Genéticos , Genética Populacional/métodos , Cadeias de Markov , Fluxo Gênico , Genoma , Simulação por Computador , Ligação Genética
2.
Proc Natl Acad Sci U S A ; 113(25): 6886-91, 2016 06 21.
Artigo em Inglês | MEDLINE | ID: mdl-27274049

RESUMO

Farming and sedentism first appeared in southwestern Asia during the early Holocene and later spread to neighboring regions, including Europe, along multiple dispersal routes. Conspicuous uncertainties remain about the relative roles of migration, cultural diffusion, and admixture with local foragers in the early Neolithization of Europe. Here we present paleogenomic data for five Neolithic individuals from northern Greece and northwestern Turkey spanning the time and region of the earliest spread of farming into Europe. We use a novel approach to recalibrate raw reads and call genotypes from ancient DNA and observe striking genetic similarity both among Aegean early farmers and with those from across Europe. Our study demonstrates a direct genetic link between Mediterranean and Central European early farmers and those of Greece and Anatolia, extending the European Neolithic migratory chain all the way back to southwestern Asia.


Assuntos
Agricultura , Antropologia , Europa (Continente) , Genética Populacional , Humanos , Região do Mediterrâneo , Análise de Componente Principal
3.
Syst Biol ; 66(6): 950-963, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-28204787

RESUMO

Although it is now widely accepted that the rate of phenotypic evolution may not necessarily be constant across large phylogenies, the frequency and phylogenetic position of periods of rapid evolution remain unclear. In his highly influential view of evolution, G. G. Simpson supposed that such evolutionary jumps occur when organisms transition into so-called new adaptive zones, for instance after dispersal into a new geographic area, after rapid climatic changes, or following the appearance of an evolutionary novelty. Only recently, large, accurate and well calibrated phylogenies have become available that allow testing this hypothesis directly, yet inferring evolutionary jumps remains computationally very challenging. Here, we develop a computationally highly efficient algorithm to accurately infer the rate and strength of evolutionary jumps as well as their phylogenetic location. Following previous work we model evolutionary jumps as a compound process, but introduce a novel approach to sample jump configurations that does not require matrix inversions and thus naturally scales to large trees. We then make use of this development to infer evolutionary jumps in Anolis lizards and Loriinii parrots where we find strong signal for such jumps at the basis of clades that transitioned into new adaptive zones, just as postulated by Simpson's hypothesis. [evolutionary jump; Lévy process; phenotypic evolution; punctuated equilibrium; quantitative traits.


Assuntos
Classificação/métodos , Modelos Genéticos , Filogenia , Algoritmos , Animais , Evolução Biológica , Lagartos/classificação , Papagaios/classificação
4.
J Theor Biol ; 420: 174-179, 2017 05 07.
Artigo em Inglês | MEDLINE | ID: mdl-28263815

RESUMO

The reconstruction of phylogenetic trees from discrete character data typically relies on models that assume the characters evolve under a continuous-time Markov process operating at some overall rate λ. When λ is too high or too low, it becomes difficult to distinguish a short interior edge from a polytomy (the tree that results from collapsing the edge). In this note, we investigate the rate that maximizes the expected log-likelihood ratio (i.e. the Kullback-Leibler separation) between the four-leaf unresolved (star) tree and a four-leaf binary tree with interior edge length ϵ. For a simple two-state model, we show that as ϵ converges to 0 the optimal rate also converges to zero when the four pendant edges have equal length. However, when the four pendant branches have unequal length, two local optima can arise, and it is possible for the globally optimal rate to converge to a non-zero constant as ϵ→0. Moreover, in the setting where the four pendant branches have equal lengths and either (i) we replace the two-state model by an infinite-state model or (ii) we retain the two-state model and replace the Kullback-Leibler separation by Euclidean distance as the maximization goal, then the optimal rate also converges to a non-zero constant.


Assuntos
Modelos Teóricos , Filogenia , Evolução Molecular , Cadeias de Markov , Modelos Genéticos
5.
Mol Ecol Resour ; 24(3): e13913, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38173222

RESUMO

The identification of sex-linked scaffolds and the genetic sex of individuals, i.e. their sex karyotype, is a fundamental step in population genomic studies. If sex-linked scaffolds are known, single individuals may be sexed based on read counts of next-generation sequencing data. If both sex-linked scaffolds as well as sex karyotypes are unknown, as is often the case for non-model organisms, they have to be jointly inferred. For both cases, current methods rely on arbitrary thresholds, which limits their power for low-depth data. In addition, most current methods are limited to euploid sex karyotypes (XX and XY). Here we develop BeXY, a fully Bayesian method to jointly infer the posterior probabilities for each scaffold to be autosomal, X- or Y-linked and for each individual to be any of the sex karyotypes XX, XY, X0, XXX, XXY, XYY and XXYY. If the sex-linked scaffolds are known, it also identifies autosomal trisomies and estimates the sex karyotype posterior probabilities for single individuals. As we show with downsampling experiments, BeXY has higher power than all existing methods. It accurately infers the sex karyotype of ancient human samples with as few as 20,000 reads and accurately infers sex-linked scaffolds from data sets of just a handful of samples or with highly imbalanced sex ratios, also in the case of low-quality reference assemblies. We illustrate the power of BeXY by applying it to both whole-genome shotgun and target enrichment sequencing data of ancient and modern humans, as well as several non-model organisms.


Assuntos
Genômica , Cromossomos Sexuais , Humanos , Teorema de Bayes , Cromossomos Sexuais/genética , Testes Genéticos , Cariótipo
6.
Syst Biol ; 60(6): 813-25, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21828084

RESUMO

The analysis of ratios of body measurements is deeply ingrained in the taxonomic literature. Whether for plants or animals, certain ratios are commonly indicated in identification keys, diagnoses, and descriptions. They often provide the only means for separation of cryptic species that mostly lack distinguishing qualitative characters. Additionally, they provide an obvious way to study differences in body proportions, as ratios reflect geometric shape differences. However, when it comes to multivariate analysis of body measurements, for instance, with linear discriminant analysis (LDA) or principal component analysis (PCA), interpretation using body ratios is difficult. Both techniques are commonly applied for separating similar taxa or for exploring the structure of variation, respectively, and require standardized raw or log-transformed variables as input. Here, we develop statistical procedures for the analysis of body ratios in a consistent multivariate statistical framework. In particular, we present algorithms adapted to LDA and PCA that allow the interpretation of numerical results in terms of body proportions. We first introduce a method called the "LDA ratio extractor," which reveals the best ratios for separation of two or more groups with the help of discriminant analysis. We also provide measures for deciding how much of the total differences between individuals or groups of individuals is due to size and how much is due to shape. The second method, a graphical tool called the "PCA ratio spectrum," aims at the interpretation of principal components in terms of body ratios. Based on a similar idea, the "allometry ratio spectrum" is developed which can be used for studying the allometric behavior of ratios. Because size can be defined in different ways, we discuss several concepts of size. Central to this discussion is Jolicoeur's multivariate generalization of the allometry equation, a concept that was derived only with a heuristic argument. Here we present a statistical derivation of the allometric size vector using the method of least squares. The application of the above methods is extensively demonstrated using published data sets from parasitic wasps and rock crabs.


Assuntos
Classificação/métodos , Animais , Braquiúros/anatomia & histologia , Braquiúros/classificação , Masculino , Análise Multivariada , Especificidade da Espécie , Vespas/anatomia & histologia , Vespas/classificação
7.
BMC Evol Biol ; 11: 85, 2011 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-21457536

RESUMO

BACKGROUND: Today many large mammals live in small, fragmented populations, but it is often unclear whether this subdivision is the result of long-term or recent events. Demographic modeling using genetic data can estimate changes in long-term population sizes while temporal sampling provides a way to compare genetic variation present today with that sampled in the past. In order to better understand the dynamics associated with the divergences of great ape populations, these analytical approaches were applied to western gorillas (Gorilla gorilla) and in particular to the isolated and Critically Endangered Cross River gorilla subspecies (G. g. diehli). RESULTS: We used microsatellite genotypes from museum specimens and contemporary samples of Cross River gorillas to infer both the long-term and recent population history. We find that Cross River gorillas diverged from the ancestral western gorilla population ~17,800 years ago (95% HDI: 760, 63,245 years). However, gene flow ceased only ~420 years ago (95% HDI: 200, 16,256 years), followed by a bottleneck beginning ~320 years ago (95% HDI: 200, 2,825 years) that caused a 60-fold decrease in the effective population size of Cross River gorillas. Direct comparison of heterozygosity estimates from museum and contemporary samples suggests a loss of genetic variation over the last 100 years. CONCLUSIONS: The composite history of western gorillas could plausibly be explained by climatic oscillations inducing environmental changes in western equatorial Africa that would have allowed gorilla populations to expand over time but ultimately isolate the Cross River gorillas, which thereafter exhibited a dramatic population size reduction. The recent decrease in the Cross River population is accordingly most likely attributable to increasing anthropogenic pressure over the last several hundred years. Isolation of diverging populations with prolonged concomitant gene flow, but not secondary admixture, appears to be a typical characteristic of the population histories of African great apes, including gorillas, chimpanzees and bonobos.


Assuntos
Evolução Molecular , Gorilla gorilla/genética , África Ocidental , Animais , Ecossistema , Fluxo Gênico , Variação Genética , Gorilla gorilla/classificação , Gorilla gorilla/crescimento & desenvolvimento , Repetições de Microssatélites , Densidade Demográfica , Dinâmica Populacional
8.
BMC Bioinformatics ; 11: 116, 2010 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-20202215

RESUMO

BACKGROUND: The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC) algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. RESULTS: Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC). It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. CONCLUSION: ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.


Assuntos
Teorema de Bayes , Genética Populacional , Software , Animais , Arvicolinae/genética , Evolução Molecular , Feminino , Masculino , Repetições de Microssatélites/genética
9.
Genetics ; 216(4): 1205-1215, 2020 12.
Artigo em Inglês | MEDLINE | ID: mdl-33067324

RESUMO

Allele frequencies vary across populations and loci, even in the presence of migration. While most differences may be due to genetic drift, divergent selection will further increase differentiation at some loci. Identifying those is key in studying local adaptation, but remains statistically challenging. A particularly elegant way to describe allele frequency differences among populations connected by migration is the F-model, which measures differences in allele frequencies by population specific FST coefficients. This model readily accounts for multiple evolutionary forces by partitioning FST coefficients into locus- and population-specific components reflecting selection and drift, respectively. Here we present an extension of this model to linked loci by means of a hidden Markov model (HMM), which characterizes the effect of selection on linked markers through correlations in the locus specific component along the genome. Using extensive simulations, we show that the statistical power of our method is up to twofold higher than that of previous implementations that assume sites to be independent. We finally evidence selection in the human genome by applying our method to data from the Human Genome Diversity Project (HGDP).


Assuntos
Frequência do Gene , Ligação Genética , Modelos Genéticos , Seleção Genética , Evolução Molecular , Loci Gênicos , Genética Populacional/métodos , Genoma Humano , Genômica/métodos , Migração Humana , Humanos
10.
Curr Biol ; 30(21): 4307-4315.e13, 2020 11 02.
Artigo em Inglês | MEDLINE | ID: mdl-32888485

RESUMO

Lactase persistence (LP), the continued expression of lactase into adulthood, is the most strongly selected single gene trait over the last 10,000 years in multiple human populations. It has been posited that the primary allele causing LP among Eurasians, rs4988235-A [1], only rose to appreciable frequencies during the Bronze and Iron Ages [2, 3], long after humans started consuming milk from domesticated animals. This rapid rise has been attributed to an influx of people from the Pontic-Caspian steppe that began around 5,000 years ago [4, 5]. We investigate the spatiotemporal spread of LP through an analysis of 14 warriors from the Tollense Bronze Age battlefield in northern Germany (∼3,200 before present, BP), the oldest large-scale conflict site north of the Alps. Genetic data indicate that these individuals represent a single unstructured Central/Northern European population. We complemented these data with genotypes of 18 individuals from the Bronze Age site Mokrin in Serbia (∼4,100 to ∼3,700 BP) and 37 individuals from Eastern Europe and the Pontic-Caspian Steppe region, predating both Bronze Age sites (∼5,980 to ∼3,980 BP). We infer low LP in all three regions, i.e., in northern Germany and South-eastern and Eastern Europe, suggesting that the surge of rs4988235 in Central and Northern Europe was unlikely caused by Steppe expansions. We estimate a selection coefficient of 0.06 and conclude that the selection was ongoing in various parts of Europe over the last 3,000 years.


Assuntos
DNA Antigo , Lactase/genética , Seleção Genética , População Branca/genética , Adulto , Restos Mortais , DNA Mitocondrial/genética , Europa (Continente) , Feminino , Frequência do Gene , Humanos , Masculino , Adulto Jovem
12.
Genetics ; 205(1): 317-332, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27821432

RESUMO

While genetic diversity can be quantified accurately from high coverage sequencing data, it is often desirable to obtain such estimates from data with low coverage, either to save costs or because of low DNA quality, as is observed for ancient samples. Here, we introduce a method to accurately infer heterozygosity probabilistically from sequences with average coverage [Formula: see text] of a single individual. The method relaxes the infinite sites assumption of previous methods, does not require a reference sequence, except for the initial alignment of the sequencing data, and takes into account both variable sequencing errors and potential postmortem damage. It is thus also applicable to nonmodel organisms and ancient genomes. Since error rates as reported by sequencing machines are generally distorted and require recalibration, we also introduce a method to accurately infer recalibration parameters in the presence of postmortem damage. This method does not require knowledge about the underlying genome sequence, but instead works with haploid data (e.g., from the X-chromosome from mammalian males) and integrates over the unknown genotypes. Using extensive simulations we show that a few megabasepairs of haploid data are sufficient for accurate recalibration, even at average coverages as low as [Formula: see text] At similar coverages, our method also produces very accurate estimates of heterozygosity down to [Formula: see text] within windows of about 1 Mbp. We further illustrate the usefulness of our approach by inferring genome-wide patterns of diversity for several ancient human samples, and we found that 3000-5000-year-old samples showed diversity patterns comparable to those of modern humans. In contrast, two European hunter-gatherer samples exhibited not only considerably lower levels of diversity than modern samples, but also highly distinct distributions of diversity along their genomes. Interestingly, these distributions were also very different between the two samples, supporting earlier conclusions of a highly diverse and structured population in Europe prior to the arrival of farming.


Assuntos
DNA Antigo/análise , Triagem de Portadores Genéticos/métodos , Análise de Sequência de DNA/métodos , Sequência de Bases , Mapeamento Cromossômico/métodos , Variação Genética , Genética Populacional/métodos , Genoma Humano , Heterozigoto , Humanos , Masculino , Software
13.
Genetics ; 203(2): 893-904, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27052569

RESUMO

Methods that bypass analytical evaluations of the likelihood function have become an indispensable tool for statistical inference in many fields of science. These so-called likelihood-free methods rely on accepting and rejecting simulations based on summary statistics, which limits them to low-dimensional models for which the value of the likelihood is large enough to result in manageable acceptance rates. To get around these issues, we introduce a novel, likelihood-free Markov chain Monte Carlo (MCMC) method combining two key innovations: updating only one parameter per iteration and accepting or rejecting this update based on subsets of statistics approximately sufficient for this parameter. This increases acceptance rates dramatically, rendering this approach suitable even for models of very high dimensionality. We further derive that for linear models, a one-dimensional combination of statistics per parameter is sufficient and can be found empirically with simulations. Finally, we demonstrate that our method readily scales to models of very high dimensionality, using toy models as well as by jointly inferring the effective population size, the distribution of fitness effects (DFE) of segregating mutations, and selection coefficients for each locus from data of a recent experiment on the evolution of drug resistance in influenza.


Assuntos
Farmacorresistência Viral/genética , Modelos Genéticos , Aptidão Genética , Loci Gênicos , Mutação , Orthomyxoviridae/efeitos dos fármacos , Orthomyxoviridae/genética , Probabilidade , Seleção Genética
14.
Genetics ; 203(2): 831-46, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27038112

RESUMO

The joint and accurate inference of selection and demography from genetic data is considered a particularly challenging question in population genetics, since both process may lead to very similar patterns of genetic diversity. However, additional information for disentangling these effects may be obtained by observing changes in allele frequencies over multiple time points. Such data are common in experimental evolution studies, as well as in the comparison of ancient and contemporary samples. Leveraging this information, however, has been computationally challenging, particularly when considering multilocus data sets. To overcome these issues, we introduce a novel, discrete approximation for diffusion processes, termed mean transition time approximation, which preserves the long-term behavior of the underlying continuous diffusion process. We then derive this approximation for the particular case of inferring selection and demography from time series data under the classic Wright-Fisher model and demonstrate that our approximation is well suited to describe allele trajectories through time, even when only a few states are used. We then develop a Bayesian inference approach to jointly infer the population size and locus-specific selection coefficients with high accuracy and further extend this model to also infer the rates of sequencing errors and mutations. We finally apply our approach to recent experimental data on the evolution of drug resistance in influenza virus, identifying likely targets of selection and finding evidence for much larger viral population sizes than previously reported.


Assuntos
Farmacorresistência Viral/genética , Evolução Molecular , Modelos Genéticos , Orthomyxoviridae/genética , Cadeias de Markov , Orthomyxoviridae/efeitos dos fármacos , Seleção Genética
15.
Ecol Evol ; 4(14): 2867-83, 2014 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-25165525

RESUMO

Specialization to nectarivory is associated with radiations within different bird groups, including parrots. One of them, the Australasian lories, were shown to be unexpectedly species rich. Their shift to nectarivory may have created an ecological opportunity promoting species proliferation. Several morphological specializations of the feeding tract to nectarivory have been described for parrots. However, they have never been assessed in a quantitative framework considering phylogenetic nonindependence. Using a phylogenetic comparative approach with broad taxon sampling and 15 continuous characters of the digestive tract, we demonstrate that nectarivorous parrots differ in several traits from the remaining parrots. These trait-changes indicate phenotype-environment correlations and parallel evolution, and may reflect adaptations to feed effectively on nectar. Moreover, the diet shift was associated with significant trait shifts at the base of the radiation of the lories, as shown by an alternative statistical approach. Their diet shift might be considered as an evolutionary key innovation which promoted significant non-adaptive lineage diversification through allopatric partitioning of the same new niche. The lack of increased rates of cladogenesis in other nectarivorous parrots indicates that evolutionary innovations need not be associated one-to-one with diversification events.

16.
Genetics ; 184(1): 243-52, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19786619

RESUMO

Until recently, the use of Bayesian inference was limited to a few cases because for many realistic probability models the likelihood function cannot be calculated analytically. The situation changed with the advent of likelihood-free inference algorithms, often subsumed under the term approximate Bayesian computation (ABC). A key innovation was the use of a postsampling regression adjustment, allowing larger tolerance values and as such shifting computation time to realistic orders of magnitude. Here we propose a reformulation of the regression adjustment in terms of a general linear model (GLM). This allows the integration into the sound theoretical framework of Bayesian statistics and the use of its methods, including model selection via Bayes factors. We then apply the proposed methodology to the question of population subdivision among western chimpanzees, Pan troglodytes verus.


Assuntos
Modelos Genéticos , Algoritmos , Animais , Teorema de Bayes , Genética Populacional , Humanos , Funções Verossimilhança , Modelos Lineares , Pan troglodytes/genética
17.
Genetics ; 182(4): 1207-18, 2009 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-19506307

RESUMO

Approximate Bayesian computation (ABC) techniques permit inferences in complex demographic models, but are computationally inefficient. A Markov chain Monte Carlo (MCMC) approach has been proposed (Marjoram et al. 2003), but it suffers from computational problems and poor mixing. We propose several methodological developments to overcome the shortcomings of this MCMC approach and hence realize substantial computational advances over standard ABC. The principal idea is to relax the tolerance within MCMC to permit good mixing, but retain a good approximation to the posterior by a combination of subsampling the output and regression adjustment. We also propose to use a partial least-squares (PLS) transformation to choose informative statistics. The accuracy of our approach is examined in the case of the divergence of two populations with and without migration. In that case, our ABC-MCMC approach needs considerably lower computation time to reach the same accuracy than conventional ABC. We then apply our method to a more complex case with the estimation of divergence times and migration rates between three African populations.


Assuntos
Modelos Genéticos , Dinâmica Populacional , Probabilidade , África , Teorema de Bayes , Emigração e Imigração , Humanos , Funções Verossimilhança , Cadeias de Markov , Método de Monte Carlo , Estatística como Assunto
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa