RESUMEN
In this paper, we investigate the consequences of dormancy in the 'rare mutation' and 'large population' regime of stochastic adaptive dynamics. Starting from an individual-based micro-model, we first derive the Polymorphic Evolution Sequence of the population, based on a previous work by Baar and Bovier (2018). After passing to a second 'small mutations' limit, we arrive at the Canonical Equation of Adaptive Dynamics, and state a corresponding criterion for evolutionary branching, extending a previous result of Champagnat and Méléard (2011). The criterion allows a quantitative and qualitative analysis of the effects of dormancy in the well-known model of Dieckmann and Doebeli (1999) for sympatric speciation. In fact, quite an intuitive picture emerges: Dormancy enlarges the parameter range for evolutionary branching, increases the carrying capacity and niche width of the post-branching sub-populations, and, depending on the model parameters, can either increase or decrease the 'speed of adaptation' of populations. Finally, dormancy increases diversity by increasing the genetic distance between subpopulations.
Asunto(s)
Evolución Biológica , MutaciónRESUMEN
In this paper we investigate the interplay between two fundamental mechanisms of microbial population dynamics and evolution, namely dormancy and horizontal gene transfer. The corresponding traits come in many guises and are ubiquitous in microbial communities, affecting their dynamics in important ways. Recently, they have each moved (separately) into the focus of stochastic individual-based modelling (Billiard et al. 2016, 2018; Champagnat, Méléard and Tran, 2021; Blath and Tóbiás 2020). Here, we investigate their combined effects in a unified model. Indeed, we consider the (idealized) scenario of two sub-populations, respectively carrying 'trait 1' and 'trait 2', where trait 1 individuals are able to switch (under competitive pressure) into a dormant state, and trait 2 individuals are able to execute horizontal gene transfer, which in our case means that they can turn trait 1 individuals into trait 2 ones, at a rate depending on the density of individuals. In the large-population limit, we examine the fate of (i) a single trait 2 individual (called 'mutant') arriving in a trait 1 resident population living in equilibrium, and (ii) a trait 1 individual ('mutant') arriving in a trait 2 resident population. We analyse the invasion dynamics in all cases where the resident population is individually fit and the behaviour of the mutant population is initially non-critical. This leads to the identification of parameter regimes for the invasion and fixation of the new trait, stable coexistence of the two traits, and 'founder control' (where the initial resident always dominates, irrespective of its trait). One of our key findings is that horizontal transfer can lead to stable coexistence even if trait 2 is unfit on its own. In the case of founder control, the limiting dynamical system also exhibits a coexistence equilibrium, which, however, is unstable, and with overwhelming probability none of the mutant sub-populations is able to invade. In all cases, we observe the classical (up to three) phases of invasion dynamics à la Champagnat (2006).
Asunto(s)
Bacterias , Transferencia de Gen Horizontal , Humanos , Modelos Biológicos , Fenotipo , Dinámica Poblacional , ProbabilidadRESUMEN
The goal of this article is to contribute towards the conceptual and quantitative understanding of the evolutionary benefits for (microbial) populations to maintain a seed bank consisting of dormant individuals when facing fluctuating environmental conditions. To this end, we discuss a class of '2-type' branching processes describing populations of individuals that may switch between 'active' and 'dormant' states in a random environment oscillating between a 'healthy' and a 'harsh' state. We incorporate different switching strategies and suggest a method of 'fair comparison' to incorporate potentially varying reproductive costs. We then use this concept to compare the fitness of the different strategies in terms of maximal Lyapunov exponents. This gives rise to a 'fitness map' depicting the environmental regimes where certain switching strategies are uniquely supercritical.
Asunto(s)
Evolución Biológica , Banco de Semillas , HumanosRESUMEN
We investigate scaling limits of the seed bank model when migration (to and from the seed bank) is 'slow' compared to reproduction. This is motivated by models for bacterial dormancy, where periods of dormancy can be orders of magnitude larger than reproductive times. Speeding up time, we encounter a separation of timescales phenomenon which leads to mathematically interesting observations, in particular providing a prototypical example where the scaling limit of a continuous diffusion will be a jump diffusion. For this situation, standard convergence results typically fail. While such a situation could in principle be attacked by the sophisticated analytical scheme of Kurtz (J Funct Anal 12:55-67, 1973), this will require significant technical efforts. Instead, in our situation, we are able to identify and explicitly characterise a well-defined limit via duality in a surprisingly non-technical way. Indeed, we show that moment duality is in a suitable sense stable under passage to the limit and allows a direct and intuitive identification of the limiting semi-group while at the same time providing a probabilistic interpretation of the model. We also obtain a general convergence strategy for continuous-time Markov chains in a separation of timescales regime, which is of independent interest.
Asunto(s)
Modelos Biológicos , Banco de Semillas , Tiempo , Difusión , Cadenas de MarkovRESUMEN
We derive statistical tools to analyze the patterns of genetic variability produced by models related to seed banks; in particular the Kingman coalescent, its time-changed counterpart describing so-called weak seed banks, the strong seed bank coalescent, and the two-island structured coalescent. As (strong) seed banks stratify a population, we expect them to produce a signal comparable to population structure. We present tractable formulas for Wright's FST and the expected site frequency spectrum for these models, and show that they can distinguish between some models for certain ranges of parameters. We then use pseudo-marginal MCMC to show that the full likelihood can reliably distinguish between all models in the presence of parameter uncertainty under moderate stratification, and point out statistical pitfalls arising from stratification that is either too strong or too weak. We further show that it is possible to infer parameters, and in particular determine whether mutation is taking place in the (strong) seed bank.
Asunto(s)
Modelos Genéticos , Banco de Semillas , Mutación , ProbabilidadRESUMEN
We investigate various aspects of the (biallelic) Wright-Fisher diffusion with seed bank in conjunction with and contrast to the two-island model analysed e.g. in Kermany et al. (Theor Popul Biol 74(3):226-232, 2008) and Nath and Griffiths (J Math Biol 31(8):841-851, 1993), including moments, stationary distribution and reversibility, for which our main tool is duality. Further, we show that the Wright-Fisher diffusion with seed bank can be reformulated as a one-dimensional stochastic delay differential equation, providing an elegant interpretation of the age structure in the seed bank also forward in time in the spirit of Kaj et al. (J Appl Probab 38(2):285-300, 2001). We also provide a complete boundary classification for this two-dimensional SDE using martingale-based reasoning known as McKean's argument.
Asunto(s)
Evolución Molecular , Flujo Genético , Genética de Población/métodos , Modelos Genéticos , Simulación por Computador , Frecuencia de los Genes , Haploidia , Selección Genética , Procesos EstocásticosRESUMEN
We give recursions for the expected site-frequency spectrum associated with so-called Xi-coalescents, that is exchangeable coalescents which admit simultaneous multiple mergers of ancestral lineages. Xi-coalescents arise, for example, in association with population models of skewed offspring distributions with diploidy, recurrent advantageous mutations, or strong bottlenecks. In contrast, the simpler Lambda-coalescents admit multiple mergers of lineages, but at most one such merger each time. Xi-coalescents, as well as Lambda-coalescents, can predict an excess of singletons, compared to the Kingman coalescent. We compare estimates of coalescent parameters when Xi-coalescents are applied to data generated by Lambda-coalescents, and vice versa. In general, Xi-coalescents predict fewer singletons than corresponding Lambda-coalescents, but a higher count of mutations of size larger than singletons. We fit examples of Xi-coalescents to unfolded site-frequency spectra obtained for autosomal loci of the diploid Atlantic cod, and obtain different coalescent parameter estimates than obtained with corresponding Lambda-coalescents. Our results provide new inference tools, and suggest that for autosomal population genetic data from diploid or polyploid highly fecund populations who may have skewed offspring distributions, one should not apply Lambda-coalescents, but Xi-coalescents.
Asunto(s)
Genética de Población/métodos , Modelos Genéticos , Mutación , Simulación por Computador , Diploidia , Fertilidad/genética , Reproducción/genéticaRESUMEN
We establish a link between Wakeley et al.'s (2012) cyclical pedigree model from population genetics and a randomized directed configuration model (DCM) considered by Cooper and Frieze (2004). We then exploit this link in combination with asymptotic results for the in-degree distribution of the corresponding DCM to compute the asymptotic size of the largest strongly connected component S(N) (where N is the population size) of the DCM resp. the pedigree. The size of the giant component can be characterized explicitly (amounting to approximately 80% of the total populations size) and thus contributes to a reduced 'pedigree effective population size'. In addition, the second largest strongly connected component is only of size O(logN). Moreover, we describe the size and structure of the 'domain of attraction' of S(N). In particular, we show that with high probability for any individual the shortest ancestral line reaches S(N) after O(loglogN) generations, while almost all other ancestral lines take at most O(logN) generations.
Asunto(s)
Genética de Población/métodos , Modelos Genéticos , Linaje , Algoritmos , Simulación por Computador , Humanos , Modelos Teóricos , Distribución de Poisson , Densidad de Población , ProbabilidadRESUMEN
Bacterial genomes are mosaics with fragments showing distinct phylogenetic origins or even being unrelated to any other genetic information (ORFan genes). Thus the analysis of bacterial population genetics is in large part a collection of explanations for anomalies in relation to classical population genetic models such as the Wright-Fisher model and the Kingman coalescent that do not adequately describe bacterial population genetics, genomics or evolution. The concept of "species" as an evolutionary coherent biological group that is genetically isolated and shares genetic information through recombination among its members cannot be applied to any bacterial group. Recently, a simple probabilistic model considering the role of strong seed-bank effects in population genetics has been proposed by Blath et al. This model suggests the existence of a genetic pool with high diversity that is not subject to classical selection and extinction. We reason that certain bacterial population genetics anomalies could be explained by the prevalence of strong seed-bank effects among bacteria. To address this possibility we analyzed the genome of the bacterium Azotobacter vinelandii and show that genes that code for functions that are essential for the bacterium biology do not have a relation of ancestry with closely related bacteria, or are ORFan genes. The existence of essential genes that are not inherited from the most recent ancestor cannot be explained by classical population genetics models and is irreconcilable with the current view of genes acquired by horizontal transfer as being accessory or adaptive.
Asunto(s)
Azotobacter vinelandii/genética , Evolución Molecular , Genoma Bacteriano/fisiología , Modelos GenéticosRESUMEN
We apply recently developed inference methods based on general coalescent processes to DNA sequence data obtained from various marine species. Several of these species are believed to exhibit so-called shallow gene genealogies, potentially due to extreme reproductive behaviour, e.g. via Hedgecock's "reproduction sweepstakes". Besides the data analysis, in particular the inference of mutation rates and the estimation of the (real) time to the most recent common ancestor, we briefly address the question whether the genealogies might be adequately described by so-called Beta-coalescents (as opposed to Kingman's coalescent), allowing multiple mergers of genealogies. The choice of the underlying coalescent model for the genealogy has drastic implications for the estimation of the above quantities, in particular the real-time embedding of the genealogy.
Asunto(s)
Biología Marina , Ostreidae/genética , Análisis de Secuencia de ADN , AnimalesRESUMEN
We present and discuss new importance sampling schemes for the approximate computation of the sample probability of observed genetic types in the infinitely many sites model from population genetics. More specifically, we extend the 'classical framework', where genealogies are assumed to be governed by Kingman's coalescent, to the more general class of Lambda-coalescents and develop further Hobolth et al.'s (2008) idea of deriving importance sampling schemes based on 'compressed genetrees'. The resulting schemes extend earlier work by Griffiths and Tavaré (1994), Stephens and Donnelly (2000), Birkner and Blath (2008) and Hobolth et al. (2008). We conclude with a performance comparison of classical and new schemes for Beta- and Kingman coalescents.
Asunto(s)
Evolución Molecular , Frecuencia de los Genes/genética , Genética de Población/métodos , Mutación/genética , Animales , Cadenas de Markov , Modelos Genéticos , Método de Montecarlo , MuestreoRESUMEN
Across the tree of life, populations have evolved the capacity to contend with suboptimal conditions by engaging in dormancy, whereby individuals enter a reversible state of reduced metabolic activity. The resulting seed banks are complex, storing information and imparting memory that gives rise to multi-scale structures and networks spanning collections of cells to entire ecosystems. We outline the fundamental attributes and emergent phenomena associated with dormancy and seed banks, with the vision for a unifying and mathematically based framework that can address problems in the life sciences, ranging from global change to cancer biology.
Asunto(s)
Latencia en las Plantas/fisiología , Banco de Semillas , Plantones/fisiología , Semillas/fisiología , Ecosistema , Ambiente , Regulación de la Expresión Génica de las Plantas , Humanos , Luz , Latencia en las Plantas/genética , Plantones/genética , Semillas/genética , TemperaturaRESUMEN
The ability of the site-frequency spectrum (SFS) to reflect the particularities of gene genealogies exhibiting multiple mergers of ancestral lines as opposed to those obtained in the presence of population growth is our focus. An excess of singletons is a well-known characteristic of both population growth and multiple mergers. Other aspects of the SFS, in particular, the weight of the right tail, are, however, affected in specific ways by the two model classes. Using an approximate likelihood method and minimum-distance statistics, our estimates of statistical power indicate that exponential and algebraic growth can indeed be distinguished from multiple-merger coalescents, even for moderate sample sizes, if the number of segregating sites is high enough. A normalized version of the SFS (nSFS) is also used as a summary statistic in an approximate Bayesian computation (ABC) approach. The results give further positive evidence as to the general eligibility of the SFS to distinguish between the different histories.
Asunto(s)
Genética de Población/métodos , Modelos Genéticos , Teorema de Bayes , Funciones de Verosimilitud , Crecimiento DemográficoRESUMEN
We analyze patterns of genetic variability of populations in the presence of a large seedbank with the help of a new coalescent structure called the seedbank coalescent. This ancestral process appears naturally as a scaling limit of the genealogy of large populations that sustain seedbanks, if the seedbank size and individual dormancy times are of the same order as those of the active population. Mutations appear as Poisson processes on the active lineages and potentially at reduced rate also on the dormant lineages. The presence of "dormant" lineages leads to qualitatively altered times to the most recent common ancestor and nonclassical patterns of genetic diversity. To illustrate this we provide a Wright-Fisher model with a seedbank component and mutation, motivated from recent models of microbial dormancy, whose genealogy can be described by the seedbank coalescent. Based on our coalescent model, we derive recursions for the expectation and variance of the time to most recent common ancestor, number of segregating sites, pairwise differences, and singletons. Estimates (obtained by simulations) of the distributions of commonly employed distance statistics, in the presence and absence of a seedbank, are compared. The effect of a seedbank on the expected site-frequency spectrum is also investigated using simulations. Our results indicate that the presence of a large seedbank considerably alters the distribution of some distance statistics, as well as the site-frequency spectrum. Thus, one should be able to detect from genetic data the presence of a large seedbank in natural populations.
Asunto(s)
Variación Genética , Modelos Genéticos , Esporas/genética , Archaea/genética , Bacterias/genética , Eucariontes/genética , Genética de Población , MutaciónRESUMEN
Statistical properties of the site-frequency spectrum associated with Λ-coalescents are our objects of study. In particular, we derive recursions for the expected value, variance, and covariance of the spectrum, extending earlier results of Fu (1995) for the classical Kingman coalescent. Estimating coalescent parameters introduced by certain Λ-coalescents for data sets too large for full-likelihood methods is our focus. The recursions for the expected values we obtain can be used to find the parameter values that give the best fit to the observed frequency spectrum. The expected values are also used to approximate the probability a (derived) mutation arises on a branch subtending a given number of leaves (DNA sequences), allowing us to apply a pseudolikelihood inference to estimate coalescence parameters associated with certain subclasses of Λ-coalescents. The properties of the pseudolikelihood approach are investigated on simulated as well as real mtDNA data sets for the high-fecundity Atlantic cod (Gadus morhua). Our results for two subclasses of Λ-coalescents show that one can distinguish these subclasses from the Kingman coalescent, as well as between the Λ-subclasses, even for a moderate (maybe a few hundred) sample size.
Asunto(s)
Genética de Población/estadística & datos numéricos , Densidad de Población , Animales , Organismos Acuáticos/genética , Organismos Acuáticos/fisiología , Simulación por Computador , ADN Mitocondrial/genética , Femenino , Fertilidad/genética , Gadus morhua/genética , Gadus morhua/fisiología , Funciones de Verosimilitud , Masculino , Modelos Genéticos , Modelos Estadísticos , Mutación , Reproducción/genéticaRESUMEN
A large offspring-number diploid biparental multilocus population model of Moran type is our object of study. At each time step, a pair of diploid individuals drawn uniformly at random contributes offspring to the population. The number of offspring can be large relative to the total population size. Similar "heavily skewed" reproduction mechanisms have been recently considered by various authors (cf. e.g., Eldon and Wakeley 2006, 2008) and reviewed by Hedgecock and Pudovkin (2011). Each diploid parental individual contributes exactly one chromosome to each diploid offspring, and hence ancestral lineages can coalesce only when in distinct individuals. A separation-of-timescales phenomenon is thus observed. A result of Möhle (1998) is extended to obtain convergence of the ancestral process to an ancestral recombination graph necessarily admitting simultaneous multiple mergers of ancestral lineages. The usual ancestral recombination graph is obtained as a special case of our model when the parents contribute only one offspring to the population each time. Due to diploidy and large offspring numbers, novel effects appear. For example, the marginal genealogy at each locus admits simultaneous multiple mergers in up to four groups, and different loci remain substantially correlated even as the recombination rate grows large. Thus, genealogies for loci far apart on the same chromosome remain correlated. Correlation in coalescence times for two loci is derived and shown to be a function of the coalescence parameters of our model. Extending the observations by Eldon and Wakeley (2008), predictions of linkage disequilibrium are shown to be functions of the reproduction parameters of our model, in addition to the recombination rate. Correlations in ratios of coalescence times between loci can be high, even when the recombination rate is high and sample size is large, in large offspring-number populations, as suggested by simulations, hinting at how to distinguish between different population models.
Asunto(s)
Diploidia , Modelos Genéticos , Recombinación Genética , Algoritmos , Animales , Simulación por Computador , Evolución Molecular , Femenino , Sitios Genéticos , Genética de Población , Humanos , MasculinoRESUMEN
One of the central problems in mathematical genetics is the inference of evolutionary parameters of a population (such as the mutation rate) based on the observed genetic types in a finite DNA sample. If the population model under consideration is in the domain of attraction of the classical Fleming-Viot process, such as the Wright-Fisher- or the Moran model, then the standard means to describe its genealogy is Kingman's coalescent. For this coalescent process, powerful inference methods are well-established. An important feature of the above class of models is, roughly speaking, that the number of offspring of each individual is small when compared to the total population size, and hence all ancestral collisions are binary only. Recently, more general population models have been studied, in particular in the domain of attraction of so-called generalised Lambda-Fleming-Viot processes, as well as their (dual) genealogies, given by the so-called Lambda-coalescents, which allow multiple collisions. Moreover, Eldon and Wakeley (Genetics 172:2621-2633, 2006) provide evidence that such more general coalescents might actually be more adequate to describe real populations with extreme reproductive behaviour, in particular many marine species. In this paper, we extend methods of Ethier and Griffiths (Ann Probab 15(2):515-545, 1987) and Griffiths and Tavaré (Theor Pop Biol 46:131-159, 1994a, Stat Sci 9:307-319, 1994b, Philos Trans Roy Soc Lond Ser B 344:403-410, 1994c, Math Biosci 12:77-98, 1995) to obtain a likelihood based inference method for general Lambda-coalescents. In particular, we obtain a method to compute (approximate) likelihood surfaces for the observed type probabilities of a given sample. We argue that within the (vast) family of Lambda-coalescents, the parametrisable sub-family of Beta(2 - alpha, alpha)-coalescents, where alpha in (1, 2], are of particular relevance. We illustrate our method using simulated datasets, thus obtaining maximum-likelihood estimators of mutation and demographic parameters.