Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 22
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Math Biol ; 86(6): 98, 2023 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-37233854

RESUMO

Recombination is a fundamental evolutionary force, but it is difficult to quantify because the effect of a recombination event on patterns of variation in a sample of genetic data can be hard to discern. Estimators for the recombination rate, which are usually based on the idea of integrating over the unobserved possible evolutionary histories of a sample, can therefore be noisy. Here we consider a related question: how would an estimator behave if the evolutionary history actually was observed? This would offer an upper bound on the performance of estimators used in practice. In this paper we derive an expression for the maximum likelihood estimator for the recombination rate based on a continuously observed, multi-locus, Wright-Fisher diffusion of haplotype frequencies, complementing existing work for an estimator of selection. We show that, contrary to selection, the estimator has unusual properties because the observed information matrix can explode in finite time whereupon the recombination parameter is learned without error. We also show that the recombination estimator is robust to the presence of selection in the sense that incorporating selection into the model leaves the estimator unchanged. We study the properties of the estimator by simulation and show that its distribution can be quite sensitive to the underlying mutation rates.


Assuntos
Evolução Biológica , Recombinação Genética , Haplótipos , Simulação por Computador , Modelos Genéticos
2.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36629450

RESUMO

MOTIVATION: The Wright-Fisher diffusion is important in population genetics in modelling the evolution of allele frequencies over time subject to the influence of biological phenomena such as selection, mutation and genetic drift. Simulating the paths of the process is challenging due to the form of the transition density. We present EWF, a robust and efficient sampler which returns exact draws for the diffusion and diffusion bridge processes, accounting for general models of selection including those with frequency dependence. RESULTS: Given a configuration of selection, mutation and endpoints, EWF returns draws at the requested sampling times from the law of the corresponding Wright-Fisher process. Output was validated by comparison to approximations of the transition density via the Kolmogorov-Smirnov test and QQ plots. AVAILABILITY AND IMPLEMENTATION: All softwares are available at https://github.com/JaroSant/EWF. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genética Populacional , Modelos Genéticos , Frequência do Gene , Deriva Genética , Mutação , Seleção Genética
3.
Mol Biol Evol ; 39(2)2022 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-35106601

RESUMO

The evolutionary process of genetic recombination has the potential to rapidly change the properties of a viral pathogen, and its presence is a crucial factor to consider in the development of treatments and vaccines. It can also significantly affect the results of phylogenetic analyses and the inference of evolutionary rates. The detection of recombination from samples of sequencing data is a very challenging problem and is further complicated for SARS-CoV-2 by its relatively slow accumulation of genetic diversity. The extent to which recombination is ongoing for SARS-CoV-2 is not yet resolved. To address this, we use a parsimony-based method to reconstruct possible genealogical histories for samples of SARS-CoV-2 sequences, which enables us to pinpoint specific recombination events that could have generated the data. We propose a statistical framework for disentangling the effects of recurrent mutation from recombination in the history of a sample, and hence provide a way of estimating the probability that ongoing recombination is present. We apply this to samples of sequencing data collected in England and South Africa and find evidence of ongoing recombination.


Assuntos
COVID-19 , SARS-CoV-2 , Genoma Viral , Humanos , Mutação , Filogenia , Recombinação Genética
4.
Bioinformatics ; 37(19): 3277-3284, 2021 Oct 11.
Artigo em Inglês | MEDLINE | ID: mdl-33970217

RESUMO

MOTIVATION: The reconstruction of possible histories given a sample of genetic data in the presence of recombination and recurrent mutation is a challenging problem, but can provide key insights into the evolution of a population. We present KwARG, which implements a parsimony-based greedy heuristic algorithm for finding plausible genealogical histories (ancestral recombination graphs) that are minimal or near-minimal in the number of posited recombination and mutation events. RESULTS: Given an input dataset of aligned sequences, KwARG outputs a list of possible candidate solutions, each comprising a list of mutation and recombination events that could have generated the dataset; the relative proportion of recombinations and recurrent mutations in a solution can be controlled via specifying a set of 'cost' parameters. We demonstrate that the algorithm performs well when compared against existing methods. AVAILABILITY AND IMPLEMENTATION: The software is available at https://github.com/a-ignatieva/kwarg. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

5.
Theor Popul Biol ; 134: 61-76, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32439294

RESUMO

The dynamics of a population exhibiting exponential growth can be modelled as a birth-death process, which naturally captures the stochastic variation in population size over time. In this article, we consider a supercritical birth-death process, started at a random time in the past, and conditioned to have n sampled individuals at the present. The genealogy of individuals sampled at the present time is then described by the reversed reconstructed process (RRP), which traces the ancestry of the sample backwards from the present. We show that a simple, analytic, time rescaling of the RRP provides a straightforward way to derive its inter-event times. The same rescaling characterises other distributions underlying this process, obtained elsewhere in the literature via more cumbersome calculations. We also consider the case of incomplete sampling of the population, in which each leaf of the genealogy is retained with an independent Bernoulli trial with probability ψ, and we show that corresponding results for Bernoulli-sampled RRPs can be derived using time rescaling, for any values of the underlying parameters. A central result is the derivation of a scaling limit as ψ approaches 0, corresponding to the underlying population growing to infinity, using the time rescaling formalism. We show that in this setting, after a linear time rescaling, the event times are the order statistics of n logistic random variables with mode log(1∕ψ); moreover, we show that the inter-event times are approximately exponentially distributed.


Assuntos
Densidade Demográfica , Humanos , Probabilidade
7.
Adv Neural Inf Process Syst ; 31: 8594-8605, 2018 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-33244210

RESUMO

An explosion of high-throughput DNA sequencing in the past decade has led to a surge of interest in population-scale inference with whole-genome data. Recent work in population genetics has centered on designing inference methods for relatively simple model classes, and few scalable general-purpose inference techniques exist for more realistic, complex models. To achieve this, two inferential challenges need to be addressed: (1) population data are exchangeable, calling for methods that efficiently exploit the symmetries of the data, and (2) computing likelihoods is intractable as it requires integrating over a set of correlated, extremely high-dimensional latent variables. These challenges are traditionally tackled by likelihood-free methods that use scientific simulators to generate datasets and reduce them to hand-designed, permutation-invariant summary statistics, often leading to inaccurate inference. In this work, we develop an exchangeable neural network that performs summary statistic-free, likelihood-free inference. Our framework can be applied in a black-box fashion across a variety of simulation-based tasks, both within and outside biology. We demonstrate the power of our approach on the recombination hotspot testing problem, outperforming the state-of-the-art.

8.
Theor Popul Biol ; 122: 67-77, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-28993198

RESUMO

The trajectory of the frequency of an allele which begins at x at time 0 and is known to have frequency z at time T can be modelled by the bridge process of the Wright-Fisher diffusion. Bridges when x=z=0 are particularly interesting because they model the trajectory of the frequency of an allele which appears at a time, then is lost by random drift or mutation after a time T. The coalescent genealogy back in time of a population in a neutral Wright-Fisher diffusion process is well understood. In this paper we obtain a new interpretation of the coalescent genealogy of the population in a bridge from a time t∈(0,T). In a bridge with allele frequencies of 0 at times 0 and T the coalescence structure is that the population coalesces in two directions from t to 0 and t to T such that there is just one lineage of the allele under consideration at times 0 and T. The genealogy in Wright-Fisher diffusion bridges with selection is more complex than in the neutral model, but still with the property of the population branching and coalescing in two directions from time t∈(0,T). The density of the frequency of an allele at time t is expressed in a way that shows coalescence in the two directions. A new algorithm for exact simulation of a neutral Wright-Fisher bridge is derived. This follows from knowing the density of the frequency in a bridge and exact simulation from the Wright-Fisher diffusion. The genealogy of the neutral Wright-Fisher bridge is also modelled by branching Pólya urns, extending a representation in a Wright-Fisher diffusion. This is a new very interesting representation that relates Wright-Fisher bridges to classical urn models in a Bayesian setting.


Assuntos
Frequência do Gene , Genética Populacional , Modelos Genéticos , Algoritmos , Alelos , Teorema de Bayes , Simulação por Computador , Genealogia e Heráldica , Deriva Genética , Humanos , Mutação
9.
Theor Popul Biol ; 112: 126-138, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-27594345

RESUMO

Duality plays an important role in population genetics. It can relate results from forwards-in-time models of allele frequency evolution with those of backwards-in-time genealogical models; a well known example is the duality between the Wright-Fisher diffusion for genetic drift and its genealogical counterpart, the coalescent. There have been a number of articles extending this relationship to include other evolutionary processes such as mutation and selection, but little has been explored for models also incorporating crossover recombination. Here, we derive from first principles a new genealogical process which is dual to a Wright-Fisher diffusion model of drift, mutation, and recombination. The process is reminiscent of the ancestral recombination graph, a widely-used multilocus genealogical model, but here ancestral lineages are typed and transition rates are regarded as being conditioned on an observed configuration at the leaves of the genealogy. Our approach is based on expressing a putative duality relationship between two models via their infinitesimal generators, and then seeking an appropriate test function to ensure the validity of the duality equation. This approach is quite general, and we use it to find dualities for several important variants, including both a discrete L-locus model of a gene and a continuous model in which mutation and recombination events are scattered along the gene according to continuous distributions. As an application of our results, we derive a series expansion for the transition function of the diffusion. Finally, we study in further detail the case in which mutation is absent. Then the dual process describes the dispersal of ancestral genetic material across the ancestors of a sample. The stationary distribution of this process is of particular interest; we show how duality relates this distribution to haplotype fixation probabilities. We develop an efficient method for computing such probabilities in multilocus models.


Assuntos
Frequência do Gene , Genética Populacional , Haplótipos , Recombinação Genética/genética , Deriva Genética , Modelos Genéticos
10.
Genetics ; 202(4): 1449-72, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26857628

RESUMO

Human immunodeficiency virus (HIV) is a rapidly evolving pathogen that causes chronic infections, so genetic diversity within a single infection can be very high. High-throughput "deep" sequencing can now measure this diversity in unprecedented detail, particularly since it can be performed at different time points during an infection, and this offers a potentially powerful way to infer the evolutionary dynamics of the intrahost viral population. However, population genomic inference from HIV sequence data is challenging because of high rates of mutation and recombination, rapid demographic changes, and ongoing selective pressures. In this article we develop a new method for inference using HIV deep sequencing data, using an approach based on importance sampling of ancestral recombination graphs under a multilocus coalescent model. The approach further extends recent progress in the approximation of so-called conditional sampling distributions, a quantity of key interest when approximating coalescent likelihoods. The chief novelties of our method are that it is able to infer rates of recombination and mutation, as well as the effective population size, while handling sampling over different time points and missing data without extra computational difficulty. We apply our method to a data set of HIV-1, in which several hundred sequences were obtained from an infected individual at seven time points over 2 years. We find mutation rate and effective population size estimates to be comparable to those produced by the software BEAST. Additionally, our method is able to produce local recombination rate estimates. The software underlying our method, Coalescenator, is freely available.


Assuntos
Variação Genética , Infecções por HIV/virologia , HIV-1/fisiologia , Algoritmos , Biologia Computacional/métodos , Simulação por Computador , Evolução Molecular , Genoma Viral , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Modelos Genéticos , Modelos Estatísticos , Mutação , RNA Viral , Recombinação Genética , Seleção Genética
11.
Artigo em Inglês | MEDLINE | ID: mdl-27375350

RESUMO

Widely used models in genetics include the Wright-Fisher diffusion and its moment dual, Kingman's coalescent. Each has a multilocus extension but under neither extension is the sampling distribution available in closed-form, and their computation is extremely difficult. In this paper we derive two new multilocus population genetic models, one a diffusion and the other a coalescent process, which are much simpler than the standard models, but which capture their key properties for large recombination rates. The diffusion model is based on a central limit theorem for density dependent population processes, and we show that the sampling distribution is a linear combination of moments of Gaussian distributions and hence available in closed-form. The coalescent process is based on a probabilistic coupling of the ancestral recombination graph to a simpler genealogical process which exposes the leading dynamics of the former. We further demonstrate that when we consider the sampling distribution as an asymptotic expansion in inverse powers of the recombination parameter, the sampling distributions of the new models agree with the standard ones up to the first two orders.

12.
Genetics ; 196(1): 295-311, 2014 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-24214345

RESUMO

It is becoming routine to obtain data sets on DNA sequence variation across several thousands of chromosomes, providing unprecedented opportunity to infer the underlying biological and demographic forces. Such data make it vital to study summary statistics that offer enough compression to be tractable, while preserving a great deal of information. One well-studied summary is the site frequency spectrum-the empirical distribution, across segregating sites, of the sample frequency of the derived allele. However, most previous theoretical work has assumed that each site has experienced at most one mutation event in its genealogical history, which becomes less tenable for very large sample sizes. In this work we obtain, in closed form, the predicted frequency spectrum of a site that has experienced at most two mutation events, under very general assumptions about the distribution of branch lengths in the underlying coalescent tree. Among other applications, we obtain the frequency spectrum of a triallelic site in a model of historically varying population size. We demonstrate the utility of our formulas in two settings: First, we show that triallelic sites are more sensitive to the parameters of a population that has experienced historical growth, suggesting that they will have use if they can be incorporated into demographic inference. Second, we investigate a recently proposed alternative mechanism of mutation in which the two derived alleles of a triallelic site are created simultaneously within a single individual, and we develop a test to determine whether it is responsible for the excess of triallelic sites in the human genome.


Assuntos
Frequência do Gene , Genética Populacional/estatística & dados numéricos , Genoma Humano/genética , Modelos Genéticos , Alelos , Sequência de Bases , Biometria , Demografia , Variação Genética , Humanos , Mutação , Densidade Demográfica
13.
J Chem Phys ; 139(16): 164201, 2013 Oct 28.
Artigo em Inglês | MEDLINE | ID: mdl-24182022

RESUMO

We present a versatile new instrument capable of measuring rovibrational transition frequencies of molecular ions with sub-MHz accuracy and precision. A liquid-nitrogen cooled positive column discharge cell, which can produce large column densities of a wide variety of molecular ions, is probed with sub-Doppler spectroscopy enabled by a high-power optical parametric oscillator locked to a moderate finesse external cavity. Frequency modulation (heterodyne) spectroscopy is employed to reduce intensity fluctuations due to the cavity lock, and velocity modulation spectroscopy permits ion-neutral discrimination. The relatively narrow Lamb dips are precisely and accurately calibrated using an optical frequency comb. This method is completely general as it relies on the direct measurement of absorption or dispersion of rovibrational transitions. We expect that this new approach will open up many new possibilities: from providing new benchmarks for state-of-the-art ab initio calculations to supporting astronomical observations to helping assign congested spectra by combination differences. Herein, we describe the instrument in detail and demonstrate its performance by measuring ten R-branch transitions in the ν2 band of H3(+), two transitions in the ν1 band of HCO(+), and the first sub-Doppler transition of CH5(+).

14.
PLoS One ; 7(11): e46947, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23226196

RESUMO

Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.


Assuntos
Fluxo Gênico , Genealogia e Heráldica , Genoma Fúngico , Recombinação Genética , Proteínas de Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/genética , Algoritmos , Alelos , Mapeamento Cromossômico , Evolução Molecular , Loci Gênicos , Variação Genética , Genética Populacional , Modelos Genéticos , Filogenia , Saccharomyces cerevisiae/classificação , Proteínas de Saccharomyces cerevisiae/classificação , Seleção Genética , Análise de Sequência de DNA
15.
Stat Appl Genet Mol Biol ; 11(1): Article 9, 2012 Jan 06.
Artigo em Inglês | MEDLINE | ID: mdl-22499685

RESUMO

To extract full information from samples of DNA sequence data, it is necessary to use sophisticated model-based techniques such as importance sampling under the coalescent. However, these are limited in the size of datasets they can handle efficiently. Chen and Liu (2000) introduced the idea of stopping-time resampling and showed that it can dramatically improve the efficiency of importance sampling methods under a finite-alleles coalescent model. In this paper, a new framework is developed for designing stopping-time resampling schemes under more general models. It is implemented on data both from infinite sites and stepwise models of mutation, and extended to incorporate crossover recombination. A simulation study shows that this new framework offers a substantial improvement in the accuracy of likelihood estimation over a range of parameters, while a direct application of the scheme of Chen and Liu (2000) can actually diminish the estimate. The method imposes no additional computational burden and is robust to the choice of parameters.


Assuntos
Genética Populacional , Modelos Genéticos , Mutação , Recombinação Genética , Funções Verossimilhança
16.
PLoS Genet ; 8(12): e1003090, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23284288

RESUMO

Estimating fine-scale recombination maps of Drosophila from population genomic data is a challenging problem, in particular because of the high background recombination rate. In this paper, a new computational method is developed to address this challenge. Through an extensive simulation study, it is demonstrated that the method allows more accurate inference, and exhibits greater robustness to the effects of natural selection and noise, compared to a well-used previous method developed for studying fine-scale recombination rate variation in the human genome. As an application, a genome-wide analysis of genetic variation data is performed for two Drosophila melanogaster populations, one from North America (Raleigh, USA) and the other from Africa (Gikongoro, Rwanda). It is shown that fine-scale recombination rate variation is widespread throughout the D. melanogaster genome, across all chromosomes and in both populations. At the fine-scale, a conservative, systematic search for evidence of recombination hotspots suggests the existence of a handful of putative hotspots each with at least a tenfold increase in intensity over the background rate. A wavelet analysis is carried out to compare the estimated recombination maps in the two populations and to quantify the extent to which recombination rates are conserved. In general, similarity is observed at very broad scales, but substantial differences are seen at fine scales. The average recombination rate of the X chromosome appears to be higher than that of the autosomes in both populations, and this pattern is much more pronounced in the African population than the North American population. The correlation between various genomic features-including recombination rates, diversity, divergence, GC content, gene content, and sequence quality-is examined using the wavelet analysis, and it is shown that the most notable difference between D. melanogaster and humans is in the correlation between recombination and diversity.


Assuntos
Mapeamento Cromossômico , Drosophila melanogaster/genética , Genoma de Inseto , Recombinação Genética , África , Animais , Composição de Bases , Cromossomos/genética , Troca Genética , Evolução Molecular , Variação Genética , Metagenômica , América do Norte , Polimorfismo de Nucleotídeo Único
17.
Ann Appl Probab ; 22(2): 576-607, 2012 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-23788830

RESUMO

For population genetics models with recombination, obtaining an exact, analytic sampling distribution has remained a challenging open problem for several decades. Recently, a new perspective based on asymptotic series has been introduced to make progress on this problem. Specifically, closed-form expressions have been derived for the first few terms in an asymptotic expansion of the two-locus sampling distribution when the recombination rate ρ is moderate to large. In this paper, a new computational technique is developed for finding the asymptotic expansion to an arbitrary order. Computation in this new approach can be automated easily. Furthermore, it is proved here that only a finite number of terms in the asymptotic expansion is needed to recover (via the method of Padé approximants) the exact two-locus sampling distribution as an analytic function of ρ; this function is exact for all values of ρ ∈ [0, ∞). It is also shown that the new computational framework presented here is flexible enough to incorporate natural selection.

18.
Theor Popul Biol ; 80(2): 158-73, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21550359

RESUMO

The sample frequency spectrum of a segregating site is the probability distribution of a sample of alleles from a genetic locus, conditional on observing the sample to be polymorphic. This distribution is widely used in population genetic inferences, including statistical tests of neutrality in which a skew in the observed frequency spectrum across independent sites is taken as a signature of departure from neutral evolution. Theoretical aspects of the frequency spectrum have been well studied and several interesting results are available, but they are usually under the assumption that a site has undergone at most one mutation event in the history of the sample. Here, we extend previous theoretical results by allowing for at most two mutation events per site, under a general finite allele model in which the mutation rate is independent of current allelic state but the transition matrix is otherwise completely arbitrary. Our results apply to both nested and nonnested mutations. Only the former has been addressed previously, whereas here we show it is the latter that is more likely to be observed except for very small sample sizes. Further, for any mutation transition matrix, we obtain the joint sample frequency spectrum of the two mutant alleles at a triallelic site, and derive a closed-form formula for the expected age of the younger of the two mutations given their frequencies in the population. Several large-scale resequencing projects for various species are presently under way and the resulting data will include some triallelic polymorphisms. The theoretical results described in this paper should prove useful in population genomic analyses of such data.


Assuntos
Fatores Etários , Alelos , Mutação , Análise de Elementos Finitos , Humanos
19.
J Comput Biol ; 18(1): 109-27, 2011 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21210733

RESUMO

Performing inference on contemporary samples of DNA sequence data is an important and challenging task. Computationally intensive methods such as importance sampling (IS) are attractive because they make full use of the available data, but in the presence of recombination the large state space of genealogies can be prohibitive. In this article, we make progress by developing an efficient IS proposal distribution for a two-locus model of sequence data. We show that the proposal developed here leads to much greater efficiency, outperforming existing IS methods that could be adapted to this model. Among several possible applications, the algorithm can be used to find maximum likelihood estimates for mutation and crossover rates, and to perform ancestral inference. We illustrate the method on previously reported sequence data covering two loci either side of the well-studied TAP2 recombination hotspot. The two loci are themselves largely non-recombining, so we obtain a gene tree at each locus and are able to infer in detail the effect of the hotspot on their joint ancestry. We summarize this joint ancestry by introducing the gene graph, a summary of the well-known ancestral recombination graph.


Assuntos
Desequilíbrio de Ligação , Modelos Genéticos , Algoritmos , Simulação por Computador , Humanos , Funções Verossimilhança , Cadeias de Markov , Recombinação Genética
20.
Ann Appl Probab ; 20(3): 1005-1028, 2010 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20671802

RESUMO

Ewens sampling formula (ESF) is a one-parameter family of probability distributions with a number of intriguing combinatorial connections. This elegant closed-form formula first arose in biology as the stationary probability distribution of a sample configuration at one locus under the infinite-alleles model of mutation. Since its discovery in the early 1970s, the ESF has been used in various biological applications, and has sparked several interesting mathematical generalizations. In the population genetics community, extending the underlying random-mating model to include recombination has received much attention in the past, but no general closed-form sampling formula is currently known even for the simplest extension, that is, a model with two loci. In this paper, we show that it is possible to obtain useful closed-form results in the case the population-scaled recombination rate ρ is large but not necessarily infinite. Specifically, we consider an asymptotic expansion of the two-locus sampling formula in inverse powers of ρ and obtain closed-form expressions for the first few terms in the expansion. Our asymptotic sampling formula applies to arbitrary sample sizes and configurations.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...