Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 205
Filtrar
1.
Bull Math Biol ; 86(5): 45, 2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38519704

RESUMO

Rooted binary galled trees generalize rooted binary trees to allow a restricted class of cycles, known as galls. We build upon the Wedderburn-Etherington enumeration of rooted binary unlabeled trees with n leaves to enumerate rooted binary unlabeled galled trees with n leaves, also enumerating rooted binary unlabeled galled trees with n leaves and g galls, 0 ⩽ g ⩽ ⌊ n - 1 2 ⌋ . The enumerations rely on a recursive decomposition that considers subtrees descended from the nodes of a gall, adopting a restriction on galls that amounts to considering only the rooted binary normal unlabeled galled trees in our enumeration. We write an implicit expression for the generating function encoding the numbers of trees for all n. We show that the number of rooted binary unlabeled galled trees grows with 0.0779 ( 4 . 8230 n ) n - 3 2 , exceeding the growth 0.3188 ( 2 . 4833 n ) n - 3 2 of the number of rooted binary unlabeled trees without galls. However, the growth of the number of galled trees with only one gall has the same exponential order 2.4833 as the number with no galls, exceeding it only in the subexponential term, 0.3910 n 1 2 compared to 0.3188 n - 3 2 . For a fixed number of leaves n, the number of galls g that produces the largest number of rooted binary unlabeled galled trees lies intermediate between the minimum of g = 0 and the maximum of g = ⌊ n - 1 2 ⌋ . We discuss implications in mathematical phylogenetics.


Assuntos
Conceitos Matemáticos , Modelos Biológicos , Folhas de Planta/metabolismo
2.
iScience ; 27(2): 108831, 2024 Feb 16.
Artigo em Inglês | MEDLINE | ID: mdl-38323008

RESUMO

An "Arizona search" is an evaluation of the numbers of pairs of profiles in a forensic-genetic database that possess partial or complete genotypic matches; such a search assists in establishing the extent to which a set of loci provides unique identifications. In forensic genetics, however, the potential for performing Arizona searches is constrained by the limited availability of actual forensic profiles for research purposes. Here, we use genotype imputation to circumvent this problem. From a database of genomes, we impute genotypes of forensic short-tandem-repeat (STR) loci from neighboring single-nucleotide polymorphisms (SNPs), searching for partial STR matches using the imputed profiles. We compare the distributions of the numbers of partial matches in imputed and actual profiles, finding close agreement. Despite limited potential for performing Arizona searches with actual forensic STR profiles, the questions that such searches seek to answer can be posed with imputation-based Arizona searches in increasingly large SNP databases.

3.
Biosystems ; 237: 105153, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38417692

RESUMO

The Hill numbers are statistics for biodiversity measurement in ecological studies, closely related to the Rényi and Shannon entropies from information theory. Recent developments in the mathematics of diversity in the setting of population genetics have produced mathematical constraints that characterize how standard measures depend on the highest-frequency class in a discrete probability distribution. Here, we apply these constraints to diversity statistics in ecology, focusing on the Hill numbers and the Rényi and Shannon entropies. The mathematical bounds can shift perspectives on the diversities of communities, in that when upper and lower bounds on Hill numbers are evaluated in a classic butterfly example, Hill numbers that are initially larger in one community switch positions-so that associated normalized Hill numbers are instead smaller than those of the other community. The new bounds hence add to the tools available for interpreting a commonly used family of statistics for ecological data.


Assuntos
Biodiversidade , Entropia , Matemática , Probabilidade
4.
Genetics ; 226(4)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38289724

RESUMO

In a genetically admixed population, admixed individuals possess genealogical and genetic ancestry from multiple source groups. Under a mechanistic model of admixture, we study the number of distinct ancestors from the source populations that the admixture represents. Combining a mechanistic admixture model with a recombination model that describes the probability that a genealogical ancestor is a genetic ancestor, for a member of a genetically admixed population, we count genetic ancestors from the source populations-those genealogical ancestors from the source populations who contribute to the genome of the modern admixed individual. We compare patterns in the numbers of genealogical and genetic ancestors across the generations. To illustrate the enumeration of genetic ancestors from source populations in an admixed group, we apply the model to the African-American population, extending recent results on the numbers of African and European genealogical ancestors that contribute to the pedigree of an African-American chosen at random, so that we also evaluate the numbers of African and European genetic ancestors who contribute to random African-American genomes. The model suggests that the autosomal genome of a random African-American born in the interval 1960-1965 contains genetic contributions from a mean of 162 African (standard deviation 47, interquartile range 127-192) and 32 European ancestors (standard deviation 14, interquartile range 21-43). The enumeration of genetic ancestors can potentially be performed in other diploid species in which admixture and recombination models can be specified.


Assuntos
Negro ou Afro-Americano , Genética Populacional , Humanos , Negro ou Afro-Americano/genética , População Europeia/genética
5.
Bioinformatics ; 40(1)2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38096585

RESUMO

MOTIVATION: In the mixed-membership unsupervised clustering analyses commonly used in population genetics, multiple replicate data analyses can differ in their clustering solutions. Combinatorial algorithms assist in aligning clustering outputs from multiple replicates so that clustering solutions can be interpreted and combined across replicates. Although several algorithms have been introduced, challenges exist in achieving optimal alignments and performing alignments in reasonable computation time. RESULTS: We present Clumppling, a method for aligning replicate solutions in mixed-membership unsupervised clustering. The method uses integer linear programming for finding optimal alignments, embedding the cluster alignment problem in standard combinatorial optimization frameworks. In example analyses, we find that it achieves solutions with preferred values of a desired objective function relative to those achieved by Pong and that it proceeds with less computation time than Clumpak. It is also the first method to permit alignments across replicates with multiple arbitrary values of the number of clusters K. AVAILABILITY AND IMPLEMENTATION: Clumppling is available at https://github.com/PopGenClustering/Clumppling.


Assuntos
Programação Linear , Software , Algoritmos , Genética Populacional , Análise por Conglomerados
6.
G3 (Bethesda) ; 14(2)2024 Feb 07.
Artigo em Inglês | MEDLINE | ID: mdl-37972246

RESUMO

Runs of homozygosity (ROH) and identity-by-descent (IBD) sharing can be studied in diploid coalescent models by noting that ROH and IBD-sharing at a genomic site are predicted to be inversely related to coalescence times-which in turn can be mathematically obtained in terms of parameters describing consanguinity rates. Comparing autosomal and X-chromosomal coalescent models, we consider ROH and IBD-sharing in relation to consanguinity that proceeds via multiple forms of first-cousin mating. We predict that across populations with different levels of consanguinity, (1) in a manner that is qualitatively parallel to the increase of autosomal IBD-sharing with autosomal ROH, X-chromosomal IBD-sharing increases with X-chromosomal ROH, owing to the dependence of both quantities on consanguinity levels; (2) even in the absence of consanguinity, X-chromosomal ROH and IBD-sharing levels exceed corresponding values for the autosomes, owing to the smaller population size and lower coalescence time for the X chromosome than for autosomes; (3) with matrilateral consanguinity, the relative increase in ROH and IBD-sharing on the X chromosome compared to the autosomes is greater than in the absence of consanguinity. Examining genome-wide SNPs in human populations for which consanguinity levels have been estimated, we find that autosomal and X-chromosomal ROH and IBD-sharing levels generally accord with the predictions. We find that each 1% increase in autosomal ROH is associated with an increase of 2.1% in X-chromosomal ROH, and each 1% increase in autosomal IBD-sharing is associated with an increase of 1.6% in X-chromosomal IBD-sharing. For each calculation, particularly for ROH, the estimate is reasonably close to the increase of 2% predicted by the population-size difference between autosomes and X chromosomes. The results support the utility of coalescent models for understanding patterns of genomic sharing and their dependence on sex-biased processes.


Assuntos
Genoma , Genômica , Humanos , Consanguinidade , Homozigoto , Cromossomo X , Polimorfismo de Nucleotídeo Único , Endogamia
7.
Discrete Appl Math ; 343: 65-81, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38078045

RESUMO

To a given gene tree topology G and species tree topology S with leaves labeled bijectively from a fixed set X, one can associate a set of ancestral configurations, each of which encodes a set of gene lineages that can be found at a given node of the species tree. We introduce a lattice structure on ancestral configurations, studying the directed graphs that provide graphical representations of lattices of ancestral configurations. For a matching gene tree topology and species tree topology G=S, we present a method for defining the digraph of ancestral configurations from the tree topology by using iterated cartesian products of graphs. We show that a specific set of paths on the digraph of ancestral configurations is in bijection with the set of labeled histories - a well-known phylogenetic object that enumerates possible temporal orderings of the coalescences of a tree. For each of a series of tree families, we obtain closed-form expressions for the number of labeled histories by using this bijection to count paths on associated digraphs. Finally, we prove that our lattice construction extends to nonmatching tree pairs, and we use it to characterize pairs (G,S) having the maximal number of ancestral configurations for a fixed G. We discuss how the construction provides new methods for performing enumerations of combinatorial aspects of gene and species trees.

8.
Stat Appl Genet Mol Biol ; 22(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-38073574

RESUMO

Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically. Examining two formulations of allele-sharing dissimilarity, we obtain the distributions of within-population and between-population dissimilarities for pairs of individuals. We then mathematically explore the scenarios in which, for certain allele-frequency distributions, the within-population dissimilarity - the mean dissimilarity between randomly chosen members of a population - can exceed the dissimilarity between two populations. Such scenarios assist in explaining observations in population-genetic data that members of a population can be empirically more genetically dissimilar from each other on average than they are from members of another population. For a population pair, however, the mathematical analysis finds that at least one of the two populations always possesses smaller within-population dissimilarity than the value of the between-population dissimilarity. We illustrate the mathematical results with an application to human population-genetic data.


Assuntos
Genética Populacional , Humanos , Alelos , Frequência do Gene , Genótipo
9.
Proc Biol Sci ; 290(2011): 20231634, 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-37964528

RESUMO

The study of cultural evolution benefits from detailed analysis of cultural transmission in specific human domains. Chess provides a platform for understanding the transmission of knowledge due to its active community of players, precise behaviours and long-term records of high-quality data. In this paper, we perform an analysis of chess in the context of cultural evolution, describing multiple cultural factors that affect move choice. We then build a population-level statistical model of move choice in chess, based on the Dirichlet-multinomial likelihood, to analyse cultural transmission over decades of recorded games played by leading players. For moves made in specific positions, we evaluate the relative effects of frequency-dependent bias, success bias and prestige bias on the dynamics of move frequencies. We observe that negative frequency-dependent bias plays a role in the dynamics of certain moves, and that other moves are compatible with transmission under prestige bias or success bias. These apparent biases may reflect recent changes, namely the introduction of computer chess engines and online tournament broadcasts. Our analysis of chess provides insights into broader questions concerning how social learning biases affect cultural evolution.


Assuntos
Aprendizado Social , Humanos , Modelos Estatísticos
10.
J Comput Graph Stat ; 32(3): 1145-1159, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37982130

RESUMO

Mixed-membership unsupervised clustering is widely used to extract informative patterns from data in many application areas. For a shared data set, the stochasticity and unsupervised nature of clustering algorithms can cause difficulties in comparing clustering results produced by different algorithms, or even multiple runs of the same algorithm, as outcomes can differ owing to permutation of the cluster labels or genuine differences in clustering results. Here, with a focus on inference of individual genetic ancestry in population-genetic studies, we study the cost of misalignment of mixed-membership unsupervised clustering replicates under a theoretical model of cluster memberships. Using Dirichlet distributions to model membership coefficient vectors, we provide theoretical results quantifying the alignment cost as a function of the Dirichlet parameters and the Hamming permutation difference between replicates. For fixed Dirichlet parameters, the alignment cost is seen to increase with the Hamming distance between permutations. Data sets with low variance across individuals of membership coefficients for specific clusters generally produce high misalignment costs-so that a single optimal permutation has far lower cost than suboptimal permutations. Higher variability in data, as represented by greater variance of membership coefficients, generally results in alignment costs that are similar between the optimal permutation and suboptimal permutations. We demonstrate the application of the theoretical results to data simulated under the Dirichlet model, as well as to membership estimates from inference of human-genetic ancestry. The results can contribute to improving cluster alignment algorithms that seek to find optimal permutations of replicates.

11.
J Math Biol ; 87(5): 76, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884812

RESUMO

The measurement of diversity is a central component of studies in ecology and evolution, with broad uses spanning multiple biological scales. Studies of diversity conducted in population genetics and ecology make use of analogous concepts and even employ equivalent mathematical formulas. For the Shannon entropy statistic, recent developments in the mathematics of diversity in population genetics have produced mathematical constraints on the statistic in relation to the frequency of the most frequent allele. These results have characterized the ways in which standard measures depend on the highest-frequency class in a discrete probability distribution. Here, we extend mathematical constraints on the Shannon entropy in relation to entries in specific positions in a vector of species abundances, listed in decreasing order. We illustrate the new mathematical results using abundance data from examples involving coral reefs and sponge microbiomes. The new results update the understanding of the relationship of a standard measure to the abundance vectors from which it is calculated, potentially contributing to improved interpretation of numerical measurements of biodiversity.


Assuntos
Ecologia , Genética Populacional , Biodiversidade , Matemática , Probabilidade
12.
bioRxiv ; 2023 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-37808827

RESUMO

Humans constantly encounter new microbes, but few become long-term residents of the adult gut microbiome. Classical theories predict that colonization is determined by the availability of open niches, but it remains unclear whether other ecological barriers limit commensal colonization in natural settings. To disentangle these effects, we used a controlled perturbation with the antibiotic ciprofloxacin to investigate the dynamics of gut microbiome transmission in 22 households of healthy, cohabiting adults. Colonization was rare in three-quarters of antibiotic-taking subjects, whose resident strains rapidly recovered in the week after antibiotics ended. In contrast, the remaining antibiotic-taking subjects exhibited lasting responses, with extensive species losses and transient expansions of potential opportunistic pathogens. These subjects experienced elevated rates of commensal colonization, but only after long delays: many new colonizers underwent sudden, correlated expansions months after the antibiotic perturbation. Furthermore, strains that had previously transmitted between cohabiting partners rarely recolonized after antibiotic disruptions, showing that colonization displays substantial historical contingency. This work demonstrates that there remain substantial ecological barriers to colonization even after major microbiome disruptions, suggesting that dispersal interactions and priority effects limit the pace of community change.

13.
Eur J Hum Genet ; 31(11): 1283-1290, 2023 11.
Artigo em Inglês | MEDLINE | ID: mdl-37567955

RESUMO

In many forensic settings, identity of a DNA sample is sought from poor-quality DNA, for which the typical STR loci tabulated in forensic databases are not possible to reliably genotype. Genome-wide SNPs, however, can potentially be genotyped from such samples via next-generation sequencing, so that queries can in principle compare SNP genotypes from DNA samples of interest to STR genotype profiles that represent proposed matches. We use genetic record-matching to evaluate the possibility of testing SNP profiles obtained from poor-quality DNA samples to identify exact and relatedness matches to STR profiles. Using simulations based on whole-genome sequences, we show that in some settings, similar match accuracies to those seen with full coverage of the genome are obtained by genetic record-matching for SNP data that represent 5-10% genomic coverage. Thus, if even a fraction of random genomic SNPs can be genotyped by next-generation sequencing, then the potential may exist to test the resulting genotype profiles for matches to profiles consisting exclusively of nonoverlapping STR loci. The result has implications in relation to criminal justice, mass disasters, missing-person cases, studies of ancient DNA, and genomic privacy.


Assuntos
Impressões Digitais de DNA , Polimorfismo de Nucleotídeo Único , Humanos , Impressões Digitais de DNA/métodos , Repetições de Microssatélites , Genótipo , Genômica , DNA , Sequenciamento de Nucleotídeos em Larga Escala/métodos
14.
Genetics ; 224(3)2023 Jul 06.
Artigo em Inglês | MEDLINE | ID: mdl-37410594

RESUMO

Members of genetically admixed populations possess ancestry from multiple source groups, and studies of human genetic admixture frequently estimate ancestry components corresponding to fractions of individual genomes that trace to specific ancestral populations. However, the same numerical ancestry fraction can represent a wide array of admixture scenarios within an individual's genealogy. Using a mechanistic model of admixture, we consider admixture genealogically: how many ancestors from the source populations does the admixture represent? We consider African-Americans, for whom continent-level estimates produce a 75-85% value for African ancestry on average and 15-25% for European ancestry. Genetic studies together with key features of African-American demographic history suggest ranges for parameters of a simple three-epoch model. Considering parameter sets compatible with estimates of current ancestry levels, we infer that if all genealogical lines of a random African-American born during 1960-1965 are traced back until they reach members of source populations, the mean over parameter sets of the expected number of genealogical lines terminating with African individuals is 314 (interquartile range 240-376), and the mean of the expected number terminating in Europeans is 51 (interquartile range 32-69). Across discrete generations, the peak number of African genealogical ancestors occurs in birth cohorts from the early 1700s, and the probability exceeds 50% that at least one European ancestor was born more recently than 1835. Our genealogical perspective can contribute to further understanding the admixture processes that underlie admixed populations. For African-Americans, the results provide insight both on how many of the ancestors of a typical African-American might have been forcibly displaced in the Transatlantic Slave Trade and on how many separate European admixture events might exist in a typical African-American genealogy.


Assuntos
População Negra , Negro ou Afro-Americano , Humanos , População Negra/genética , Negro ou Afro-Americano/genética , Genética Populacional
15.
Elife ; 122023 04 25.
Artigo em Inglês | MEDLINE | ID: mdl-37096877

RESUMO

From the 15th to the 19th century, the Trans-Atlantic Slave-Trade (TAST) influenced the genetic and cultural diversity of numerous populations. We explore genomic and linguistic data from the nine islands of Cabo Verde, the earliest European colony of the era in Africa, a major Slave-Trade platform between the 16th and 19th centuries, and a previously uninhabited location ideal for investigating early admixture events between Europeans and Africans. Using local-ancestry inference approaches, we find that genetic admixture in Cabo Verde occurred primarily between Iberian and certain Senegambian populations, although forced and voluntary migrations to the archipelago involved numerous other populations. Inter-individual genetic and linguistic variation recapitulates the geographic distribution of individuals' birth-places across Cabo Verdean islands, following an isolation-by-distance model with reduced genetic and linguistic effective dispersals within the archipelago, and suggesting that Kriolu language variants have developed together with genetic divergences at very reduced geographical scales. Furthermore, based on approximate bayesian computation inferences of highly complex admixture histories, we find that admixture occurred early on each island, long before the 18th-century massive TAST deportations triggered by the expansion of the plantation economy in Africa and the Americas, and after this era mostly during the abolition of the TAST and of slavery in European colonial empires. Our results illustrate how shifting socio-cultural relationships between enslaved and non-enslaved communities during and after the TAST, shaped enslaved-African descendants' genomic diversity and structure on both sides of the Atlantic.


Assuntos
Pessoas Escravizadas , Linguística , Humanos , Cabo Verde , Teorema de Bayes , África , Variação Genética , Genética Populacional
16.
Genetics ; 224(2)2023 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-37075098

RESUMO

In studying allele-frequency variation across populations, it is often convenient to classify an allelic type as "rare," with nonzero frequency less than or equal to a specified threshold, "common," with a frequency above the threshold, or entirely unobserved in a population. When sample sizes differ across populations, however, especially if the threshold separating "rare" and "common" corresponds to a small number of observed copies of an allelic type, discreteness effects can lead a sample from one population to possess substantially more rare allelic types than a sample from another population, even if the two populations have extremely similar underlying allele-frequency distributions across loci. We introduce a rarefaction-based sample-size correction for use in comparing rare and common variation across multiple populations whose sample sizes potentially differ. We use our approach to examine rare and common variation in worldwide human populations, finding that the sample-size correction introduces subtle differences relative to analyses that use the full available sample sizes. We introduce several ways in which the rarefaction approach can be applied: we explore the dependence of allele classifications on subsample sizes, we permit more than two classes of allelic types of nonzero frequency, and we analyze rare and common variation in sliding windows along the genome. The results can assist in clarifying similarities and differences in allele-frequency patterns across populations.


Assuntos
Variação Genética , Humanos , Frequência do Gene
17.
Algorithms Mol Biol ; 18(1): 1, 2023 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-36782318

RESUMO

OBJECTIVE: In mathematical phylogenetics, a labeled rooted binary tree topology can possess any of a number of labeled histories, each of which represents a possible temporal ordering of its coalescences. Labeled histories appear frequently in calculations that describe the combinatorics of phylogenetic trees. Here, we generalize the concept of labeled histories from rooted phylogenetic trees to rooted phylogenetic networks, specifically for the class of rooted phylogenetic networks known as rooted galled trees. RESULTS: Extending a recursive algorithm for enumerating the labeled histories of a labeled tree topology, we present a method to enumerate the labeled histories associated with a labeled rooted galled tree. The method relies on a recursive decomposition by which each gall in a galled tree possesses three or more descendant subtrees. We exhaustively provide the numbers of labeled histories for all small galled trees, finding that each gall reduces the number of labeled histories relative to a specified galled tree that does not contain it. CONCLUSION: The results expand the set of structures for which labeled histories can be enumerated, extending a well-known calculation for phylogenetic trees to a class of phylogenetic networks.

18.
Artigo em Inglês | MEDLINE | ID: mdl-36276878

RESUMO

High-dimensional datasets on cultural characters contribute to uncovering insights about factors that influence cultural evolution. Because cultural variation in part reflects descent processes with a hierarchical structure - including the descent of populations and vertical transmission of cultural traits - methods designed for hierarchically structured data have potential to find applications in the analysis of cultural variation. We adapt a network-based hierarchical clustering method for use in analysing cultural variation. Given a set of entities, the method constructs a similarity network, hierarchically depicting community structure among them. We illustrate the approach using four datasets: pronunciation variation in the US mid-Atlantic region, folklore variation in worldwide cultures, phonemic variation across worldwide languages and temporal variation in first names in the US. In these examples, the method provides insights into processes that affect cultural variation, uncovering geographic and other influences on observed patterns and cultural characters that make important contributions to them.

19.
Theor Popul Biol ; 147: 1-15, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35973448

RESUMO

By providing additional opportunities for coalescence within families, the presence of consanguineous unions in a population reduces coalescence times relative to non-consanguineous populations. First-cousin consanguinity can take one of six forms differing in the configuration of sexes in the pedigree of the male and female cousins who join in a consanguineous union: patrilateral parallel, patrilateral cross, matrilateral parallel, matrilateral cross, bilateral parallel, and bilateral cross. Considering populations with each of the six types of first-cousin consanguinity individually and a population with a mixture of the four unilateral types, we examine coalescent models of consanguinity. We previously computed, for first-cousin consanguinity models, the mean coalescence time for X-chromosomal loci and the limiting distribution of coalescence times for autosomal loci. Here, we use the separation-of-time-scales approach to obtain the limiting distribution of coalescence times for X-chromosomal loci. This limiting distribution has an instantaneous coalescence probability that depends on the probability that a union is consanguineous; lineages that do not coalesce instantaneously coalesce according to an exponential distribution. We study the effects on the coalescence time distribution of the type of first-cousin consanguinity, showing that patrilateral-parallel and patrilateral-cross consanguinity have no effect on X-chromosomal coalescence time distributions and that matrilateral-parallel consanguinity decreases coalescence times to a greater extent than does matrilateral-cross consanguinity.


Assuntos
Família , Casamento , Consanguinidade , Feminino , Humanos , Masculino , Linhagem
20.
G3 (Bethesda) ; 12(10)2022 09 30.
Artigo em Inglês | MEDLINE | ID: mdl-35951748

RESUMO

Properties of gene genealogies such as tree height (H), total branch length (L), total lengths of external (E) and internal (I) branches, mean length of basal branches (B), and the underlying coalescence times (T) can be used to study population-genetic processes and to develop statistical tests of population-genetic models. Uses of tree features in statistical tests often rely on predictions that depend on pairwise relationships among such features. For genealogies under the coalescent, we provide exact expressions for Taylor approximations to expected values and variances of ratios Xn/Yn, for all 15 pairs among the variables {Hn,Ln,En,In,Bn,Tk}, considering n leaves and 2≤k≤n. For expected values of the ratios, the approximations match closely with empirical simulation-based values. The approximations to the variances are not as accurate, but they generally match simulations in their trends as n increases. Although En has expectation 2 and Hn has expectation 2 in the limit as n→∞, the approximation to the limiting expectation for En/Hn is not 1, instead equaling π2/3-2≈1.28987. The new approximations augment fundamental results in coalescent theory on the shapes of genealogical trees.


Assuntos
Modelos Genéticos , Motivação , Simulação por Computador , Humanos , Filogenia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...