Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 206
Filtrar
1.
Cell ; 175(3): 848-858.e6, 2018 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-30318150

RESUMO

In familial searching in forensic genetics, a query DNA profile is tested against a database to determine whether it represents a relative of a database entrant. We examine the potential for using linkage disequilibrium to identify pairs of profiles as belonging to relatives when the query and database rely on nonoverlapping genetic markers. Considering data on individuals genotyped with both microsatellites used in forensic applications and genome-wide SNPs, we find that ∼30%-32% of parent-offspring pairs and ∼35%-36% of sib pairs can be identified from the SNPs of one member of the pair and the microsatellites of the other. The method suggests the possibility of performing familial searches of microsatellite databases using query SNP profiles, or vice versa. It also reveals that privacy concerns arising from computations across multiple databases that share no genetic markers in common entail risks, not only for database entrants, but for their close relatives as well.


Assuntos
Família , Genética Forense/métodos , Genética Populacional/métodos , Técnicas de Genotipagem/métodos , Polimorfismo de Nucleotídeo Único , Feminino , Humanos , Desequilíbrio de Ligação , Masculino , Repetições de Microssatélites , Modelos Genéticos , Modelos Estatísticos , Linhagem
2.
Bioinformatics ; 40(1)2024 01 02.
Artigo em Inglês | MEDLINE | ID: mdl-38096585

RESUMO

MOTIVATION: In the mixed-membership unsupervised clustering analyses commonly used in population genetics, multiple replicate data analyses can differ in their clustering solutions. Combinatorial algorithms assist in aligning clustering outputs from multiple replicates so that clustering solutions can be interpreted and combined across replicates. Although several algorithms have been introduced, challenges exist in achieving optimal alignments and performing alignments in reasonable computation time. RESULTS: We present Clumppling, a method for aligning replicate solutions in mixed-membership unsupervised clustering. The method uses integer linear programming for finding optimal alignments, embedding the cluster alignment problem in standard combinatorial optimization frameworks. In example analyses, we find that it achieves solutions with preferred values of a desired objective function relative to those achieved by Pong and that it proceeds with less computation time than Clumpak. It is also the first method to permit alignments across replicates with multiple arbitrary values of the number of clusters K. AVAILABILITY AND IMPLEMENTATION: Clumppling is available at https://github.com/PopGenClustering/Clumppling.


Assuntos
Programação Linear , Software , Algoritmos , Genética Populacional , Análise por Conglomerados
3.
Proc Natl Acad Sci U S A ; 119(13): e2111533119, 2022 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-35312358

RESUMO

SignificanceCalifornia supports a high cultural and linguistic diversity of Indigenous peoples. In a partnership of researchers with the Muwekma Ohlone tribe, we studied genomes of eight present-day tribal members and 12 ancient individuals from two archaeological sites in the San Francisco Bay Area, spanning ∼2,000 y. We find that compared to genomes of Indigenous individuals from throughout the Americas, the 12 ancient individuals are most genetically similar to ancient individuals from Southern California, and that despite spanning a large time period, they share distinctive ancestry. This ancestry is also shared with present-day tribal members, providing evidence of genetic continuity between past and present Indigenous individuals in the region, in contrast to some popular reconstructions based on archaeological and linguistic information.


Assuntos
Genômica , Povos Indígenas , Arqueologia , DNA Antigo , Genética Populacional , História Antiga , Humanos , Linguística , São Francisco
4.
Stat Appl Genet Mol Biol ; 22(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-38073574

RESUMO

Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically. Examining two formulations of allele-sharing dissimilarity, we obtain the distributions of within-population and between-population dissimilarities for pairs of individuals. We then mathematically explore the scenarios in which, for certain allele-frequency distributions, the within-population dissimilarity - the mean dissimilarity between randomly chosen members of a population - can exceed the dissimilarity between two populations. Such scenarios assist in explaining observations in population-genetic data that members of a population can be empirically more genetically dissimilar from each other on average than they are from members of another population. For a population pair, however, the mathematical analysis finds that at least one of the two populations always possesses smaller within-population dissimilarity than the value of the between-population dissimilarity. We illustrate the mathematical results with an application to human population-genetic data.


Assuntos
Genética Populacional , Humanos , Alelos , Frequência do Gene , Genótipo
5.
Bull Math Biol ; 86(5): 45, 2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38519704

RESUMO

Rooted binary galled trees generalize rooted binary trees to allow a restricted class of cycles, known as galls. We build upon the Wedderburn-Etherington enumeration of rooted binary unlabeled trees with n leaves to enumerate rooted binary unlabeled galled trees with n leaves, also enumerating rooted binary unlabeled galled trees with n leaves and g galls, 0 ⩽ g ⩽ ⌊ n - 1 2 ⌋ . The enumerations rely on a recursive decomposition that considers subtrees descended from the nodes of a gall, adopting a restriction on galls that amounts to considering only the rooted binary normal unlabeled galled trees in our enumeration. We write an implicit expression for the generating function encoding the numbers of trees for all n. We show that the number of rooted binary unlabeled galled trees grows with 0.0779 ( 4 . 8230 n ) n - 3 2 , exceeding the growth 0.3188 ( 2 . 4833 n ) n - 3 2 of the number of rooted binary unlabeled trees without galls. However, the growth of the number of galled trees with only one gall has the same exponential order 2.4833 as the number with no galls, exceeding it only in the subexponential term, 0.3910 n 1 2 compared to 0.3188 n - 3 2 . For a fixed number of leaves n, the number of galls g that produces the largest number of rooted binary unlabeled galled trees lies intermediate between the minimum of g = 0 and the maximum of g = ⌊ n - 1 2 ⌋ . We discuss implications in mathematical phylogenetics.


Assuntos
Conceitos Matemáticos , Modelos Biológicos , Folhas de Planta/metabolismo
6.
PLoS Genet ; 17(2): e1009278, 2021 02.
Artigo em Inglês | MEDLINE | ID: mdl-33630838

RESUMO

The prospect of utilizing CRISPR-based gene-drive technology for controlling populations has generated much excitement. However, the potential for spillovers of gene-drive alleles from the target population to non-target populations has raised concerns. Here, using mathematical models, we investigate the possibility of limiting spillovers to non-target populations by designing differential-targeting gene drives, in which the expected equilibrium gene-drive allele frequencies are high in the target population but low in the non-target population. We find that achieving differential targeting is possible with certain configurations of gene-drive parameters, but, in most cases, only under relatively low migration rates between populations. Under high migration, differential targeting is possible only in a narrow region of the parameter space. Because fixation of the gene drive in the non-target population could severely disrupt ecosystems, we outline possible ways to avoid this outcome. We apply our model to two potential applications of gene drives-field trials for malaria-vector gene drives and control of invasive species on islands. We discuss theoretical predictions of key requirements for differential targeting and their practical implications.


Assuntos
Tecnologia de Impulso Genético/métodos , Marcação de Genes/métodos , Malária/transmissão , Alelos , Animais , Sistemas CRISPR-Cas , Ecossistema , Frequência do Gene , Espécies Introduzidas/estatística & dados numéricos , Modelos Genéticos , Modelos Teóricos , Roedores
7.
Discrete Appl Math ; 343: 65-81, 2024 Jan 30.
Artigo em Inglês | MEDLINE | ID: mdl-38078045

RESUMO

To a given gene tree topology G and species tree topology S with leaves labeled bijectively from a fixed set X, one can associate a set of ancestral configurations, each of which encodes a set of gene lineages that can be found at a given node of the species tree. We introduce a lattice structure on ancestral configurations, studying the directed graphs that provide graphical representations of lattices of ancestral configurations. For a matching gene tree topology and species tree topology G=S, we present a method for defining the digraph of ancestral configurations from the tree topology by using iterated cartesian products of graphs. We show that a specific set of paths on the digraph of ancestral configurations is in bijection with the set of labeled histories - a well-known phylogenetic object that enumerates possible temporal orderings of the coalescences of a tree. For each of a series of tree families, we obtain closed-form expressions for the number of labeled histories by using this bijection to count paths on associated digraphs. Finally, we prove that our lattice construction extends to nonmatching tree pairs, and we use it to characterize pairs (G,S) having the maximal number of ancestral configurations for a fixed G. We discuss how the construction provides new methods for performing enumerations of combinatorial aspects of gene and species trees.

8.
Proc Biol Sci ; 290(2011): 20231634, 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-37964528

RESUMO

The study of cultural evolution benefits from detailed analysis of cultural transmission in specific human domains. Chess provides a platform for understanding the transmission of knowledge due to its active community of players, precise behaviours and long-term records of high-quality data. In this paper, we perform an analysis of chess in the context of cultural evolution, describing multiple cultural factors that affect move choice. We then build a population-level statistical model of move choice in chess, based on the Dirichlet-multinomial likelihood, to analyse cultural transmission over decades of recorded games played by leading players. For moves made in specific positions, we evaluate the relative effects of frequency-dependent bias, success bias and prestige bias on the dynamics of move frequencies. We observe that negative frequency-dependent bias plays a role in the dynamics of certain moves, and that other moves are compatible with transmission under prestige bias or success bias. These apparent biases may reflect recent changes, namely the introduction of computer chess engines and online tournament broadcasts. Our analysis of chess provides insights into broader questions concerning how social learning biases affect cultural evolution.


Assuntos
Aprendizado Social , Humanos , Modelos Estatísticos
9.
J Math Biol ; 87(5): 76, 2023 10 26.
Artigo em Inglês | MEDLINE | ID: mdl-37884812

RESUMO

The measurement of diversity is a central component of studies in ecology and evolution, with broad uses spanning multiple biological scales. Studies of diversity conducted in population genetics and ecology make use of analogous concepts and even employ equivalent mathematical formulas. For the Shannon entropy statistic, recent developments in the mathematics of diversity in population genetics have produced mathematical constraints on the statistic in relation to the frequency of the most frequent allele. These results have characterized the ways in which standard measures depend on the highest-frequency class in a discrete probability distribution. Here, we extend mathematical constraints on the Shannon entropy in relation to entries in specific positions in a vector of species abundances, listed in decreasing order. We illustrate the new mathematical results using abundance data from examples involving coral reefs and sponge microbiomes. The new results update the understanding of the relationship of a standard measure to the abundance vectors from which it is calculated, potentially contributing to improved interpretation of numerical measurements of biodiversity.


Assuntos
Ecologia , Genética Populacional , Biodiversidade , Matemática , Probabilidade
10.
Proc Natl Acad Sci U S A ; 117(46): 28876-28886, 2020 11 17.
Artigo em Inglês | MEDLINE | ID: mdl-33139566

RESUMO

Genealogical tree modeling is essential for estimating evolutionary parameters in population genetics and phylogenetics. Recent mathematical results concerning ranked genealogies without leaf labels unlock opportunities in the analysis of evolutionary trees. In particular, comparisons between ranked genealogies facilitate the study of evolutionary processes of different organisms sampled at multiple time periods. We propose metrics on ranked tree shapes and ranked genealogies for lineages isochronously and heterochronously sampled. Our proposed tree metrics make it possible to conduct statistical analyses of ranked tree shapes and timed ranked tree shapes or ranked genealogies. Such analyses allow us to assess differences in tree distributions, quantify estimation uncertainty, and summarize tree distributions. We show the utility of our metrics via simulations and an application in infectious diseases.


Assuntos
Genética Populacional/métodos , Análise de Sequência de DNA/métodos , Evolução Biológica , Simulação por Computador , Modelos Genéticos , Linhagem , Filogenia
11.
Genome Res ; 29(12): 2020-2033, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31694865

RESUMO

Analysis of population structure in natural populations using genetic data is a common practice in ecological and evolutionary studies. With large genomic data sets of populations now appearing more frequently across the taxonomic spectrum, it is becoming increasingly possible to reveal many hierarchical levels of structure, including fine-scale genetic clusters. To analyze these data sets, methods need to be appropriately suited to the challenges of extracting multilevel structure from whole-genome data. Here, we present a network-based approach for constructing population structure representations from genetic data. The use of community-detection algorithms from network theory generates a natural hierarchical perspective on the representation that the method produces. The method is computationally efficient, and it requires relatively few assumptions regarding the biological processes that underlie the data. We show the approach by analyzing population structure in the model plant species Arabidopsis thaliana and in human populations. These examples illustrate how network-based approaches for population structure analysis are well-suited to extracting valuable ecological and evolutionary information in the era of large genomic data sets.


Assuntos
Algoritmos , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Genômica , Análise de Sequência de DNA , Humanos
12.
Theor Popul Biol ; 143: 1-13, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34757022

RESUMO

Gene genealogies are frequently studied by measuring properties such as their height (H), length (L), sum of external branches (E), sum of internal branches (I), and mean of their two basal branches (B), and the coalescence times that contribute to the other genealogical features (T). These tree properties and their relationships can provide insight into the effects of population-genetic processes on genealogies and genetic sequences. Here, under the coalescent model, we study the 15 correlations among pairs of features of genealogical trees: Hn, Ln, En, In, Bn, and Tk for a sample of size n, with 2≤k≤n. We report high correlations among Hn, Ln, In, and Bn, with all pairwise correlations of these quantities having values greater than or equal to 6[6ζ(3)+6-π2]/(π18+9π2-π4)≈0.84930 in the limit as n→∞, where ζ is the Riemann zeta function. Although En has expectation 2 for all n and Hn has expectation 2 in the n→∞ limit, their limiting correlation is 0. The results contribute toward understanding features of the shapes of coalescent trees.


Assuntos
Genética Populacional , Modelos Genéticos , Linhagem , Filogenia
13.
Theor Popul Biol ; 147: 1-15, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35973448

RESUMO

By providing additional opportunities for coalescence within families, the presence of consanguineous unions in a population reduces coalescence times relative to non-consanguineous populations. First-cousin consanguinity can take one of six forms differing in the configuration of sexes in the pedigree of the male and female cousins who join in a consanguineous union: patrilateral parallel, patrilateral cross, matrilateral parallel, matrilateral cross, bilateral parallel, and bilateral cross. Considering populations with each of the six types of first-cousin consanguinity individually and a population with a mixture of the four unilateral types, we examine coalescent models of consanguinity. We previously computed, for first-cousin consanguinity models, the mean coalescence time for X-chromosomal loci and the limiting distribution of coalescence times for autosomal loci. Here, we use the separation-of-time-scales approach to obtain the limiting distribution of coalescence times for X-chromosomal loci. This limiting distribution has an instantaneous coalescence probability that depends on the probability that a union is consanguineous; lineages that do not coalesce instantaneously coalesce according to an exponential distribution. We study the effects on the coalescence time distribution of the type of first-cousin consanguinity, showing that patrilateral-parallel and patrilateral-cross consanguinity have no effect on X-chromosomal coalescence time distributions and that matrilateral-parallel consanguinity decreases coalescence times to a greater extent than does matrilateral-cross consanguinity.


Assuntos
Família , Casamento , Consanguinidade , Feminino , Humanos , Masculino , Linhagem
14.
J Math Biol ; 84(6): 54, 2022 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-35552538

RESUMO

Evolutionary models used for describing molecular sequence variation suppose that at a non-recombining genomic segment, sequences share ancestry that can be represented as a genealogy-a rooted, binary, timed tree, with tips corresponding to individual sequences. Under the infinitely-many-sites mutation model, mutations are randomly superimposed along the branches of the genealogy, so that every mutation occurs at a chromosomal site that has not previously mutated; if a mutation occurs at an interior branch, then all individuals descending from that branch carry the mutation. The implication is that observed patterns of molecular variation from this model impose combinatorial constraints on the hidden state space of genealogies. In particular, observed molecular variation can be represented in the form of a perfect phylogeny, a tree structure that fully encodes the mutational differences among sequences. For a sample of n sequences, a perfect phylogeny might not possess n distinct leaves, and hence might be compatible with many possible binary tree structures that could describe the evolutionary relationships among the n sequences. Here, we investigate enumerative properties of the set of binary ranked and unranked tree shapes that are compatible with a perfect phylogeny, and hence, the binary ranked and unranked tree shapes conditioned on an observed pattern of mutations under the infinitely-many-sites mutation model. We provide a recursive enumeration of these shapes. We consider both perfect phylogenies that can be represented as binary and those that are multifurcating. The results have implications for computational aspects of the statistical inference of evolutionary parameters that underlie sets of molecular sequences.


Assuntos
Evolução Biológica , Modelos Genéticos , Algoritmos , Humanos , Mutação , Filogenia
15.
Mol Biol Evol ; 37(5): 1480-1494, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31860090

RESUMO

A labeled gene tree topology that is more probable than the labeled gene tree topology matching a species tree is called "anomalous." Species trees that can generate such anomalous gene trees are said to be in the "anomaly zone." Here, probabilities of "unranked" and "ranked" gene tree topologies under the multispecies coalescent are considered. A ranked tree depicts not only the topological relationship among gene lineages, as an unranked tree does, but also the sequence in which the lineages coalesce. In this article, we study how the parameters of a species tree simulated under a constant-rate birth-death process can affect the probability that the species tree lies in the anomaly zone. We find that with more than five taxa, it is possible for species trees to have both anomalous unranked and ranked gene trees. The probability of being in either type of anomaly zone increases with more taxa. The probability of anomalous gene trees also increases with higher speciation rates. We observe that the probabilities of unranked anomaly zones are higher and grow much faster than those of ranked anomaly zones as the speciation rate increases. Our simulation shows that the most probable ranked gene tree is likely to have the same unranked topology as the species tree. We design the software PRANC, which computes probabilities of ranked gene tree topologies given a species tree under the coalescent model.


Assuntos
Modelos Genéticos , Filogenia , Software , Simulação por Computador , Estudo de Prova de Conceito
16.
Theor Popul Biol ; 139: 50-65, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33675872

RESUMO

Recent modeling studies interested in runs of homozygosity (ROH) and identity by descent (IBD) have sought to connect these properties of genomic sharing to pairwise coalescence times. Here, we examine a variety of features of pairwise coalescence times in models that consider consanguinity. In particular, we extend a recent diploid analysis of mean coalescence times for lineage pairs within and between individuals in a consanguineous population to derive the variance of coalescence times, studying its dependence on the frequency of consanguinity and the kinship coefficient of consanguineous relationships. We also introduce a separation-of-time-scales approach that treats consanguinity models analogously to mathematically similar phenomena such as partial selfing, using this approach to obtain coalescence-time distributions. This approach shows that the consanguinity model behaves similarly to a standard coalescent, scaling population size by a factor 1-3c, where c represents the kinship coefficient of a randomly chosen mating pair. It provides the explanation for an earlier result describing mean coalescence time in the consanguinity model in terms of c. The results extend the potential to make predictions about ROH and IBD in relation to demographic parameters of diploid populations.


Assuntos
Diploide , Consanguinidade , Homozigoto , Humanos , Densidade Demográfica
17.
Theor Popul Biol ; 140: 32-43, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-33901539

RESUMO

Consanguineous unions increase the frequency at which identical genomic segments are inherited along separate paths of descent, decreasing coalescence times for pairs of alleles drawn from an individual who is the offspring of a consanguineous pair. For an autosomal locus, it has recently been shown that the mean time to the most recent common ancestor (TMRCA) for two alleles in the same individual and the mean TMRCA for two alleles in two separate individuals both decrease with increasing consanguinity in a population. Here, we extend this analysis to the X chromosome, considering X-chromosomal coalescence times under a coalescent model with diploid, male-female mating pairs. We examine four possible first-cousin mating schemes that are equivalent in their effects on autosomes, but that have differing effects on the X chromosome: patrilateral-parallel, patrilateral-cross, matrilateral-parallel, and matrilateral-cross. In each mating model, we calculate mean TMRCA for X-chromosomal alleles sampled either within or between individuals. We describe a consanguinity effect on X-chromosomal TMRCA that differs from the autosomal pattern under matrilateral but not under patrilateral first-cousin mating. For matrilateral first cousins, the effect of consanguinity in reducing TMRCA is stronger on the X chromosome than on the autosomes, with an increased effect of parallel-cousin mating compared to cross-cousin mating. The theoretical computations support the utility of the model in understanding patterns of genomic sharing on the X chromosome.


Assuntos
Diploide , Família , Alelos , Consanguinidade , Feminino , Humanos , Masculino , Cromossomo X
18.
Hum Biol ; 92(3): 135-152, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-34057327

RESUMO

Recent studies have produced a variety of advances in the investigation of genetic similarities and differences among human populations. In this reprinted article, originally published in Human Biology in 2011 (vol. 83, no. 6, pp. 659-684), I pose a series of questions about human population-genetic similarities and differences, and I then answer these questions by numerical computation with a single shared population-genetic data set. The collection of answers obtained provides an introductory perspective for understanding key results on the features of worldwide human genetic variation. A new foreword discusses the original article in light of the research that has followed.


Assuntos
Variação Genética , Repetições de Microssatélites , Alelos , Variação Genética/genética , Genética Populacional , Humanos
19.
Am J Phys Anthropol ; 175(2): 406-421, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33772750

RESUMO

OBJECTIVES: In genetic admixture processes, source groups for an admixed population possess distinct patterns of genotype and phenotype at the onset of admixture. Particularly in the context of recent and ongoing admixture, such differences are sometimes taken to serve as markers of ancestry for individuals-that is, phenotypes initially associated with the ancestral background in one source population are assumed to continue to reflect ancestry in that population. Such phenotypes might possess ongoing significance in social categorizations of individuals, owing in part to perceived continuing correlations with ancestry. However, genotypes or phenotypes initially associated with ancestry in one specific source population have been seen to decouple from overall admixture levels, so that they no longer serve as proxies for genetic ancestry. Here, we aim to develop an understanding of the joint dynamics of admixture levels and phenotype distributions in an admixed population. METHODS: We devise a mechanistic model, consisting of an admixture model, a quantitative trait model, and a mating model. We analyze the behavior of the mechanistic model in relation to the model parameters. RESULTS: We find that it is possible for the decoupling of genetic ancestry and phenotype to proceed quickly, and that it occurs faster if the phenotype is driven by fewer loci. Positive assortative mating attenuates the process of dissociation relative to a scenario in which mating is random with respect to genetic admixture and with respect to phenotype. CONCLUSIONS: The mechanistic framework suggests that in an admixed population, a trait that initially differed between source populations might serve as a reliable proxy for ancestry for only a short time, especially if the trait is determined by few loci. It follows that a social categorization based on such a trait is increasingly uninformative about genetic ancestry and about other traits that differed between source populations at the onset of admixture.


Assuntos
Frequência do Gene/genética , Genética Populacional , Antropologia Física , Feminino , Fluxo Gênico/genética , Genoma Humano/genética , Genótipo , Humanos , Masculino , Fenótipo , Pigmentação da Pele/genética
20.
Discrete Appl Math ; 291: 88-98, 2021 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-33364668

RESUMO

Colijn & Plazzotta (Syst. Biol. 67:113-126, 2018) introduced a scheme for bijectively associating the unlabeled binary rooted trees with the positive integers. First, the rank 1 is associated with the 1-leaf tree. Proceeding recursively, ordered pair (k 1, k 2), k 1 ⩾ k 2 ⩾ 1, is then associated with the tree whose left subtree has rank k 1 and whose right subtree has rank k 2. Following dictionary order on ordered pairs, the tree whose left and right subtrees have the ordered pair of ranks (k 1, k 2) is assigned rank k 1(k 1 - 1)/2 + 1 + k 2. With this ranking, given a number of leaves n, we determine recursions for a n , the smallest rank assigned to some tree with n leaves, and b n , the largest rank assigned to some tree with n leaves. The smallest rank a n is assigned to the maximally balanced tree, and the largest rank b n is assigned to the caterpillar. For n equal to a power of 2, the value of a n is seen to increase exponentially with 2α n for a constant α ≈ 1.24602; more generally, we show it is bounded a n < 1.5 n . The value of b n is seen to increase with 2 ß ( 2 n ) for a constant ß ≈ 1.05653. The great difference in the rates of increase for a n and b n indicates that as the index v is incremented, the number of leaves for the tree associated with rank v quickly traverses a wide range of values. We interpret the results in relation to applications in evolutionary biology.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA