Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
Theor Popul Biol ; 156: 93-102, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38367870

RESUMO

Given a labeled tree topology t, consider a population P of k leaves chosen among those of t. The clade of P is the minimal subtree of t containing P and its size is given by the number of leaves in the clade. When t is selected under the Yule or uniform distribution among the labeled topologies of size n, we study the "clade size" random variable determining closed formulas for its probability mass function, its mean, and its variance. Our calculations show that for large n the clade size tends to be smaller under the uniform model than under the Yule model, with a larger variability in the first scenario for values of k≥5. We apply our probability formulas to investigate set-theoretic relationships between the clades of two populations in a random tree, determining how likely one clade is contained in or it is equal to the other. Our study relates to earlier calculations for the probability that under the Yule model the clade size of P equals the size of P - that is, the population P forms a monophyletic group - and extends known results for the probability that the minimal (non-trivial) clade containing a random taxon has a given size.


Assuntos
Deriva Genética , Modelos Genéticos , Filogenia , Funções Verossimilhança
2.
J Math Biol ; 84(6): 54, 2022 05 12.
Artigo em Inglês | MEDLINE | ID: mdl-35552538

RESUMO

Evolutionary models used for describing molecular sequence variation suppose that at a non-recombining genomic segment, sequences share ancestry that can be represented as a genealogy-a rooted, binary, timed tree, with tips corresponding to individual sequences. Under the infinitely-many-sites mutation model, mutations are randomly superimposed along the branches of the genealogy, so that every mutation occurs at a chromosomal site that has not previously mutated; if a mutation occurs at an interior branch, then all individuals descending from that branch carry the mutation. The implication is that observed patterns of molecular variation from this model impose combinatorial constraints on the hidden state space of genealogies. In particular, observed molecular variation can be represented in the form of a perfect phylogeny, a tree structure that fully encodes the mutational differences among sequences. For a sample of n sequences, a perfect phylogeny might not possess n distinct leaves, and hence might be compatible with many possible binary tree structures that could describe the evolutionary relationships among the n sequences. Here, we investigate enumerative properties of the set of binary ranked and unranked tree shapes that are compatible with a perfect phylogeny, and hence, the binary ranked and unranked tree shapes conditioned on an observed pattern of mutations under the infinitely-many-sites mutation model. We provide a recursive enumeration of these shapes. We consider both perfect phylogenies that can be represented as binary and those that are multifurcating. The results have implications for computational aspects of the statistical inference of evolutionary parameters that underlie sets of molecular sequences.


Assuntos
Evolução Biológica , Modelos Genéticos , Algoritmos , Humanos , Mutação , Filogenia
3.
Theor Popul Biol ; 134: 92-105, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32485202

RESUMO

The Kingman coalescent process is a classical model of gene genealogies in population genetics. It generates Yule-distributed, binary ranked tree topologies - also called histories - with a finite number of n leaves, together with n-1 exponentially distributed time lengths: one for each layer of the history. Using a discrete approach, we study the lengths of the external branches of Yule distributed histories, where the length of an external branch is defined as the rank of its parent node. We study the multiplicity of external branches of given length in a random history of n leaves. A correspondence between the external branches of the ordered histories of size n and the non-peak entries of the permutations of size n-1 provides easy access to the length distributions of the first and second longest external branches in a random Yule history and coalescent tree of size n. The length of the longest external branch is also studied in dependence of root balance of a random tree. As a practical application, we compare the observed and expected number of mutations on the longest external branches in samples from natural populations.


Assuntos
Modelos Genéticos , Árvores , Genética Populacional , Mutação , Filogenia , Árvores/genética
4.
J Math Biol ; 79(4): 1205-1225, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31222377

RESUMO

A ranked tree topology is a tree topology with a temporal ordering of its coalescence events. Under the multispecies coalescent model, we consider ranked gene tree topologies realized along the branches of ranked species trees, where one gene copy is sampled for each species. Previous results have demonstrated that for almost all ranked species tree topologies with at least five species, there exists a set of branch lengths such that the maximally probable ranked gene tree topologies-those generated with the highest probability under the model-do not match the species tree ranked topology. Here, we focus on the agreement of a ranked species tree with its maximally probable ranked gene tree topologies in terms of their unranked topology, that is, disregarding the ordering of the coalescence events. We show that although the set of maximally probable ranked gene tree topologies for a ranked species tree can contain ranked trees with different unranked topologies, at least one of these maximal ranked gene tree topologies must have the same unranked topology as the species tree. Our results contribute to the study of the relationships between gene trees and species trees.


Assuntos
Algoritmos , Evolução Biológica , Genes/genética , Especiação Genética , Modelos Genéticos , Filogenia , Animais , Humanos
5.
Bull Math Biol ; 81(2): 384-407, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-28913585

RESUMO

An ancestral configuration is one of the combinatorially distinct sets of gene lineages that, for a given gene tree, can reach a given node of a specified species tree. Ancestral configurations have appeared in recursive algebraic computations of the conditional probability that a gene tree topology is produced under the multispecies coalescent model for a given species tree. For matching gene trees and species trees, we study the number of ancestral configurations, considered up to an equivalence relation introduced by Wu (Evolution 66:763-775, 2012) to reduce the complexity of the recursive probability computation. We examine the largest number of non-equivalent ancestral configurations possible for a given tree size n. Whereas the smallest number of non-equivalent ancestral configurations increases polynomially with n, we show that the largest number increases with [Formula: see text], where k is a constant that satisfies [Formula: see text]. Under a uniform distribution on the set of binary labeled trees with a given size n, the mean number of non-equivalent ancestral configurations grows exponentially with n. The results refine an earlier analysis of the number of ancestral configurations considered without applying the equivalence relation, showing that use of the equivalence relation does not alter the exponential nature of the increase with tree size.


Assuntos
Modelos Genéticos , Filogenia , Algoritmos , Biologia Computacional , Evolução Molecular , Especiação Genética , Conceitos Matemáticos , Modelos Estatísticos , Probabilidade
6.
Bull Math Biol ; 81(2): 452-493, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29876842

RESUMO

The neighbor-joining algorithm for phylogenetic inference (NJ) has been seen to have three specific properties when applied to distance matrices that contain an admixed taxon: (1) antecedence of clustering, in which the admixed taxon agglomerates with one of its source taxa before the two source taxa agglomerate with each other; (2) intermediacy of distances, in which the distance on an inferred NJ tree between an admixed taxon and either of its source taxa is smaller than the distance between the two source taxa; and (3) intermediacy of path lengths, in which the number of edges separating the admixed taxon and either of its source taxa is less than or equal to the number of edges between the source taxa. We examine the behavior of neighbor-joining on distance matrices containing an admixed group, investigating the occurrence of antecedence of clustering, intermediacy of distances, and intermediacy of path lengths. We first mathematically predict the frequency with which the properties are satisfied for a labeled unrooted binary tree selected uniformly at random in the absence of admixture. We then introduce a taxon constructed by a linear admixture of distances from two source taxa, examining three admixture scenarios by simulation: a model in which distance matrices are chosen at random, a model in which an admixed taxon is added to a set of taxa that reflect treelike evolution, and a model that introduces a perturbation of the treelike scenario. In contrast to previous conjectures, we observe that the three properties are sometimes violated by distance matrices that include an admixed taxon. However, we also find that they are satisfied more often than is expected by chance when the distance matrix contains an admixed taxon, especially when evolution among the non-admixed taxa is treelike. The results contribute to a deeper understanding of the nature of evolutionary trees constructed from data that do not necessarily reflect a treelike evolutionary process.


Assuntos
Algoritmos , Filogenia , Análise por Conglomerados , Biologia Computacional , Simulação por Computador , Evolução Molecular , Conceitos Matemáticos , Modelos Genéticos , Modelos Estatísticos , Probabilidade
7.
J Math Biol ; 78(1-2): 155-188, 2019 01.
Artigo em Inglês | MEDLINE | ID: mdl-30116881

RESUMO

Compact coalescent histories are combinatorial structures that describe for a given gene tree G and species tree S possibilities for the numbers of coalescences of G that take place on the various branches of S. They have been introduced as a data structure for evaluating probabilities of gene tree topologies conditioning on species trees, reducing computation time compared to standard coalescent histories. When gene trees and species trees have a matching labeled topology [Formula: see text], the compact coalescent histories of t are encoded by particular integer labelings of the branches of t, each integer specifying the number of coalescent events of G present in a branch of S. For matching gene trees and species trees, we investigate enumerative properties of compact coalescent histories. We report a recursion for the number of compact coalescent histories for matching gene trees and species trees, using it to study the numbers of compact coalescent histories for small trees. We show that the number of compact coalescent histories equals the number of coalescent histories if and only if the labeled topology is a caterpillar or a bicaterpillar. The number of compact coalescent histories is seen to increase with tree imbalance: we prove that as the number of taxa n increases, the exponential growth of the number of compact coalescent histories follows [Formula: see text] in the case of caterpillar or bicaterpillar labeled topologies and approximately [Formula: see text] and [Formula: see text] for lodgepole and balanced topologies, respectively. We prove that the mean number of compact coalescent histories of a labeled topology of size n selected uniformly at random grows with [Formula: see text]. Our results contribute to the analysis of the computational complexity of algorithms for computing gene tree probabilities, and to the combinatorial study of gene trees and species trees more generally.


Assuntos
Especiação Genética , Modelos Genéticos , Filogenia , Algoritmos , Biologia Computacional , Evolução Molecular , Genética Populacional/estatística & dados numéricos , Conceitos Matemáticos , Probabilidade
8.
J Comput Biol ; 24(9): 831-850, 2017 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-28437136

RESUMO

Given a gene tree and a species tree, ancestral configurations represent the combinatorially distinct sets of gene lineages that can reach a given node of the species tree. They have been introduced as a data structure for use in the recursive computation of the conditional probability under the multispecies coalescent model of a gene tree topology given a species tree, the cost of this computation being affected by the number of ancestral configurations of the gene tree in the species tree. For matching gene trees and species trees, we obtain enumerative results on ancestral configurations. We study ancestral configurations in balanced and unbalanced families of trees determined by a given seed tree, showing that for seed trees with more than one taxon, the number of ancestral configurations increases for both families exponentially in the number of taxa n. For fixed n, the maximal number of ancestral configurations tabulated at the species tree root node and the largest number of labeled histories possible for a labeled topology occur for trees with precisely the same unlabeled shape. For ancestral configurations at the root, the maximum increases with [Formula: see text], where [Formula: see text] is a quadratic recurrence constant. Under a uniform distribution over the set of labeled trees of given size, the mean number of root ancestral configurations grows with [Formula: see text] and the variance with ∼[Formula: see text]. The results provide a contribution to the combinatorial study of gene trees and species trees.


Assuntos
Evolução Molecular , Genes , Modelos Genéticos , Filogenia , Algoritmos , Animais , Especiação Genética , Homologia de Sequência do Ácido Nucleico
9.
Artigo em Inglês | MEDLINE | ID: mdl-26452289

RESUMO

Coalescent histories provide lists of species tree branches on which gene tree coalescences can take place, and their enumerative properties assist in understanding the computational complexity of calculations central in the study of gene trees and species trees. Here, we solve an enumerative problem left open by Rosenberg (IEEE/ACM Transactions on Computational Biology and Bioinformatics 10: 1253-1262, 2013) concerning the number of coalescent histories for gene trees and species trees with a matching labeled topology that belongs to a generic caterpillar-like family. By bringing a generating function approach to the study of coalescent histories, we prove that for any caterpillar-like family with seed tree t , the sequence (hn)n ≥ 0 describing the number of matching coalescent histories of the n th tree of the family grows asymptotically as a constant multiple of the Catalan numbers. Thus, hn  âˆ¼ ßt cn, where the asymptotic constant ßt > 0 depends on the shape of the seed tree t. The result extends a claim demonstrated only for seed trees with at most eight taxa to arbitrary seed trees, expanding the set of cases for which detailed enumerative properties of coalescent histories can be determined. We introduce a procedure that computes from t the constant ßt as well as the algebraic expression for the generating function of the sequence (hn)n ≥ 0.


Assuntos
Algoritmos , Evolução Biológica , Deriva Genética , Especiação Genética , Modelos Genéticos , Filogenia , Simulação por Computador
10.
J Comput Biol ; 22(10): 918-29, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-25973633

RESUMO

Coalescent histories are combinatorial structures that describe for a given gene tree and species tree the possible lists of branches of the species tree on which the gene tree coalescences take place. Properties of the number of coalescent histories for gene trees and species trees affect a variety of probabilistic calculations in mathematical phylogenetics. Exact and asymptotic evaluations of the number of coalescent histories, however, are known only in a limited number of cases. Here we introduce a particular family of species trees, the lodgepole species trees (λn)n ≥ 0, in which tree λn has m = 2n+1 taxa. We determine the number of coalescent histories for the lodgepole species trees, in the case that the gene tree matches the species tree, showing that this number grows with m!! in the number of taxa m. This computation demonstrates the existence of tree families in which the growth in the number of coalescent histories is faster than exponential. Further, it provides a substantial improvement on the lower bound for the ratio of the largest number of matching coalescent histories to the smallest number of matching coalescent histories for trees with m taxa, increasing a previous bound of [Formula: see text] to [Formula: see text]. We discuss the implications of our enumerative results for phylogenetic computations.


Assuntos
Biologia Computacional/métodos , Filogenia , Algoritmos , Evolução Molecular , Genes , Modelos Genéticos
11.
Opt Express ; 22(5): 5312-24, 2014 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-24663872

RESUMO

Monte Carlo markovian models of a dual-mode semiconductor laser with quantum well (QW) or quantum dot (QD) active regions are proposed. Accounting for carriers and photons as particles that may exchange energy in the course of time allows an ab initio description of laser dynamics such as the mode competition and intrinsic laser noise. We used these models to evaluate the stability of the dual-mode regime when laser characteristics are varied: mode gains and losses, non-radiative recombination rates, intraband relaxation time, capture time in QD, transfer of excitation between QD via the wetting layer... As a major result, a possible steady-state dual-mode regime is predicted for specially designed QD semiconductor lasers thereby acting as a CW microwave or terahertz-beating source whereas it does not occur for QW lasers.

12.
Artigo em Inglês | MEDLINE | ID: mdl-26357058

RESUMO

Analysis of probability distributions conditional on species trees has demonstrated the existence of anomalous ranked gene trees (ARGTs), ranked gene trees that are more probable than the ranked gene tree that accords with the ranked species tree. Here, to improve the characterization of ARGTs, we study enumerative and probabilistic properties of two classes of ranked labeled species trees, focusing on the presence or avoidance of certain subtree patterns associated with the production of ARGTs. We provide exact enumerations and asymptotic estimates for cardinalities of these sets of trees, showing that as the number of species increases without bound, the fraction of all ranked labeled species trees that are ARGT-producing approaches 1. This result extends beyond earlier existence results to provide a probabilistic claim about the frequency of ARGTs.


Assuntos
Biologia Computacional/métodos , Especiação Genética , Modelos Genéticos , Filogenia
13.
Math Biosci ; 246(1): 139-47, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23994240

RESUMO

The Yule process generates a class of binary trees which is fundamental to population genetic models and other applications in evolutionary biology. In this paper, we introduce a family of sub-classes of ranked trees, called Ω-trees, which are characterized by imbalance of internal nodes. The degree of imbalance is defined by an integer 0 ≤ ω. For caterpillars, the extreme case of unbalanced trees, ω = 0. Under models of neutral evolution, for instance the Yule model, trees with small ω are unlikely to occur by chance. Indeed, imbalance can be a signature of permanent selection pressure, such as observable in the genealogies of certain pathogens. From a mathematical point of view it is interesting to observe that the space of Ω-trees maintains several statistical invariants although it is drastically reduced in size compared to the space of unconstrained Yule trees. Using generating functions, we study here some basic combinatorial properties of Ω-trees. We focus on the distribution of the number of subtrees with two leaves. We show that expectation and variance of this distribution match those for unconstrained trees already for very small values of ω.


Assuntos
Genética Populacional/estatística & dados numéricos , Modelos Biológicos
14.
PLoS One ; 8(4): e60123, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23593168

RESUMO

The coalescent with recombination is a fundamental model to describe the genealogical history of DNA sequence samples from recombining organisms. Considering recombination as a process which acts along genomes and which creates sequence segments with shared ancestry, we study the influence of single recombination events upon tree characteristics of the coalescent. We focus on properties such as tree height and tree balance and quantify analytically the changes in these quantities incurred by recombination in terms of probability distributions. We find that changes in tree topology are often relatively mild under conditions of neutral evolution, while changes in tree height are on average quite large. Our results add to a quantitative understanding of the spatial coalescent and provide the neutral reference to which the impact by other evolutionary scenarios, for instance tree distortion by selective sweeps, can be compared.


Assuntos
Modelos Genéticos , Recombinação Genética , Árvores/anatomia & histologia , Árvores/genética , Raízes de Plantas/genética , Probabilidade
15.
Math Biosci ; 242(2): 195-200, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23396093

RESUMO

We consider exact enumerations and probabilistic properties of ranked trees when generated under the random coalescent process. Using a new approach, based on generating functions, we derive several statistics such as the exact probability of finding k cherries in a ranked tree of fixed size n. We then extend our method to consider also the number of pitchforks. We find a recursive formula to calculate the joint and conditional probabilities of cherries and pitchforks when the size of the tree is fixed. These results provide insights into structural properties of coalescent trees under the model of neutral evolution.


Assuntos
Modelos Teóricos , Probabilidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA