*J Theor Biol ; 404: 342-347, 2016 09 07.*

##### RESUMO

Consensus trees and supertrees are regularly used in systematic biology in order to obtain a summary for the common agreement of the evolutionary relationships among a collection of phylogenetic trees (hierarchies). When every tree is defined on the same set of taxa then consensus functions are used, while if the trees are defined on different sets then supertree functions are used. For both of these situations we will consider some of the limitations that might arise from the placing of singularly reasonable and apparently innocuous conditions on the functions. Previous work is reviewed together with new material. In particular, we consider the impact of axioms requiring that the removal or addition of a tree that contains no, or no new, branching information should not affect the outcome.

##### Assuntos

Modelos Biológicos , Filogenia*Syst Biol ; 60(2): 232-8, 2011 Mar.*

*Math Biosci ; 228(1): 10-5, 2010 Nov.*

##### RESUMO

The construction of a consensus tree to summarize the information of a given set of phylogenetic trees is now routinely a part of many studies in systematic biology. One popular method is the majority-rule consensus tree. In this paper we introduce and characterize a new consensus method that refines the majority-rule tree by adding certain compatible clusters satisfying a simple criterion.

##### Assuntos

Filogenia , Biologia de Sistemas/métodos , Algoritmos*Algorithms Mol Biol ; 5: 2, 2010 Jan 04.*

##### RESUMO

BACKGROUND: Supertree methods combine the phylogenetic information from multiple partially-overlapping trees into a larger phylogenetic tree called a supertree. Several supertree construction methods have been proposed to date, but most of these are not designed with any specific properties in mind. Recently, Cotton and Wilkinson proposed extensions of the majority-rule consensus tree method to the supertree setting that inherit many of the appealing properties of the former. RESULTS: We study a variant of one of Cotton and Wilkinson's methods, called majority-rule (+) supertrees. After proving that a key underlying problem for constructing majority-rule (+) supertrees is NP-hard, we develop a polynomial-size exact integer linear programming formulation of the problem. We then present a data reduction heuristic that identifies smaller subproblems that can be solved independently. While this technique is not guaranteed to produce optimal solutions, it can achieve substantial problem-size reduction. Finally, we report on a computational study of our approach on various real data sets, including the 121-taxon, 7-tree Seabirds data set of Kennedy and Page. CONCLUSIONS: The results indicate that our exact method is computationally feasible for moderately large inputs. For larger inputs, our data reduction heuristic makes it feasible to tackle problems that are well beyond the range of the basic integer programming approach. Comparisons between the results obtained by our heuristic and exact solutions indicate that the heuristic produces good answers. Our results also suggest that the majority-rule (+) approach, in both its basic form and with data reduction, yields biologically meaningful phylogenies.

*J Theor Biol ; 253(2): 345-8, 2008 Jul 21.*

##### RESUMO

In phylogenetic systematics a problem of great practical and theoretical interest is to construct one or more large phylogenies (evolutionary trees), i.e., supertrees, from a given set of small phylogenies with overlapping sets of leaf labels. Although the methods being used to solve this problem are usually given plausible biological or theoretical justifications, occasionally it is possible to see that the result of a supertree method (SM) is explosive, and therefore logically meaningless, in the sense that it has been inferred from logical propositions that are contradictory. This paper presents the basic ideas and issues of how explosions affect the inference of rooted trees by SMs. We define the relevant concepts, give examples, and show how sometimes it is possible to identify hot spots in the input from which an SM may make explosive inferences that cannot be logically justified.

##### Assuntos

Evolução Biológica , Filogenia , Animais , Modelos Genéticos , Terminologia como Assunto*Syst Biol ; 47(4): 604-16, 1998 Dec.*

##### RESUMO

Two qualitative taxonomic characters are potentially compatible if the states of each can be ordered into a character state tree in such a way that the two resulting character state trees are compatible. The number of potentially compatible pairs (NPCP) of qualitative characters from a data set may be considered to be a measure of its phylogenetic randomness. The value of NPCP depends on the number of evolutionary units (EUs), the number of characters, the number of states in the characters, the distributions of EUs among these states, and the amount and distribution of missing information and so does not directly indicate degree of phylogenetic randomness. Thus, for an observed data set, we used Monte Carlo methods to estimate the probability that a data set chosen equiprobably from among those identical (with respect to all the other above determining features) to the observed data set would have as high (or low) an NPCP as the observed data set. This probability, the realized significance of the observed NPCP, is attractive as an indication of phylogenetic randomness because it does not require the assumptions made by other such methods: No character state trees are assumed and consequently, only potential compatibility can be determined; no particular method of phylogenetic estimation is assumed; and no phylogenetic trees are constructed. We determined the values and significances of NPCP for analyses of 57 data sets taken from 53 published sources. All data sets from 37 of those sources exhibited realized significances of < 0.01, indicating high levels of phylogenetic nonrandomness. From each of the remaining 16 sources, at least one data set was more phylogenetically random. Inclusion of outgroups changed significance in some cases, but not always in the same direction. Data sets with significantly low NPCP may be consistent with an ancient hybrid origin (or other ancient polyphyletic gene exchange, crossing over, viral transfer, etc.) of the study group.

##### Assuntos

Filogenia , Distribuição Aleatória , Método de Monte Carlo*Math Biosci ; 123(2): 215-26, 1994 Oct.*

##### RESUMO

Let S be a set of n objects. A binary tree of S is a binary tree whose leaves are labeled without repetition from S. The operation of pruning a tree T is that of removing some leaves from T and suppressing all inner vertices of degree 2 which are formed by this deletion. Given two trees T and U, an agreement tree is a tree that can be obtained from T as well as from U by pruning the fewest number of leaves from the two trees. A quadratic algorithm is presented for doing this and two metrics are defined based on agreement trees.

##### Assuntos

Classificação/métodos , Matemática , Algoritmos , Filogenia*Comput Appl Biosci ; 9(6): 653-6, 1993 Dec.*

##### RESUMO

Although molecular biologists often calculate consensus sequences from aligned DNA or protein sequences, relatively little is known about the properties of many of the consensus methods being used. Consequently, we wrote a program, CONSENSUS, to analyze and compare methods of calculating a consensus result (a base, an ambiguity code or a subset of codes) at a position in an aligned set of molecular sequences. The program supports alphabets of up to four symbols (e.g. [R,Y] or [A,C,G,T]). The program's output makes it suitable for exploratory data analysis or for selecting values of thresholds or confidence levels in consensus methods having such parameters.

##### Assuntos

Sequência Consenso , Software , Sequência de Bases , DNA/genética , Proteínas/genética*J Theor Biol ; 159(4): 481-9, 1992 Dec 21.*

##### RESUMO

We introduce a parameterized threshold consensus method (th chi) for molecular sequences which is based on a majority-rule voting principle. In contrast to other frequency-based methods, the th chi method uses a single criterion to return ambiguity codes of different lengths. We derive basic features of the method and establish that it returns at most two ambiguity codes at any position of the consensus sequence. We bound from below the size of the frequency gap that exists when the th chi method returns an ambiguity code. Using such properties, we compare the th chi method to other consensus methods for molecular sequences which are defined in terms of threshold or gap criteria.

##### Assuntos

Sequência de Bases , Modelos Genéticos , Sequência Consenso , Matemática*Bull Math Biol ; 54(6): 1057-68, 1992 Nov.*

##### Assuntos

Sequência Consenso , Teoria da Decisão , Modelos Genéticos , Sequência de Bases , Matemática*Math Biosci ; 111(2): 231-47, 1992 Oct.*

##### RESUMO

Our goal is to help researchers interpret the results of a function, based on the concept of plurality rule, that calculates a consensus of a profile of molecular bases. By expressing the plurality rule function as a composition of simpler functions, we obtain both an algorithm to calculate the consensus result and an upper bound on the number of nonequivalent results. Consequently, when used to analyze molecular sequences such as DNA or RNA, the plurality rule function yields at most 48 nonequivalent consensus results. For problems of reasonable size, we describe an algorithm to calculate the probability that each consensus result would occur if the bases were equally likely to appear at every position of the plurality rule function's input profile.

##### Assuntos

Sequência Consenso , Modelos Genéticos , Algoritmos , Sequência de Bases , Dados de Sequência Molecular , Probabilidade*Nucleic Acids Res ; 20(5): 1093-9, 1992 Mar 11.*

##### RESUMO

Consensus methods are recognized as valuable tools for data analysis, especially when some sort of data aggregation is desired. Although consensus methods for sequences play a vital role in molecular biology, researchers pay little heed to the features and limitations of such methods, and so there are risks that criteria for constructing consensus sequences will be misused or misunderstood. To understand better the issues involved, we conducted a critical comparison of nine consensus methods for sequences, of which eight were used in papers appearing in this journal. We report the results of that comparison, and we make recommendations which we hope will assist researchers when they must select particular consensus methods for particular applications.

##### Assuntos

Sequência Consenso/genética , Interpretação Estatística de Dados , Algoritmos , Sequência de Bases/genética , Estudos de Avaliação como Assunto*Bull Math Biol ; 53(5): 679-84, 1991.*

##### RESUMO

An axiomatic characterization is presented for consensus functions defined on weak hierarchies. These functions are generalizations of the majority rule consensus.

##### Assuntos

Matemática , Classificação , Sequência Consenso*J Math Biol ; 4(2): 195-200, 1977 May 23.*

##### RESUMO

A proof is given of a procedure that has previously appeared claiming to determine when two amino acid positions on a protein could both possibly be divergent taxonomic characters. An algorithm for executing this procedure is described.

##### Assuntos

Sequência de Aminoácidos , Evolução Biológica , Modelos Biológicos , Filogenia*Bull Math Biol ; 39(2): 133-8, 1977.*