Pesquisa | Biblioteca Virtual em Saúde Fiocruz

1.

How Well Does Your Phylogenetic Model Fit Your Data?

A Shepherd, Daisy; Klaere, Steffen.

Syst Biol ; 68(1): 157-167, 2019 01 01.

Artigo em Inglês | MEDLINE | ID: mdl-30329125

RESUMO

The test for model-to-data fitness is a fundamental principle within the statistical sciences. The purpose of such a test is to assess whether the selected best-fitting model adequately describes the behavior in the data. Despite their broad application across many areas of statistics, goodness of fit tests for phylogenetic models have received much less attention than model selection methods in the last decade. At present a number of approaches have been suggested. However, these are often flawed, with problems ranging from the presence of systematic error in the models themselves to the difficulties presented by the nature of phylogenetic data. Ultimately these problems lead to an inadequate choice of statistic. This is one of the main reasons why goodness of fit assessment is often a neglected step within phylogenetic analysis. We argue not only for the necessity of these goodness of fit measures to test how well the model reflects the data, but additionally for the need for "useful" tests that explain why the model-to-data fit may be inadequate. Such tests are a critical part of the model building process, allowing the model to be adapted to provide a better model-to-data fit or to reject a model class outright due to such an inadequate fit that the intended use of the class may be compromised. Proposed and existing methods in both the maximum likelihood and Bayesian framework will be discussed here, whilst highlighting their strengths and limitations for assessing goodness of fit. The final section discusses some critical open statistical problems in goodness of fit assessment for this field, with the hope of encouraging more research into such a fundamental yet underdeveloped area of phylogenetic inference. [Bayesian phylogenetics; Goodness of fit; maximum likelihood; molecular phylogenetics; outlier detection; residual diagnostics.].

Assuntos

Classificação/métodos , Modelos Biológicos , Filogenia , Interpretação Estatística de Dados

2.

Model Selection and Parameter Inference in Phylogenetics Using Nested Sampling.

Russel, Patricio Maturana; Brewer, Brendon J; Klaere, Steffen; Bouckaert, Remco R.

Syst Biol ; 68(2): 219-233, 2019 03 01.

Artigo em Inglês | MEDLINE | ID: mdl-29961836

RESUMO

Bayesian inference methods rely on numerical algorithms for both model selection and parameter inference. In general, these algorithms require a high computational effort to yield reliable estimates. One of the major challenges in phylogenetics is the estimation of the marginal likelihood. This quantity is commonly used for comparing different evolutionary models, but its calculation, even for simple models, incurs high computational cost. Another interesting challenge relates to the estimation of the posterior distribution. Often, long Markov chains are required to get sufficient samples to carry out parameter inference, especially for tree distributions. In general, these problems are addressed separately by using different procedures. Nested sampling (NS) is a Bayesian computation algorithm, which provides the means to estimate marginal likelihoods together with their uncertainties, and to sample from the posterior distribution at no extra cost. The methods currently used in phylogenetics for marginal likelihood estimation lack in practicality due to their dependence on many tuning parameters and their inability of most implementations to provide a direct way to calculate the uncertainties associated with the estimates, unlike NS. In this article, we introduce NS to phylogenetics. Its performance is analysed under different scenarios and compared to established methods. We conclude that NS is a competitive and attractive algorithm for phylogenetic inference. An implementation is available as a package for BEAST 2 under the LGPL licence, accessible at https://github.com/BEAST2-Dev/nested-sampling.

Assuntos

Classificação/métodos , Modelos Genéticos , Filogenia , Algoritmos

3.

Central control of fever and female body temperature by RANKL/RANK.

Hanada, Reiko; Leibbrandt, Andreas; Hanada, Toshikatsu; Kitaoka, Shiho; Furuyashiki, Tomoyuki; Fujihara, Hiroaki; Trichereau, Jean; Paolino, Magdalena; Qadri, Fatimunnisa; Plehm, Ralph; Klaere, Steffen; Komnenovic, Vukoslav; Mimata, Hiromitsu; Yoshimatsu, Hironobu; Takahashi, Naoyuki; von Haeseler, Arndt; Bader, Michael; Kilic, Sara Sebnem; Ueta, Yoichi; Pifl, Christian; Narumiya, Shuh; Penninger, Josef M.

Nature ; 462(7272): 505-9, 2009 Nov 26.

Artigo em Inglês | MEDLINE | ID: mdl-19940926

RESUMO

Receptor-activator of NF-kappaB ligand (TNFSF11, also known as RANKL, OPGL, TRANCE and ODF) and its tumour necrosis factor (TNF)-family receptor RANK are essential regulators of bone remodelling, lymph node organogenesis and formation of a lactating mammary gland. RANKL and RANK are also expressed in the central nervous system. However, the functional relevance of RANKL/RANK in the brain was entirely unknown. Here we report that RANKL and RANK have an essential role in the brain. In both mice and rats, central RANKL injections trigger severe fever. Using tissue-specific Nestin-Cre and GFAP-Cre rank(floxed) deleter mice, the function of RANK in the fever response was genetically mapped to astrocytes. Importantly, Nestin-Cre and GFAP-Cre rank(floxed) deleter mice are resistant to lipopolysaccharide-induced fever as well as fever in response to the key inflammatory cytokines IL-1beta and TNFalpha. Mechanistically, RANKL activates brain regions involved in thermoregulation and induces fever via the COX2-PGE(2)/EP3R pathway. Moreover, female Nestin-Cre and GFAP-Cre rank(floxed) mice exhibit increased basal body temperatures, suggesting that RANKL and RANK control thermoregulation during normal female physiology. We also show that two children with RANK mutations exhibit impaired fever during pneumonia. These data identify an entirely novel and unexpected function for the key osteoclast differentiation factors RANKL/RANK in female thermoregulation and the central fever response in inflammation.

Assuntos

Regulação da Temperatura Corporal/efeitos dos fármacos , Regulação da Temperatura Corporal/fisiologia , Febre/induzido quimicamente , Febre/metabolismo , Ligante RANK/farmacologia , Receptor Ativador de Fator Nuclear kappa-B/metabolismo , Caracteres Sexuais , Animais , Astrócitos/efeitos dos fármacos , Astrócitos/metabolismo , Criança , Dinoprostona/metabolismo , Feminino , Febre/complicações , Perfilação da Expressão Gênica , Humanos , Injeções Intraventriculares , Masculino , Camundongos , Camundongos Endogâmicos C57BL , Pneumonia/complicações , Pneumonia/metabolismo , Ligante RANK/administração & dosagem , Ligante RANK/antagonistas & inibidores , Ligante RANK/metabolismo , Ratos , Ratos Wistar , Receptor Ativador de Fator Nuclear kappa-B/genética , Receptores de Prostaglandina E/metabolismo , Receptores de Prostaglandina E Subtipo EP3

4.

MISFITS: evaluating the goodness of fit between a phylogenetic model and an alignment.

Nguyen, Minh Anh Thi; Klaere, Steffen; von Haeseler, Arndt.

Mol Biol Evol ; 28(1): 143-52, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-20643866

RESUMO

As models of sequence evolution become more and more complicated, many criteria for model selection have been proposed, and tools are available to select the best model for an alignment under a particular criterion. However, in many instances the selected model fails to explain the data adequately as reflected by large deviations between observed pattern frequencies and the corresponding expectation. We present MISFITS, an approach to evaluate the goodness of fit (http://www.cibiv.at/software/misfits). MISFITS introduces a minimum number of "extra substitutions" on the inferred tree to provide a biologically motivated explanation why the alignment may deviate from expectation. These extra substitutions plus the evolutionary model then fully explain the alignment. We illustrate the method on several examples and then give a survey about the goodness of fit of the selected models to the alignments in the PANDIT database.

Assuntos

Algoritmos , Modelos Genéticos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Animais , Sequência de Bases , DNA Mitocondrial/análise , DNA Mitocondrial/genética , Bases de Dados Genéticas , Evolução Molecular , Humanos , Funções Verossimilhança , Dados de Sequência Molecular , Filogenia , Primatas/genética , Homologia de Sequência do Ácido Nucleico

5.

The link between segregation and phylogenetic diversity.

Bryant, David; Klaere, Steffen.

J Math Biol ; 64(1-2): 149-62, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21336622

RESUMO

We derive an invertible transform linking two widely used measures of species diversity: phylogenetic diversity and the expected proportions of segregating (non-constant) sites. We assume a bi-allelic (two-state), symmetric, finite site model of substitution. Like the Hadamard transform of Hendy and Penny, the transform can be expressed independently of the underlying phylogeny. Our results bridge work on diversity from two quite distinct scientific communities.

Assuntos

Biodiversidade , Modelos Biológicos , Modelos Estatísticos , Filogenia

6.

Taxon Selection under Split Diversity.

Minh, Bui Quang; Klaere, Steffen; von Haeseler, Arndt.

Syst Biol ; 58(6): 586-94, 2009 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-20525611

RESUMO

The "phylogenetic diversity" (PD) measure of biodiversity is evaluated using a phylogenetic tree, usually inferred from morphological or molecular data. Consequently, it is vulnerable to errors in that tree, including those resulting from sampling error, model misspecification, or conflicting signals. To improve the robustness of PD, we can evaluate the measure using either a collection (or distribution) of trees or a phylogenetic network. Recently, it has been shown that these 2 approaches are equivalent but that the problem of maximizing PD in the general concept is NP-hard. In this study, we provide an efficient dynamic programming algorithm for maximizing PD when splits in the trees or network form a circular split system. We illustrate our method using a case study of game birds ("Galliformes") and discuss the different choices of taxa based on our approach and PD.

Assuntos

Algoritmos , Biodiversidade , Classificação/métodos , Biologia Computacional/métodos , Modelos Genéticos , Filogenia , Animais , Conservação dos Recursos Naturais/métodos , Galliformes/genética , Projetos de Pesquisa

7.

Pre-fermentation fining effects on the aroma chemistry of Marlborough Sauvignon blanc press fractions.

Parish, Katie J; Herbst-Johnstone, Mandy; Bouda, Flo; Klaere, Steffen; Fedrizzi, Bruno.

Food Chem ; 208: 326-35, 2016 Oct 01.

Artigo em Inglês | MEDLINE | ID: mdl-27132857

RESUMO

In the wine industry, fining agents are commonly used with many choices now commercially available. Here the influence of pre-fermentation fining on wine aroma chemistry has been explored. Free run and press fraction Sauvignon blanc juices from two vineyards were fined using gelatin, activated carbon, polyvinylpolypyrrolidone (PVPP) and a combination agent which included bentonite, PVPP and isinglass. Over thirty aroma compounds were quantified in the experimental wines. Results showed that activated carbon fining led to a significant (p<0.05) concentration decrease of hexan-1-ol and linalool in the experimental wines when compared to a control, consistent across all vineyard and fraction combinations. Other aroma compounds were also influenced by fining agent, even if vineyards and press fractions played a crucial role. This study confirmed that fining agents used pre-fermentation can influence wine aroma profiles and therefore needs specific tailoring addressing style and origin of grape.

Assuntos

Fermentação , Aromatizantes/química , Manipulação de Alimentos/métodos , Olfato , Vitis/química , Vinho/análise , Monoterpenos Acíclicos , Hexanóis/análise , Monoterpenos/análise

8.

Regional microbial signatures positively correlate with differential wine phenotypes: evidence for a microbial aspect to terroir.

Knight, Sarah; Klaere, Steffen; Fedrizzi, Bruno; Goddard, Matthew R.

Sci Rep ; 5: 14233, 2015 Sep 24.

Artigo em Inglês | MEDLINE | ID: mdl-26400688

RESUMO

Many crops display differential geographic phenotypes and sensorial signatures, encapsulated by the concept of terroir. The drivers behind these differences remain elusive, and the potential contribution of microbes has been ignored until recently. Significant genetic differentiation between microbial communities and populations from different geographic locations has been demonstrated, but crucially it has not been shown whether this correlates with differential agricultural phenotypes or not. Using wine as a model system, we utilize the regionally genetically differentiated population of Saccharomyces cerevisiae in New Zealand and objectively demonstrate that these populations differentially affect wine phenotype, which is driven by a complex mix of chemicals. These findings reveal the importance of microbial populations for the regional identity of wine, and potentially extend to other important agricultural commodities. Moreover, this suggests that long-term implementation of methods maintaining differential biodiversity may have tangible economic imperatives as well as being desirable in terms of employing agricultural practices that increase responsible environmental stewardship.

Assuntos

Biodiversidade , Fenótipo , Saccharomyces cerevisiae , Vinho , Análise de Variância , Fermentação , Genótipo , Nova Zelândia , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Compostos Orgânicos Voláteis

9.

Split diversity in constrained conservation prioritization using integer linear programming.

Chernomor, Olga; Minh, Bui Quang; Forest, Félix; Klaere, Steffen; Ingram, Travis; Henzinger, Monika; von Haeseler, Arndt.

Methods Ecol Evol ; 6(1): 83-91, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25893087

RESUMO

Phylogenetic diversity (PD) is a measure of biodiversity based on the evolutionary history of species. Here, we discuss several optimization problems related to the use of PD, and the more general measure split diversity (SD), in conservation prioritization.Depending on the conservation goal and the information available about species, one can construct optimization routines that incorporate various conservation constraints. We demonstrate how this information can be used to select sets of species for conservation action. Specifically, we discuss the use of species' geographic distributions, the choice of candidates under economic pressure, and the use of predator-prey interactions between the species in a community to define viability constraints.Despite such optimization problems falling into the area of NP hard problems, it is possible to solve them in a reasonable amount of time using integer programming. We apply integer linear programming to a variety of models for conservation prioritization that incorporate the SD measure.We exemplarily show the results for two data sets: the Cape region of South Africa and a Caribbean coral reef community. Finally, we provide user-friendly software at http://www.cibiv.at/software/pda.

10.

ObStruct: a method to objectively analyse factors driving population structure using Bayesian ancestry profiles.

Gayevskiy, Velimir; Klaere, Steffen; Knight, Sarah; Goddard, Matthew R.

PLoS One ; 9(1): e85196, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24416362

RESUMO

Bayesian inference methods are extensively used to detect the presence of population structure given genetic data. The primary output of software implementing these methods are ancestry profiles of sampled individuals. While these profiles robustly partition the data into subgroups, currently there is no objective method to determine whether the fixed factor of interest (e.g. geographic origin) correlates with inferred subgroups or not, and if so, which populations are driving this correlation. We present ObStruct, a novel tool to objectively analyse the nature of structure revealed in Bayesian ancestry profiles using established statistical methods. ObStruct evaluates the extent of structural similarity between sampled and inferred populations, tests the significance of population differentiation, provides information on the contribution of sampled and inferred populations to the observed structure and crucially determines whether the predetermined factor of interest correlates with inferred population structure. Analyses of simulated and experimental data highlight ObStruct's ability to objectively assess the nature of structure in populations. We show the method is capable of capturing an increase in the level of structure with increasing time since divergence between simulated populations. Further, we applied the method to a highly structured dataset of 1,484 humans from seven continents and a less structured dataset of 179 Saccharomyces cerevisiae from three regions in New Zealand. Our results show that ObStruct provides an objective metric to classify the degree, drivers and significance of inferred structure, as well as providing novel insights into the relationships between sampled populations, and adds a final step to the pipeline for population structure analyses.

Assuntos

Modelos Genéticos , Dinâmica Populacional/estatística & dados numéricos , Grupos Raciais/genética , Software , Teorema de Bayes , Variação Genética , Humanos , Repetições de Microssatélites , Nova Zelândia , Filogeografia , Grupos Raciais/classificação , Saccharomyces cerevisiae/classificação , Saccharomyces cerevisiae/genética

11.

An algebraic analysis of the two state Markov model on tripod trees.

Klaere, Steffen; Liebscher, Volkmar.

Math Biosci ; 237(1-2): 38-48, 2012 May.

Artigo em Inglês | MEDLINE | ID: mdl-22430560

RESUMO

Methods of phylogenetic inference use more and more complex models to generate trees from data. However, even simple models and their implications are not fully understood. Here, we investigate the two-state Markov model on a tripod tree, inferring conditions under which a given set of observations gives rise to such a model. This type of investigation has been undertaken before by several scientists from different fields of research. In contrast to other work we fully analyse the model, presenting conditions under which one can infer a model from the observation or at least get support for the tree-shaped interdependence of the leaves considered. We also present all conditions under which the results can be extended from tripod trees to quartet trees, a step necessary to reconstruct at least a topology. Apart from finding conditions under which such an extension works we discuss example cases for which such an extension does not work.

Assuntos

Cadeias de Markov , Modelos Genéticos , Filogenia

12.

On the group theoretical background of assigning stepwise mutations onto phylogenies.

Fischer, Mareike; Klaere, Steffen; Thi Nguyen, Minh Anh; von Haeseler, Arndt.

Algorithms Mol Biol ; 7(1): 36, 2012 Dec 15.

Artigo em Inglês | MEDLINE | ID: mdl-23241267

RESUMO

Recently one step mutation matrices were introduced to model the impact of substitutions on arbitrary branches of a phylogenetic tree on an alignment site. This concept works nicely for the four-state nucleotide alphabet and provides an efficient procedure conjectured to compute the minimal number of substitutions needed to transform one alignment site into another. The present paper delivers a proof of the validity of this algorithm. Moreover, we provide several mathematical insights into the generalization of the OSM matrix to multi-state alphabets. The construction of the OSM matrix is only possible if the matrices representing the substitution types acting on the character states and the identity matrix form a commutative group with respect to matrix multiplication. We illustrate this approach by looking at Abelian groups over twenty states and critically discuss their biological usefulness when investigating amino acids.

13.

Budgeted phylogenetic diversity on circular split systems.

Minh, Bui Quang; Pardi, Fabio; Klaere, Steffen; von Haeseler, Arndt.

IEEE/ACM Trans Comput Biol Bioinform ; 6(1): 22-9, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19179696

RESUMO

In the last 15 years, Phylogenetic Diversity (PD) has gained interest in the community of conservation biologists as a surrogate measure for assessing biodiversity. We have recently proposed two approaches to select taxa for maximizing PD, namely PD with budget constraints and PD on split systems. In this paper, we will unify these two strategies and present a dynamic programming algorithm to solve the unified framework of selecting taxa with maximal PD under budget constraints on circular split systems. An improved algorithm will also be given if the underlying split system is a tree.

Assuntos

Biodiversidade , Biologia Computacional/métodos , Conservação dos Recursos Naturais , Filogenia , Software , Algoritmos , Animais , Conservação dos Recursos Naturais/economia , Conservação dos Recursos Naturais/métodos , Extinção Biológica , Integração de Sistemas

14.

The impact of single substitutions on multiple sequence alignments.

Klaere, Steffen; Gesell, Tanja; von Haeseler, Arndt.

Philos Trans R Soc Lond B Biol Sci ; 363(1512): 4041-7, 2008 Dec 27.

Artigo em Inglês | MEDLINE | ID: mdl-18852110

RESUMO

We introduce another view of sequence evolution. Contrary to other approaches, we model the substitution process in two steps. First we assume (arbitrary) scaled branch lengths on a given phylogenetic tree. Second we allocate a Poisson distributed number of substitutions on the branches. The probability to place a mutation on a branch is proportional to its relative branch length. More importantly, the action of a single mutation on an alignment column is described by a doubly stochastic matrix, the so-called one-step mutation matrix. This matrix leads to analytical formulae for the posterior probability distribution of the number of substitutions for an alignment column.

Assuntos

Substituição de Aminoácidos/genética , Evolução Molecular , Modelos Genéticos , Filogenia , Alinhamento de Sequência , Teorema de Bayes , Funções Verossimilhança , Probabilidade

15.

An exact algorithm for the geodesic distance between phylogenetic trees.

Kupczok, Anne; von Haeseler, Arndt; Klaere, Steffen.

J Comput Biol ; 15(6): 577-91, 2008.

Artigo em Inglês | MEDLINE | ID: mdl-18631022

RESUMO

The geometrical representation of the space of phylogenetic trees implies a metric on the space of weighted trees. This metric, the geodesic distance, is the length of the shortest path through that space. We present an exact algorithm to compute this metric. For biologically reasonable trees, the implementation allows fast computations of the geodesic distance, although the running time of the algorithm is worst-case exponential. The algorithm was applied to pairs of 118 gene trees of the metazoa. The results show that a special path in tree space, the cone path, which can be computed in linear time, is a good approximation of the geodesic distance. The program GeoMeTree is a python implementation of the geodesic distance, and it is approximations and is available from www.cibiv.at/software/geometree.

Assuntos

Algoritmos , Filogenia

16.

Phylogenetic diversity within seconds.

Minh, Bui Quang; Klaere, Steffen; von Haeseler, Arndt.

Syst Biol ; 55(5): 769-73, 2006 Oct.

Artigo em Inglês | MEDLINE | ID: mdl-17060198

RESUMO

We consider a (phylogenetic) tree with n labeled leaves, the taxa, and a length for each branch in the tree. For any subset of k taxa, the phylogenetic diversity is defined as the sum of the branch-lengths of the minimal subtree connecting the taxa in the subset. We introduce two time-efficient algorithms (greedy and pruning) to compute a subset of size k with maximal phylogenetic diversity in O(n log k) and O[n + (n-k) log (n-k)] time, respectively. The greedy algorithm is an efficient implementation of the so-called greedy strategy (Steel, 2005; Pardi and Goldman, 2005), whereas the pruning algorithm provides an alternative description of the same problem. Both algorithms compute within seconds a subtree with maximal phylogenetic diversity for trees with 100,000 taxa or more.

Assuntos

Biodiversidade , Filogenia , Algoritmos , Classificação/métodos , Simulação por Computador/normas

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA