Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Ecol Evol ; 14(3): e11067, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38435021

RESUMO

Climate change has the potential to disrupt species interactions across global ecosystems. Ectotherm-endotherm interactions may be especially prone to this risk due to the possible mismatch between the species in physiological response and performance. However, few studies have examined how changing temperatures might differentially impact species' niches or available suitable habitat when they have very different modes of thermoregulation. An ideal system for studying this interaction is the predator-prey system. In this study, we used ecological niche modeling to characterize the niche overlap and examine biogeography in past and future climate conditions of prairie rattlesnakes (Crotalus viridis) and Ord's kangaroo rats (Dipodomys ordii), an endotherm-ectotherm pair typifying a predator-prey species interaction. Our models show a high niche overlap between these two species (D = 0.863 and I = 0.979) and further affirm similar paleoecological distributions during the last glacial maximum (LGM) and mid-Holocene (MH). Under future climate change scenarios, we found that prairie rattlesnakes may experience a reduction in overall suitable habitat (RCP 2.6 = -1.82%, 4.5 = -4.62%, 8.5 = -7.34%), whereas Ord's kangaroo rats may experience an increase (RCP 2.6 = 9.8%, 4.5 = 11.71%, 8.5 = 8.37%). We found a shared trend of stable suitable habitat at northern latitudes but reduced suitability in southern portions of the range, and we propose future monitoring and conservation be focused on those areas. Overall, we demonstrate a biogeographic example of how interacting ectotherm-endotherm species may have mismatched responses under climate change scenarios and the models presented here can serve as a starting point for further investigation into the biogeography of these systems.

2.
PLoS One ; 19(1): e0291801, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38206953

RESUMO

Phylogenetic analysis of protein sequences provides a powerful means of identifying novel protein functions and subfamilies, and for identifying and resolving annotation errors. However, automation of functional clustering based on phylogenetic trees has been challenging and most of it is done manually. Clustering phylogenetic trees usually requires the delineation of tree-based thresholds (e.g., distances), leading to an ad hoc problem. We propose a new phylogenetic clustering approach that identifies clusters without using ad hoc distances or other pre-defined values. Our workflow combines uniform manifold approximation and projection (UMAP) with Gaussian mixture models as a k-means like procedure to automatically group sequences into clusters. We then apply a "second pass" clade identification algorithm to resolve non-monophyletic groups. We tested our approach with several well-curated protein families (outer membrane porins, acyltransferase, and nuclear receptors) and showed our automated methods recapitulated known subfamilies. We also applied our methods to a broad range of different protein families from multiple databases, including Pfam, PANTHER, and UniProt, and to alignments of RNA viral genomes. Our results showed that AutoPhy rapidly generated monophyletic clusters (subfamilies) within phylogenetic trees evolving at very different rates both within and among phylogenies. The phylogenetic clusters generated by AutoPhy resolved misannotations and identified new protein functional groups and novel viral strains.


Assuntos
Algoritmos , Proteínas , Filogenia , Proteínas/genética , Porinas/genética , Sequência de Aminoácidos
3.
Mov Ecol ; 11(1): 72, 2023 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-37919756

RESUMO

BACKGROUND: Kangaroo rats are small mammals that are among the most abundant vertebrates in many terrestrial ecosystems in Western North America and are considered both keystone species and ecosystem engineers, providing numerous linkages between other species as both consumers and resources. However, there are challenges to studying the behavior and activity of these species due to the difficulty of observing large numbers of individuals that are small, secretive, and nocturnal. Our goal was to develop an integrated approach of miniaturized animal-borne accelerometry and radiotelemetry to classify the cryptic behavior and activity cycles of kangaroo rats and test hypotheses of how their behavior is influenced by light cycles, moonlight, and weather. METHODS: We provide a proof-of-concept approach to effectively quantify behavioral patterns of small bodied (< 50 g), nocturnal, and terrestrial free-ranging mammals using large acceleration datasets by combining low-mass, miniaturized animal-borne accelerometers with radiotelemetry and advanced machine learning techniques. We developed a method of attachment and retrieval for deploying accelerometers, a non-disruptive method of gathering observational validation datasets for acceleration data on free-ranging nocturnal small mammals, and used these techniques on Merriam's kangaroo rats to analyze how behavioral patterns relate to abiotic factors. RESULTS: We found that Merriam's kangaroo rats are only active during the nighttime phases of the diel cycle and are particularly active during later light phases of the night (i.e., late night, morning twilight, and dawn). We found no reduction in activity or foraging associated with moonlight, indicating that kangaroo rats are actually more lunarphilic than lunarphobic. We also found that kangaroo rats increased foraging effort on more humid nights, most likely as a mechanism to avoid cutaneous water loss. CONCLUSIONS: Small mammals are often integral to ecosystem functionality, as many of these species are highly abundant ecosystem engineers driving linkages in energy flow and nutrient transfer across trophic levels. Our work represents the first continuous detailed quantitative description of fine-scale behavioral activity budgets in kangaroo rats, and lays out a general framework for how to use miniaturized biologging devices on small and nocturnal mammals to examine behavioral responses to environmental factors.

4.
Mol Phylogenet Evol ; 189: 107932, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-37751827

RESUMO

Diplomystidae is an early-diverged family of freshwater catfish endemic to southern South America. We have recently collected five juvenile specimens belonging to this family from the Bueno River Basin, a basin which the only previous record was a single juvenile specimen collected in 1996. This finding confirms the distribution of the family further South in northern Patagonia, but poses new questions about the origin of this population in an area with a strong glacial history. We used phylogenetic analyses to evaluate three different hypotheses that could explain the origin of this population in the basin. First, the population could have originated in Atlantic basins (East of the Andes) and dispersed to the Bueno Basin after the Last Glacial Maximum (LGM) via river reversals, as it has been proposed for other population of Diplomystes as well as for other freshwater species from Patagonia. Second, the population could have originated in the geographically close Valdivia Basin (West of the Andes) and dispersed south to its current location in the Bueno Basin. Third, regardless of its geographic origin (West or East of the Andes), the Bueno Basin population could have a longer history in the basin, surviving in situ through the LGM. In addition, we conducted species delimitation analyses using a recently developed method that uses a protracted model of speciation. Our goal was to test the species status of the Bueno Basin population along with another controversial population in Central Chile (Biobío Basin), which appeared highly divergent in previous studies with mtDNA. The phylogenetic analyses showed that the population from the Bueno Basin is more related to Atlantic than to Pacific lineages, although with a deep divergence that predated the LGM, supporting in situ survival rather than postglacial dispersal. In addition, these analyses also showed that the species D. nahuelbutaensis is polyphyletic, supporting the need for a taxonomic reevaluation. The species delimitation analyses supported two new species which are described using molecular diagnostic characters: Diplomystes arratiae sp. nov. from the Biobío, Carampangue, and Laraquete basins, maintaining D. nahuelbutaensis valid only for the Imperial Basin, and Diplomystes habitae sp. nov. from the Bueno Basin. This study greatly increases the number of species within both the family Diplomystidae and Patagonia, and contributes substantially to the knowledge of the evolution of southern South American freshwater biodiversity during its glacial history. Given the important contribution to the phylogenetic diversity of the family, we recommend a high conservation priority for both new species. Finally, this study highlights an exemplary scenario where species descriptions based only on DNA data are particularly valuable, bringing additional elements to the ongoing debate on DNA-based taxonomy.


Assuntos
Peixes-Gato , Animais , Filogenia , Peixes-Gato/genética , Chile , DNA Mitocondrial/genética , Filogeografia , Variação Genética
5.
Mol Ecol Resour ; 22(1): 430-438, 2022 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34288531

RESUMO

A wide range of data types can be used to delimit species and various computer-based tools dedicated to this task are now available. Although these formalized approaches have significantly contributed to increase the objectivity of species delimitation (SD) under different assumptions, they are not routinely used by alpha-taxonomists. One obvious shortcoming is the lack of interoperability among the various independently developed SD programs. Given the frequent incongruences between species partitions inferred by different SD approaches, researchers applying these methods often seek to compare these alternative species partitions to evaluate the robustness of the species boundaries. This procedure is excessively time consuming at present, and the lack of a standard format for species partitions is a major obstacle. Here, we propose a standardized format, SPART, to enable compatibility between different SD tools exporting or importing partitions. This format reports the partitions and describes, for each of them, the assignment of individuals to the "inferred species". The syntax also allows support values to be optionally reported, as well as original trees and the full command lines used in the respective SD analyses. Two variants of this format are proposed, overall using the same terminology but presenting the data either optimized for human readability (matricial SPART) or in a format in which each partition forms a separate block (SPART.XML). ABGD, DELINEATE, GMYC, PTP and TR2 have already been adapted to output SPART files and a new version of LIMES has been developed to import, export, merge and split them.

6.
J Chromatogr A ; 1660: 462656, 2021 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-34798444

RESUMO

Nontargeted analysis based on mass spectrometry is a rising practice in environmental monitoring for identifying contaminants of emerging concern. Nontargeted analysis performed using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry (GC×GC/TOF-MS) generates large numbers of possible analytes. Moreover, the default spectral library similarity score-based search algorithm used by LECO® ChromaTOF® does not ensure that high similarity scores result in correct library matches. Therefore, an additional manual screening is necessary, but leads to human errors especially when dealing with large amounts of data. To improve the speed and accuracy of the chemical identification, we developed CINeMA.py (Classification Is Never Manual Again). This programming suite automates GC×GC/TOF-MS data interpretation by determining the confidence of a match between the observed analyte mass spectrum and the LECO® ChromaTOF® software generated library hit from the NIST Electron Ionization Mass Spectral (NIST EI-MS) library. Our script allows the user to evaluate the confidence of the match using an algorithmic method that mimics the manual curation process and two different machine learning approaches (neural networks and random forest). The script allows the user to adjust various parameters (e.g., similarity threshold) and study their effects on prediction accuracy. To test CINeMA.py, we used data from two different environmental contaminant studies: an EPA study on household dust and a study on stormwater runoff. Using a reference set based on the analysis performed by highly trained users of the ChromaTOF and GC×GC/TOF-MS systems, the random forest model had the highest prediction accuracies of 86% and 83% on the EPA and Stormwater data sets, respectively. The algorithmic approach had the second-best prediction accuracy (82% and 79%), while the neural network accuracy had the lowest (63% and 67%). All the approaches required less than 1 min to classify 986 observed analytes, whereas manual data analysis required hours or days to complete. Our methods were also able to detect high confidence matches missed during the manual review. Overall, CINeMA.py provides users with a powerful suite of tools that should significantly speed-up data analysis while reducing the possibilities of manual errors and discrepancies among users, and can be applicable to other GC/EI-MS instrument based nontargeted analysis.


Assuntos
Elétrons , Software , Algoritmos , Monitoramento Ambiental , Cromatografia Gasosa-Espectrometria de Massas , Humanos
7.
PLoS Comput Biol ; 17(9): e1008949, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34516547

RESUMO

A current strategy for obtaining haplotype information from several individuals involves short-read sequencing of pooled amplicons, where fragments from each individual is identified by a unique DNA barcode. In this paper, we report a new method to recover the phylogeny of haplotypes from short-read sequences obtained using pooled amplicons from a mixture of individuals, without barcoding. The method, AFPhyloMix, accepts an alignment of the mixture of reads against a reference sequence, obtains the single-nucleotide-polymorphisms (SNP) patterns along the alignment, and constructs the phylogenetic tree according to the SNP patterns. AFPhyloMix adopts a Bayesian inference model to estimate the phylogeny of the haplotypes and their relative abundances, given that the number of haplotypes is known. In our simulations, AFPhyloMix achieved at least 80% accuracy at recovering the phylogenies and relative abundances of the constituent haplotypes, for mixtures with up to 15 haplotypes. AFPhyloMix also worked well on a real data set of kangaroo mitochondrial DNA sequences.


Assuntos
Código de Barras de DNA Taxonômico , Filogenia , Algoritmos , Teorema de Bayes , DNA Mitocondrial/genética , Humanos , Cadeias de Markov , Método de Monte Carlo , Polimorfismo de Nucleotídeo Único
8.
PLoS Comput Biol ; 17(5): e1008924, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33983918

RESUMO

The "multispecies" coalescent (MSC) model that underlies many genomic species-delimitation approaches is problematic because it does not distinguish between genetic structure associated with species versus that of populations within species. Consequently, as both the genomic and spatial resolution of data increases, a proliferation of artifactual species results as within-species population lineages, detected due to restrictions in gene flow, are identified as distinct species. The toll of this extends beyond systematic studies, getting magnified across the many disciplines that rely upon an accurate framework of identified species. Here we present the first of a new class of approaches that addresses this issue by incorporating an extended speciation process for species delimitation. We model the formation of population lineages and their subsequent development into independent species as separate processes and provide for a way to incorporate current understanding of the species boundaries in the system through specification of species identities of a subset of population lineages. As a result, species boundaries and within-species lineages boundaries can be discriminated across the entire system, and species identities can be assigned to the remaining lineages of unknown affinities with quantified probabilities. In addition to the identification of species units in nature, the primary goal of species delimitation, the incorporation of a speciation model also allows us insights into the links between population and species-level processes. By explicitly accounting for restrictions in gene flow not only between, but also within, species, we also address the limits of genetic data for delimiting species. Specifically, while genetic data alone is not sufficient for accurate delimitation, when considered in conjunction with other information we are able to not only learn about species boundaries, but also about the tempo of the speciation process itself.


Assuntos
Especiação Genética , Modelos Genéticos , Algoritmos , Animais , Biologia Computacional , Simulação por Computador , Fluxo Gênico , Genética Populacional , Modelos Estatísticos , Filogenia , Software , Especificidade da Espécie , Fatores de Tempo
9.
BMC Evol Biol ; 18(1): 123, 2018 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-30097006

RESUMO

BACKGROUND: Macroevolutionary modeling of species diversification plays important roles in inferring large-scale biodiversity patterns. It allows estimation of speciation and extinction rates and statistically testing their relationships with different ecological factors. However, macroevolutionary patterns are ultimately generated by microevolutionary processes acting at population levels, especially when speciation and extinction are considered protracted instead of point events. Neglecting the connection between micro- and macroevolution may hinder our ability to fully understand the underlying mechanisms that drive the observed patterns. RESULTS: In this simulation study, we used the protracted speciation framework to demonstrate that distinct microevolutionary scenarios can generate very similar biodiversity patterns (e.g., latitudinal diversity gradient). We also showed that current macroevolutionary models may not be able to distinguish these different scenarios. CONCLUSIONS: Given the compounded nature of speciation and extinction rates, one needs to be cautious when inferring causal relationships between ecological factors and macroevolutioanry rates. Future studies that incorporate microevolutionary processes into current modeling approaches are in need.


Assuntos
Evolução Biológica , Animais , Biodiversidade , Aves/fisiologia , Extinção Biológica , Especiação Genética , Filogenia
10.
Am J Bot ; 105(3): 376-384, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-29710372

RESUMO

PREMISE OF THE STUDY: Discordant gene trees are commonly encountered when sequences from thousands of loci are applied to estimate phylogenetic relationships. Several processes contribute to this discord. Yet, we have no methods that jointly model different sources of conflict when estimating phylogenies. An alternative to analyzing entire genomes or all the sequenced loci is to identify a subset of loci for phylogenetic analysis. If we can identify data partitions that are most likely to reflect descent from a common ancestor (i.e., discordant loci that indeed reflect incomplete lineage sorting [ILS], as opposed to some other process, such as lateral gene transfer [LGT]), we can analyze this subset using powerful coalescent-based species-tree approaches. METHODS: Test data sets were simulated where discord among loci could arise from ILS and LGT. Data sets where analyzed using the newly developed program CLASSIPHY (Huang et al., ) to assess whether our ability to distinguish the cause of discord among loci varied when ILS and LGT occurred in the recent versus deep past and whether the accuracy of these inferences were affected by the mutational process. KEY RESULTS: We show that accuracy of probabilistic classification of individual loci by the cause of discord differed when ILS and LGT events occurred more recently compared with the distant past and that the signal-to-noise ratio arising from the mutational process contributes to difficulties in inferring LGT data partitions. CONCLUSIONS: We discuss our findings in terms of the promise and limitations of identifying subsets of loci for species-tree inference that will not violate the underlying coalescent model (i.e., data partitions in which ILS, and not LGT, contributes to discord). We also discuss the empirical implications of our work given the many recalcitrant nodes in the tree of life (e.g., origins of angiosperms, amniotes, or Neoaves), and recent arguments for concatenating loci.


Assuntos
Transferência Genética Horizontal , Loci Gênicos , Especiação Genética , Modelos Genéticos , Filogenia , Simulação por Computador , Genoma , Magnoliopsida/genética , Mutação
11.
Trends Ecol Evol ; 33(6): 390-398, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29685579

RESUMO

The development of process-based probabilistic models for historical biogeography has transformed the field by grounding it in modern statistical hypothesis testing. However, most of these models abstract away biological differences, reducing species to interchangeable lineages. We present here the case for reintegration of biology into probabilistic historical biogeographical models, allowing a broader range of questions about biogeographical processes beyond ancestral range estimation or simple correlation between a trait and a distribution pattern, as well as allowing us to assess how inferences about ancestral ranges themselves might be impacted by differential biological traits. We show how new approaches to inference might cope with the computational challenges resulting from the increased complexity of these trait-based historical biogeographical models.


Assuntos
Distribuição Animal , Características de História de Vida , Fenótipo , Filogenia , Filogeografia , Dispersão Vegetal , Biologia Computacional , Modelos Biológicos , Modelos Estatísticos
12.
Microbiome ; 5(1): 127, 2017 09 25.
Artigo em Inglês | MEDLINE | ID: mdl-28946894

RESUMO

BACKGROUND: Numerous empirical studies suggest that hosts and microbes exert reciprocal selective effects on their ecological partners. Nonetheless, we still lack an explicit framework to model the dynamics of both hosts and microbes under selection. In a previous study, we developed an agent-based forward-time computational framework to simulate the neutral evolution of host-associated microbial communities in a constant-sized, unstructured population of hosts. These neutral models allowed offspring to sample microbes randomly from parents and/or from the environment. Additionally, the environmental pool of available microbes was constituted by fixed and persistent microbial OTUs and by contributions from host individuals in the preceding generation. METHODS: In this paper, we extend our neutral models to allow selection to operate on both hosts and microbes. We do this by constructing a phenome for each microbial OTU consisting of a sample of traits that influence host and microbial fitnesses independently. Microbial traits can influence the fitness of hosts ("host selection") and the fitness of microbes ("trait-mediated microbial selection"). Additionally, the fitness effects of traits on microbes can be modified by their hosts ("host-mediated microbial selection"). We simulate the effects of these three types of selection, individually or in combination, on microbiome diversities and the fitnesses of hosts and microbes over several thousand generations of hosts. RESULTS: We show that microbiome diversity is strongly influenced by selection acting on microbes. Selection acting on hosts only influences microbiome diversity when there is near-complete direct or indirect parental contribution to the microbiomes of offspring. Unsurprisingly, microbial fitness increases under microbial selection. Interestingly, when host selection operates, host fitness only increases under two conditions: (1) when there is a strong parental contribution to microbial communities or (2) in the absence of a strong parental contribution, when host-mediated selection acts on microbes concomitantly. CONCLUSIONS: We present a computational framework that integrates different selective processes acting on the evolution of microbiomes. Our framework demonstrates that selection acting on microbes can have a strong effect on microbial diversities and fitnesses, whereas selection on hosts can have weaker outcomes.


Assuntos
Simulação por Computador , Evolução Molecular , Microbiota/genética , Bactérias/genética , Bactérias/patogenicidade , Aptidão Genética , Variação Genética , Humanos , Simbiose
13.
Proc Natl Acad Sci U S A ; 114(7): 1607-1612, 2017 02 14.
Artigo em Inglês | MEDLINE | ID: mdl-28137871

RESUMO

The multispecies coalescent model underlies many approaches used for species delimitation. In previous work assessing the performance of species delimitation under this model, speciation was treated as an instantaneous event rather than as an extended process involving distinct phases of speciation initiation (structuring) and completion. Here, we use data under simulations that explicitly model speciation as an extended process rather than an instantaneous event and carry out species delimitation inference on these data under the multispecies coalescent. We show that the multispecies coalescent diagnoses genetic structure, not species, and that it does not statistically distinguish structure associated with population isolation vs. species boundaries. Because of the misidentification of population structure as putative species, our work raises questions about the practice of genome-based species discovery, with cascading consequences in other fields. Specifically, all fields that rely on species as units of analysis, from conservation biology to studies of macroevolutionary dynamics, will be impacted by inflated estimates of the number of species, especially as genomic resources provide unprecedented power for detecting increasingly finer-scaled genetic structure under the multispecies coalescent. As such, our work also represents a general call for systematic study to reconsider a reliance on genomic data alone. Until new methods are developed that can discriminate between structure due to population-level processes and that due to species boundaries, genomic-based results should only be considered a hypothesis that requires validation of delimited species with multiple data types, such as phenotypic and ecological information.


Assuntos
Fluxo Gênico , Especiação Genética , Genoma/genética , Modelos Genéticos , Animais , Simulação por Computador , Evolução Molecular , Humanos , Fenótipo , Filogenia , Especificidade da Espécie
14.
Syst Biol ; 65(3): 525-45, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26715585

RESUMO

Current statistical biogeographical analysis methods are limited in the ways ecology can be related to the processes of diversification and geographical range evolution, requiring conflation of geography and ecology, and/or assuming ecologies that are uniform across all lineages and invariant in time. This precludes the possibility of studying a broad class of macroevolutionary biogeographical theories that relate geographical and species histories through lineage-specific ecological and evolutionary dynamics, such as taxon cycle theory. Here we present a new model that generates phylogenies under a complex of superpositioned geographical range evolution, trait evolution, and diversification processes that can communicate with each other. We present a likelihood-free method of inference under our model using discriminant analysis of principal components of summary statistics calculated on phylogenies, with the discriminant functions trained on data generated by simulations under our model. This approach of model selection by classification of empirical data with respect to data generated under training models is shown to be efficient, robust, and performs well over a broad range of parameter space defined by the relative rates of dispersal, trait evolution, and diversification processes. We apply our method to a case study of the taxon cycle, that is testing for habitat and trophic level constraints in the dispersal regimes of the Wallacean avifaunal radiation.


Assuntos
Análise Discriminante , Aprendizado de Máquina , Modelos Biológicos , Filogeografia/métodos , Simulação por Computador , Filogenia
15.
PLoS Comput Biol ; 11(7): e1004365, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26200800

RESUMO

There has been an explosion of research on host-associated microbial communities (i.e.,microbiomes). Much of this research has focused on surveys of microbial diversities across a variety of host species, including humans, with a view to understanding how these microbiomes are distributed across space and time, and how they correlate with host health, disease, phenotype, physiology and ecology. Fewer studies have focused on how these microbiomes may have evolved. In this paper, we develop an agent-based framework to study the dynamics of microbiome evolution. Our framework incorporates neutral models of how hosts acquire their microbiomes, and how the environmental microbial community that is available to the hosts is assembled. Most importantly, our framework also incorporates a Wright-Fisher genealogical model of hosts, so that the dynamics of microbiome evolution is studied on an evolutionary timescale. Our results indicate that the extent of parental contribution to microbial availability from one generation to the next significantly impacts the diversity of microbiomes: the greater the parental contribution, the less diverse the microbiomes. In contrast, even when there is only a very small contribution from a constant environmental pool, microbial communities can remain highly diverse. Finally, we show that our models may be used to construct hypotheses about the types of processes that operate to assemble microbiomes over evolutionary time.


Assuntos
Evolução Biológica , Ecossistema , Variação Genética/genética , Especificidade de Hospedeiro/genética , Microbiota/genética , Modelos Genéticos , Simulação por Computador
16.
Evolution ; 68(12): 3607-17, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25213163

RESUMO

Establishing that a set of population-splitting events occurred at the same time can be a potentially persuasive argument that a common process affected the populations. Recently, Oaks et al. () assessed the ability of an approximate-Bayesian model-choice method (msBayes) to estimate such a pattern of simultaneous divergence across taxa, to which Hickerson et al. () responded. Both papers agree that the primary inference enabled by the method is very sensitive to prior assumptions and often erroneously supports shared divergences across taxa when prior uncertainty about divergence times is represented by a uniform distribution. However, the papers differ about the best explanation and solution for this problem. Oaks et al. () suggested the method's behavior was caused by the strong weight of uniformly distributed priors on divergence times leading to smaller marginal likelihoods (and thus smaller posterior probabilities) of models with more divergence-time parameters (Hypothesis 1); they proposed alternative prior probability distributions to avoid such strongly weighted posteriors. Hickerson et al. () suggested numerical-approximation error causes msBayes analyses to be biased toward models of clustered divergences because the method's rejection algorithm is unable to adequately sample the parameter space of richer models within reasonable computational limits when using broad uniform priors on divergence times (Hypothesis 2). As a potential solution, they proposed a model-averaging approach that uses narrow, empirically informed uniform priors. Here, we use analyses of simulated and empirical data to demonstrate that the approach of Hickerson et al. () does not mitigate the method's tendency to erroneously support models of highly clustered divergences, and is dangerous in the sense that the empirically derived uniform priors often exclude from consideration the true values of the divergence-time parameters. Our results also show that the tendency of msBayes analyses to support models of shared divergences is primarily due to Hypothesis 1, whereas Hypothesis 2 is an untenable explanation for the bias. Overall, this series of papers demonstrates that if our prior assumptions place too much weight in unlikely regions of parameter space such that the exact posterior supports the wrong model of evolutionary history, no amount of computation can rescue our inference. Fortunately, as predicted by fundamental principles of Bayesian model choice, more flexible distributions that accommodate prior uncertainty about parameters without placing excessive weight in vast regions of parameter space with low likelihood increase the method's robustness and power to detect temporal variation in divergences.


Assuntos
Evolução Biológica , Clima , Modelos Biológicos , Animais
17.
BMC Bioinformatics ; 14: 158, 2013 May 13.
Artigo em Inglês | MEDLINE | ID: mdl-23668630

RESUMO

BACKGROUND: Scientists rarely reuse expert knowledge of phylogeny, in spite of years of effort to assemble a great "Tree of Life" (ToL). A notable exception involves the use of Phylomatic, which provides tools to generate custom phylogenies from a large, pre-computed, expert phylogeny of plant taxa. This suggests great potential for a more generalized system that, starting with a query consisting of a list of any known species, would rectify non-standard names, identify expert phylogenies containing the implicated taxa, prune away unneeded parts, and supply branch lengths and annotations, resulting in a custom phylogeny suited to the user's needs. Such a system could become a sustainable community resource if implemented as a distributed system of loosely coupled parts that interact through clearly defined interfaces. RESULTS: With the aim of building such a "phylotastic" system, the NESCent Hackathons, Interoperability, Phylogenies (HIP) working group recruited 2 dozen scientist-programmers to a weeklong programming hackathon in June 2012. During the hackathon (and a three-month follow-up period), 5 teams produced designs, implementations, documentation, presentations, and tests including: (1) a generalized scheme for integrating components; (2) proof-of-concept pruners and controllers; (3) a meta-API for taxonomic name resolution services; (4) a system for storing, finding, and retrieving phylogenies using semantic web technologies for data exchange, storage, and querying; (5) an innovative new service, DateLife.org, which synthesizes pre-computed, time-calibrated phylogenies to assign ages to nodes; and (6) demonstration projects. These outcomes are accessible via a public code repository (GitHub.com), a website (http://www.phylotastic.org), and a server image. CONCLUSIONS: Approximately 9 person-months of effort (centered on a software development hackathon) resulted in the design and implementation of proof-of-concept software for 4 core phylotastic components, 3 controllers, and 3 end-user demonstration tools. While these products have substantial limitations, they suggest considerable potential for a distributed system that makes phylogenetic knowledge readily accessible in computable form. Widespread use of phylotastic systems will create an electronic marketplace for sharing phylogenetic knowledge that will spur innovation in other areas of the ToL enterprise, such as annotation of sources and methods and third-party methods of quality assessment.


Assuntos
Filogenia , Software , Internet
18.
Evolution ; 67(4): 991-1010, 2013 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-23550751

RESUMO

Approximate Bayesian computation (ABC) is rapidly gaining popularity in population genetics. One example, msBayes, infers the distribution of divergence times among pairs of taxa, allowing phylogeographers to test hypotheses about historical causes of diversification in co-distributed groups of organisms. Using msBayes, we infer the distribution of divergence times among 22 pairs of populations of vertebrates distributed across the Philippine Archipelago. Our objective was to test whether sea-level oscillations during the Pleistocene caused diversification across the islands. To guide interpretation of our results, we perform a suite of simulation-based power analyses. Our empirical results strongly support a recent simultaneous divergence event for all 22 taxon pairs, consistent with the prediction of the Pleistocene-driven diversification hypothesis. However, our empirical estimates are sensitive to changes in prior distributions, and our simulations reveal low power of the method to detect random variation in divergence times and bias toward supporting clustered divergences. Our results demonstrate that analyses exploring power and prior sensitivity should accompany ABC model selection inferences. The problems we identify are potentially mitigable with uniform priors over divergence models (rather than classes of models) and more flexible prior distributions on demographic and divergence-time parameters.


Assuntos
Evolução Biológica , Clima , Modelos Biológicos , Animais , Especiação Genética , Fenômenos Geológicos , Ilhas , Filogenia
19.
Syst Biol ; 61(4): 675-89, 2012 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-22357728

RESUMO

In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.


Assuntos
Evolução Biológica , Biologia Computacional/normas , Linguagens de Programação , Biodiversidade , Classificação , Informática , Modelos Biológicos , Filogenia , Software
20.
Mol Ecol Resour ; 11(2): 364-9, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21429145

RESUMO

We present Ginkgo, a software package for agent-based, forward-time simulations of genealogies of multiple unlinked loci from diploid populations. Ginkgo simulates the evolution of one or more species on a spatially explicit landscape of cells. The user of the software can specify the geographical and environmental characteristics of the landscape, and these properties can change according to a prespecified schedule. The geographical elements modelled include the arrangement of cells and movement rates between particular cells. Each species has a function that can calculate a fitness score for any combination of an individual organism's phenotype and environmental characteristics. The user can control the number of fitness factors (the dimensionality of the cell-specific fitness factors and the individuals phenotypic vectors) and the weighting of each of these dimensions in the fitness calculation. Cell-specific fitness trait optima can be specified across the landscape to mimic differences in habitat. In addition to their differing fitness functions, species can differ in terms of their vagility and fecundity. Genealogies and occurrence data can be produced at any time during the simulation in NEXUS and ESRI Ascii Grid formats, respectively.


Assuntos
Simulação por Computador , Filogeografia , Software , Evolução Biológica , Ecossistema
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA