Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 23
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
BMC Evol Biol ; 18(1): 46, 2018 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-29618314

RESUMO

BACKGROUND: The pattern of data availability in a phylogenetic data set may lead to the formation of terraces, collections of equally optimal trees. Terraces can arise in tree space if trees are scored with parsimony or with partitioned, edge-unlinked maximum likelihood. Theory predicts that terraces can be large, but their prevalence in contemporary data sets has never been surveyed. We selected 26 data sets and phylogenetic trees reported in recent literature and investigated the terraces to which the trees would belong, under a common set of inference assumptions. We examined terrace size as a function of the sampling properties of the data sets, including taxon coverage density (the proportion of taxon-by-gene positions with any data present) and a measure of gene sampling "sufficiency". We evaluated each data set in relation to the theoretical minimum gene sampling depth needed to reduce terrace size to a single tree, and explored the impact of the terraces found in replicate trees in bootstrap methods. RESULTS: Terraces were identified in nearly all data sets with taxon coverage densities < 0.90. They were not found, however, in high-coverage-density (i.e., ≥ 0.94) transcriptomic and genomic data sets. The terraces could be very large, and size varied inversely with taxon coverage density and with gene sampling sufficiency. Few data sets achieved a theoretical minimum gene sampling depth needed to reduce terrace size to a single tree. Terraces found during bootstrap resampling reduced overall support. CONCLUSIONS: If certain inference assumptions apply, trees estimated from empirical data sets often belong to large terraces of equally optimal trees. Terrace size correlates to data set sampling properties. Data sets seldom include enough genes to reduce terrace size to one tree. When bootstrap replicate trees lie on a terrace, statistical support for phylogenetic hypotheses may be reduced. Although some of the published analyses surveyed were conducted with edge-linked inference models (which do not induce terraces), unlinked models have been used and advocated. The present study describes the potential impact of that inference assumption on phylogenetic inference in the context of the kinds of multigene data sets now widely assembled for large-scale tree construction.


Assuntos
Bases de Dados Genéticas , Filogenia , Genes , Modelos Genéticos
2.
Syst Biol ; 64(5): 709-26, 2015 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-25999395

RESUMO

Terraces are sets of trees with precisely the same likelihood or parsimony score, which can be induced by missing sequences in partitioned multi-locus phylogenetic data matrices. The potentially large set of trees on a terrace can be characterized by enumeration algorithms or consensus methods that exploit the pattern of partial taxon coverage in the data, independent of the sequence data themselves. Terraces can add ambiguity and complexity to phylogenetic inference, particularly in settings where inference is already challenging: data sets with many taxa and relatively few loci. In this article we present five new findings about terraces and their impacts on phylogenetic inference. First, we clarify assumptions about partitioning scheme model parameters that are necessary for the existence of terraces. Second, we explore the dependence of terrace size on partitioning scheme and indicate how to find the partitioning scheme associated with the largest terrace containing a given tree. Third, we highlight the impact of terrace size on bootstrap estimates of confidence limits in clades, and characterize the surprising result that the bootstrap proportion for a clade, as it is usually calculated, can be entirely determined by the frequency of bipartitions on a terrace, with some bipartitions receiving high support even when incorrect. Fourth, we dissect some effects of prior distributions of edge lengths on the computed posterior probabilities of clades on terraces, to understand an example in which long edges "attract" each other in Bayesian inference. Fifth, we describe how assuming relationships between edge-lengths of different loci, as an attempt to avoid terraces, can also be problematic when taxon coverage is partial, specifically when heterotachy is present. Finally, we discuss strategies for remediation of some of these problems. One promising approach finds a minimal set of taxa which, when deleted from the data matrix, reduces the size of a terrace to a single tree.


Assuntos
Classificação/métodos , Simulação por Computador/normas , Filogenia , Modelos Genéticos
3.
Syst Biol ; 63(5): 812-8, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-24789072

RESUMO

We introduce molecularevolution.org, a publicly available gateway for high-throughput, maximum-likelihood phylogenetic analysis powered by grid computing. The gateway features a garli 2.0 web service that enables a user to quickly and easily submit thousands of maximum likelihood tree searches or bootstrap searches that are executed in parallel on distributed computing resources. The garli web service allows one to easily specify partitioned substitution models using a graphical interface, and it performs sophisticated post-processing of phylogenetic results. Although the garli web service has been used by the research community for over three years, here we formally announce the availability of the service, describe its capabilities, highlight new features and recent improvements, and provide details about how the grid system efficiently delivers high-quality phylogenetic results.


Assuntos
Classificação/métodos , Filogenia , Software , Acesso à Informação , Internet
4.
Syst Biol ; 63(5): 645-59, 2014 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-24721692

RESUMO

We describe new methods for characterizing gene tree discordance in phylogenomic data sets, which screen for deviations from neutral expectations, summarize variation in statistical support among gene trees, and allow comparison of the patterns of discordance induced by various analysis choices. Using an exceptionally complete set of genome sequences for the short arm of chromosome 3 in Oryza (rice) species, we applied these methods to identify the causes and consequences of differing patterns of discordance in the sets of gene trees inferred using a panel of 20 distinct analysis pipelines. We found that discordance patterns were strongly affected by aspects of data selection, alignment, and alignment masking. Unusual patterns of discordance evident when using certain pipelines were reduced or eliminated by using alternative pipelines, suggesting that they were the product of methodological biases rather than evolutionary processes. In some cases, once such biases were eliminated, evolutionary processes such as introgression could be implicated. Additionally, patterns of gene tree discordance had significant downstream impacts on species tree inference. For example, inference from supermatrices was positively misleading when pipelines that led to biased gene trees were used. Several results may generalize to other data sets: we found that gene tree and species tree inference gave more reasonable results when intron sequence was included during sequence alignment and tree inference, the alignment software PRANK was used, and detectable "block-shift" alignment artifacts were removed. We discuss our findings in the context of well-established relationships in Oryza and continuing controversies regarding the domestication history of O. sativa.


Assuntos
Cromossomos de Plantas/genética , Classificação/métodos , Oryza/classificação , Oryza/genética , Filogenia , Genoma de Planta/genética
5.
Syst Biol ; 61(1): 170-3, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-21963610

RESUMO

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.


Assuntos
Biologia Computacional/métodos , Filogenia , Software , Algoritmos , Metodologias Computacionais , Evolução Molecular , Genoma
6.
Proc Natl Acad Sci U S A ; 107(51): 22172-7, 2010 Dec 21.
Artigo em Inglês | MEDLINE | ID: mdl-21127261

RESUMO

The genetic basis of parallel innovation remains poorly understood due to the rarity of independent origins of the same complex trait among model organisms. We focus on two groups of teleost fishes that independently gained myogenic electric organs underlying electrical communication. Earlier work suggested that a voltage-gated sodium channel gene (Scn4aa), which arose by whole-genome duplication, was neofunctionalized for expression in electric organ and subsequently experienced strong positive selection. However, it was not possible to determine if these changes were temporally linked to the independent origins of myogenic electric organs in both lineages. Here, we test predictions of such a relationship. We show that Scn4aa co-option and rapid sequence evolution were tightly coupled to the two origins of electric organ, providing strong evidence that Scn4aa contributed to parallel innovations underlying the evolutionary diversification of each electric fish group. Independent evolution of electric organs and Scn4aa co-option occurred more than 100 million years following the origin of Scn4aa by duplication. During subsequent diversification of the electrical communication channels, amino acid substitutions in both groups occurred in the same regions of the sodium channel that likely contribute to electric signal variation. Thus, the phenotypic similarities between independent electric fish groups are also associated with striking parallelism at genetic and molecular levels. Our results show that gene duplication can contribute to remarkably similar innovations in repeatable ways even after long waiting periods between gene duplication and the origins of novelty.


Assuntos
Órgão Elétrico/fisiologia , Evolução Molecular , Proteínas de Peixes/genética , Peixes/genética , Duplicação Gênica/genética , Canais de Sódio/genética , Sequência de Aminoácidos , Substituição de Aminoácidos , Animais , Estudo de Associação Genômica Ampla , Humanos , Dados de Sequência Molecular
7.
Proc Natl Acad Sci U S A ; 107(50): 21242-7, 2010 Dec 14.
Artigo em Inglês | MEDLINE | ID: mdl-21078965

RESUMO

Phylogenetic analysis has been widely used to test the a priori hypothesis of epidemiological clustering in suspected transmission chains of HIV-1. Among studies showing strong support for relatedness between HIV samples obtained from infected individuals, evidence for the direction of transmission between epidemiologically related pairs has been lacking. During transmission of HIV, a genetic bottleneck occurs, resulting in the paraphyly of source viruses with respect to those of the recipient. This paraphyly establishes the direction of transmission, from which the source can then be inferred. Here, we present methods and results from two criminal cases, State of Washington v Anthony Eugene Whitfield, case number 04-1-0617-5 (Superior Court of the State of Washington, Thurston County, 2004) and State of Texas v Philippe Padieu, case numbers 219-82276-07, 219-82277-07, 219-82278-07, 219-82279-07, 219-82280-07, and 219-82705-07 (219th Judicial District Court, Collin County, TX, 2009), which provided evidence that direction can be established from blinded case samples. The observed paraphyly from each case study led to the identification of an inferred source (i.e., index case), whose identity was revealed at trial to be that of the defendant.


Assuntos
Direito Penal , DNA Viral/análise , Genética Forense/métodos , Infecções por HIV/transmissão , HIV-1/classificação , HIV-1/genética , Análise de Sequência de DNA , DNA Viral/sangue , Bases de Dados Genéticas , Infecções por HIV/genética , Infecções por HIV/virologia , Humanos , Dados de Sequência Molecular , Filogenia , Texas , Washington
9.
Nat Genet ; 50(2): 285-296, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29358651

RESUMO

The genus Oryza is a model system for the study of molecular evolution over time scales ranging from a few thousand to 15 million years. Using 13 reference genomes spanning the Oryza species tree, we show that despite few large-scale chromosomal rearrangements rapid species diversification is mirrored by lineage-specific emergence and turnover of many novel elements, including transposons, and potential new coding and noncoding genes. Our study resolves controversial areas of the Oryza phylogeny, showing a complex history of introgression among different chromosomes in the young 'AA' subclade containing the two domesticated species. This study highlights the prevalence of functionally coupled disease resistance genes and identifies many new haplotypes of potential use for future crop protection. Finally, this study marks a milestone in modern rice research with the release of a complete long-read assembly of IR 8 'Miracle Rice', which relieved famine and drove the Green Revolution in Asia 50 years ago.


Assuntos
Produtos Agrícolas/genética , Evolução Molecular , Variação Genética , Oryza/classificação , Oryza/genética , Sequência Conservada , Domesticação , Especiação Genética , Genoma de Planta , Filogenia
10.
Nat Commun ; 5: 5269, 2014 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-25381880

RESUMO

The extent and importance of endogenous viral elements have been extensively described in animals but are much less well understood in plants. Here we describe a new genus of Caulimoviridae called 'Florendovirus', members of which have colonized the genomes of a large diversity of flowering plants, sometimes at very high copy numbers (>0.5% total genome content). The genome invasion of Oryza is dated to over 1.8 million years ago (MYA) but phylogeographic evidence points to an even older age of 20-34 MYA for this virus group. Some appear to have had a bipartite genome organization, a unique characteristic among viral retroelements. In Vitis vinifera, 9% of the endogenous florendovirus loci are located within introns and therefore may influence host gene expression. The frequent colocation of endogenous florendovirus loci with TA simple sequence repeats, which are associated with chromosome fragility, suggests sequence capture during repair of double-stranded DNA breaks.


Assuntos
Caulimoviridae/genética , Evolução Molecular , Genoma de Planta/genética , Oryza/virologia , Filogenia , Dosagem de Genes/genética , Loci Gênicos/genética , Íntrons/genética , Repetições de Microssatélites/genética , Replicação Viral/genética
11.
PLoS One ; 8(6): e66245, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23755303

RESUMO

Molecular divergence time analyses often rely on the age of fossil lineages to calibrate node age estimates. Most divergence time analyses are now performed in a Bayesian framework, where fossil calibrations are incorporated as parametric prior probabilities on node ages. It is widely accepted that an ideal parameterization of such node age prior probabilities should be based on a comprehensive analysis of the fossil record of the clade of interest, but there is currently no generally applicable approach for calculating such informative priors. We provide here a simple and easily implemented method that employs fossil data to estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade, which can be used to fit an informative parametric prior probability distribution on a node age. Specifically, our method uses the extant diversity and the stratigraphic distribution of fossil lineages confidently assigned to a clade to fit a branching model of lineage diversification. Conditioning this on a simple model of fossil preservation, we estimate the likely amount of missing history prior to the oldest fossil occurrence of a clade. The likelihood surface of missing history can then be translated into a parametric prior probability distribution on the age of the clade of interest. We show that the method performs well with simulated fossil distribution data, but that the likelihood surface of missing history can at times be too complex for the distribution-fitting algorithm employed by our software tool. An empirical example of the application of our method is performed to estimate echinoid node ages. A simulation-based sensitivity analysis using the echinoid data set shows that node age prior distributions estimated under poor preservation rates are significantly less informative than those estimated under high preservation rates.


Assuntos
Especiação Genética , Modelos Genéticos , Algoritmos , Animais , Teorema de Bayes , Calibragem , Evolução Molecular , Fósseis , Funções Verossimilhança , Modelos Estatísticos , Ouriços-do-Mar/genética , Software
12.
PLoS One ; 8(3): e58568, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23554903

RESUMO

BACKGROUND: Higher-level relationships within the Lepidoptera, and particularly within the species-rich subclade Ditrysia, are generally not well understood, although recent studies have yielded progress. We present the most comprehensive molecular analysis of lepidopteran phylogeny to date, focusing on relationships among superfamilies. METHODOLOGY PRINCIPAL FINDINGS: 483 taxa spanning 115 of 124 families were sampled for 19 protein-coding nuclear genes, from which maximum likelihood tree estimates and bootstrap percentages were obtained using GARLI. Assessment of heuristic search effectiveness showed that better trees and higher bootstrap percentages probably remain to be discovered even after 1000 or more search replicates, but further search proved impractical even with grid computing. Other analyses explored the effects of sampling nonsynonymous change only versus partitioned and unpartitioned total nucleotide change; deletion of rogue taxa; and compositional heterogeneity. Relationships among the non-ditrysian lineages previously inferred from morphology were largely confirmed, plus some new ones, with strong support. Robust support was also found for divergences among non-apoditrysian lineages of Ditrysia, but only rarely so within Apoditrysia. Paraphyly for Tineoidea is strongly supported by analysis of nonsynonymous-only signal; conflicting, strong support for tineoid monophyly when synonymous signal was added back is shown to result from compositional heterogeneity. CONCLUSIONS SIGNIFICANCE: Support for among-superfamily relationships outside the Apoditrysia is now generally strong. Comparable support is mostly lacking within Apoditrysia, but dramatically increased bootstrap percentages for some nodes after rogue taxon removal, and concordance with other evidence, strongly suggest that our picture of apoditrysian phylogeny is approximately correct. This study highlights the challenge of finding optimal topologies when analyzing hundreds of taxa. It also shows that some nodes get strong support only when analysis is restricted to nonsynonymous change, while total change is necessary for strong support of others. Thus, multiple types of analyses will be necessary to fully resolve lepidopteran phylogeny.


Assuntos
Borboletas/genética , Mariposas/genética , Filogenia , Animais , Borboletas/classificação , Mariposas/classificação
13.
PLoS One ; 7(11): e47450, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23185239

RESUMO

BACKGROUND: In a previous study of higher-level arthropod phylogeny, analyses of nucleotide sequences from 62 protein-coding nuclear genes for 80 panarthopod species yielded significantly higher bootstrap support for selected nodes than did amino acids. This study investigates the cause of that discrepancy. METHODOLOGY/PRINCIPAL FINDINGS: The hypothesis is tested that failure to distinguish the serine residues encoded by two disjunct clusters of codons (TCN, AGY) in amino acid analyses leads to this discrepancy. In one test, the two clusters of serine codons (Ser1, Ser2) are conceptually translated as separate amino acids. Analysis of the resulting 21-amino-acid data matrix shows striking increases in bootstrap support, in some cases matching that in nucleotide analyses. In a second approach, nucleotide and 20-amino-acid data sets are artificially altered through targeted deletions, modifications, and replacements, revealing the pivotal contributions of distinct Ser1 and Ser2 codons. We confirm that previous methods of coding nonsynonymous nucleotide change are robust and computationally efficient by introducing two new degeneracy coding methods. We demonstrate for degeneracy coding that neither compositional heterogeneity at the level of nucleotides nor codon usage bias between Ser1 and Ser2 clusters of codons (or their separately coded amino acids) is a major source of non-phylogenetic signal. CONCLUSIONS: The incongruity in support between amino-acid and nucleotide analyses of the forementioned arthropod data set is resolved by showing that "standard" 20-amino-acid analyses yield lower node support specifically when serine provides crucial signal. Separate coding of Ser1 and Ser2 residues yields support commensurate with that found by degenerated nucleotides, without introducing phylogenetic artifacts. While exclusion of all serine data leads to reduced support for serine-sensitive nodes, these nodes are still recovered in the ML topology, indicating that the enhanced signal from Ser1 and Ser2 is not qualitatively different from that of the other amino acids.


Assuntos
Aminoácidos/genética , Artrópodes/genética , Códon/genética , Genômica/métodos , Nucleotídeos/genética , Filogenia , Serina/genética , Animais , Bases de Dados Genéticas , Funções Verossimilhança , Modelos Genéticos , Terminologia como Assunto
14.
Integr Zool ; 4(1): 64-74, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21392277

RESUMO

Voltage-dependent sodium channels are critical for electrical excitability. Invertebrates possess a single sodium channel gene; two rounds of genome duplication early in vertebrates increased the number to four. Since the teleost-tetrapod split, independent gene duplications in each lineage have further increased the number of sodium channel genes to 10 in tetrapods and 8 in teleosts. Here we review how the occurrence of multiple sodium channel paralogs has influenced the evolutionary history of three groups of fishes: pufferfish, gymnotiform and mormyriform electric fish. Pufferfish (tetraodontidae) produce a neurotoxin, tetrodotoxin, that binds to and blocks the pore of sodium channels. Pufferfish evolved resistance to their own toxins by amino acid substitutions in the pore of their sodium channels. These substitutions had to occur in parallel across multiple paralogs for organismal resistance to evolve. Gymnotiform and mormyriform fishes independently evolved electric organs to generate electricity for communication and object localization. Two sodium channel genes are expressed in muscle in most fishes. In both groups of weakly electric fishes, one gene lost its expression in muscle and became compartmentalized in the evolutionary novel electric organ, which is a muscle derivative. This gene then evolved at elevated rates, whereas the gene that is still expressed in muscle does not show elevated rates of evolution. In the electric organ-expressing gene, amino acid substitutions occur in parts of the channel involved in determining how long the channel will be open or closed. The enhanced rate of sequence evolution of this gene likely underlies the species-level variations in the electric signal.


Assuntos
Peixe Elétrico/fisiologia , Evolução Molecular , Canais de Sódio/fisiologia , Tetraodontiformes/fisiologia , Sequência de Aminoácidos , Substituição de Aminoácidos , Animais , Resistência a Medicamentos/genética , Órgão Elétrico/fisiologia , Genes Duplicados/genética , Dados de Sequência Molecular , Músculo Esquelético/metabolismo , Filogenia , Alinhamento de Sequência , Canais de Sódio/genética , Tetrodotoxina/toxicidade
15.
Philos Trans R Soc Lond B Biol Sci ; 363(1512): 4013-21, 2008 Dec 27.
Artigo em Inglês | MEDLINE | ID: mdl-18852108

RESUMO

Computer simulations provide a flexible method for assessing the power and robustness of phylogenetic inference methods. Unfortunately, simulated data are often obviously atypical of data encountered in studies of molecular evolution. Unrealistic simulations can lead to conclusions that are irrelevant to real-data analyses or can provide a biased view of which methods perform well. Here, we present a software tool designed to generate data under a complex codon model that allows each residue in the protein sequence to have a different set of equilibrium amino acid frequencies. The software can obtain maximum-likelihood estimates of the parameters of the Halpern and Bruno model from empirical data and a fixed tree; given an arbitrary tree and a fixed set of parameters, the software can then simulate artificial datasets.We present the results of a simulation experiment using randomly generated tree shapes and substitution parameters estimated from 1610 mammalian cytochrome b sequences.We tested tree inference at the amino acid, nucleotide and codon levels and under parsimony, maximum-likelihood, Bayesian and distance criteria (for a total of more than 650 analyses on each dataset). Based on these simulations, nucleotide-level analyses seem to be more accurate than amino acid and codon analyses. The performance of distance-based phylogenetic methods appears to be quite sensitive to the choice of model and the form of rate heterogeneity used. Further studies are needed to assess the generality of these conclusions. For example, fitting parameters of the Halpern Bruno model to sequences from other genes will reveal the extent to which our conclusions were influenced by the choice of cytochrome b. Incorporating codon bias and more sources heterogeneity into the simulator will be crucial to determining whether the current results are caused by a bias in the current simulation study in favour of nucleotide analyses.


Assuntos
Algoritmos , Substituição de Aminoácidos/genética , Classificação/métodos , Evolução Molecular , Modelos Genéticos , Filogenia , Teorema de Bayes , Códon/genética , Simulação por Computador , Citocromos b/genética , Funções Verossimilhança
16.
J Exp Biol ; 211(Pt 11): 1814-8, 2008 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-18490397

RESUMO

Animal communication systems are subject to natural selection so the imprint of selection must reside in the genome of each species. Electric fish generate electric organ discharges (EODs) from a muscle-derived electric organ (EO) and use these fields for electrolocation and communication. Weakly electric teleosts have evolved at least twice (mormyriforms, gymnotiforms) allowing a comparison of the workings of evolution in two independently evolved sensory/motor systems. We focused on the genes for two Na(+) channels, Nav1.4a and Nav1.4b, which are orthologs of the mammalian muscle-expressed Na(+) channel gene Nav1.4. Both genes are expressed in muscle in non-electric fish. Nav1.4b is expressed in muscle in electric fish, but Nav1.4a expression has been lost from muscle and gained in the evolutionarily novel EO in both groups. We hypothesized that Nav1.4a might be evolving to optimize the EOD for different sensory environments and the generation of species-specific communication signals. We obtained the sequence for Nav1.4a from non-electric, mormyriform and gymnotiform species, estimated a phylogenetic tree, and determined rates of evolution. We observed elevated rates of evolution in this gene in both groups coincident with the loss of Nav1.4a from muscle and its compartmentalization in EO. We found amino acid substitutions at sites known to be critical for channel inactivation; analyses suggest that these changes are likely to be the result of positive selection. We suggest that the diversity of EOD waveforms in both groups of electric fish is correlated with accelerations in the rate of evolution of the Nav1.4a Na(+) channel gene due to changes in selection pressure on the gene once it was solely expressed in the EO.


Assuntos
Comunicação Animal , Peixe Elétrico/genética , Evolução Molecular , Sequência de Aminoácidos , Animais , Proteínas de Peixes/química , Proteínas de Peixes/genética , Proteínas de Peixes/fisiologia , Dados de Sequência Molecular , Proteínas Musculares/química , Proteínas Musculares/genética , Proteínas Musculares/fisiologia , Filogenia , Seleção Genética , Alinhamento de Sequência , Canais de Sódio/química , Canais de Sódio/genética , Canais de Sódio/fisiologia , Especificidade da Espécie
17.
Proc Natl Acad Sci U S A ; 103(10): 3675-80, 2006 Mar 07.
Artigo em Inglês | MEDLINE | ID: mdl-16505358

RESUMO

We investigated whether the evolution of electric organs and electric signal diversity in two independently evolved lineages of electric fishes was accompanied by convergent changes on the molecular level. We found that a sodium channel gene (Na(v)1.4a) that is expressed in muscle in nonelectric fishes has lost its expression in muscle and is expressed instead in the evolutionarily novel electric organ in both lineages of electric fishes. This gene appears to be evolving under positive selection in both lineages, facilitated by its restricted expression in the electric organ. This view is reinforced by the lack of evidence for selection on this gene in one electric species in which expression of this gene is retained in muscle. Amino acid replacements occur convergently in domains that influence channel inactivation, a key trait for shaping electric communication signals. Some amino acid replacements occur at or adjacent to sites at which disease-causing mutations have been mapped in human sodium channel genes, emphasizing that these replacements occur in functionally important domains. Selection appears to have acted on the final step in channel inactivation, but complementarily on the inactivation "ball" in one lineage, and its receptor site in the other lineage. Thus, changes in the expression and sequence of the same gene are associated with the independent evolution of signal complexity.


Assuntos
Peixe Elétrico/genética , Evolução Molecular , Canais de Sódio/genética , Sequência de Aminoácidos , Animais , Peixe Elétrico/classificação , Órgão Elétrico/metabolismo , Peixes/classificação , Peixes/genética , Gimnotiformes/classificação , Gimnotiformes/genética , Humanos , Dados de Sequência Molecular , Filogenia , Homologia de Sequência de Aminoácidos , Transdução de Sinais/genética , Canais de Sódio/química , Especificidade da Espécie
18.
Syst Biol ; 51(4): 588-98, 2002 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-12228001

RESUMO

Several authors have argued recently that extensive taxon sampling has a positive and important effect on the accuracy of phylogenetic estimates. However, other authors have argued that there is little benefit of extensive taxon sampling, and so phylogenetic problems can or should be reduced to a few exemplar taxa as a means of reducing the computational complexity of the phylogenetic analysis. In this paper we examined five aspects of study design that may have led to these different perspectives. First, we considered the measurement of phylogenetic error across a wide range of taxon sample sizes, and conclude that the expected error based on randomly selecting trees (which varies by taxon sample size) must be considered in evaluating error in studies of the effects of taxon sampling. Second, we addressed the scope of the phylogenetic problems defined by different samples of taxa, and argue that phylogenetic scope needs to be considered in evaluating the importance of taxon-sampling strategies. Third, we examined the claim that fast and simple tree searches are as effective as more thorough searches at finding near-optimal trees that minimize error. We show that a more complete search of tree space reduces phylogenetic error, especially as the taxon sample size increases. Fourth, we examined the effects of simple versus complex simulation models on taxonomic sampling studies. Although benefits of taxon sampling are apparent for all models, data generated under more complex models of evolution produce higher overall levels of error and show greater positive effects of increased taxon sampling. Fifth, we asked if different phylogenetic optimality criteria show different effects of taxon sampling. Although we found strong differences in effectiveness of different optimality criteria as a function of taxon sample size, increased taxon sampling improved the results from all the common optimality criteria. Nonetheless, the method that showed the lowest overall performance (minimum evolution) also showed the least improvement from increased taxon sampling. Taking each of these results into account re-enforces the conclusion that increased sampling of taxa is one of the most important ways to increase overall phylogenetic accuracy.


Assuntos
Filogenia , Projetos de Pesquisa , Funções Verossimilhança
19.
Mol Phylogenet Evol ; 25(2): 361-71, 2002 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-12414316

RESUMO

Four New World genera of dwarf boas (Exiliboa, Trachyboa, Tropidophis, and Ungaliophis) have been placed by many systematists in a single group (traditionally called Tropidophiidae). However, the monophyly of this group has been questioned in several studies. Moreover, the overall relationships among basal snake lineages, including the placement of the dwarf boas, are poorly understood. We obtained mtDNA sequence data for 12S, 16S, and intervening tRNA-val genes from 23 species of snakes representing most major snake lineages, including all four genera of New World dwarf boas. We then examined the phylogenetic position of these species by estimating the phylogeny of the basal snakes. Our phylogenetic analysis suggests that New World dwarf boas are not monophyletic. Instead, we find Exiliboa and Ungaliophis to be most closely related to sand boas (Erycinae), boas (Boinae), and advanced snakes (Caenophidea), whereas Tropidophis and Trachyboa form an independent clade that separated relatively early in snake radiation. Our estimate of snake phylogeny differs significantly in other ways from some previous estimates of snake phylogeny. For instance, pythons do not cluster with boas and sand boas, but instead show a strong relationship with Loxocemus and Xenopeltis. Additionally, uropeltids cluster strongly with Cylindrophis, and together are embedded in what has previously been considered the macrostomatan radiation. These relationships are supported by both bootstrapping (parametric and nonparametric approaches) and Bayesian analysis, although Bayesian support values are consistently higher than those obtained from nonparametric bootstrapping. Simulations show that Bayesian support values represent much better estimates of phylogenetic accuracy than do nonparametric bootstrap support values, at least under the conditions of our study.


Assuntos
Boidae/genética , Filogenia , Animais , Teorema de Bayes , Interpretação Estatística de Dados , Funções Verossimilhança , Mitocôndrias/genética , RNA Ribossômico/genética , RNA Ribossômico 16S/genética
20.
Mol Biol Evol ; 19(10): 1717-26, 2002 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-12270898

RESUMO

We investigated the usefulness of a parallel genetic algorithm for phylogenetic inference under the maximum-likelihood (ML) optimality criterion. Parallelization was accomplished by assigning each "individual" in the genetic algorithm "population" to a separate processor so that the number of processors used was equal to the size of the evolving population (plus one additional processor for the control of operations). The genetic algorithm incorporated branch-length and topological mutation, recombination, selection on the ML score, and (in some cases) migration and recombination among subpopulations. We tested this parallel genetic algorithm with large (228 taxa) data sets of both empirically observed DNA sequence data (for angiosperms) as well as simulated DNA sequence data. For both observed and simulated data, search-time improvement was nearly linear with respect to the number of processors, so the parallelization strategy appears to be highly effective at improving computation time for large phylogenetic problems using the genetic algorithm. We also explored various ways of optimizing and tuning the parameters of the genetic algorithm. Under the conditions of our analyses, we did not find the best-known solution using the genetic algorithm approach before terminating each run. We discuss some possible limitations of the current implementation of this genetic algorithm as well as of avenues for its future improvement.


Assuntos
Algoritmos , Modelos Genéticos , Filogenia , Biologia Computacional , Evolução Molecular , Funções Verossimilhança , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA