Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 167
Filtrar
1.
Nature ; 602(7896): 263-267, 2022 02.
Artículo en Inglés | MEDLINE | ID: mdl-34937052

RESUMEN

High-throughput sequencing projects generate genome-scale sequence data for species-level phylogenies1-3. However, state-of-the-art Bayesian methods for inferring timetrees are computationally limited to small datasets and cannot exploit the growing number of available genomes4. In the case of mammals, molecular-clock analyses of limited datasets have produced conflicting estimates of clade ages with large uncertainties5,6, and thus the timescale of placental mammal evolution remains contentious7-10. Here we develop a Bayesian molecular-clock dating approach to estimate a timetree of 4,705 mammal species integrating information from 72 mammal genomes. We show that increasingly larger phylogenomic datasets produce diversification time estimates with progressively smaller uncertainties, facilitating precise tests of macroevolutionary hypotheses. For example, we confidently reject an explosive model of placental mammal origination in the Palaeogene8 and show that crown Placentalia originated in the Late Cretaceous with unambiguous ordinal diversification in the Palaeocene/Eocene. Our Bayesian methodology facilitates analysis of complete genomes and thousands of species within an integrated framework, making it possible to address hitherto intractable research questions on species diversifications. This approach can be used to address other contentious cases of animal and plant diversifications that require analysis of species-level phylogenomic datasets.


Asunto(s)
Evolución Molecular , Mamíferos , Filogenia , Animales , Teorema de Bayes , Euterios/clasificación , Euterios/genética , Femenino , Mamíferos/clasificación , Mamíferos/genética , Placenta , Embarazo , Especificidad de la Especie
2.
Nat Rev Genet ; 21(7): 428-444, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32424311

RESUMEN

Knowing phylogenetic relationships among species is fundamental for many studies in biology. An accurate phylogenetic tree underpins our understanding of the major transitions in evolution, such as the emergence of new body plans or metabolism, and is key to inferring the origin of new genes, detecting molecular adaptation, understanding morphological character evolution and reconstructing demographic changes in recently diverged species. Although data are ever more plentiful and powerful analysis methods are available, there remain many challenges to reliable tree building. Here, we discuss the major steps of phylogenetic analysis, including identification of orthologous genes or proteins, multiple sequence alignment, and choice of substitution models and inference methodologies. Understanding the different sources of errors and the strategies to mitigate them is essential for assembling an accurate tree of life.


Asunto(s)
Genoma , Genómica , Modelos Genéticos , Filogenia , Animales , Biología Computacional/métodos , Cruzamientos Genéticos , Bases de Datos Genéticas , Evolución Molecular , Heterogeneidad Genética , Genómica/métodos , Humanos
3.
Proc Natl Acad Sci U S A ; 120(44): e2310708120, 2023 Oct 31.
Artículo en Inglés | MEDLINE | ID: mdl-37871206

RESUMEN

Analyses of genome sequence data have revealed pervasive interspecific gene flow and enriched our understanding of the role of gene flow in speciation and adaptation. Inference of gene flow using genomic data requires powerful statistical methods. Yet current likelihood-based methods involve heavy computation and are feasible for small datasets only. Here, we implement the multispecies-coalescent-with-migration model in the Bayesian program bpp, which can be used to test for gene flow and estimate migration rates, as well as species divergence times and population sizes. We develop Markov chain Monte Carlo algorithms for efficient sampling from the posterior, enabling the analysis of genome-scale datasets with thousands of loci. Implementation of both introgression and migration models in the same program allows us to test whether gene flow occurred continuously over time or in pulses. Analyses of genomic data from Anopheles mosquitoes demonstrate rich information in typical genomic datasets about the mode and rate of gene flow.


Asunto(s)
Algoritmos , Flujo Génico , Animales , Filogenia , Simulación por Computador , Teorema de Bayes , Funciones de Verosimilitud , Modelos Genéticos
4.
Syst Biol ; 2024 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-38190300

RESUMEN

The opposing forces of gene flow and isolation are two major processes shaping genetic diversity. Understanding how these vary across space and time is necessary to identify the environmental features that promote diversification. The detection of considerable geographic structure in taxa from the arid Nearctic has prompted research into the drivers of isolation in the region. Several geographic features have been proposed as barriers to gene flow, including the Colorado River, Western Continental Divide, and a hypothetical Mid-Peninsular Seaway in Baja California. However, recent studies suggest that the role of barriers in genetic differentiation may have been overestimated when compared to other mechanisms of divergence. In this study, we infer historical and spatial patterns of connectivity and isolation in Desert Spiny Lizards (Sceloporus magister) and Baja Spiny Lizards (S. zosteromus), which together form a species complex composed of parapatric lineages with wide distributions in arid western North America. Our analyses incorporate mitochondrial sequences, genomic-scale data, and past and present climatic data to evaluate the nature and strength of barriers to gene flow in the region. Our approach relies on estimates of migration under the multispecies coalescent to understand the history of lineage divergence in the face of gene flow. Results show that the S. magister complex is geographically structured, but we also detect instances of gene flow. The Continental Divide is a strong barrier to gene flow, while the Colorado River is more permeable. Analyses yield conflicting results for the catalyst of differentiation of peninsular lineages in S. zosteromus. Our study shows how large-scale genomic data for thoroughly sampled species can shed new light on biogeography. Furthermore, our approach highlights the need for the combined analysis of multiple sources of evidence to adequately characterize the drivers of divergence.

5.
Mol Biol Evol ; 40(4)2023 04 04.
Artículo en Inglés | MEDLINE | ID: mdl-37096789

RESUMEN

The CODEML program in the PAML package has been widely used to analyze protein-coding gene sequences to estimate the synonymous and nonsynonymous rates (dS and dN) and to detect positive Darwinian selection driving protein evolution. For users not familiar with molecular evolutionary analysis, the program is known to have a steep learning curve. Here, we provide a step-by-step protocol to illustrate the commonly used tests available in the program, including the branch models, the site models, and the branch-site models, which can be used to detect positive selection driving adaptive protein evolution affecting particular lineages of the species phylogeny, affecting a subset of amino acid residues in the protein, and affecting a subset of sites along prespecified lineages, respectively. A data set of the myxovirus (Mx) genes from ten mammal and two bird species is used as an example. We discuss a new feature in CODEML that allows users to perform positive selection tests for multiple genes for the same set of taxa, as is common in modern genome-sequencing projects. The PAML package is distributed at https://github.com/abacus-gene/paml under the GNU license, with support provided at its discussion site (https://groups.google.com/g/pamlsoftware). Data files used in this protocol are available at https://github.com/abacus-gene/paml-tutorial.


Asunto(s)
Evolución Molecular , Programas Informáticos , Animales , Codón , Secuencia de Bases , Selección Genética , Filogenia , Mamíferos/genética
6.
Mol Biol Evol ; 40(8)2023 08 03.
Artículo en Inglés | MEDLINE | ID: mdl-37552932

RESUMEN

Genomic data are informative about the history of species divergence and interspecific gene flow, including the direction, timing, and strength of gene flow. However, gene flow in opposite directions generates similar patterns in multilocus sequence data, such as reduced sequence divergence between the hybridizing species. As a result, inference of the direction of gene flow is challenging. Here, we investigate the information about the direction of gene flow present in genomic sequence data using likelihood-based methods under the multispecies-coalescent-with-introgression model. We analyze the case of two species, and use simulation to examine cases with three or four species. We find that it is easier to infer gene flow from a small population to a large one than in the opposite direction, and easier to infer inflow (gene flow from outgroup species to an ingroup species) than outflow (gene flow from an ingroup species to an outgroup species). It is also easier to infer gene flow if there is a longer time of separate evolution between the initial divergence and subsequent introgression. When introgression is assumed to occur in the wrong direction, the time of introgression tends to be correctly estimated and the Bayesian test of gene flow is often significant, while estimates of introgression probability can be even greater than the true probability. We analyze genomic sequences from Heliconius butterflies to demonstrate that typical genomic datasets are informative about the direction of interspecific gene flow, as well as its timing and strength.


Asunto(s)
Mariposas Diurnas , Animales , Funciones de Verosimilitud , Teorema de Bayes , Mariposas Diurnas/genética , Genoma , Genómica , Flujo Génico , Filogenia , Hibridación Genética
7.
Syst Biol ; 72(2): 446-465, 2023 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-36504374

RESUMEN

In the past two decades, genomic data have been widely used to detect historical gene flow between species in a variety of plants and animals. The Tamias quadrivittatus group of North America chipmunks, which originated through a series of rapid speciation events, are known to undergo massive amounts of mitochondrial introgression. Yet in a recent analysis of targeted nuclear loci from the group, no evidence for cross-species introgression was detected, indicating widespread cytonuclear discordance. The study used the heuristic method HYDE to detect gene flow, which may suffer from low power. Here we use the Bayesian method implemented in the program BPP to re-analyze these data. We develop a Bayesian test of introgression, calculating the Bayes factor via the Savage-Dickey density ratio using the Markov chain Monte Carlo (MCMC) sample under the model of introgression. We take a stepwise approach to constructing an introgression model by adding introgression events onto a well-supported binary species tree. The analysis detected robust evidence for multiple ancient introgression events affecting the nuclear genome, with introgression probabilities reaching 63%. We estimate population parameters and highlight the fact that species divergence times may be seriously underestimated if ancient cross-species gene flow is ignored in the analysis. We examine the assumptions and performance of HYDE and demonstrate that it lacks power if gene flow occurs between sister lineages or if the mode of gene flow does not match the assumed hybrid-speciation model with symmetrical population sizes. Our analyses highlight the power of likelihood-based inference of cross-species gene flow using genomic sequence data. [Bayesian test; BPP; chipmunks; introgression; MSci; multispecies coalescent; Savage-Dickey density ratio.].


Asunto(s)
Flujo Génico , Sciuridae , Animales , Filogenia , Teorema de Bayes , Sciuridae/genética , Funciones de Verosimilitud , Heurística , América del Norte , ADN Mitocondrial/genética
8.
Syst Biol ; 72(5): 1119-1135, 2023 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-37366056

RESUMEN

Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here, we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-coding data generated under models of heterogeneous substitution processes across sites in the sequence and among lineages on the tree, and then analyzed using nucleotide, amino acid, and codon models. Analysis of DNA sequences under nucleotide-substitution models (possibly with the third codon positions excluded) recovered the correct tree at least as often as analysis of the corresponding protein sequences under modern amino acid models. We also applied the different data-analysis strategies to an empirical dataset to infer the metazoan phylogeny. Our results from both simulated and real data suggest that DNA sequences may be as useful as proteins for inferring deep phylogenies and should not be excluded from such analyses. Analysis of DNA data under nucleotide models has a major computational advantage over protein-data analysis, potentially making it feasible to use advanced models that account for among-site and among-lineage heterogeneity in the nucleotide-substitution process in inference of deep phylogenies.


Asunto(s)
Modelos Genéticos , Nucleótidos , Animales , Filogenia , Secuencia de Bases , Codón , Aminoácidos/genética , Evolución Molecular
9.
Syst Biol ; 72(4): 820-836, 2023 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-36961245

RESUMEN

Cross-species introgression can have significant impacts on phylogenomic reconstruction of species divergence events. Here, we used simulations to show how the presence of even a small amount of introgression can bias divergence time estimates when gene flow is ignored in the analysis. Using advances in analytical methods under the multispecies coalescent (MSC) model, we demonstrate that by accounting for incomplete lineage sorting and introgression using large phylogenomic data sets this problem can be avoided. The multispecies-coalescent-with-introgression (MSci) model is capable of accurately estimating both divergence times and ancestral effective population sizes, even when only a single diploid individual per species is sampled. We characterize some general expectations for biases in divergence time estimation under three different scenarios: 1) introgression between sister species, 2) introgression between non-sister species, and 3) introgression from an unsampled (i.e., ghost) outgroup lineage. We also conducted simulations under the isolation-with-migration (IM) model and found that the MSci model assuming episodic gene flow was able to accurately estimate species divergence times despite high levels of continuous gene flow. We estimated divergence times under the MSC and MSci models from two published empirical datasets with previous evidence of introgression, one of 372 target-enrichment loci from baobabs (Adansonia), and another of 1000 transcriptome loci from 14 species of the tomato relative, Jaltomata. The empirical analyses not only confirm our findings from simulations, demonstrating that the MSci model can reliably estimate divergence times but also show that divergence time estimation under the MSC can be robust to the presence of small amounts of introgression in empirical datasets with extensive taxon sampling. [divergence time; gene flow; hybridization; introgression; MSci model; multispecies coalescent].


Asunto(s)
Flujo Génico , Hibridación Genética , Filogenia , Modelos Genéticos
10.
Mol Biol Evol ; 39(5)2022 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-35417543

RESUMEN

Full-likelihood implementations of the multispecies coalescent with introgression (MSci) model treat genealogical fluctuations across the genome as a major source of information to infer the history of species divergence and gene flow using multilocus sequence data. However, MSci models are known to have unidentifiability issues, whereby different models or parameters make the same predictions about the data and cannot be distinguished by the data. Previous studies of unidentifiability have focused on heuristic methods based on gene trees and do not make an efficient use of the information in the data. Here we study the unidentifiability of MSci models under the full-likelihood methods. We characterize the unidentifiability of the bidirectional introgression (BDI) model, which assumes that gene flow occurs in both directions. We derive simple rules for arbitrary BDI models, which create unidentifiability of the label-switching type. In general, an MSci model with k BDI events has 2k unidentifiable modes or towers in the posterior, with each BDI event between sister species creating within-model parameter unidentifiability and each BDI event between nonsister species creating between-model unidentifiability. We develop novel algorithms for processing Markov chain Monte Carlo samples to remove label-switching problems and implement them in the bpp program. We analyze real and synthetic data to illustrate the utility of the BDI models and the new algorithms. We discuss the unidentifiability of heuristic methods and provide guidelines for the use of MSci models to infer gene flow using genomic data.


Asunto(s)
Flujo Génico , Genómica , Algoritmos , Genómica/métodos , Modelos Genéticos , Filogenia
11.
Mol Biol Evol ; 39(8)2022 08 03.
Artículo en Inglés | MEDLINE | ID: mdl-35907248

RESUMEN

The multispecies coalescent (MSC) model accommodates both species divergences and within-species coalescent and provides a natural framework for phylogenetic analysis of genomic data when the gene trees vary across the genome. The MSC model implemented in the program bpp assumes a molecular clock and the Jukes-Cantor model, and is suitable for analyzing genomic data from closely related species. Here we extend our implementation to more general substitution models and relaxed clocks to allow the rate to vary among species. The MSC-with-relaxed-clock model allows the estimation of species divergence times and ancestral population sizes using genomic sequences sampled from contemporary species when the strict clock assumption is violated, and provides a simulation framework for evaluating species tree estimation methods. We conducted simulations and analyzed two real datasets to evaluate the utility of the new models. We confirm that the clock-JC model is adequate for inference of shallow trees with closely related species, but it is important to account for clock violation for distant species. Our simulation suggests that there is valuable phylogenetic information in the gene-tree branch lengths even if the molecular clock assumption is seriously violated, and the relaxed-clock models implemented in bpp are able to extract such information. Our Markov chain Monte Carlo algorithms suffer from mixing problems when used for species tree estimation under the relaxed clock and we discuss possible improvements. We conclude that the new models are currently most effective for estimating population parameters such as species divergence times when the species tree is fixed.


Asunto(s)
Modelos Genéticos , Teorema de Bayes , Simulación por Computador , Cadenas de Markov , Método de Montecarlo , Filogenia
12.
Mol Biol Evol ; 39(12)2022 12 05.
Artículo en Inglés | MEDLINE | ID: mdl-36317198

RESUMEN

Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species.


Asunto(s)
Flujo Génico , Genómica , Simulación por Computador
13.
Trends Genet ; 36(11): 845-856, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-32709458

RESUMEN

Molecular data have been used to date species divergences ever since they were described as documents of evolutionary history in the 1960s. Yet, an inadequate fossil record and discordance between gene trees and species trees are persistently problematic. We examine how, by accommodating gene tree discordance and by scaling branch lengths to absolute time using mutation rate and generation time, multispecies coalescent (MSC) methods can potentially overcome these challenges. We find that time estimates can differ - in some cases, substantially - depending on whether MSC methods or traditional phylogenetic methods that apply concatenation are used, and whether the tree is calibrated with pedigree-based mutation rates or with fossils. We discuss the advantages and shortcomings of both approaches and provide practical guidance for data analysis when using these methods.


Asunto(s)
Evolución Biológica , Fósiles , Mamíferos/clasificación , Mamíferos/genética , Modelos Teóricos , Tasa de Mutación , Filogenia , Animales , Flujo Génico , Modelos Genéticos
14.
Anal Chem ; 95(2): 1721-1730, 2023 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-36538756

RESUMEN

Early diagnosis of pathogenic bacteria and treatment are essential to prevent further infection. Photothermal therapy (PTT) is a promising sterilization method with advantages of minimal invasiveness and high efficiency. The effect of PTT depends on the performance of photothermal materials. Herein, Ti3C2-Au nanomaterials were prepared by the electrostatic self-assembly method, and the absorption characteristics were modulated by changing the morphology of Ti3C2-Au to achieve high photothermal conversion efficiency and sensitive label-free SERS bacterial detection. The results showed that the prepared Ti3C2-Au had better SERS performance than Au and achieved direct and sensitive detection of Escherichia coli (E. coli) and Staphylococcus aureus (S. aureus). Under 808 nm laser irradiation, the photothermal conversion efficiency of Ti3C2-Au nanobipyramids (NBPs) was increased to 50.41% compared with the other two composites. The bactericidal rates of Ti3C2-Au NBPs against E. coli and S. aureus were 95.11 and 99.80% in 8 min, respectively, and the killing rates of nine other bacteria were all above 95%, showing broad-spectrum antibacterial properties. Cell viability studies showed that the Ti3C2-Au NBP had significantly improved biocompatibility compared with the Au NBP and was suitable for biological applications. It can simultaneously realize sensitive bacterial detection and photothermal sterilization and is important for the detection and inhibition of pathogenic bacteria.


Asunto(s)
Escherichia coli , Nanoestructuras , Staphylococcus aureus , Antibacterianos/farmacología , Esterilización
15.
Syst Biol ; 71(5): 1159-1177, 2022 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-35169847

RESUMEN

Introgressive hybridization plays a key role in adaptive evolution and species diversification in many groups of species. However, frequent hybridization and gene flow between species make estimation of the species phylogeny and key population parameters challenging. Here, we show that by accounting for phasing and using full-likelihood methods, introgression histories and population parameters can be estimated reliably from whole-genome sequence data. We employ the multispecies coalescent (MSC) model with and without gene flow to infer the species phylogeny and cross-species introgression events using genomic data from six members of the erato-sara clade of Heliconius butterflies. The methods naturally accommodate random fluctuations in genealogical history across the genome due to deep coalescence. To avoid heterozygote phasing errors in haploid sequences commonly produced by genome assembly methods, we process and compile unphased diploid sequence alignments and use analytical methods to average over uncertainties in heterozygote phase resolution. There is robust evidence for introgression across the genome, both among distantly related species deep in the phylogeny and between sister species in shallow parts of the tree. We obtain chromosome-specific estimates of key population parameters such as introgression directions, times and probabilities, as well as species divergence times and population sizes for modern and ancestral species. We confirm ancestral gene flow between the sara clade and an ancestral population of Heliconius telesiphe, a likely hybrid speciation origin for Heliconius hecalesia, and gene flow between the sister species Heliconius erato and Heliconius himera. Inferred introgression among ancestral species also explains the history of two chromosomal inversions deep in the phylogeny of the group. This study illustrates how a full-likelihood approach based on the MSC makes it possible to extract rich historical information of species divergence and gene flow from genomic data. [3s; bpp; gene flow; Heliconius; hybrid speciation; introgression; inversion; multispecies coalescent].


Asunto(s)
Mariposas Diurnas , Animales , Mariposas Diurnas/genética , Genómica , Hibridación Genética , Funciones de Verosimilitud , Filogenia
16.
Syst Biol ; 71(2): 334-352, 2022 02 10.
Artículo en Inglés | MEDLINE | ID: mdl-34143216

RESUMEN

Genome sequencing projects routinely generate haploid consensus sequences from diploid genomes, which are effectively chimeric sequences with the phase at heterozygous sites resolved at random. The impact of phasing errors on phylogenomic analyses under the multispecies coalescent (MSC) model is largely unknown. Here, we conduct a computer simulation to evaluate the performance of four phase-resolution strategies (the true phase resolution, the diploid analytical integration algorithm which averages over all phase resolutions, computational phase resolution using the program PHASE, and random resolution) on estimation of the species tree and evolutionary parameters in analysis of multilocus genomic data under the MSC model. We found that species tree estimation is robust to phasing errors when species divergences were much older than average coalescent times but may be affected by phasing errors when the species tree is shallow. Estimation of parameters under the MSC model with and without introgression is affected by phasing errors. In particular, random phase resolution causes serious overestimation of population sizes for modern species and biased estimation of cross-species introgression probability. In general, the impact of phasing errors is greater when the mutation rate is higher, the data include more samples per species, and the species tree is shallower with recent divergences. Use of phased sequences inferred by the PHASE program produced small biases in parameter estimates. We analyze two real data sets, one of East Asian brown frogs and another of Rocky Mountains chipmunks, to demonstrate that heterozygote phase-resolution strategies have similar impacts on practical data analyses. We suggest that genome sequencing projects should produce unphased diploid genotype sequences if fully phased data are too challenging to generate, and avoid haploid consensus sequences, which have heterozygous sites phased at random. In case the analytical integration algorithm is computationally unfeasible, computational phasing prior to population genomic analyses is an acceptable alternative. [BPP; introgression; multispecies coalescent; phase; species tree.].


Asunto(s)
Diploidia , Modelos Genéticos , Simulación por Computador , Heterocigoto , Filogenia
17.
Mol Biol Evol ; 38(9): 3993-4009, 2021 08 23.
Artículo en Inglés | MEDLINE | ID: mdl-33492385

RESUMEN

The multispecies coalescent model provides a natural framework for species tree estimation accounting for gene-tree conflicts. Although a number of species tree methods under the multispecies coalescent have been suggested and evaluated using simulation, their statistical properties remain poorly understood. Here, we use mathematical analysis aided by computer simulation to examine the identifiability, consistency, and efficiency of different species tree methods in the case of three species and three sequences under the molecular clock. We consider four major species-tree methods including concatenation, two-step, independent-sites maximum likelihood, and maximum likelihood. We develop approximations that predict that the probit transform of the species tree estimation error decreases linearly with the square root of the number of loci. Even in this simplest case, major differences exist among the methods. Full-likelihood methods are considerably more efficient than summary methods such as concatenation and two-step. They also provide estimates of important parameters such as species divergence times and ancestral population sizes,whereas these parameters are not identifiable by summary methods. Our results highlight the need to improve the statistical efficiency of summary methods and the computational efficiency of full likelihood methods of species tree estimation.


Asunto(s)
Modelos Genéticos , Simulación por Computador , Filogenia , Densidad de Población , Probabilidad
18.
Mol Biol Evol ; 38(7): 2930-2945, 2021 06 25.
Artículo en Inglés | MEDLINE | ID: mdl-33744959

RESUMEN

Cis-regulatory elements play important roles in tissue-specific gene expression and in the evolution of various phenotypes, and mutations in promoters and enhancers may be responsible for adaptations of species to environments. TRIM72 is a highly conserved protein that is involved in energy metabolism. Its expression in the heart varies considerably in primates, with high levels of expression in Old World monkeys and near absence in hominids. Here, we combine phylogenetic hypothesis testing and experimentation to demonstrate that mutations in promoter are responsible for the differences among primate species in the heart-specific expression of TRIM72. Maximum likelihood estimates of lineage-specific substitution rates under local-clock models show that relative to the evolutionary rate of introns, the rate of promoter was accelerated by 78% in the common ancestor of Old World monkeys, suggesting a role for positive selection in the evolution of the TRIM72 promoter, possibly driven by selective pressure due to changes in cardiac physiology after species divergence. We demonstrate that mutations in the TRIM72 promoter account for the differential myocardial TRIM72 expression of the human and the rhesus macaque. Furthermore, changes in TRIM72 expression alter the expression of genes involved in oxidative phosphorylation, which in turn affects mitochondrial respiration and cardiac energy capacity. On a broader timescale, phylogenetic regression analyses of data from 29 mammalian species show that mammals with high cardiac expression of TRIM72 have high heart rate, suggesting that the expression changes of TRIM72 may be related to differences in the heart physiology of those species.


Asunto(s)
Evolución Biológica , Miocardio/metabolismo , Primates/genética , Regiones Promotoras Genéticas/genética , Proteínas de Motivos Tripartitos/genética , Animales , Metabolismo Basal , Regulación de la Expresión Génica/genética , Frecuencia Cardíaca , Humanos , Mutación , Fosforilación Oxidativa , Primates/metabolismo , Proteínas de Motivos Tripartitos/metabolismo
19.
Proc Biol Sci ; 289(1980): 20220596, 2022 08 10.
Artículo en Inglés | MEDLINE | ID: mdl-35946151

RESUMEN

Microsatellites have been a workhorse of evolutionary genetic studies for decades and are still commonly in use for estimating signatures of genetic diversity at the population and species level across a multitude of taxa. Yet, the very high mutation rate of these loci is a double-edged sword, conferring great sensitivity at shallow levels of analysis (e.g. paternity analysis) but yielding considerable uncertainty for deeper evolutionary comparisons. For the present study, we used reduced representation genome-wide data (restriction site-associated DNA sequencing (RADseq)) to test for patterns of interspecific hybridization previously characterized using microsatellite data in a contact zone between two closely related mouse lemur species in Madagascar (Microcebus murinus and Microcebus griseorufus). We revisit this system by examining populations in, near, and far from the contact zone, including many of the same individuals that had previously been identified as hybrids with microsatellite data. Surprisingly, we find no evidence for admixed nuclear ancestry. Instead, re-analyses of microsatellite data and simulations suggest that previously inferred hybrids were false positives and that the program NewHybrids can be particularly sensitive to erroneously inferring hybrid ancestry. Combined with results from coalescent-based analyses and evidence for local syntopic co-occurrence, we conclude that the two mouse lemur species are in fact completely reproductively isolated, thus providing a new understanding of the evolutionary rate whereby reproductive isolation can be achieved in a primate.


Asunto(s)
Cheirogaleidae , Lemur , Animales , Evolución Biológica , Cheirogaleidae/genética , Hibridación Genética , Lemur/genética , Madagascar , Repeticiones de Microsatélite , Análisis de Secuencia de ADN
20.
Mol Ecol ; 31(10): 2814-2829, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35313033

RESUMEN

Phylogenomic analyses under the multispecies coalescent model assume no recombination within locus and free recombination among loci. Yet, in real data sets intralocus recombination causes different sites of the same locus to have different genealogical histories so that the model is misspecified. The impact of recombination on various coalescent-based phylogenomic analyses has not been systematically examined. Here, we conduct a computer simulation to examine the impact of recombination on several Bayesian analyses of multilocus sequence data, including species tree estimation, species delimitation (by Bayesian selection of delimitation models) and estimation of evolutionary parameters such as species divergence and introgression times, population sizes for modern and extinct species, and cross-species introgression probabilities. We found that recombination, at rates comparable to estimates from the human being, has little impact on coalescent-based species tree estimation, species delimitation and estimation of population parameters. At rates 10 times higher than the human rate, recombination may affect parameter estimation, causing positive biases in introgression times and ancestral population sizes, although species divergence times and cross-species introgression probabilities are estimated with little bias. Overall, the simulation suggests that phylogenomic inferences under the multispecies coalescent model are robust to realistic amounts of intralocus recombination.


Asunto(s)
Modelos Genéticos , Recombinación Genética , Teorema de Bayes , Simulación por Computador , Humanos , Filogenia , Recombinación Genética/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA