RESUMEN
We investigate a 2,000-year genetic transect through Scandinavia spanning the Iron Age to the present, based on 48 new and 249 published ancient genomes and genotypes from 16,638 modern individuals. We find regional variation in the timing and magnitude of gene flow from three sources: the eastern Baltic, the British-Irish Isles, and southern Europe. British-Irish ancestry was widespread in Scandinavia from the Viking period, whereas eastern Baltic ancestry is more localized to Gotland and central Sweden. In some regions, a drop in current levels of external ancestry suggests that ancient immigrants contributed proportionately less to the modern Scandinavian gene pool than indicated by the ancestry of genomes from the Viking and Medieval periods. Finally, we show that a north-south genetic cline that characterizes modern Scandinavians is mainly due to the differential levels of Uralic ancestry and that this cline existed in the Viking Age and possibly earlier.
Asunto(s)
Genoma Humano , Humanos , Europa (Continente) , Variación Genética , Países Escandinavos y Nórdicos , Reino Unido , Población Blanca/genética , Población Blanca/historia , Migración HumanaRESUMEN
Regulatory variation influencing gene expression is a key contributor to phenotypic diversity, both within and between species. Unfortunately, RNA degrades too rapidly to be recovered from fossil remains, limiting functional genomic insights about our extinct hominin relatives. Many Neanderthal sequences survive in modern humans due to ancient hybridization, providing an opportunity to assess their contributions to transcriptional variation and to test hypotheses about regulatory evolution. We developed a flexible Bayesian statistical approach to quantify allele-specific expression (ASE) in complex RNA-seq datasets. We identified widespread expression differences between Neanderthal and modern human alleles, indicating pervasive cis-regulatory impacts of introgression. Brain regions and testes exhibited significant downregulation of Neanderthal alleles relative to other tissues, consistent with natural selection influencing the tissue-specific regulatory landscape. Our study demonstrates that Neanderthal-inherited sequences are not silent remnants of ancient interbreeding but have measurable impacts on gene expression that contribute to variation in modern human phenotypes.
Asunto(s)
Evolución Molecular , Expresión Génica , Hombre de Neandertal/genética , Animales , Encéfalo/metabolismo , Regulación de la Expresión Génica , Humanos , Masculino , Especificidad de Órganos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Testículo/metabolismoRESUMEN
Genomic time series from experimental evolution studies and ancient DNA datasets offer us a chance to directly observe the interplay of various evolutionary forces. We show how the genome-wide variance in allele frequency change between two time points can be decomposed into the contributions of gene flow, genetic drift, and linked selection. In closed populations, the contribution of linked selection is identifiable because it creates covariances between time intervals, and genetic drift does not. However, repeated gene flow between populations can also produce directionality in allele frequency change, creating covariances. We show how to accurately separate the fraction of variance in allele frequency change due to admixture and linked selection in a population receiving gene flow. We use two human ancient DNA datasets, spanning around 5,000 y, as time transects to quantify the contributions to the genome-wide variance in allele frequency change. We find that a large fraction of genome-wide change is due to gene flow. In both cases, after correcting for known major gene flow events, we do not observe a signal of genome-wide linked selection. Thus despite the known role of selection in shaping long-term polymorphism levels, and an increasing number of examples of strong selection on single loci and polygenic scores from ancient DNA, it appears to be gene flow and drift, and not selection, that are the main determinants of recent genome-wide allele frequency change. Our approach should be applicable to the growing number of contemporary and ancient temporal population genomics datasets.
Asunto(s)
Flujo Génico , Selección Genética , Humanos , ADN Antiguo , Frecuencia de los Genes , Flujo Genético , Genética de PoblaciónRESUMEN
Analyses of genome sequence data have revealed pervasive interspecific gene flow and enriched our understanding of the role of gene flow in speciation and adaptation. Inference of gene flow using genomic data requires powerful statistical methods. Yet current likelihood-based methods involve heavy computation and are feasible for small datasets only. Here, we implement the multispecies-coalescent-with-migration model in the Bayesian program bpp, which can be used to test for gene flow and estimate migration rates, as well as species divergence times and population sizes. We develop Markov chain Monte Carlo algorithms for efficient sampling from the posterior, enabling the analysis of genome-scale datasets with thousands of loci. Implementation of both introgression and migration models in the same program allows us to test whether gene flow occurred continuously over time or in pulses. Analyses of genomic data from Anopheles mosquitoes demonstrate rich information in typical genomic datasets about the mode and rate of gene flow.
Asunto(s)
Algoritmos , Flujo Génico , Animales , Filogenia , Simulación por Computador , Teorema de Bayes , Funciones de Verosimilitud , Modelos GenéticosRESUMEN
Polyploidy is a major evolutionary force that has shaped plant diversity. However, the various pathways toward polyploid formation and interploidy gene flow remain poorly understood. Here, we demonstrated that the immediate progeny of allotriploid AAC Brassica (obtained by crossing allotetraploid Brassica napus and diploid Brassica rapa) was predominantly aneuploids with ploidal levels ranging from near-triploidy to near-hexaploidy, and their chromosome numbers deviated from the theoretical distribution toward increasing chromosome numbers, suggesting that they underwent selection. Karyotype and phenotype analyses showed that aneuploid individuals containing fewer imbalanced chromosomes had higher viability and fertility. Within three generations of self-fertilization, allotriploids mainly developed into near or complete allotetraploids similar to B. napus via gradually increasing chromosome numbers and fertility, suggesting that allotriploids could act as a bridge in polyploid formation, with aneuploids as intermediates. Self-fertilized interploidy hybrids ultimately generated new allopolyploids carrying different chromosome combinations, which may create a reproductive barrier preventing allotetraploidy back to diploidy and promote gene flow from diploids to allotetraploids. These results suggest that the maintenance of a proper genome balance and dosage drove the recurrent conversion of allotriploids to allotetraploids, which may contribute to the formation and evolution of polyploids.
Asunto(s)
Brassica napus , Brassica , Brassica/genética , Genoma de Planta/genética , Poliploidía , Brassica napus/genética , AneuploidiaRESUMEN
Genome re-arrangements such as chromosomal inversions are often involved in adaptation. As such, they experience natural selection, which can erode genetic variation. Thus, whether and how inversions can remain polymorphic for extended periods of time remains debated. Here we combine genomics, experiments, and evolutionary modeling to elucidate the processes maintaining an inversion polymorphism associated with the use of a challenging host plant (Redwood trees) in Timema stick insects. We show that the inversion is maintained by a combination of processes, finding roles for life-history trade-offs, heterozygote advantage, local adaptation to different hosts, and gene flow. We use models to show how such multi-layered regimes of balancing selection and gene flow provide resilience to help buffer populations against the loss of genetic variation, maintaining the potential for future evolution. We further show that the inversion polymorphism has persisted for millions of years and is not a result of recent introgression. We thus find that rather than being a nuisance, the complex interplay of evolutionary processes provides a mechanism for the long-term maintenance of genetic variation.
Asunto(s)
Aclimatación , Inversión Cromosómica , Animales , Inversión Cromosómica/genética , Flujo Génico , Genómica , Heterocigoto , NeopteraRESUMEN
Ghost introgression, or the transfer of genetic material from extinct or unsampled lineages to sampled species, has attracted much attention. However, conclusive evidence for ghost introgression, especially in plant species, remains scarce. Here, we newly assembled chromosome-level genomes for both Carya sinensis and Carya cathayensis, and additionally re-sequenced the whole genomes of 43 C. sinensis individuals as well as 11 individuals representing 11 diploid hickory species. These genomic datasets were used to investigate the reticulation and bifurcation patterns within the genus Carya (Juglandaceae), with a particular focus on the beaked hickory C. sinensis. By combining the D-statistic and BPP methods, we obtained compelling evidence that supports the occurrence of ghost introgression in C. sinensis from an extinct ancestral hickory lineage. This conclusion was reinforced through the phylogenetic network analysis and a genome scan method VolcanoFinder, the latter of which can detect signatures of adaptive introgression from unknown donors. Our results not only dispel certain misconceptions about the phylogenetic history of C. sinensis but also further refine our understanding of Carya's biogeography via divergence estimates. Moreover, the successful integration of the D-statistic and BPP methods demonstrates their efficacy in facilitating a more precise identification of introgression types.
Asunto(s)
Introgresión Genética , Genoma de Planta , Filogenia , Genoma de Planta/genética , Genómica , Asia Oriental , Pueblos del Este de AsiaRESUMEN
Advances in genomic studies have revealed that hybridization in nature is pervasive and raised questions about the dynamics of different genetic and evolutionary factors following the initial hybridization event. While recent research has proposed that the genomic outcomes of hybridization might be predictable to some extent, many uncertainties remain. With comprehensive whole-genome sequence data, we investigated the genetic introgression between 2 divergent lineages of 9-spined sticklebacks (Pungitius pungitius) in the Baltic Sea. We found that the intensity and direction of selection on the introgressed variation has varied across different genomic elements: while functionally important regions displayed reduced rates of introgression, promoter regions showed enrichment. Despite the general trend of negative selection, we identified specific genomic regions that were enriched for introgressed variants, and within these regions, we detected footprints of selection, indicating adaptive introgression. Geographically, we found the selection against the functional changes to be strongest in the vicinity of the secondary contact zone and weaken as a function of distance from the initial contact. Altogether, the results suggest that the stabilization of introgressed variation in the genomes is a complex, multistage process involving both negative and positive selection. In spite of the predominance of negative selection against introgressed variants, we also found evidence for adaptive introgression variants likely associated with adaptation to Baltic Sea environmental conditions.
Asunto(s)
Introgresión Genética , Smegmamorpha , Animales , Smegmamorpha/genética , Genoma , Genómica , Hibridación GenéticaRESUMEN
Admixture between populations and species is common in nature. Since the influx of new genetic material might be either facilitated or hindered by selection, variation in mixture proportions along the genome is expected in organisms undergoing recombination. Various graph-based models have been developed to better understand these evolutionary dynamics of population splits and mixtures. However, current models assume a single mixture rate for the entire genome and do not explicitly account for linkage. Here, we introduce TreeSwirl, a novel method for inferring branch lengths and locus-specific mixture proportions by using genome-wide allele frequency data, assuming that the admixture graph is known or has been inferred. TreeSwirl builds upon TreeMix that uses Gaussian processes to estimate the presence of gene flow between diverged populations. However, in contrast to TreeMix, our model infers locus-specific mixture proportions employing a hidden Markov model that accounts for linkage. Through simulated data, we demonstrate that TreeSwirl can accurately estimate locus-specific mixture proportions and handle complex demographic scenarios. It also outperforms related D- and f-statistics in terms of accuracy and sensitivity to detect introgressed loci.
Asunto(s)
Frecuencia de los Genes , Modelos Genéticos , Genética de Población/métodos , Cadenas de Markov , Flujo Génico , Genoma , Simulación por Computador , Ligamiento GenéticoRESUMEN
Speciation in the face of gene flow is usually associated with a heterogeneous genomic landscape of divergence in nascent species pairs. However, multiple factors, such as divergent selection and local recombination rate variation, can influence the formation of these genomic islands. Examination of the genomic landscapes of species pairs that are still in the early stages of speciation provides an insight into this conundrum. In this study, population genomic analyses were undertaken using a wide range of sampling and whole-genome resequencing data from 96 unrelated individuals of Kentish plover (Charadrius alexandrinus) and white-faced plover (Charadrius dealbatus). We suggest that the two species exhibit varying levels of population admixture along the Chinese coast and on the Taiwan Island. Genome-wide analyses for introgression indicate that ancient introgression had occurred in Taiwan population, and gene flow is still ongoing in mainland coastal populations. Furthermore, we identified a few genomic regions with significant levels of interspecific differentiation and local recombination suppression, which contain several genes potentially associated with disease resistance, coloration, and regulation of plumage molting and thus may be relevant to the phenotypic and ecological divergence of the two nascent species. Overall, our findings suggest that divergent selection in low recombination regions may be a main force in shaping the genomic islands in two incipient shorebird species.
Asunto(s)
Estudio de Asociación del Genoma Completo , Islas Genómicas , Humanos , Especiación Genética , Genoma , Flujo Génico , Recombinación Genética , Selección GenéticaRESUMEN
The heterogeneous landscape of genomic variation has been well documented in population genomic studies. However, disentangling the intricate interplay of evolutionary forces influencing the genetic variation landscape over time remains challenging. In this study, we assembled a chromosome-level genome for Castanopsis eyrei and sequenced the whole genomes of 276 individuals from 12 Castanopsis species, spanning a broad divergence continuum. We found highly correlated genomic variation landscapes across these species. Furthermore, variations in genetic diversity and differentiation along the genome were strongly associated with recombination rates and gene density. These results suggest that long-term linked selection and conserved genomic features have contributed to the formation of a common genomic variation landscape. By examining how correlations between population summary statistics change throughout the species divergence continuum, we determined that background selection alone does not fully explain the observed patterns of genomic variation; the effects of recurrent selective sweeps must be considered. We further revealed that extensive gene flow has significantly influenced patterns of genomic variation in Castanopsis species. The estimated admixture proportion correlated positively with recombination rate and negatively with gene density, supporting a scenario of selection against gene flow. Additionally, putative introgression regions exhibited strong signals of positive selection, an enrichment of functional genes, and reduced genetic burdens, indicating that adaptive introgression has played a role in shaping the genomes of hybridizing species. This study provides insights into how different evolutionary forces have interacted in driving the evolution of the genomic variation landscape.
Asunto(s)
Variación Genética , Selección Genética , Evolución Molecular , Flujo Génico , Fagaceae/genéticaRESUMEN
Different genomic regions may reflect conflicting phylogenetic topologies primarily due to incomplete lineage sorting and/or gene flow. Genomic data are necessary to reconstruct the true species tree and explore potential causes of phylogenetic conflict. Here, we investigate the phylogenetic relationships of 4 Emberiza species (Aves: Emberizidae) and discuss the potential causes of the observed mitochondrial non-monophyly of Emberiza godlewskii (Godlewski's bunting) using phylogenomic analyses based on whole genome resequencing data from 41 birds. Analyses based on both the whole mitochondrial genome and ~39 kilobases from the non-recombining W chromosome reveal sister relationships between each the northern and southern populations of E. godlewskii with E. cioides and E. cia, respectively. In contrast, the monophyly of E. godlewskii is reflected by the phylogenetic signal of autosomal and Z chromosomal sequence data as well as demographic inference analyses, which-in combination-support the following tree topology: ([{E. godlewskii, E. cia}, E. cioides], E. jankowskii). Using D-statistics, we detected multiple gene flow events among different lineages, indicating pervasive introgressive hybridization within this clade. Introgression from an unsampled lineage that is sister to E. cioides or introgression from an unsampled mitochondrialâ +â W chromosomal lineage of E. cioides into northern E. godlewskii may explain the phylogenetic conflict between the species tree estimated from genome-wide data versus mtDNA/W tree topologies. These results underscore the importance of using genomic data for phylogenetic reconstruction and species delimitation.
Asunto(s)
Passeriformes , Filogenia , Animales , Passeriformes/genética , Passeriformes/clasificación , Herencia Materna/genética , Genoma Mitocondrial/genética , Flujo GénicoRESUMEN
The multispecies coalescent (MSC) model accommodates genealogical fluctuations across the genome and provides a natural framework for comparative analysis of genomic sequence data from closely related species to infer the history of species divergence and gene flow. Given a set of populations, hypotheses of species delimitation (and species phylogeny) may be formulated as instances of MSC models (e.g., MSC for one species versus MSC for two species) and compared using Bayesian model selection. This approach, implemented in the program bpp, has been found to be prone to over-splitting. Alternatively heuristic criteria based on population parameters (such as popula- tion split times, population sizes, and migration rates) estimated from genomic data may be used to delimit species. Here we develop hierarchical merge and split algorithms for heuristic species delimitation based on the genealogical divergence index (ððð) and implement them in a python pipeline called hhsd. We characterize the behavior of the ððð under a few simple scenarios of gene flow. We apply the new approaches to a dataset simulated under a model of isolation by distance as well as three empirical datasets. Our tests suggest that the new approaches produced sensible results and were less prone to over-splitting. We discuss possible strategies for accommodating paraphyletic species in the hierarchical algorithm, as well as the challenges of species delimitation based on heuristic criteria.
RESUMEN
Introgression allows polyploid species to acquire new genomic content from diploid progenitors or from other unrelated diploid or polyploid lineages, contributing to genetic diversity and facilitating adaptive allele discovery. In some cases, high levels of introgression elicit the replacement of large numbers of alleles inherited from the polyploid's ancestral species, profoundly reshaping the polyploid's genomic composition. In such complex polyploids, it is often difficult to determine which taxa were the progenitor species and which taxa provided additional introgressive blocks through subsequent hybridization. Here, we use population-level genomic data to reconstruct the phylogenetic history of Betula pubescens (downy birch), a tetraploid species often assumed to be of allopolyploid origin and which is known to hybridize with at least four other birch species. This was achieved by modeling polyploidization and introgression events under the multispecies coalescent and then using an approximate Bayesian computation rejection algorithm to evaluate and compare competing polyploidization models. We provide evidence that B. pubescens is the outcome of an autoploid genome doubling event in the common ancestor of B. pendula and its extant sister species, B. platyphylla, that took place approximately 178,000-188,000 generations ago. Extensive hybridization with B. pendula, B. nana, and B. humilis followed in the aftermath of autopolyploidization, with the relative contribution of each of these species to the B. pubescens genome varying markedly across the species' range. Functional analysis of B. pubescens loci containing alleles introgressed from B. nana identified multiple genes involved in climate adaptation, while loci containing alleles derived from B. humilis revealed several genes involved in the regulation of meiotic stability and pollen viability in plant species.
Asunto(s)
Alelos , Betula , Genoma de Planta , Filogenia , Poliploidía , Betula/genética , Betula/clasificación , Introgresión Genética , Hibridación GenéticaRESUMEN
Rates of species formation vary widely across the tree of life and contribute to massive disparities in species richness among clades. This variation can emerge from differences in metapopulation-level processes that affect the rates at which lineages diverge, persist, and evolve reproductive barriers and ecological differentiation. For example, populations that evolve reproductive barriers quickly should form new species at faster rates than populations that acquire reproductive barriers more slowly. This expectation implicitly links microevolutionary processes (the evolution of populations) and macroevolutionary patterns (the profound disparity in speciation rate across taxa). Here, leveraging extensive field sampling from the Neotropical Cerrado biome in a biogeographically controlled natural experiment, we test the role of an important microevolutionary process-the propensity for population isolation-as a control on speciation rate in lizards and snakes. By quantifying population genomic structure across a set of codistributed taxa with extensive and phylogenetically independent variation in speciation rate, we show that broad-scale patterns of species formation are decoupled from demographic and genetic processes that promote the formation of population isolates. Population isolation is likely a critical stage of speciation for many taxa, but our results suggest that interspecific variability in the propensity for isolation has little influence on speciation rates. These results suggest that other stages of speciation-including the rate at which reproductive barriers evolve and the extent to which newly formed populations persist-are likely to play a larger role than population isolation in controlling speciation rate variation in squamates.
Asunto(s)
Evolución Biológica , Especiación Genética , Aislamiento Reproductivo , Reptiles/genética , Animales , Biodiversidad , Evolución Molecular , Genética de Población , Lagartos/clasificación , Lagartos/genética , Filogenia , Filogeografía , Reptiles/clasificación , Serpientes/clasificación , Serpientes/genéticaRESUMEN
Host-associated microbiomes, particularly gut microbiomes, often harbor related but distinct microbial lineages, but how this diversity arises and is maintained is not well understood. A prerequisite for lineage diversification is reproductive isolation imposed by barriers to gene flow. In host-associated microbes, genetic recombination can be disrupted by confinement to different hosts, for example following host speciation, or by niche partitioning within the same host. Taking advantage of the simple gut microbiome of social bees, we explore the diversification of two groups of gut-associated bacteria, Gilliamella and Snodgrassella, which have evolved for 80 million y with honey bees and bumble bees. Our analyses of sequenced genomes show that these lineages have diversified into discrete populations with limited gene flow. Divergence has occurred between symbionts of different host species and, in some cases, between symbiont lineages within a single host individual. Populations have acquired genes to adapt to specific hosts and ecological niches; for example, Gilliamella lineages differ markedly in abilities to degrade dietary polysaccharides and to use the resulting sugar components. Using engineered fluorescent bacteria in vivo, we show that Gilliamella lineages localize to different hindgut regions, corresponding to differences in their abilities to use spatially concentrated nitrogenous wastes of hosts. Our findings show that bee gut bacteria can diversify due to isolation in different host species and also due to spatial niche partitioning within individual hosts, leading to barriers to gene flow.
Asunto(s)
Microbioma Gastrointestinal , Microbiota , Adaptación Fisiológica , Animales , Bacterias/genética , Abejas , Especificidad del HuéspedRESUMEN
Hybridization has long been recognized as a fundamental evolutionary process in plants but, until recently, our understanding of its phylogenetic distribution and biological significance across deep evolutionary scales has been largely obscure. Over the past decade, genomic and phylogenomic datasets have revealed, perhaps not surprisingly, that hybridization, often associated with polyploidy, has been common throughout the evolutionary history of plants, particularly in various lineages of flowering plants. However, phylogenomic studies have also highlighted the challenges of disentangling signals of ancient hybridization from other sources of genomic conflict (in particular, incomplete lineage sorting). Here, we provide a critical review of ancient hybridization in vascular plants, outlining well-documented cases of ancient hybridization across plant phylogeny, as well as the challenges unique to documenting ancient versus recent hybridization. We provide a definition for ancient hybridization, which, to our knowledge, has not been explicitly attempted before. Further documenting the extent of deep reticulation in plants should remain an important research focus, especially because published examples likely represent the tip of the iceberg in terms of the total extent of ancient hybridization. However, future research should increasingly explore the macroevolutionary significance of this process, in terms of its impact on evolutionary trajectories (e.g. how does hybridization influence trait evolution or the generation of biodiversity over long time scales?), as well as how life history and ecological factors shape, or have shaped, the frequency of hybridization across geologic time and plant phylogeny. Finally, we consider the implications of ubiquitous ancient hybridization for how we conceptualize, analyze, and classify plant phylogeny. Networks, as opposed to bifurcating trees, represent more accurate representations of evolutionary history in many cases, although our ability to infer, visualize, and use networks for comparative analyses is highly limited. Developing improved methods for the generation, visualization, and use of networks represents a critical future direction for plant biology. Current classification systems also do not generally allow for the recognition of reticulate lineages, and our classifications themselves are largely based on evidence from the chloroplast genome. Updating plant classification to better reflect nuclear phylogenies, as well as considering whether and how to recognize hybridization in classification systems, will represent an important challenge for the plant systematics community.
Asunto(s)
Hibridación Genética , Magnoliopsida , Filogenia , Genómica , Genoma , Magnoliopsida/genética , Plantas/genéticaRESUMEN
Impacts of immigration on micro-evolution and population dynamics fundamentally depend on net rates and forms of resulting gene flow into recipient populations. Yet, the degrees to which observed rates and sex ratios of physical immigration translate into multi-generational genetic legacies have not been explicitly quantified in natural meta-populations, precluding inference on how movements translate into effective gene flow and eco-evolutionary outcomes. Our analyses of three decades of complete song sparrow (Melospiza melodia) pedigree data show that multi-generational genetic contributions from regular natural immigrants substantially exceeded those from contemporary natives, consistent with heterosis-enhanced introgression. However, while contributions from female immigrants exceeded those from female natives by up to three-fold, male immigrants' lineages typically went locally extinct soon after arriving. Both the overall magnitude, and the degree of female bias, of effective gene flow therefore greatly exceeded those which would be inferred from observed physical arrivals, altering multiple eco-evolutionary implications of immigration.
Asunto(s)
Emigrantes e Inmigrantes , Passeriformes , Animales , Masculino , Humanos , Femenino , Flujo Génico , Dinámica PoblacionalRESUMEN
The Arctic is warming four times faster than the rest of the world, threatening the persistence of many Arctic species. It is uncertain if Arctic wildlife will have sufficient time to adapt to such rapidly warming environments. We used genetic forecasting to measure the risk of maladaptation to warming temperatures and sea ice loss in polar bears (Ursus maritimus) sampled across the Canadian Arctic. We found evidence for local adaptation to sea ice conditions and temperature. Forecasting of genome-environment mismatches for predicted climate scenarios suggested that polar bears in the Canadian high Arctic had the greatest risk of becoming maladapted to climate warming. While Canadian high Arctic bears may be the most likely to become maladapted, all polar bears face potentially negative outcomes to climate change. Given the importance of the sea ice habitat to polar bears, we expect that maladaptation to future warming is already widespread across Canada.
Asunto(s)
Cambio Climático , Ursidae , Ursidae/genética , Animales , Canadá , Regiones Árticas , Adaptación Fisiológica , Cubierta de Hielo , Ecosistema , TemperaturaRESUMEN
Genomic data are informative about the history of species divergence and interspecific gene flow, including the direction, timing, and strength of gene flow. However, gene flow in opposite directions generates similar patterns in multilocus sequence data, such as reduced sequence divergence between the hybridizing species. As a result, inference of the direction of gene flow is challenging. Here, we investigate the information about the direction of gene flow present in genomic sequence data using likelihood-based methods under the multispecies-coalescent-with-introgression model. We analyze the case of two species, and use simulation to examine cases with three or four species. We find that it is easier to infer gene flow from a small population to a large one than in the opposite direction, and easier to infer inflow (gene flow from outgroup species to an ingroup species) than outflow (gene flow from an ingroup species to an outgroup species). It is also easier to infer gene flow if there is a longer time of separate evolution between the initial divergence and subsequent introgression. When introgression is assumed to occur in the wrong direction, the time of introgression tends to be correctly estimated and the Bayesian test of gene flow is often significant, while estimates of introgression probability can be even greater than the true probability. We analyze genomic sequences from Heliconius butterflies to demonstrate that typical genomic datasets are informative about the direction of interspecific gene flow, as well as its timing and strength.