RESUMEN
The characterization of genes and biological functions underlying functional diversification and the formation of species is a major goal of evolutionary biology. In this study, we investigated the fast radiation of Microtus voles, one of the most speciose group of mammals, which shows strong genetic divergence despite few readily observable morphological differences. We produced an annotated reference genome for the common vole, Microtus arvalis, and resequenced the genomes of 10 different species and evolutionary lineages spanning the Microtus speciation continuum. Our full genome sequences illustrate the recent and fast diversification of this group, and we identified genes in highly divergent genomic windows that have likely particular roles in their radiation. We found three biological functions enriched for highly divergent genes in most Microtus species and lineages: olfaction, immunity and metabolism. In particular, olfaction-related genes (mostly olfactory receptors and vomeronasal receptors) are fast evolving in all Microtus species indicating the exceptional importance of the olfactory system in the evolution of these rodents. Of note is e.g. the shared signature among vole species on Olfr1019 which has been associated with fear responses against predator odours in rodents. Our analyses provide a genome-wide basis for the further characterization of the ecological factors and processes of natural and sexual selection that have contributed to the fast radiation of Microtus voles.
RESUMEN
Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.
Asunto(s)
Genética de Población , Genética de Población/métodos , Humanos , Biología Computacional/métodosRESUMEN
Modern and ancient genomes are not necessarily drawn from homogeneous populations, as they may have been collected from different places and at different times. This heterogeneous sampling can be an issue for demographic inferences and results in biased demographic parameters and incorrect model choice if not properly considered. When explicitly accounted for, it can result in very complex models and high data dimensionality that are difficult to analyse. In this paper, we formally study the impact of such spatial and temporal sampling heterogeneity on demographic inference, and we introduce a way to circumvent this problem. To deal with structured samples without increasing the dimensionality of the site frequency spectrum (SFS), we introduce a new structured approach to the existing program fastsimcoal2. We assess the efficiency and relevance of this methodological update with simulated and modern human genomic data. We particularly focus on spatial and temporal heterogeneities to evidence the interest of this new SFS-based approach, which can be especially useful when handling scattered and ancient DNA samples, as in conservation genetics or archaeogenetics.
Asunto(s)
Genética de Población , Genoma , Humanos , Genómica , ADN Antiguo , Modelos GenéticosRESUMEN
Although some lineages of animals and plants have made impressive adaptive radiations when provided with ecological opportunity, the propensities to radiate vary profoundly among lineages for unknown reasons. In Africa's Lake Victoria region, one cichlid lineage radiated in every lake, with the largest radiation taking place in a lake less than 16,000 years old. We show that all of its ecological guilds evolved in situ. Cycles of lineage fusion through admixture and lineage fission through speciation characterize the history of the radiation. It was jump-started when several swamp-dwelling refugial populations, each of which were of older hybrid descent, met in the newly forming lake, where they fused into a single population, resuspending old admixture variation. Each population contributed a different set of ancient alleles from which a new adaptive radiation assembled in record time, involving additional fusion-fission cycles. We argue that repeated fusion-fission cycles in the history of a lineage make adaptive radiation fast and predictable.
Asunto(s)
Adaptación Biológica , Cíclidos , Especiación Genética , Lagos , Animales , Cíclidos/clasificación , Cíclidos/genética , Filogenia , África OrientalRESUMEN
Range expansions have been common in the history of most species. Serial founder effects and subsequent population growth at expansion fronts typically lead to a loss of genomic diversity along the expansion axis. A frequent consequence is the phenomenon of "gene surfing," where variants located near the expanding front can reach high frequencies or even fix in newly colonized territories. Although gene surfing events have been characterized thoroughly for a specific locus, their effects on linked genomic regions and the overall patterns of genomic diversity have been little investigated. In this study, we simulated the evolution of whole genomes during several types of 1D and 2D range expansions differing by the extent of migration, founder events, and recombination rates. We focused on the characterization of local dips of diversity, or "troughs," taken as a proxy for surfing events. We find that, for a given recombination rate, once we consider the amount of diversity lost since the beginning of the expansion, it is possible to predict the initial evolution of trough density and their average width irrespective of the expansion condition. Furthermore, when recombination rates vary across the genome, we find that troughs are over-represented in regions of low recombination. Therefore, range expansions can leave local and global genomic signatures often interpreted as evidence of past selective events. Given the generality of our results, they could be used as a null model for species having gone through recent expansions, and thus be helpful to correctly interpret many evolutionary biology studies.
Asunto(s)
Efecto Fundador , Genómica , Crecimiento DemográficoRESUMEN
The precise genetic origins of the first Neolithic farming populations in Europe and Southwest Asia, as well as the processes and the timing of their differentiation, remain largely unknown. Demogenomic modeling of high-quality ancient genomes reveals that the early farmers of Anatolia and Europe emerged from a multiphase mixing of a Southwest Asian population with a strongly bottlenecked western hunter-gatherer population after the last glacial maximum. Moreover, the ancestors of the first farmers of Europe and Anatolia went through a period of extreme genetic drift during their westward range expansion, contributing highly to their genetic distinctiveness. This modeling elucidates the demographic processes at the root of the Neolithic transition and leads to a spatial interpretation of the population history of Southwest Asia and Europe during the late Pleistocene and early Holocene.
Asunto(s)
Agricultores , Genoma , Agricultura , ADN Mitocondrial/genética , Europa (Continente) , Flujo Genético , Genómica , Historia Antigua , Migración Humana , HumanosRESUMEN
The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
Asunto(s)
Genómica , Metagenómica , Genómica/métodosRESUMEN
A strong reduction in diversity around a specific locus is often interpreted as a recent rapid fixation of a positively selected allele, a phenomenon called a selective sweep. Rapid fixation of neutral variants can however lead to a similar reduction in local diversity, especially when the population experiences changes in population size, e.g. bottlenecks or range expansions. The fact that demographic processes can lead to signals of nucleotide diversity very similar to signals of selective sweeps is at the core of an ongoing discussion about the roles of demography and natural selection in shaping patterns of neutral variation. Here, we quantitatively investigate the shape of such neutral valleys of diversity under a simple model of a single population size change, and we compare it to signals of a selective sweep. We analytically describe the expected shape of such "neutral sweeps" and show that selective sweep valleys of diversity are, for the same fixation time, wider than neutral valleys. On the other hand, it is always possible to parametrize our model to find a neutral valley that has the same width as a given selected valley. Our findings provide further insight into how simple demographic models can create valleys of genetic diversity similar to those attributed to positive selection.
Asunto(s)
Evolución Molecular , Modelos Genéticos , Alelos , Variación Genética , Genética de Población , Selección GenéticaRESUMEN
Runs of homozygosity (ROH) occur when offspring inherit haplotypes that are identical by descent from each parent. Length distributions of ROH are informative about population history; specifically, the probability of inbreeding mediated by mating system and/or population demography. Here, we investigated whether variation in killer whale (Orcinus orca) demographic history is reflected in genome-wide heterozygosity and ROH length distributions, using a global data set of 26 genomes representative of geographic and ecotypic variation in this species, and two F1 admixed individuals with Pacific-Atlantic parentage. We first reconstructed demographic history for each population as changes in effective population size through time using the pairwise sequential Markovian coalescent (PSMC) method. We found a subset of populations declined in effective population size during the Late Pleistocene, while others had more stable demography. Genomes inferred to have undergone ancestral declines in effective population size, were autozygous at hundreds of short ROH (<1 Mb), reflecting high background relatedness due to coalescence of haplotypes deep within the pedigree. In contrast, longer and therefore younger ROH (>1.5 Mb) were found in low latitude populations, and populations of known conservation concern. These include a Scottish killer whale, for which 37.8% of the autosomes were comprised of ROH >1.5 Mb in length. The fate of this population, in which only two adult males have been sighted in the past five years, and zero fecundity over the last two decades, may be inextricably linked to its demographic history and consequential inbreeding depression.
Asunto(s)
Orca , Animales , Genoma , Homocigoto , Endogamia , Masculino , Polimorfismo de Nucleótido Simple , Densidad de Población , Orca/genéticaRESUMEN
MOTIVATION: fastsimcoal2 extends fastsimcoal, a continuous time coalescent-based genetic simulation program, by enabling the estimation of demographic parameters under very complex scenarios from the site frequency spectrum under a maximum-likelihood framework. RESULTS: Other improvements include multi-threading, handling of population inbreeding, extended input file syntax facilitating the description of complex demographic scenarios, and more efficient simulations of sparsely structured populations and of large chromosomes. AVAILABILITY AND IMPLEMENTATION: fastsimcoal2 is freely available on http://cmpg.unibe.ch/software/fastsimcoal2/. It includes console versions for Linux, Windows and MacOS, additional scripts for the analysis and visualization of simulated and estimated scenarios, as well as a detailed documentation and ready-to-use examples.
Asunto(s)
Genética de Población , Programas Informáticos , Simulación por Computador , Evolución Biológica , DemografíaAsunto(s)
Smegmamorpha , Animales , Flujo Génico , Lagos , Repeticiones de Microsatélite , Smegmamorpha/genéticaRESUMEN
The Pacific region is of major importance for addressing questions regarding human dispersals, interactions with archaic hominins and natural selection processes1. However, the demographic and adaptive history of Oceanian populations remains largely uncharacterized. Here we report high-coverage genomes of 317 individuals from 20 populations from the Pacific region. We find that the ancestors of Papuan-related ('Near Oceanian') groups underwent a strong bottleneck before the settlement of the region, and separated around 20,000-40,000 years ago. We infer that the East Asian ancestors of Pacific populations may have diverged from Taiwanese Indigenous peoples before the Neolithic expansion, which is thought to have started from Taiwan around 5,000 years ago2-4. Additionally, this dispersal was not followed by an immediate, single admixture event with Near Oceanian populations, but involved recurrent episodes of genetic interactions. Our analyses reveal marked differences in the proportion and nature of Denisovan heritage among Pacific groups, suggesting that independent interbreeding with highly structured archaic populations occurred. Furthermore, whereas introgression of Neanderthal genetic information facilitated the adaptation of modern humans related to multiple phenotypes (for example, metabolism, pigmentation and neuronal development), Denisovan introgression was primarily beneficial for immune-related functions. Finally, we report evidence of selective sweeps and polygenic adaptation associated with pathogen exposure and lipid metabolism in the Pacific region, increasing our understanding of the mechanisms of biological adaptation to island environments.
Asunto(s)
Adaptación Biológica/genética , Evolución Biológica , Genética de Población , Genoma Humano/genética , Genómica , Migración Humana/historia , Islas , Nativos de Hawái y Otras Islas del Pacífico/genética , Animales , Australia , Conjuntos de Datos como Asunto , Asia Oriental , Introgresión Genética , Historia Antigua , Humanos , Hombre de Neandertal/genética , Oceanía , Océano Pacífico , TaiwánRESUMEN
Isobiotic mice, with an identical stable microbiota composition, potentially allow models of host-microbial mutualism to be studied over time and between different laboratories. To understand microbiota evolution in these models, we carried out a 6-year experiment in mice colonized with 12 representative taxa. Increased non-synonymous to synonymous mutation rates indicate positive selection in multiple taxa, particularly for genes annotated for nutrient acquisition or replication. Microbial sub-strains that evolved within a single taxon can stably coexist, consistent with niche partitioning of ecotypes in the complex intestinal environment. Dietary shifts trigger rapid transcriptional adaptation to macronutrient and micronutrient changes in individual taxa and alterations in taxa biomass. The proportions of different sub-strains are also rapidly altered after dietary shift. This indicates that microbial taxa within a mouse colony adapt to changes in the intestinal environment by long-term genomic positive selection and short-term effects of transcriptional reprogramming and adjustments in sub-strain proportions.
Asunto(s)
Adaptación Fisiológica , Microbioma Gastrointestinal/fisiología , Microbiota/fisiología , Adaptación Fisiológica/inmunología , Animales , Bacterias/genética , Femenino , Microbioma Gastrointestinal/genética , Microbioma Gastrointestinal/inmunología , Genómica , Inmunidad , Intestinos , Masculino , Metabolómica , Ratones , Ratones Endogámicos C57BL , Ralstonia , SimbiosisRESUMEN
In the last ten years, the next generation sequencing revolution has multiplied the amount of genetic data for many organisms by orders of magnitude. This has not only led to evolutionary biologists having more data available but also to new and different types of data: from a handful of allozyme markers in the 70s, we got dozens of restriction fragment length polymorphisms (RFLPs) in the 80s, hundreds of microsatellites in the 90s, thousands to hundreds of thousands of single nucleotide polymorphisms (SNPs) in the 2000s, a few full genomes in the 2010s, and thousands of full genomes in the 2020s. These data have provided information not only on the genetic diversity and evolution of the organisms studied but also on genome-wide patterns of selection, linkage disequilibrium, as well as recombination and mutation processes. Below, we will describe how these new genomic data can be used to infer the past demographic history of populations.
Asunto(s)
Demografía , Genoma/genética , Genómica , Modelos Genéticos , Animales , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Desequilibrio de LigamientoRESUMEN
Current procedures for inferring population history generally assume complete neutrality-that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2), specifically studying how the underlying shape of the distribution of fitness effects and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the distribution of fitness effect as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
Asunto(s)
Demografía/métodos , Aptitud Genética , Técnicas Genéticas , Modelos Genéticos , Selección Genética , Teorema de Bayes , Tamaño del Genoma , Cadenas de Markov , Polimorfismo de Nucleótido SimpleRESUMEN
Species conservation can be improved by knowledge of evolutionary and genetic history. Tigers are among the most charismatic of endangered species and garner significant conservation attention. However, their evolutionary history and genomic variation remain poorly known, especially for Indian tigers. With 70% of the world's wild tigers living in India, such knowledge is critical. We re-sequenced 65 individual tiger genomes representing most extant subspecies with a specific focus on tigers from India. As suggested by earlier studies, we found strong genetic differentiation between the putative tiger subspecies. Despite high total genomic diversity in India, individual tigers host longer runs of homozygosity, potentially suggesting recent inbreeding or founding events, possibly due to small and fragmented protected areas. We suggest the impacts of ongoing connectivity loss on inbreeding and persistence of Indian tigers be closely monitored. Surprisingly, demographic models suggest recent divergence (within the last 20,000 years) between subspecies and strong population bottlenecks. Amur tiger genomes revealed the strongest signals of selection related to metabolic adaptation to cold, whereas Sumatran tigers show evidence of weak selection for genes involved in body size regulation. We recommend detailed investigation of local adaptation in Amur and Sumatran tigers prior to initiating genetic rescue.
Asunto(s)
Evolución Biológica , Flujo Genético , Endogamia , Selección Genética , Tigres/genética , Animales , Conservación de los Recursos Naturales , Variación Genética , Genoma , India , FilogeografíaRESUMEN
Current procedures for inferring population history generally assume complete neutrality - that is, they neglect both direct selection and the effects of selection on linked sites. We here examine how the presence of direct purifying selection and background selection may bias demographic inference by evaluating two commonly-used methods (MSMC and fastsimcoal2 ), specifically studying how the underlying shape of the distribution of fitness effects (DFE) and the fraction of directly selected sites interact with demographic parameter estimation. The results show that, even after masking functional genomic regions, background selection may cause the mis-inference of population growth under models of both constant population size and decline. This effect is amplified as the strength of purifying selection and the density of directly selected sites increases, as indicated by the distortion of the site frequency spectrum and levels of nucleotide diversity at linked neutral sites. We also show how simulated changes in background selection effects caused by population size changes can be predicted analytically. We propose a potential method for correcting for the mis-inference of population growth caused by selection. By treating the DFE as a nuisance parameter and averaging across all potential realizations, we demonstrate that even directly selected sites can be used to infer demographic histories with reasonable accuracy.
RESUMEN
Most human populations exhibit an excess of high-frequency variants, leading to a U-shaped site-frequency spectrum (uSFS). This pattern has been generally interpreted as a signature of ongoing episodes of positive selection, or as evidence for a mis-assignment of ancestral/derived allelic states, but uSFS has also been observed in populations receiving gene flow from a ghost population, in structured populations, or after range expansions. In order to better explain the prevalence of high-frequency variants in humans and other populations, we describe here which patterns of gene flow and population demography can lead to uSFS by using extensive coalescent simulations. We find that uSFS can often be observed in a population if gene flow brings a few ancestral alleles from a well-differentiated population. Gene flow can either consist in single pulses of admixture or continuous immigration, but different demographic conditions are necessary to observe uSFS in these two scenarios. Indeed, an extremely low and recent gene flow is required in the case of single admixture events, while with continuous immigration, uSFS occurs only if gene flow started recently at a high rate or if it lasted for a long time at a low rate. Overall, we find that a neutral uSFS occurs under more restrictive conditions in populations having received single pulses of gene flow than in populations exposed to continuous gene flow. We also show that the uSFS observed in human populations from the 1000 Genomes Project can easily be explained by gene flow from surrounding populations without requiring past episodes of positive selection. These results imply that uSFS should be common in non-isolated populations, such as most wild or domesticated plants and animals.
RESUMEN
Speciation rates vary considerably among lineages, and our understanding of what drives the rapid succession of speciation events within young adaptive radiations remains incomplete1-11. The cichlid fish family provides a notable example of such variation, with many slowly speciating lineages as well as several exceptionally large and rapid radiations12. Here, by reconstructing a large phylogeny of all currently described cichlid species, we show that explosive speciation is solely concentrated in species flocks of several large young lakes. Increases in the speciation rate are associated with the absence of top predators; however, this does not sufficiently explain explosive speciation. Across lake radiations, we observe a positive relationship between the speciation rate and enrichment of large insertion or deletion polymorphisms. Assembly of 100 cichlid genomes within the most rapidly speciating cichlid radiation, which is found in Lake Victoria, reveals exceptional 'genomic potential'-hundreds of ancient haplotypes bear insertion or deletion polymorphisms, many of which are associated with specific ecologies and shared with ecologically similar species from other older radiations elsewhere in Africa. Network analysis reveals fundamentally non-treelike evolution through recombining old haplotypes, and the origins of ecological guilds are concentrated early in the radiation. Our results suggest that the combination of ecological opportunity, sexual selection and exceptional genomic potential is the key to understanding explosive adaptive radiation.
Asunto(s)
Cíclidos/genética , Especiación Genética , Genoma/genética , Genómica , Filogenia , África , Animales , Haplotipos/genética , Mutación INDEL , Lagos , Masculino , Factores de TiempoRESUMEN
BACKGROUND: Recent experimental work has shown that the evolutionary dynamics of bacteria expanding across space can differ dramatically from what we expect under well-mixed conditions. During spatial expansion, deleterious mutations can accumulate due to inefficient selection on the expansion front, potentially interfering with and modifying adaptive evolutionary processes. RESULTS: We used whole genome sequencing to follow the genomic evolution of 10 mutator Escherichia coli lines during 39 days ( ~ 1650 generations) of a spatial expansion, which allowed us to gain a temporal perspective on the interaction of adaptive and non-adaptive evolutionary processes during range expansions. We used elastic net regression to infer the positive or negative effects of mutations on colony growth. The colony size, measured after three day of growth, decreased at the end of the experiment in all 10 lines, and mutations accumulated at a nearly constant rate over the whole experiment. We find evidence that beneficial mutations accumulate primarily at an early stage of the experiment, leading to a non-linear change of colony size over time. Indeed, the rate of colony size expansion remains almost constant at the beginning of the experiment and then decreases after ~ 12 days of evolution. We also find that beneficial mutations are enriched in genes encoding transport proteins, and genes coding for the membrane structure, whereas deleterious mutations show no enrichment for any biological process. CONCLUSIONS: Our experiment shows that beneficial mutations target specific biological functions mostly involved in inter or extra membrane processes, whereas deleterious mutations are randomly distributed over the whole genome. It thus appears that the interaction between genetic drift and the availability or depletion of beneficial mutations determines the change in fitness of bacterial populations during range expansion.