ABSTRACT
Copy number variation represents a major source of genetic divergence, yet the evolutionary dynamics of genic copy number variation in natural populations during differentiation and adaptation remain unclear. We applied a read depth approach to genome resequencing data to detect copy number variants (CNVs) ≥1 kb in wild-caught mice belonging to four populations of Mus musculus domesticus. We complemented the bioinformatics analyses with experimental validation using droplet digital PCR. The specific focus of our analysis is CNVs that include complete genes, as these CNVs could be expected to contribute most directly to evolutionary divergence. In total, 1863 transcription units appear to be completely encompassed within CNVs in at least one individual when compared to the reference assembly. Further, 179 of these CNVs show population-specific copy number differences, and 325 are subject to complete deletion in multiple individuals. Among the most copy-number variable genes are three highly conserved genes that encode the splicing factor CWC22, the spindle protein SFI1, and the Holliday junction recognition protein HJURP. These genes exhibit population-specific expansion patterns that suggest involvement in local adaptations. We found that genes that overlap with large segmental duplications are generally more copy-number variable. These genes encode proteins that are relevant for environmental and behavioral interactions, such as vomeronasal and olfactory receptors, as well as major urinary proteins and several proteins of unknown function. The overall analysis shows that genic CNVs contribute more to population differentiation in mice than in humans and may promote and speed up population divergence.
Subject(s)
Cell Cycle Proteins/genetics , DNA Copy Number Variations , DNA-Binding Proteins/genetics , Mice/genetics , Nuclear Proteins/genetics , Adaptation, Biological , Animals , Cell Cycle Proteins/metabolism , Conserved Sequence , DNA-Binding Proteins/metabolism , Evolution, Molecular , Genetics, Population , Genome , Genomics/methods , Mice/classification , Nuclear Proteins/metabolism , RNA-Binding Proteins , Selection, GeneticABSTRACT
BACKGROUND: The MHC class I and II loci mediate the adaptive immune response and belong to the most polymorphic loci in vertebrate genomes. In fact, the number of different alleles in a given species is often so large that it remains a challenge to provide an evolutionary model that can fully account for this. RESULTS: We provide here a general survey of MHC allele numbers in house mouse populations and two sub-species (M. m. domesticus and M. m. musculus) for H2 class I D and K, as well as class II A and E loci. Between 50 and 90% of the detected different sequences constitute new alleles, confirming that the discovery of new alleles is indeed far from complete. House mice live in separate demes with small effective population sizes, factors that were proposed to reduce, rather than enhance the possibility for the maintenance of many different alleles. To specifically investigate the occurrence of alleles within demes, we focused on the class II H2-Aa and H2-Eb exon 2 alleles in nine demes of M. m. domesticus from two different geographic regions. We find on the one hand a group of alleles that occur in different sampling regions and three quarters of these are also found in both sub-species. On the other hand, the larger group of different alleles (56%) occurs only in one of the regions and most of these (89%) only in single demes. We show that most of these region-specific alleles have apparently arisen through recombination and/or partial gene conversion from already existing alleles. CONCLUSIONS: Demes can act as sources of alleles that outnumber the set of alleles that are shared across the species range. These findings support the reservoir model proposed for human MHC diversity, which states that large pools of rare MHC allele variants are continuously generated by neutral mutational mechanisms. Given that these can become important in the defense against newly emerging pathogens, the reservoir model complements the selection based models for MHC diversity and explains why the exceptional diversity exists.
ABSTRACT
BACKGROUND: The phylogeography of the house mouse (Mus musculus L.), an emblematic species for genetic and biomedical studies, is only partly understood, essentially because of a sampling bias towards its most peripheral populations in Europe, Asia and the Americas. Moreover, the present-day phylogeographic hypotheses stem mostly from the study of mitochondrial lineages. In this article, we complement the mtDNA studies with a comprehensive survey of nuclear markers (19 microsatellite loci) typed in 963 individuals from 47 population samples, with an emphasis on the putative Middle-Eastern centre of dispersal of the species. RESULTS: Based on correspondence analysis, distance and allele-sharing trees, we find a good coherence between geographical origin and genetic make-up of the populations. We thus confirm the clear distinction of the three best described peripheral subspecies, M. m. musculus, M. m. domesticus and M. m. castaneus. A large diversity was found in the Iranian populations, which have had an unclear taxonomic status to date. In addition to samples with clear affiliation to M. m. musculus and M. m. domesticus, we find two genetic groups in Central and South East Iran, which are as distinct from each other as they are from the south-east Asian M. m. castaneus. These groups were previously also found to harbor distinct mitochondrial haplotypes. CONCLUSION: We propose that the Iranian plateau is home to two more taxonomic units displaying complex primary and secondary relationships with their long recognized neighbours. This central region emerges as the area with the highest known diversity of mouse lineages within a restricted geographical area, designating it as the focal place to study the mechanisms of speciation and diversification of this species.
Subject(s)
Mice/classification , Mice/genetics , Phylogeography , Alleles , Animals , DNA, Mitochondrial/genetics , Genetics, Population , Iran , Microsatellite RepeatsABSTRACT
The evolutionary divergence of cues for mate recognition can contribute to early stages of population separation. We compare here two allopatric populations of house mice (Mus musculus domesticus) that have become separated about 3000 years ago. We have used paternity assignments in semi-natural environments to study the degree of mutual mate recognition according to population origin under conditions of free choice and overlapping generations. Our results provide insights into the divergence of mating cues, but also for the mating system of house mice. We find frequent multiple mating, occurrence of inbreeding and formation of extended family groups. In addition, many animals show strong mate fidelity, that is, frequent choice of the same mating partners in successive breeding cycles, indicating a role for familiarity in mating preference. With respect to population divergence, we find evidence for assortative mating, but only under conditions where the animals had time to familiarize themselves with mating partners from their own population. Most interestingly, the first-generation offspring born in the enclosure showed a specific mating pattern. Although matings between animals of hybrid population origin with animals of pure population origin should have occurred with equal frequency with respect to matching the paternal or maternal origin, paternal matching with mates from their own populations occurred much more often. Our findings suggest that paternally imprinted cues play a role in mate recognition between mice and that the cues evolve fast, such that animals of populations that are separated since not more than 3000 years can differentially recognize them.
Subject(s)
Genomic Imprinting , Mating Preference, Animal , Mice/genetics , Reproduction/genetics , Animals , Biological Evolution , Female , Genetics, Population , Genotype , Hybridization, Genetic , Inbreeding , MaleABSTRACT
BACKGROUND: Starting from Western Europe, the house mouse (Mus musculus domesticus) has spread across the globe in historic times. However, most oceanic islands were colonized by mice only within the past 300 years. This makes them an excellent model for studying the evolutionary processes during early stages of new colonization. We have focused here on the Kerguelen Archipelago, located within the sub-Antarctic area and compare the patterns with samples from other Southern Ocean islands. RESULTS: We have typed 18 autosomal and six Y-chromosomal microsatellite loci and obtained mitochondrial D-loop sequences for a total of 534 samples, mainly from the Kerguelen Archipelago, but also from the Falkland Islands, Marion Island, Amsterdam Island, Antipodes Island, Macquarie Island, Auckland Islands and one sample from South Georgia. We find that most of the mice on the Kerguelen Archipelago have the same mitochondrial haplotype and all share the same major Y-chromosomal haplotype. Two small islands (Cochons Island and Cimetière Island) within the archipelago show a different mitochondrial haplotype, are genetically distinct for autosomal loci, but share the major Y-chromosomal haplotype. In the mitochondrial D-loop sequences, we find several single step mutational derivatives of one of the major mitochondrial haplotypes, suggesting an unusually high mutation rate, or the occurrence of selective sweeps in mitochondria. CONCLUSIONS: Although there was heavy ship traffic for over a hundred years to the Kerguelen Archipelago, it appears that the mice that have arrived first have colonized the main island (Grande Terre) and most of the associated small islands. The second invasion that we see in our data has occurred on islands that are detached from Grande Terre and were likely to have had no resident mice prior to their arrival. The genetic data suggest that the mice of both primary invasions originated from related source populations. Our data suggest that an area colonized by mice is refractory to further introgression, possibly due to fast adaptations of the resident mice to local conditions.
Subject(s)
Geography , Animals , DNA, Mitochondrial/genetics , Europe , Genetics, Population , Haplotypes/genetics , Mice , Microsatellite Repeats/genetics , Phylogeny , Y Chromosome/geneticsABSTRACT
The RIIIS/J inbred mouse strain is a model for type 1 von Willebrand disease (VWD), a common human bleeding disorder. Low von Willebrand factor (VWF) levels in RIIIS/J are due to a regulatory mutation, Mvwf1, which directs a tissue-specific switch in expression of a glycosyltransferase, B4GALNT2, from intestine to blood vessel. We recently found that Mvwf1 lies on a founder allele common among laboratory mouse strains. To investigate the evolutionary forces operating at B4galnt2, we conducted a survey of DNA sequence polymorphism and microsatellite variation spanning the B4galnt2 gene region in natural Mus musculus domesticus populations. Two divergent haplotypes segregate in these natural populations, one of which corresponds to the RIIIS/J sequence. Different local populations display dramatic differences in the frequency of these haplotypes, and reduced microsatellite variability near B4galnt2 within the RIIIS/J haplotype is consistent with the recent action of natural selection. The level and pattern of DNA sequence polymorphism in the 5' flanking region of the gene significantly deviates from the neutral expectation and suggests that variation in B4galnt2 expression may be under balancing selection and/or arose from a recently introgressed allele that subsequently increased in frequency due to natural selection. However, coalescent simulations indicate that the heterogeneity in divergence between haplotypes is greater than expected under an introgression model. Analysis of a population where the RIIIS/J haplotype is in high frequency reveals an association between this haplotype, the B4galnt2 tissue-specific switch, and a significant decrease in plasma VWF levels. Given these observations, we propose that low VWF levels may represent a fitness cost that is offset by a yet unknown benefit of the B4galnt2 tissue-specific switch. Similar mechanisms may account for the variability in VWF levels and high prevalence of VWD in other mammals, including humans.
Subject(s)
Enhancer Elements, Genetic/genetics , N-Acetylgalactosaminyltransferases/genetics , Polymorphism, Genetic , Selection, Genetic , von Willebrand Factor/genetics , 5' Flanking Region , Animals , Genetic Variation , Haplotypes , Mice , Tissue DistributionABSTRACT
Genome scans of polymorphisms promise to provide insights into the patterns and frequencies of positive selection under natural conditions. The use of microsatellites as markers has the potential to focus on very recent events, since in contrast to SNPs, their high mutation rates should remove signatures of older events. We assess this concept here in a large-scale study. We have analyzed two population pairs of the house mouse, one pair of the subspecies Mus musculus domesticus and the other of M. m. musculus. A total of 915 microsatellite loci chosen to cover the whole genome were assessed in a prescreening procedure, followed by individual typing of candidate loci. Schlötterer's ratio statistics (lnRH) were applied to detect loci with significant deviations from patterns of neutral expectation. For eight loci from each population pair we have determined the size of the potential sweep window and applied a second statistical procedure (linked locus statistics). For the two population pairs, we find five and four significant sweep loci, respectively, with an average estimated window size of 120 kb. On the basis of the analysis of individual allele frequencies, it is possible to identify the most recent sweep, for which we estimate an onset of 400-600 years ago. Given the known population history for the French-German population pair, we infer that the average frequency of selective sweeps in these populations is higher than 1 in 100 generations across the whole genome. We discuss the implications for adaptation processes in natural populations.
Subject(s)
Chromosome Mapping , Microsatellite Repeats/genetics , Selection, Genetic , Statistics as Topic/methods , Animals , Evolution, Molecular , Gene Frequency , Genetics, Population , Genome , MiceABSTRACT
BACKGROUND: In vertebrates, several anatomical regions located within the nasal cavity mediate olfaction. Among these, the main olfactory epithelium detects most conventional odorants. Olfactory sensory neurons, provided with cilia exposed to the air, detect volatile chemicals via an extremely large family of seven-transmembrane chemoreceptors named odorant receptors. Their genes are expressed in a monogenic and monoallelic fashion: a single allele of a single odorant receptor gene is transcribed in a given mature neuron, through a still uncharacterized molecular mechanism known as odorant receptor gene choice. AIM: Odorant receptor genes are typically arranged in genomic clusters, but a few are isolated (we call them solitary) from the others within a region broader than 1 Mb upstream and downstream with respect to their transcript's coordinates. The study of clustered genes is problematic, because of redundancy and ambiguities in their regulatory elements: we propose to use the solitary genes as simplified models to understand odorant receptor gene choice. PROCEDURES: Here we define number and identity of the solitary genes in the mouse genome (C57BL/6J), and assess the conservation of the solitary status in some mammalian orthologs. Furthermore, we locate their putative promoters, predict their homeodomain binding sites (commonly present in the promoters of odorant receptor genes) and compare candidate promoter sequences with those of wild-caught mice. We also provide expression data from histological sections. RESULTS: In the mouse genome there are eight intact solitary genes: Olfr19 (M12), Olfr49, Olfr266, Olfr267, Olfr370, Olfr371, Olfr466, Olfr1402; five are conserved as solitary in rat. These genes are all expressed in the main olfactory epithelium of three-day-old mice. The C57BL/6J candidate promoter of Olfr370 has considerably varied compared to its wild-type counterpart. Within the putative promoter for Olfr266 a homeodomain binding site is predicted. As a whole, our findings favor Olfr266 as a model gene to investigate odorant receptor gene choice.
Subject(s)
Promoter Regions, Genetic , Receptors, Odorant/genetics , Animals , Binding Sites , Mice, Inbred C57BL , Mice, Transgenic , Olfactory Mucosa/physiology , Pseudogenes , Rats , TranscriptomeABSTRACT
Wild populations of the house mouse (Mus musculus) represent the raw genetic material for the classical inbred strains in biomedical research and are a major model system for evolutionary biology. We provide whole genome sequencing data of individuals representing natural populations of M. m. domesticus (24 individuals from 3 populations), M. m. helgolandicus (3 individuals), M. m. musculus (22 individuals from 3 populations) and M. spretus (8 individuals from one population). We use a single pipeline to map and call variants for these individuals and also include 10 additional individuals of M. m. castaneus for which genomic data are publically available. In addition, RNAseq data were obtained from 10 tissues of up to eight adult individuals from each of the three M. m. domesticus populations for which genomic data were collected. Data and analyses are presented via tracks viewable in the UCSC or IGV genome browsers. We also provide information on available outbred stocks and instructions on how to keep them in the laboratory.
Subject(s)
Genome , Genomics , Animals , Biological Evolution , MiceABSTRACT
Divergence of gene expression is known to contribute to the differentiation and separation of populations and species, although the dynamics of this process in early stages of population divergence remains unclear. We analyzed gene expression differences in three organs (brain, liver, and testis) between two natural populations of Mus musculus domesticus that have been separated for at most 3000 years. We used two different microarray platforms to corroborate the results at a large scale and identified hundreds of genes with significant expression differences between the populations. We find that although the three tissues have similar number of differentially expressed genes, brain and liver have more tissue-specific genes than testis. Most genes show changes in a single tissue only, even when expressed in all tissues, supporting the notion that tissue-specific enhancers act as separable targets of evolution. In terms of functional categories, in brain and to a smaller extent in liver, we find transcription factors and their targets to be particularly variable between populations, similar to previous findings in primates. Testis, however, has a different set of differently expressed genes, both with respect to functional categories and overall correlation with the other tissues, the latter indicating that gene expression divergence of potential importance might be present in other datasets where no differences in fraction of differentially expressed genes were reported. Our results show that a significant amount of gene expression divergence quickly accumulates between allopatric populations.
ABSTRACT
Changes in expression of genes are thought to contribute significantly to evolutionary divergence. To study the relative role of selection and neutrality in shaping expression changes, we analyzed 24 genes in three different tissues of the house mouse (Mus musculus). Samples from two natural populations of the subspecies M. m. domesticus and M. m. musculus were investigated using quantitative PCR assays and sequencing of the upstream region. We have developed an approach to quantify expression polymorphism within such populations and to disentangle technical from biological variation in the data. We found a correlation between expression polymorphism within populations and divergence between populations. Furthermore, we found a correlation between expression polymorphism and sequence polymorphism of the respective genes. These data are most easily interpreted within a framework of a predominantly neutral model of gene expression change, where only a fraction of the changes may have been driven by positive selection. Although most genes investigated were expressed in all three tissues analyzed, significant changes of expression levels occurred predominantly in a single tissue only. This adds to the notion that enhancer-specific effects or transregulatory effects can modulate the evolution of gene expression in a tissue-specific way.
Subject(s)
Mice/genetics , Models, Genetic , Animals , Polymorphism, Genetic , Reverse Transcriptase Polymerase Chain ReactionABSTRACT
Glyoxalase 1 (Glo1) has been implicated in anxiety-like behavior in mice and in multiple psychiatric diseases in humans. We used mouse Affymetrix exon arrays to detect copy number variants (CNV) among inbred mouse strains and thereby identified a approximately 475 kb tandem duplication on chromosome 17 that includes Glo1 (30,174,390-30,651,226 Mb; mouse genome build 36). We developed a PCR-based strategy and used it to detect this duplication in 23 of 71 inbred strains tested, and in various outbred and wild-caught mice. Presence of the duplication is associated with a cis-acting expression QTL for Glo1 (LOD>30) in BXD recombinant inbred strains. However, evidence for an eQTL for Glo1 was not obtained when we analyzed single SNPs or 3-SNP haplotypes in a panel of 27 inbred strains. We conclude that association analysis in the inbred strain panel failed to detect an eQTL because the duplication was present on multiple highly divergent haplotypes. Furthermore, we suggest that non-allelic homologous recombination has led to multiple reversions to the non-duplicated state among inbred strains. We show associations between multiple duplication-containing haplotypes, Glo1 expression and anxiety-like behavior in both inbred strain panels and outbred CD-1 mice. Our findings provide a molecular basis for differential expression of Glo1 and further implicate Glo1 in anxiety-like behavior. More broadly, these results identify problems with commonly employed tests for association in inbred strains when CNVs are present. Finally, these data provide an example of biologically significant phenotypic variability in model organisms that can be attributed to CNVs.