RESUMO
In this issue, Enard and Petrov present intriguing results on the possibility of genetic traces left behind in our genomes from adaptation to past viral epidemics that may have been initiated by interaction with Neanderthal archaic hominins. The work highlights how powerful infectious agents can act as a selective force to shape our genetic makeup.
Assuntos
Hominidae/genética , Homem de Neandertal/genética , Vírus de RNA , Animais , Genoma , HumanosRESUMO
Introgression is a common evolutionary phenomenon that results in shared genetic material across non-sister taxa. Existing statistical methods such as Patterson's D statistic can detect introgression by measuring an excess of shared derived alleles between populations. The D statistic is effective to detect genome-wide patterns of introgression but can give spurious inferences of introgression when applied to local regions. We propose a new statistic, D+, that leverages both shared ancestral and derived alleles to infer local introgressed regions. Incorporating both shared derived and ancestral alleles increases the number of informative sites per region, improving our ability to identify local introgression. We use a coalescent framework to derive the expected value of this statistic as a function of different demographic parameters under an instantaneous admixture model and use coalescent simulations to compute the power and precision of D+. While the power of D and D+ is comparable, D+ has better precision than D. We apply D+ to empirical data from the 1000 Genome Project and Heliconius butterflies to infer local targets of introgression in humans and in butterflies.
Assuntos
Borboletas , Humanos , Animais , Borboletas/genética , Genoma , Evolução BiológicaRESUMO
Evidence of interbreeding between archaic hominins and humans comes from methods that infer the locations of segments of archaic haplotypes, or 'archaic coverage' using the genomes of people living today. As more estimates of archaic coverage have emerged, it has become clear that most of this coverage is found on the autosomes- very little is retained on chromosome X. Here, we summarize published estimates of archaic coverage on autosomes and chromosome X from extant human samples. We find on average 7 times more archaic coverage on autosomes than chromosome X, and identify broad continental patterns in this ratio: greatest in European samples, and least in South Asian samples. We also perform extensive simulation studies to investigate how the amount of archaic coverage, lengths of coverage, and rates of purging of archaic coverage are affected by sex-bias caused by an unequal sex ratio within the archaic introgressors. Our results generally confirm that, with increasing male sex-bias, less archaic coverage is retained on chromosome X. Ours is the first study to explicitly model such sex-bias and its potential role in creating the dearth of archaic coverage on chromosome X.
Assuntos
Introgressão Genética , Genoma Humano , Hominidae , Cromossomo X , Animais , Humanos , Masculino , Povo Asiático/genética , Genoma , Genoma Humano/genética , Hominidae/genética , Homem de Neandertal/genética , Cromossomo X/genética , Fatores Sexuais , Haplótipos/genética , Introgressão Genética/genética , Cromossomos Humanos/genética , Feminino , População do Sul da Ásia/genética , População Europeia/genéticaRESUMO
A set of 20 short tandem repeats (STRs) is used by the US criminal justice system to identify suspects and to maintain a database of genetic profiles for individuals who have been previously convicted or arrested. Some of these STRs were identified in the 1990s, with a preference for markers in putative gene deserts to avoid forensic profiles revealing protected medical information. We revisit that assumption, investigating whether forensic genetic profiles reveal information about gene-expression variation or potential medical information. We find six significant correlations (false discovery rate = 0.23) between the forensic STRs and the expression levels of neighboring genes in lymphoblastoid cell lines. We explore possible mechanisms for these associations, showing evidence compatible with forensic STRs causing expression variation or being in linkage disequilibrium with a causal locus in three cases and weaker or potentially spurious associations in the other three cases. Together, these results suggest that forensic genetic loci may reveal expression levels and, perhaps, medical information.
Assuntos
Genética Forense , Loci Gênicos , Repetições de Microssatélites , Privacidade , Genética Forense/legislação & jurisprudência , Genética Forense/métodos , Frequência do Gene , Genética Populacional , Humanos , Desequilíbrio de LigaçãoRESUMO
Recent studies suggest that admixture with archaic hominins played an important role in facilitating biological adaptations to new environments. For example, interbreeding with Denisovans facilitated the adaptation to high-altitude environments on the Tibetan Plateau. Specifically, the EPAS1 gene, a transcription factor that regulates the response to hypoxia, exhibits strong signatures of both positive selection and introgression from Denisovans in Tibetan individuals. Interestingly, despite being geographically closer to the Denisova Cave, East Asian populations do not harbor as much Denisovan ancestry as populations from Melanesia. Recently, two studies have suggested two independent waves of Denisovan admixture into East Asians, one of which is shared with South Asians and Oceanians. Here, we leverage data from EPAS1 in 78 Tibetan individuals to interrogate which of these two introgression events introduced the EPAS1 beneficial sequence into the ancestral population of Tibetans, and we use the distribution of introgressed segment lengths at this locus to infer the timing of the introgression and selection event. We find that the introgression event unique to East Asians most likely introduced the beneficial haplotype into the ancestral population of Tibetans around 48,700 (16,000-59,500) y ago, and selection started around 9,000 (2,500-42,000) y ago. Our estimates suggest that one of the most convincing examples of adaptive introgression is in fact selection acting on standing archaic variation.
Assuntos
Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Evolução Molecular , Haplótipos , Adaptação Fisiológica/genética , Altitude , Humanos , TibetRESUMO
Although some variation introgressed from Neanderthals has undergone selective sweeps, little is known about its functional significance. We used a Massively Parallel Reporter Assay (MPRA) to assay 5,353 high-frequency introgressed variants for their ability to modulate the gene expression within 170 bp of endogenous sequence. We identified 2,548 variants in active putative cis-regulatory elements (CREs) and 292 expression-modulating variants (emVars). These emVars are predicted to alter the binding motifs of important immune transcription factors, are enriched for associations with neutrophil and white blood cell count, and are associated with the expression of genes that function in innate immune pathways including inflammatory response and antiviral defense. We combined the MPRA data with other data sets to identify strong candidates to be driver variants of positive selection including an emVar that may contribute to protection against severe COVID-19 response. We endogenously deleted two CREs containing expression-modulation variants linked to immune function, rs11624425 and rs80317430, identifying their primary genic targets as ELMSAN1, and PAN2 and STAT2, respectively, three genes differentially expressed during influenza infection. Overall, we present the first database of experimentally identified expression-modulating Neanderthal-introgressed alleles contributing to potential immune response in modern humans.
Assuntos
Variação Genética , Genoma Humano , Imunidade Inata/genética , Homem de Neandertal , Animais , Expressão Gênica , Humanos , Inflamação , Homem de Neandertal/genéticaRESUMO
Variation at the ABO locus was one of the earliest sources of data in the study of human population identity and history, and to this day remains widely genotyped due to its importance in blood and tissue transfusions. Here, we look at ABO blood type variants in our archaic relatives: Neanderthals and Denisovans. Our goal is to understand the genetic landscape of the ABO gene in archaic humans, and how it relates to modern human ABO variation. We found two Neanderthal variants of the O allele in the Siberian Neanderthals (O1 and O2), one of these variants is shared with an European Neanderthal, who is a heterozygote for this O1 variant and a rare cis-AB variant. The Denisovan individual is heterozygous for two variants of the O1 allele, functionally similar to variants found widely in modern humans. Perhaps more surprisingly, the O2 allele variant found in Siberian Neanderthals can be found at low frequencies in modern Europeans and Southeast Asians, and the O1 allele variant found in Siberian and European Neanderthal is also found at very low frequency in modern East Asians. Our genetic distance analyses suggest both alleles survive in modern humans due to inbreeding with Neanderthals. We find that the sequence backgrounds of the surviving Neanderthal-like O alleles in modern humans retain a higher sequence divergence than other surviving Neanderthal genome fragments, supporting a view of balancing selection operating in the Neanderthal ABO alleles by retaining highly diverse haplotypes compared with portions of the genome evolving neutrally.
Assuntos
Sistema ABO de Grupos Sanguíneos/genética , Homem de Neandertal/genética , Animais , Variação Genética , Genoma Humano , Haplótipos , HumanosRESUMO
As modern and ancient DNA sequence data from diverse human populations accumulate, evidence is increasing in support of the existence of beneficial variants acquired from archaic humans that may have accelerated adaptation and improved survival in new environments - a process known as adaptive introgression. Within the past few years, a series of studies have identified genomic regions that show strong evidence for archaic adaptive introgression. Here, we provide an overview of the statistical methods developed to identify archaic introgressed fragments in the genome sequences of modern humans and to determine whether positive selection has acted on these fragments. We review recently reported examples of adaptive introgression, grouped by selection pressure, and consider the level of supporting evidence for each. Finally, we discuss challenges and recommendations for inferring selection on introgressed regions.
Assuntos
Modelos Genéticos , Adaptação Biológica/genética , Animais , Evolução Molecular , Fluxo Gênico , Genoma Humano , Haplótipos , Humanos , Desequilíbrio de Ligação , Cadeias de Markov , Homem de Neandertal/genética , Filogenia , Seleção GenéticaRESUMO
As modern humans migrated out of Africa, they encountered many new environmental conditions, including greater temperature extremes, different pathogens and higher altitudes. These diverse environments are likely to have acted as agents of natural selection and to have led to local adaptations. One of the most celebrated examples in humans is the adaptation of Tibetans to the hypoxic environment of the high-altitude Tibetan plateau. A hypoxia pathway gene, EPAS1, was previously identified as having the most extreme signature of positive selection in Tibetans, and was shown to be associated with differences in haemoglobin concentration at high altitude. Re-sequencing the region around EPAS1 in 40 Tibetan and 40 Han individuals, we find that this gene has a highly unusual haplotype structure that can only be convincingly explained by introgression of DNA from Denisovan or Denisovan-related individuals into humans. Scanning a larger set of worldwide populations, we find that the selected haplotype is only found in Denisovans and in Tibetans, and at very low frequency among Han Chinese. Furthermore, the length of the haplotype, and the fact that it is not found in any other populations, makes it unlikely that the haplotype sharing between Tibetans and Denisovans was caused by incomplete ancestral lineage sorting rather than introgression. Our findings illustrate that admixture with other hominin species has provided genetic variation that helped humans to adapt to new environments.
Assuntos
Adaptação Fisiológica/genética , Altitude , DNA/genética , Variação Genética , Hominidae/genética , Animais , Povo Asiático/genética , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Frequência do Gene , Haplótipos , Humanos , Polimorfismo de Nucleotídeo Único , TibetRESUMO
OBJECTIVES: Long-tailed macaques (Macaca fascicularis) are widely distributed throughout the mainland and islands of Southeast Asia, making them a useful model for understanding the complex biogeographical history resulting from drastic changes in sea levels throughout the Pleistocene. Past studies based on mitochondrial genomes (mitogenomes) of long-tailed macaque museum specimens have traced their colonization patterns throughout the archipelago, but mitogenomes trace only the maternal history. Here, our objectives were to trace phylogeographic patterns of long-tailed macaques using low-coverage nuclear DNA (nDNA) data from museum specimens. METHODS: We performed population genetic analyses and phylogenetic reconstruction on nuclear single nucleotide polymorphisms (SNPs) from shotgun sequencing of 75 long-tailed macaque museum specimens from localities throughout Southeast Asia. RESULTS: We show that shotgun sequencing of museum specimens yields sufficient genome coverage (average ~1.7%) for reconstructing population relationships using SNP data. Contrary to expectations of divergent results between nuclear and mitochondrial genomes for a female philopatric species, phylogeographical patterns based on nuclear SNPs proved to be closely similar to those found using mitogenomes. In particular, population genetic analyses and phylogenetic reconstruction from the nDNA identify two major clades within M. fascicularis: Clade A includes all individuals from the mainland along with individuals from northern Sumatra, while Clade B consists of the remaining island-living individuals, including those from southern Sumatra. CONCLUSIONS: Overall, we demonstrate that low-coverage sequencing of nDNA from museum specimens provides enough data for examining broad phylogeographic patterns, although greater genome coverage and sequencing depth would be needed to distinguish between very closely related populations, such as those throughout the Philippines.
Assuntos
Macaca fascicularis/classificação , Macaca fascicularis/genética , Migração Animal , Animais , Animais Selvagens/classificação , Animais Selvagens/genética , Antropologia Física , DNA/genética , Feminino , Genética Populacional , Genoma/genética , Indonésia , Masculino , Museus , Filipinas , Filogenia , Polimorfismo de Nucleotídeo Único/genética , Análise de Sequência de DNARESUMO
Comparisons of DNA from archaic and modern humans show that these groups interbred, and in some cases received an evolutionary advantage from doing so. This process-adaptive introgression-may lead to a faster rate of adaptation than is predicted from models with mutation and selection alone. Within the last couple of years, a series of studies have identified regions of the genome that are likely examples of adaptive introgression. In many cases, once a region was ascertained as being introgressed, commonly used statistics based on both haplotype as well as allele frequency information were employed to test for positive selection. Introgression by itself, however, changes both the haplotype structure and the distribution of allele frequencies, thus confounding traditional tests for detecting positive selection. Therefore, patterns generated by introgression alone may lead to false inferences of positive selection. Here we explore models involving both introgression and positive selection to investigate the behavior of various statistics under adaptive introgression. In particular, we find that the number and allelic frequencies of sites that are uniquely shared between archaic humans and specific present-day populations are particularly useful for detecting adaptive introgression. We then examine the 1000 Genomes dataset to characterize the landscape of uniquely shared archaic alleles in human populations. Finally, we identify regions that were likely subject to adaptive introgression and discuss some of the most promising candidate genes located in these regions.
Assuntos
Adaptação Biológica/genética , DNA Antigo/análise , Análise de Sequência de DNA/métodos , Alelos , Animais , Evolução Biológica , Simulação por Computador , Bases de Dados de Ácidos Nucleicos , Evolução Molecular , Frequência do Gene , Genética Populacional , Haplótipos , Humanos , Homem de Neandertal , Filogenia , Seleção GenéticaRESUMO
A recent study conducted the first genome-wide scan for selection in Inuit from Greenland using single nucleotide polymorphism chip data. Here, we report that selection in the region with the second most extreme signal of positive selection in Greenlandic Inuit favored a deeply divergent haplotype that is closely related to the sequence in the Denisovan genome, and was likely introgressed from an archaic population. The region contains two genes, WARS2 and TBX15, and has previously been associated with adipose tissue differentiation and body-fat distribution in humans. We show that the adaptively introgressed allele has been under selection in a much larger geographic region than just Greenland. Furthermore, it is associated with changes in expression of WARS2 and TBX15 in multiple tissues including the adrenal gland and subcutaneous adipose tissue, and with regional DNA methylation changes in TBX15.
Assuntos
Adaptação Biológica/genética , Inuíte/genética , Proteínas com Domínio T/genética , Tecido Adiposo/fisiologia , Alelos , Animais , Metilação de DNA , DNA Antigo , Groenlândia , Haplótipos , Humanos , Modelos Genéticos , Homem de Neandertal , Polimorfismo de Nucleotídeo Único , Seleção Genética , Análise de Sequência de DNA/métodosRESUMO
Access to a geographically diverse set of modern human samples from the present time and from ancient remains, combined with archaic hominin samples, provides an unprecedented level of resolution to study both human history and adaptation. The amount and quality of ancient human data continue to improve and enable tracking the trajectory of genetic variation over time. These data have the potential to help us redefine or generate new hypotheses of how human evolution occurred and to revise previous conjectures. In this article, we argue that leveraging all these data will help us better detail adaptive histories in humans. As a case in point, we focus on one of the most celebrated examples of human adaptation: the evolution of lactase persistence. We briefly review this dietary adaptation and argue that, effectively, the evolutionary history of lactase persistence is still not fully resolved. We propose that, by leveraging data from multiple populations across time and space, we will find evidence of a more nuanced history than just a simple selective sweep. We support our hypotheses with simulation results and make some cautionary notes regarding the use of haplotype-based summary statistics to estimate evolutionary parameters.
Assuntos
Adaptação Fisiológica/genética , Evolução Molecular , Frequência do Gene/genética , Hominidae , Lactase/genética , Repetições de Microssatélites/genética , Animais , Primers do DNA , Dieta , Deriva Genética , Genética Populacional , Haplótipos/genética , História Antiga , Humanos , Lactase/metabolismo , Teste de Tolerância a Lactose , Leite , Seleção GenéticaRESUMO
Interferon lambda 4 gene (IFNL4) encodes IFN-λ4, a new member of the IFN-λ family with antiviral activity. In humans IFNL4 open reading frame is truncated by a polymorphic frame-shift insertion that eliminates IFN-λ4 and turns IFNL4 into a polymorphic pseudogene. Functional IFN-λ4 has antiviral activity but the elimination of IFN-λ4 through pseudogenization is strongly associated with improved clearance of hepatitis C virus (HCV) infection. We show that functional IFN-λ4 is conserved and evolutionarily constrained in mammals and thus functionally relevant. However, the pseudogene has reached moderately high frequency in Africa, America, and Europe, and near fixation in East Asia. In fact, the pseudogenizing variant is among the 0.8% most differentiated SNPs between Africa and East Asia genome-wide. Its raise in frequency is associated with additional evidence of positive selection, which is strongest in East Asia, where this variant falls in the 0.5% tail of SNPs with strongest signatures of recent positive selection genome-wide. Using a new Approximate Bayesian Computation (ABC) approach we infer that the pseudogenizing allele appeared just before the out-of-Africa migration and was immediately targeted by moderate positive selection; selection subsequently strengthened in European and Asian populations resulting in the high frequency observed today. This provides evidence for a changing adaptive process that, by favoring IFN-λ4 inactivation, has shaped present-day phenotypic diversity and susceptibility to disease.
Assuntos
Interleucinas/genética , Seleção Genética , África , Animais , Povo Asiático/genética , Sequência de Bases , Teorema de Bayes , Sequência Conservada , Ásia Oriental , Frequência do Gene , Predisposição Genética para Doença , Genética Populacional , Genoma Humano , Haplótipos , Hepatite C/genética , Hepatite C/virologia , Humanos , Interleucinas/fisiologia , Mamíferos/genética , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Pseudogenes , População Branca/genéticaRESUMO
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with ∂a∂i, the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets.
Assuntos
Simulação por Computador , Demografia , Genética Populacional , Polimorfismo de Nucleotídeo Único/genética , Genoma Humano , Genômica , Humanos , Grupos PopulacionaisRESUMO
An outstanding question in human genetics has been the degree to which adaptation occurs from standing genetic variation or from de novo mutations. Here, we combine several common statistics used to detect selection in an Approximate Bayesian Computation (ABC) framework, with the goal of discriminating between models of selection and providing estimates of the age of selected alleles and the selection coefficients acting on them. We use simulations to assess the power and accuracy of our method and apply it to seven of the strongest sweeps currently known in humans. We identify two genes, ASPM and PSCA, that are most likely affected by selection on standing variation; and we find three genes, ADH1B, LCT, and EDAR, in which the adaptive alleles seem to have swept from a new mutation. We also confirm evidence of selection for one further gene, TRPV6. In one gene, G6PD, neither neutral models nor models of selective sweeps fit the data, presumably because this locus has been subject to balancing selection.
Assuntos
Variação Genética , Modelos Genéticos , Mutação , Seleção Genética , Algoritmos , Teorema de Bayes , Simulação por Computador , Humanos , Reprodutibilidade dos TestesRESUMO
Environmental or genomic changes during evolution can relax negative selection pressure on specific loci, permitting high frequency polymorphisms at previously conserved sites. Here, we jointly analyze population genomic and comparative genomic data to search for functional processes showing relaxed negative selection specifically in the human lineage, whereas remaining evolutionarily conserved in other mammals. Consistent with previous studies, we find that olfactory receptor genes display such a signature of relaxation in humans. Intriguingly, proteasome genes also show a prominent signal of human-specific relaxation: multiple proteasome subunits, including four members of the catalytic core particle, contain high frequency nonsynonymous polymorphisms at sites conserved across mammals. Chimpanzee proteasome genes do not display a similar trend. Human proteasome genes also bear no evidence of recent positive or balancing selection. These results suggest human-specific relaxation of negative selection in proteasome subunits; the exact biological causes, however, remain unknown.
Assuntos
Polimorfismo Genético , Complexo de Endopeptidases do Proteassoma/genética , Seleção Genética , Animais , Evolução Molecular , Frequência do Gene , Genoma Humano , Humanos , Pan troglodytes , Polimorfismo de Nucleotídeo ÚnicoRESUMO
The Tibetan and Andean Plateaus and Ethiopian highlands are the largest regions to have long-term high-altitude residents. Such populations are exposed to lower barometric pressures and hence atmospheric partial pressures of oxygen. Such "hypobaric hypoxia" may limit physical functional capacity, reproductive health, and even survival. As such, selection of genetic variants advantageous to hypoxic adaptation is likely to have occurred. Identifying signatures of such selection is likely to help understanding of hypoxic adaptive processes. Here, we seek evidence of such positive selection using five Ethiopian populations, three of which are from high-altitude areas in Ethiopia. As these populations may have been recipients of Eurasian gene flow, we correct for this admixture. Using single-nucleotide polymorphism genotype data from multiple populations, we find the strongest signal of selection in BHLHE41 (also known as DEC2 or SHARP1). Remarkably, a major role of this gene is regulation of the same hypoxia response pathway on which selection has most strikingly been observed in both Tibetan and Andean populations. Because it is also an important player in the circadian rhythm pathway, BHLHE41 might also provide insights into the mechanisms underlying the recognized impacts of hypoxia on the circadian clock. These results support the view that Ethiopian, Andean, and Tibetan populations living at high altitude have adapted to hypoxia differently, with convergent evolution affecting different genes from the same pathway.
Assuntos
Aclimatação/genética , Altitude , Estudo de Associação Genômica Ampla , Transcriptoma , Evolução Biológica , Etiópia , Redes Reguladoras de Genes , Genética Populacional , Humanos , Hipóxia/genética , Polimorfismo de Nucleotídeo Único , Seleção GenéticaRESUMO
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also correlated with recombination rate and whether these correlations can be explained solely by negative selection against deleterious mutations or whether positive selection acting on favorable alleles is also required. Here we attempt to address these questions by analyzing three different genome-wide resequencing datasets from European individuals. We document several significant correlations between different genomic features. In particular, we find that average minor allele frequency and diversity are reduced in regions of low recombination and that human diversity, human-chimp divergence, and average minor allele frequency are reduced near genes. Population genetic simulations show that either positive natural selection acting on favorable mutations or negative natural selection acting against deleterious mutations can explain these correlations. However, models with strong positive selection on nonsynonymous mutations and little negative selection predict a stronger negative correlation between neutral diversity and nonsynonymous divergence than observed in the actual data, supporting the importance of negative, rather than positive, selection throughout the genome. Further, we show that the widespread presence of weakly deleterious alleles, rather than a small number of strongly positively selected mutations, is responsible for the correlation between neutral genetic diversity and recombination rate. This work suggests that natural selection has affected multiple aspects of linked neutral variation throughout the human genome and that positive selection is not required to explain these observations.