Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
1.
Mol Biol Evol ; 41(3)2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38466119

RESUMO

Ancient DNA can directly reveal the contribution of natural selection to human genomic variation. However, while the analysis of ancient DNA has been successful at identifying genomic signals of selection, inferring the phenotypic consequences of that selection has been more difficult. Most trait-associated variants are noncoding, so we expect that a large proportion of the phenotypic effects of selection will also act through noncoding variation. Since we cannot measure gene expression directly in ancient individuals, we used an approach (Joint-Tissue Imputation [JTI]) developed to predict gene expression from genotype data. We tested for changes in the predicted expression of 17,384 protein coding genes over a time transect of 4,500 years using 91 present-day and 616 ancient individuals from Britain. We identified 28 genes at seven genomic loci with significant (false discovery rate [FDR] < 0.05) changes in predicted expression levels in this time period. We compared the results from our transcriptome-wide scan to a genome-wide scan based on estimating per-single nucleotide polymorphism (SNP) selection coefficients from time series data. At five previously identified loci, our approach allowed us to highlight small numbers of genes with evidence for significant shifts in expression from peaks that in some cases span tens of genes. At two novel loci (SLC44A5 and NUP85), we identify selection on gene expression not captured by scans based on genomic signatures of selection. Finally, we show how classical selection statistics (iHS and SDS) can be combined with JTI models to incorporate functional information into scans that use present-day data alone. These results demonstrate the potential of this type of information to explore both the causes and consequences of natural selection.


Assuntos
DNA Antigo , Seleção Genética , Humanos , Reino Unido , Genoma , Genótipo , Polimorfismo de Nucleotídeo Único , Estudo de Associação Genômica Ampla
2.
Genetics ; 226(4)2024 Apr 03.
Artigo em Inglês | MEDLINE | ID: mdl-38386895

RESUMO

Understanding natural selection and other forms of non-neutrality is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically require slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection and other local evolutionary processes that requires relatively few selection simulations during training. We build upon a generative adversarial network trained to simulate realistic neutral data. This consists of a generator (fitted demographic model), and a discriminator (convolutional neural network) that predicts whether a genomic region is real or fake. As the generator can only generate data under neutral demographic processes, regions of real data that the discriminator recognizes as having a high probability of being "real" do not fit the neutral demographic model and are therefore candidates for targets of selection. To incentivize identification of a specific mode of selection, we fine-tune the discriminator with a small number of custom non-neutral simulations. We show that this approach has high power to detect various forms of selection in simulations, and that it finds regions under positive selection identified by state-of-the-art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Humanos , Genômica , Seleção Genética , Genética Populacional
3.
Nat Hum Behav ; 8(2): 243-255, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38081999

RESUMO

The rules and structure of human culture impact health as much as genetics or environment. To study these relationships, we combine ancient DNA (n = 230), skeletal metrics (n = 391), palaeopathology (n = 606) and dietary stable isotopes (n = 873) to analyse stature variation in Early Neolithic Europeans from North Central, South Central, Balkan and Mediterranean regions. In North Central Europe, stable isotopes and linear enamel hypoplasias indicate high environmental stress across sexes, but female stature is low, despite polygenic scores identical to males, and suggests that cultural factors preferentially supported male recovery from stress. In Mediterranean populations, sexual dimorphism is reduced, indicating male vulnerability to stress and no strong cultural preference for males. Our analysis indicates that biological effects of sex-specific inequities can be linked to cultural influences at least as early as 7,000 yr ago, and culture, more than environment or genetics, drove height disparities in Early Neolithic Europe.


Assuntos
Genética Populacional , Caracteres Sexuais , Feminino , Masculino , Humanos , DNA Mitocondrial , Europa (Continente) , Isótopos
4.
Genome Biol Evol ; 15(11)2023 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-37935112

RESUMO

To elucidate the population history of the Caucasus, we conducted a survey of genetic diversity in Samegrelo (Mingrelia), western Georgia. We collected DNA samples and genealogical information from 485 individuals residing in 30 different locations, the vast majority of whom being Mingrelian speaking. From these DNA samples, we generated mitochondrial DNA (mtDNA) control region sequences for all 485 participants (female and male), Y-short tandem repeat haplotypes for the 372 male participants, and analyzed all samples at nearly 590,000 autosomal single nucleotide polymorphisms (SNPs) plus around 33,000 on the sex chromosomes, with 27,000 SNP removed for missingness, using the GenoChip 2.0+ microarray. The resulting data were compared with those from populations from Anatolia, the Caucasus, the Near East, and Europe. Overall, Mingrelians exhibited considerable mtDNA haplogroup diversity, having high frequencies of common West Eurasian haplogroups (H, HV, I, J, K, N1, R1, R2, T, U, and W. X2) and low frequencies of East Eurasian haplogroups (A, C, D, F, and G). From a Y-chromosome standpoint, Mingrelians possessed a variety of haplogroups, including E1b1b, G2a, I2, J1, J2, L, Q, R1a, and R1b. Analysis of autosomal SNP data further revealed that Mingrelians are genetically homogeneous and cluster with other modern-day South Caucasus populations. When compared with ancient DNA samples from Bronze Age archaeological contexts in the broader region, these data indicate that the Mingrelian gene pool began taking its current form at least by this period, probably in conjunction with the formation of a distinct linguistic community.


Assuntos
Cromossomos Humanos Y , Genética Populacional , Humanos , Masculino , Feminino , República da Geórgia , Cromossomos Humanos Y/genética , DNA Mitocondrial/genética , Europa (Continente) , Haplótipos , Variação Genética
5.
bioRxiv ; 2023 Oct 19.
Artigo em Inglês | MEDLINE | ID: mdl-37904954

RESUMO

Ancient DNA can directly reveal the contribution of natural selection to human genomic variation. However, while the analysis of ancient DNA has been successful at identifying genomic signals of selection, inferring the phenotypic consequences of that selection has been more difficult. Most trait-associated variants are non-coding, so we expect that a large proportion of the phenotypic effects of selection will also act through non-coding variation. Since we cannot measure gene expression directly in ancient individuals, we used an approach (Joint-Tissue Imputation; JTI) developed to predict gene expression from genotype data. We tested for changes in the predicted expression of 17,384 protein coding genes over a time transect of 4500 years using 91 present-day and 616 ancient individuals from Britain. We identified 28 genes at seven genomic loci with significant (FDR < 0.05) changes in predicted expression levels in this time period. We compared the results from our transcriptome-wide scan to a genome-wide scan based on estimating per-SNP selection coefficients from time series data. At five previously identified loci, our approach allowed us to highlight small numbers of genes with evidence for significant shifts in expression from peaks that in some cases span tens of genes. At two novel loci (SLC44A5 and NUP85), we identify selection on gene expression not captured by scans based on genomic signatures of selection. Finally we show how classical selection statistics (iHS and SDS) can be combined with JTI models to incorporate functional information into scans that use present-day data alone. These results demonstrate the potential of this type of information to explore both the causes and consequences of natural selection.

6.
Curr Biol ; 33(20): R1064-R1066, 2023 10 23.
Artigo em Inglês | MEDLINE | ID: mdl-37875084

RESUMO

A new study aims to identify how genetic and physiological adaptations to altitude affect pregnancy, childbirth and neonatal health in one of the most extreme environments on Earth, the Tibetan Plateau.


Assuntos
Adaptação Fisiológica , Genética Populacional , Recém-Nascido , Humanos , Adaptação Fisiológica/genética , Altitude , Tibet
7.
PLoS Genet ; 19(7): e1010807, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37418489

RESUMO

Germline mutation is the mechanism by which genetic variation in a population is created. Inferences derived from mutation rate models are fundamental to many population genetics methods. Previous models have demonstrated that nucleotides flanking polymorphic sites-the local sequence context-explain variation in the probability that a site is polymorphic. However, limitations to these models exist as the size of the local sequence context window expands. These include a lack of robustness to data sparsity at typical sample sizes, lack of regularization to generate parsimonious models and lack of quantified uncertainty in estimated rates to facilitate comparison between models. To address these limitations, we developed Baymer, a regularized Bayesian hierarchical tree model that captures the heterogeneous effect of sequence contexts on polymorphism probabilities. Baymer implements an adaptive Metropolis-within-Gibbs Markov Chain Monte Carlo sampling scheme to estimate the posterior distributions of sequence-context based probabilities that a site is polymorphic. We show that Baymer accurately infers polymorphism probabilities and well-calibrated posterior distributions, robustly handles data sparsity, appropriately regularizes to return parsimonious models, and scales computationally at least up to 9-mer context windows. We demonstrate application of Baymer in three ways-first, identifying differences in polymorphism probabilities between continental populations in the 1000 Genomes Phase 3 dataset, second, in a sparse data setting to examine the use of polymorphism models as a proxy for de novo mutation probabilities as a function of variant age, sequence context window size, and demographic history, and third, comparing model concordance between different great ape species. We find a shared context-dependent mutation rate architecture underlying our models, enabling a transfer-learning inspired strategy for modeling germline mutations. In summary, Baymer is an accurate polymorphism probability estimation algorithm that automatically adapts to data sparsity at different sequence context levels, thereby making efficient use of the available data.


Assuntos
Genoma Humano , Taxa de Mutação , Humanos , Genoma Humano/genética , Teorema de Bayes , Mutação , Polimorfismo Genético , Cadeias de Markov , Método de Monte Carlo
8.
HGG Adv ; 4(3): 100202, 2023 07 13.
Artigo em Inglês | MEDLINE | ID: mdl-37255673

RESUMO

Mitochondrial DNA copy number (mtCN) is often treated as a proxy for mitochondrial (dys-) function and disease risk. Pathological changes in mtCN are common symptoms of rare mitochondrial disorders, but reported associations between mtCN and common diseases vary across studies. To understand the biology of mtCN, we carried out genome- and phenome-wide association studies of mtCN in 30,666 individuals from the Penn Medicine BioBank (PMBB)-a diverse cohort of largely African and European ancestry. We estimated mtCN in peripheral blood using exome sequence data, taking cell composition into account. We replicated known genetic associations of mtCN in the PMBB and found that their effects are highly correlated between individuals of European and African ancestry. However, the heritability of mtCN was much higher among individuals of largely African ancestry (h2=0.3) compared with European ancestry individuals(h2=0.1). Admixture mapping suggests that there are undiscovered variants underlying mtCN that are differentiated in frequency between individuals with African and European ancestry. We show that mtCN is associated with many health-related phenotypes. We discovered robust associations between mtDNA copy number and diseases of metabolically active tissues, such as cardiovascular disease and liver damage, that were consistent across African and European ancestry individuals. Other associations, such as epilepsy and prostate cancer, were only discovered in either individuals with European or African ancestry but not both. We show that mtCN-phenotype associations can be sensitive to blood cell composition and environmental modifiers, explaining why such associations are inconsistent across studies. Thus, mtCN-phenotype associations must be interpreted with care.


Assuntos
Variações do Número de Cópias de DNA , DNA Mitocondrial , Masculino , Animais , DNA Mitocondrial/genética , Variações do Número de Cópias de DNA/genética , Mitocôndrias/genética , Leucócitos/metabolismo , Fenótipo
9.
Genetics ; 224(2)2023 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-37036411

RESUMO

Most variants identified in human genome-wide association studies and scans for selection are noncoding. Interpretation of their effects and the way in which they contribute to phenotypic variation and adaptation in human populations is therefore limited by our understanding of gene regulation and the difficulty of confidently linking noncoding variants to genes. To overcome this, we developed a gene-wise test for population-specific selection based on combinations of regulatory variants. Specifically, we use the QX statistic to test for polygenic selection on cis-regulatory variants based on whether the variance across populations in the predicted expression of a particular gene is higher than expected under neutrality. We then applied this approach to human data, testing for selection on 17,388 protein-coding genes in 26 populations from the Thousand Genomes Project. We identified 45 genes with significant evidence (FDR<0.1) for selection, including FADS1, KHK, SULT1A2, ITGAM, and several genes in the HLA region. We further confirm that these signals correspond to plausible population-level differences in predicted expression. While the small number of significant genes (0.2%) is consistent with most cis-regulatory variation evolving under genetic drift or stabilizing selection, it remains possible that there are effects not captured in this study. Our gene-level QX score is independent of standard genomic tests for selection, and may therefore be useful in combination with traditional selection scans to specifically identify selection on regulatory variation. Overall, our results demonstrate the utility of combining population-level genomic data with functional data to understand the evolution of gene expression.


Assuntos
Testes Genéticos , Estudo de Associação Genômica Ampla , Humanos , Deriva Genética , Genoma , Expressão Gênica , Seleção Genética
10.
Nat Hum Behav ; 7(5): 790-801, 2023 05.
Artigo em Inglês | MEDLINE | ID: mdl-36864135

RESUMO

Identifying genetic determinants of reproductive success may highlight mechanisms underlying fertility and identify alleles under present-day selection. Using data in 785,604 individuals of European ancestry, we identified 43 genomic loci associated with either number of children ever born (NEB) or childlessness. These loci span diverse aspects of reproductive biology, including puberty timing, age at first birth, sex hormone regulation, endometriosis and age at menopause. Missense variants in ARHGAP27 were associated with higher NEB but shorter reproductive lifespan, suggesting a trade-off at this locus between reproductive ageing and intensity. Other genes implicated by coding variants include PIK3IP1, ZFP82 and LRP4, and our results suggest a new role for the melanocortin 1 receptor (MC1R) in reproductive biology. As NEB is one component of evolutionary fitness, our identified associations indicate loci under present-day natural selection. Integration with data from historical selection scans highlighted an allele in the FADS1/2 gene locus that has been under selection for thousands of years and remains so today. Collectively, our findings demonstrate that a broad range of biological mechanisms contribute to reproductive success.


Assuntos
Fertilidade , Reprodução , Criança , Feminino , Humanos , Envelhecimento/fisiologia , Fertilidade/genética , Menopausa/genética , Reprodução/genética , Seleção Genética
11.
Curr Biol ; 33(7): 1365-1371.e3, 2023 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-36963383

RESUMO

Ancient DNA has revealed multiple episodes of admixture in human prehistory during geographic expansions associated with cultural innovations. One important example is the expansion of Neolithic agricultural groups out of the Near East into Europe and their consequent admixture with Mesolithic hunter-gatherers.1,2,3,4 Ancient genomes from this period provide an opportunity to study the role of admixture in providing new genetic variation for selection to act upon, and also to identify genomic regions that resisted hunter-gatherer introgression and may thus have contributed to agricultural adaptations. We used genome-wide DNA from 677 individuals spanning Mesolithic and Neolithic Europe to infer ancestry deviations in the genomes of admixed individuals and to test for natural selection after admixture by testing for deviations from a genome-wide null distribution. We find that the region around the pigmentation-associated gene SLC24A5 shows the greatest overrepresentation of Neolithic local ancestry in the genome (|Z| = 3.46). In contrast, we find the greatest overrepresentation of Mesolithic ancestry across the major histocompatibility complex (MHC; |Z| = 4.21), a major immunity locus, which also shows allele frequency deviations indicative of selection following admixture (p = 1 × 10-56). This could reflect negative frequency-dependent selection on MHC alleles common in Neolithic populations or that Mesolithic alleles were positively selected for and facilitated adaptation in Neolithic populations to pathogens or other environmental factors. Our study extends previous results that highlight immune function and pigmentation as targets of adaptation in more recent populations to selection processes in the Stone Age.


Assuntos
DNA , Fazendeiros , Humanos , Europa (Continente) , Alelos , Seleção Genética
12.
bioRxiv ; 2023 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-36993413

RESUMO

Klunk et al. analyzed ancient DNA data from individuals in London and Denmark before, during and after the Black Death [1], and argued that allele frequency changes at immune genes were too large to be produced by random genetic drift and thus must reflect natural selection. They also identified four specific variants that they claimed show evidence of selection including at ERAP2, for which they estimate a selection coefficient of 0.39-several times larger than any selection coefficient on a common human variant reported to date. Here we show that these claims are unsupported for four reasons. First, the signal of enrichment of large allele frequency changes in immune genes comparing people in London before and after the Black Death disappears after an appropriate randomization test is carried out: the P value increases by ten orders of magnitude and is no longer significant. Second, a technical error in the estimation of allele frequencies means that none of the four originally reported loci actually pass the filtering thresholds. Third, the filtering thresholds do not adequately correct for multiple testing. Finally, in the case of the ERAP2 variant rs2549794, which Klunk et al. show experimentally may be associated with a host interaction with Y. pestis, we find no evidence of significant frequency change either in the data that Klunk et al. report, or in published data spanning 2,000 years. While it remains plausible that immune genes were subject to natural selection during the Black Death, the magnitude of this selection and which specific genes may have been affected remains unknown.

13.
bioRxiv ; 2023 Jul 09.
Artigo em Inglês | MEDLINE | ID: mdl-36945387

RESUMO

Understanding natural selection in humans and other species is a major focus for the use of machine learning in population genetics. Existing methods rely on computationally intensive simulated training data. Unlike efficient neutral coalescent simulations for demographic inference, realistic simulations of selection typically requires slow forward simulations. Because there are many possible modes of selection, a high dimensional parameter space must be explored, with no guarantee that the simulated models are close to the real processes. Mismatches between simulated training data and real test data can lead to incorrect inference. Finally, it is difficult to interpret trained neural networks, leading to a lack of understanding about what features contribute to classification. Here we develop a new approach to detect selection that requires relatively few selection simulations during training. We use a Generative Adversarial Network (GAN) trained to simulate realistic neutral data. The resulting GAN consists of a generator (fitted demographic model) and a discriminator (convolutional neural network). For a genomic region, the discriminator predicts whether it is "real" or "fake" in the sense that it could have been simulated by the generator. As the "real" training data includes regions that experienced selection and the generator cannot produce such regions, regions with a high probability of being real are likely to have experienced selection. To further incentivize this behavior, we "fine-tune" the discriminator with a small number of selection simulations. We show that this approach has high power to detect selection in simulations, and that it finds regions under selection identified by state-of-the art population genetic methods in three human populations. Finally, we show how to interpret the trained networks by clustering hidden units of the discriminator based on their correlation patterns with known summary statistics. In summary, our approach is a novel, efficient, and powerful way to use machine learning to detect natural selection.

14.
PLoS Genet ; 19(1): e1010584, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36656851

RESUMO

Loss or absence of hearing is common at both extremes of human lifespan, in the forms of congenital deafness and age-related hearing loss. While these are often studied separately, there is increasing evidence that their genetic basis is at least partially overlapping. In particular, both common and rare variants in genes associated with monogenic forms of hearing loss also contribute to the more polygenic basis of age-related hearing loss. Here, we directly test this model in the Penn Medicine BioBank-a healthcare system cohort of around 40,000 individuals with linked genetic and electronic health record data. We show that increased burden of predicted deleterious variants in Mendelian hearing loss genes is associated with increased risk and severity of adult-onset hearing loss. As a specific example, we identify one gene-TCOF1, responsible for a syndromic form of congenital hearing loss-in which deleterious variants are also associated with adult-onset hearing loss. We also identify four additional novel candidate genes (COL5A1, HMMR, RAPGEF3, and NNT) in which rare variant burden may be associated with hearing loss. Our results confirm that rare variants in Mendelian hearing loss genes contribute to polygenic risk of hearing loss, and emphasize the utility of healthcare system cohorts to study common complex traits and diseases.


Assuntos
Surdez , Perda Auditiva Neurossensorial , Perda Auditiva , Humanos , Adulto , Surdez/genética , Perda Auditiva/genética , Perda Auditiva Neurossensorial/genética , Herança Multifatorial , Audição , Mutação
15.
Genome Res ; 32(11-12): 2057-2067, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36316157

RESUMO

We developed a novel method for efficiently estimating time-varying selection coefficients from genome-wide ancient DNA data. In simulations, our method accurately recovers selective trajectories and is robust to misspecification of population size. We applied it to a large data set of ancient and present-day human genomes from Britain and identified seven loci with genome-wide significant evidence of selection in the past 4500 yr. Almost all of them can be related to increased vitamin D or calcium levels, suggesting strong selective pressure on these or related phenotypes. However, the strength of selection on individual loci varied substantially over time, suggesting that cultural or environmental factors moderated the genetic response. Of 28 complex anthropometric and metabolic traits, skin pigmentation was the only one with significant evidence of polygenic selection, further underscoring the importance of phenotypes related to vitamin D. Our approach illustrates the power of ancient DNA to characterize selection in human populations and illuminates the recent evolutionary history of Britain.


Assuntos
DNA Antigo , Seleção Genética , Humanos , Reino Unido , Pigmentação da Pele , Genoma Humano
17.
Genome Biol Evol ; 13(11)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34718543

RESUMO

As humans populated the world, they adapted to many varying environmental factors, including climate, diet, and pathogens. Because many of these adaptations were mediated by multiple noncoding variants with small effects on gene regulation, it has been difficult to link genomic signals of selection to specific genes, and to describe the regulatory response to selection. To overcome this challenge, we adapted PrediXcan, a machine learning method for imputing gene regulation from genotype data, to analyze low-coverage ancient human DNA (aDNA). First, we used simulated genomes to benchmark strategies for adapting PrediXcan to increase robustness to incomplete data. Applying the resulting models to 490 ancient Eurasians, we found that genes with the strongest divergent regulation among ancient populations with hunter-gatherer, pastoralist, and agricultural lifestyles are enriched for metabolic and immune functions. Next, we explored the contribution of divergent gene regulation to two traits with strong evidence of recent adaptation: dietary metabolism and skin pigmentation. We found enrichment for divergent regulation among genes proposed to be involved in diet-related local adaptation, and the predicted effects on regulation often suggest explanations for known signals of selection, for example, at FADS1, GPX1, and LEPR. In contrast, skin pigmentation genes show little regulatory change over a 38,000-year time series of 2,999 ancient Europeans, suggesting that adaptation mainly involved large-effect coding variants. This work demonstrates that combining aDNA with present-day genomes is informative about the biological differences among ancient populations, the role of gene regulation in adaptation, and the relationship between genetic diversity and complex traits.


Assuntos
Adaptação Biológica , Genoma Humano , Adaptação Biológica/genética , Adaptação Fisiológica , Evolução Biológica , DNA Antigo , Humanos , Herança Multifatorial , Seleção Genética
18.
Am J Hum Genet ; 108(9): 1558-1563, 2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-34331855

RESUMO

The omnigenic model was proposed as a framework to understand the highly polygenic architecture of complex traits revealed by genome-wide association studies (GWASs). I argue that this model also explains recent observations about cross-population genetic effects, specifically the low transferability of polygenic scores and the lack of clear evidence for polygenic selection. In particular, the omnigenic model explains why the effects of most GWAS variants vary between populations. This interpretation has several consequences for the evolutionary interpretation and practical use of GWAS summary statistics and polygenic scores. First, some polygenic scores may be applicable only in populations of the same ancestry and environment as the discovery population. Second, most GWAS associations will have differing effects between populations and are unlikely to be robust clinical targets. Finally, it may not always be possible to detect polygenic selection from population genetic data. These considerations make it difficult to interpret the clinical and evolutionary meanings of polygenic scores without an explicit model of genetic architecture.


Assuntos
Genética Populacional/métodos , Modelos Genéticos , Herança Multifatorial , Característica Quantitativa Herdável , Simulação por Computador , Variação Genética , Estudo de Associação Genômica Ampla , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas
19.
Mol Ecol Resour ; 21(8): 2689-2705, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-33745225

RESUMO

Population genetics relies heavily on simulated data for validation, inference and intuition. In particular, since the evolutionary 'ground truth' for real data is always limited, simulated data are crucial for training supervised machine learning methods. Simulation software can accurately model evolutionary processes but requires many hand-selected input parameters. As a result, simulated data often fail to mirror the properties of real genetic data, which limits the scope of methods that rely on it. Here, we develop a novel approach to estimating parameters in population genetic models that automatically adapts to data from any population. Our method, pg-gan, is based on a generative adversarial network that gradually learns to generate realistic synthetic data. We demonstrate that our method is able to recover input parameters in a simulated isolation-with-migration model. We then apply our method to human data from the 1000 Genomes Project and show that we can accurately recapitulate the features of real data.


Assuntos
Software , Simulação por Computador , Demografia , Humanos
20.
Proc Natl Acad Sci U S A ; 118(1)2021 01 05.
Artigo em Inglês | MEDLINE | ID: mdl-33443182

RESUMO

Skin pigmentation is a classic example of a polygenic trait that has experienced directional selection in humans. Genome-wide association studies have identified well over a hundred pigmentation-associated loci, and genomic scans in present-day and ancient populations have identified selective sweeps for a small number of light pigmentation-associated alleles in Europeans. It is unclear whether selection has operated on all of the genetic variation associated with skin pigmentation as opposed to just a small number of large-effect variants. Here, we address this question using ancient DNA from 1,158 individuals from West Eurasia covering a period of 40,000 y combined with genome-wide association summary statistics from the UK Biobank. We find a robust signal of directional selection in ancient West Eurasians on 170 skin pigmentation-associated variants ascertained in the UK Biobank. However, we also show that this signal is driven by a limited number of large-effect variants. Consistent with this observation, we find that a polygenic selection test in present-day populations fails to detect selection with the full set of variants. Our data allow us to disentangle the effects of admixture and selection. Most notably, a large-effect variant at SLC24A5 was introduced to Western Europe by migrations of Neolithic farming populations but continued to be under selection post-admixture. This study shows that the response to selection for light skin pigmentation in West Eurasia was driven by a relatively small proportion of the variants that are associated with present-day phenotypic variation.


Assuntos
DNA Antigo/análise , Seleção Genética/genética , Pigmentação da Pele/genética , Alelos , Ásia , Povo Asiático/genética , Evolução Biológica , Bases de Dados Genéticas , Europa (Continente) , Frequência do Gene/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Haplótipos/genética , Humanos , Herança Multifatorial/genética , Polimorfismo de Nucleotídeo Único/genética , Pigmentação da Pele/fisiologia , População Branca/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...