ABSTRACT
We conduct high coverage (>30×) whole-genome sequencing of 180 individuals from 12 indigenous African populations. We identify millions of unreported variants, many predicted to be functionally important. We observe that the ancestors of southern African San and central African rainforest hunter-gatherers (RHG) diverged from other populations >200 kya and maintained a large effective population size. We observe evidence for ancient population structure in Africa and for multiple introgression events from "ghost" populations with highly diverged genetic lineages. Although currently geographically isolated, we observe evidence for gene flow between eastern and southern Khoesan-speaking hunter-gatherer populations lasting until â¼12 kya. We identify signatures of local adaptation for traits related to skin color, immune response, height, and metabolic processes. We identify a positively selected variant in the lightly pigmented San that influences pigmentation in vitro by regulating the enhancer activity and gene expression of PDPK1.
Subject(s)
Acclimatization , Skin Pigmentation , Humans , Whole Genome Sequencing , Population Density , Africa , 3-Phosphoinositide-Dependent Protein KinasesABSTRACT
Polygenic risk scores (PRSs) summarize the genetic predisposition of a complex human trait or disease and may become a valuable tool for advancing precision medicine. However, PRSs that are developed in populations of predominantly European genetic ancestries can increase health disparities due to poor predictive performance in individuals of diverse and complex genetic ancestries. We describe genetic and modifiable risk factors that limit the transferability of PRSs across populations and review the strengths and weaknesses of existing PRS construction methods for diverse ancestries. Developing PRSs that benefit global populations in research and clinical settings provides an opportunity for innovation and is essential for health equity.
Subject(s)
Genetic Predisposition to Disease , Humans , Risk Factors , Multifactorial Inheritance , Precision Medicine , Genome-Wide Association StudyABSTRACT
SUMMARY: Admixed populations, with their unique and diverse genetic backgrounds, are often underrepresented in genetic studies. This oversight not only limits our understanding but also exacerbates existing health disparities. One major barrier has been the lack of efficient tools tailored for the special challenges of genetic studies of admixed populations. Here, we present admix-kit, an integrated toolkit and pipeline for genetic analyses of admixed populations. Admix-kit implements a suite of methods to facilitate genotype and phenotype simulation, association testing, genetic architecture inference, and polygenic scoring in admixed populations. AVAILABILITY AND IMPLEMENTATION: Admix-kit package is open-source and available at https://github.com/KangchengHou/admix-kit. Additionally, users can use the pipeline designed for admixed genotype simulation available at https://github.com/UW-GAC/admix-kit_workflow.
Subject(s)
Software , Genotype , PhenotypeABSTRACT
Human genomic diversity has been shaped by both ancient and ongoing challenges from viruses. The current coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has had a devastating impact on population health. However, genetic diversity and evolutionary forces impacting host genes related to SARS-CoV-2 infection are not well understood. We investigated global patterns of genetic variation and signatures of natural selection at host genes relevant to SARS-CoV-2 infection (angiotensin converting enzyme 2 [ACE2], transmembrane protease serine 2 [TMPRSS2], dipeptidyl peptidase 4 [DPP4], and lymphocyte antigen 6 complex locus E [LY6E]). We analyzed data from 2,012 ethnically diverse Africans and 15,977 individuals of European and African ancestry with electronic health records and integrated with global data from the 1000 Genomes Project. At ACE2, we identified 41 nonsynonymous variants that were rare in most populations, several of which impact protein function. However, three nonsynonymous variants (rs138390800, rs147311723, and rs145437639) were common among central African hunter-gatherers from Cameroon (minor allele frequency 0.083 to 0.164) and are on haplotypes that exhibit signatures of positive selection. We identify signatures of selection impacting variation at regulatory regions influencing ACE2 expression in multiple African populations. At TMPRSS2, we identified 13 amino acid changes that are adaptive and specific to the human lineage compared with the chimpanzee genome. Genetic variants that are targets of natural selection are associated with clinical phenotypes common in patients with COVID-19. Our study provides insights into global variation at host genes related to SARS-CoV-2 infection, which have been shaped by natural selection in some populations, possibly due to prior viral infections.
Subject(s)
COVID-19 , Africa , Angiotensin-Converting Enzyme 2/genetics , COVID-19/genetics , Genetic Variation , Humans , Phenotype , SARS-CoV-2/genetics , Selection, GeneticABSTRACT
BACKGROUND: Exfoliation syndrome (XFS) is an age-related systemic disorder characterized by excessive production and progressive accumulation of abnormal extracellular material, with pathognomonic ocular manifestations. It is the most common cause of secondary glaucoma, resulting in widespread global blindness. The largest global meta-analysis of XFS in 123,457 multi-ethnic individuals from 24 countries identified seven loci with the strongest association signal in chr15q22-25 region near LOXL1. Expression analysis have so far correlated coding and a few non-coding variants in the region with LOXL1 expression levels, but functional effects of these variants is unclear. We hypothesize that analysis of the contribution of the genetically determined component of gene expression to XFS risk can provide a powerful method to elucidate potential roles of additional genes and clarify biology that underlie XFS. RESULTS: Transcriptomic Wide Association Studies (TWAS) using PrediXcan models trained in 48 GTEx tissues leveraging on results from the multi-ethnic and European ancestry GWAS were performed. To eliminate the possibility of false-positive results due to Linkage Disequilibrium (LD) contamination, we i) performed PrediXcan analysis in reduced models removing variants in LD with LOXL1 missense variants associated with XFS, and variants in LOXL1 models in both multiethnic and European ancestry individuals, ii) conducted conditional analysis of the significant signals in European ancestry individuals, and iii) filtered signals based on correlated gene expression, LD and shared eQTLs, iv) conducted expression validation analysis in human iris tissues. We observed twenty-eight genes in chr15q22-25 region that showed statistically significant associations, which were whittled down to ten genes after statistical validations. In experimental analysis, mRNA transcript levels for ARID3B, CD276, LOXL1, NEO1, SCAMP2, and UBL7 were significantly decreased in iris tissues from XFS patients compared to control samples. TWAS genes for XFS were significantly enriched for genes associated with inflammatory conditions. We also observed a higher incidence of XFS comorbidity with inflammatory and connective tissue diseases. CONCLUSION: Our results implicate a role for connective tissues and inflammation pathways in the etiology of XFS. Targeting the inflammatory pathway may be a potential therapeutic option to reduce progression in XFS.
Subject(s)
Exfoliation Syndrome , Humans , Exfoliation Syndrome/genetics , Exfoliation Syndrome/complications , Exfoliation Syndrome/metabolism , Amino Acid Oxidoreductases/genetics , RNA, Messenger , Mutation, Missense , Gene Expression , Polymorphism, Single Nucleotide , DNA-Binding Proteins/genetics , B7 Antigens/geneticsABSTRACT
Anatomically modern humans arose in Africa â¼300,000 years ago, but the demographic and adaptive histories of African populations are not well-characterized. Here, we have generated a genome-wide dataset from 840 Africans, residing in western, eastern, southern, and northern Africa, belonging to 50 ethnicities, and speaking languages belonging to four language families. In addition to agriculturalists and pastoralists, our study includes 16 populations that practice, or until recently have practiced, a hunting-gathering (HG) lifestyle. We observe that genetic structure in Africa is broadly correlated not only with geography, but to a lesser extent, with linguistic affiliation and subsistence strategy. Four East African HG (EHG) populations that are geographically distant from each other show evidence of common ancestry: the Hadza and Sandawe in Tanzania, who speak languages with clicks classified as Khoisan; the Dahalo in Kenya, whose language has remnant clicks; and the Sabue in Ethiopia, who speak an unclassified language. Additionally, we observed common ancestry between central African rainforest HGs and southern African San, the latter of whom speak languages with clicks classified as Khoisan. With the exception of the EHG, central African rainforest HGs, and San, other HG groups in Africa appear genetically similar to neighboring agriculturalist or pastoralist populations. We additionally demonstrate that infectious disease, immune response, and diet have played important roles in the adaptive landscape of African history. However, while the broad biological processes involved in recent human adaptation in Africa are often consistent across populations, the specific loci affected by selective pressures more often vary across populations.
Subject(s)
Black People/genetics , Ethnicity/genetics , Genetic Variation , Genome, Human , Language , Phylogeny , Female , Humans , MaleABSTRACT
Leukocyte telomere length (LTL), which reflects telomere length in other somatic tissues, is a complex genetic trait. Eleven SNPs have been shown in genome-wide association studies to be associated with LTL at a genome-wide level of significance within cohorts of European ancestry. It has been observed that LTL is longer in African Americans than in Europeans. The underlying reason for this difference is unknown. Here we show that LTL is significantly longer in sub-Saharan Africans than in both Europeans and African Americans. Based on the 11 LTL-associated alleles and genetic data in phase 3 of the 1000 Genomes Project, we show that the shifts in allele frequency within Europe and between Europe and Africa do not fit the pattern expected by neutral genetic drift. Our findings suggest that differences in LTL within Europeans and between Europeans and Africans is influenced by polygenic adaptation and that differences in LTL between Europeans and Africans might explain, in part, ethnic differences in risks for human diseases that have been linked to LTL.
Subject(s)
Leukocytes/cytology , Telomere Homeostasis/genetics , Telomere Shortening/genetics , Telomere/genetics , Adolescent , Adult , Black or African American/genetics , Aged , Aged, 80 and over , Alleles , Black People/genetics , Child , Female , Genetic Drift , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide , White People/geneticsABSTRACT
Mammalian gestation and pregnancy are fast evolving processes that involve the interaction of the fetal, maternal and paternal genomes. Version 1.0 of the GEneSTATION database (http://genestation.org) integrates diverse types of omics data across mammals to advance understanding of the genetic basis of gestation and pregnancy-associated phenotypes and to accelerate the translation of discoveries from model organisms to humans. GEneSTATION is built using tools from the Generic Model Organism Database project, including the biology-aware database CHADO, new tools for rapid data integration, and algorithms that streamline synthesis and user access. GEneSTATION contains curated life history information on pregnancy and reproduction from 23 high-quality mammalian genomes. For every human gene, GEneSTATION contains diverse evolutionary (e.g. gene age, population genetic and molecular evolutionary statistics), organismal (e.g. tissue-specific gene and protein expression, differential gene expression, disease phenotype), and molecular data types (e.g. Gene Ontology Annotation, protein interactions), as well as links to many general (e.g. Entrez, PubMed) and pregnancy disease-specific (e.g. PTBgene, dbPTB) databases. By facilitating the synthesis of diverse functional and evolutionary data in pregnancy-associated tissues and phenotypes and enabling their quick, intuitive, accurate and customized meta-analysis, GEneSTATION provides a novel platform for comprehensive investigation of the function and evolution of mammalian pregnancy.
Subject(s)
Databases, Genetic , Evolution, Molecular , Pregnancy/genetics , Animals , Cats , Cattle , Dogs , Female , Gene Expression , Genomics , Guinea Pigs , Humans , Mice , Organ Specificity , Phenotype , Pregnancy/metabolism , Pregnancy Complications/genetics , Pregnancy Complications/metabolism , Rabbits , Rats , Reproduction/geneticsABSTRACT
In humans, the ability to digest lactose, the sugar in milk, declines after weaning because of decreasing levels of the enzyme lactase-phlorizin hydrolase, encoded by LCT. However, some individuals maintain high enzyme amounts and are able to digest lactose into adulthood (i.e., they have the lactase-persistence [LP] trait). It is thought that selection has played a major role in maintaining this genetically determined phenotypic trait in different human populations that practice pastoralism. To identify variants associated with the LP trait and to study its evolutionary history in Africa, we sequenced MCM6 introns 9 and 13 and ~2 kb of the LCT promoter region in 819 individuals from 63 African populations and in 154 non-Africans from nine populations. We also genotyped four microsatellites in an ~198 kb region in a subset of 252 individuals to reconstruct the origin and spread of LP-associated variants in Africa. Additionally, we examined the association between LP and genetic variability at candidate regulatory regions in 513 individuals from eastern Africa. Our analyses confirmed the association between the LP trait and three common variants in intron 13 (C-14010, G-13907, and G-13915). Furthermore, we identified two additional LP-associated SNPs in intron 13 and the promoter region (G-12962 and T-956, respectively). Using neutrality tests based on the allele frequency spectrum and long-range linkage disequilibrium, we detected strong signatures of recent positive selection in eastern African populations and the Fulani from central Africa. In addition, haplotype analysis supported an eastern African origin of the C-14010 LP-associated mutation in southern Africa.
Subject(s)
Lactase/metabolism , Africa , Humans , Introns , Lactase-Phlorizin Hydrolase/genetics , Lactase-Phlorizin Hydrolase/metabolism , Microsatellite Repeats/genetics , Minichromosome Maintenance Complex Component 6/genetics , Polymerase Chain Reaction , Polymorphism, Single Nucleotide , Promoter Regions, GeneticABSTRACT
Disease susceptibility can arise as a consequence of adaptation to infectious disease. Recent findings have suggested that higher rates of chronic kidney disease (CKD) in individuals with recent African ancestry might be attributed to two risk alleles (G1 and G2) at the serum-resistance-associated (SRA)-interacting-domain-encoding region of APOL1. These two alleles appear to have arisen adaptively, possibly as a result of their protective effects against human African trypanosomiasis (HAT), or African sleeping sickness. In order to explore the distribution of potential functional variation at APOL1, we studied nucleotide variation in 187 individuals across ten geographically and genetically diverse African ethnic groups with exposure to two Trypanosoma brucei subspecies that cause HAT. We observed unusually high levels of nonsynonymous polymorphism in the regions encoding the functional domains that are required for lysing parasites. Whereas allele frequencies of G2 were similar across all populations (3%-8%), the G1 allele was only common in the Yoruba (39%). Additionally, we identified a haplotype (termed G3) that contains a nonsynonymous change at the membrane-addressing-domain-encoding region of APOL1 and is present in all populations except for the Yoruba. Analyses of long-range patterns of linkage disequilibrium indicate evidence of recent selection acting on the G3 haplotype in Fulani from Cameroon. Our results indicate that the G1 and G2 variants in APOL1 are geographically restricted and that there might be other functional variants that could play a role in HAT resistance and CKD risk in African populations.
Subject(s)
Apolipoproteins/genetics , Black People/genetics , Lipoproteins, HDL/genetics , Polymorphism, Single Nucleotide , Selection, Genetic , Adaptation, Biological , Africa , Alleles , Apolipoprotein L1 , Disease Resistance/genetics , Evolution, Molecular , Exons , Gene Frequency , Genetic Predisposition to Disease , Genetics, Population/methods , Haplotypes , Humans , Linkage Disequilibrium , Molecular Sequence Data , Renal Insufficiency, Chronic/ethnology , Renal Insufficiency, Chronic/genetics , Risk Factors , Trypanosomiasis, African/ethnology , Trypanosomiasis, African/geneticsABSTRACT
A SNP in the gene encoding lactase (LCT) (C/T-13910) is associated with the ability to digest milk as adults (lactase persistence) in Europeans, but the genetic basis of lactase persistence in Africans was previously unknown. We conducted a genotype-phenotype association study in 470 Tanzanians, Kenyans and Sudanese and identified three SNPs (G/C-14010, T/G-13915 and C/G-13907) that are associated with lactase persistence and that have derived alleles that significantly enhance transcription from the LCT promoter in vitro. These SNPs originated on different haplotype backgrounds from the European C/T-13910 SNP and from each other. Genotyping across a 3-Mb region demonstrated haplotype homozygosity extending >2.0 Mb on chromosomes carrying C-14010, consistent with a selective sweep over the past approximately 7,000 years. These data provide a marked example of convergent evolution due to strong selective pressure resulting from shared cultural traits-animal domestication and adult milk consumption.
Subject(s)
Adaptation, Biological , Lactase/genetics , Lactose/metabolism , Adult , Africa , Animals , Caco-2 Cells , Europe , Evolution, Molecular , Gene Frequency , Haplotypes , Humans , Lactose/blood , Lactose Tolerance Test , Milk/metabolism , Polymorphism, Single Nucleotide , Selection, GeneticABSTRACT
Bitter taste perception influences human nutrition and health, and the genetic variation underlying this trait may play a role in disease susceptibility. To better understand the genetic architecture and patterns of phenotypic variability of bitter taste perception, we sequenced a 996 bp region, encompassing the coding exon of TAS2R16, a bitter taste receptor gene, in 595 individuals from 74 African populations and in 94 non-Africans from 11 populations. We also performed genotype-phenotype association analyses of threshold levels of sensitivity to salicin, a bitter anti-inflammatory compound, in 296 individuals from Central and East Africa. In addition, we characterized TAS2R16 mutants in vitro to investigate the effects of polymorphic loci identified at this locus on receptor function. Here, we report striking signatures of positive selection, including significant Fay and Wu's H statistics predominantly in East Africa, indicating strong local adaptation and greater genetic structure among African populations than expected under neutrality. Furthermore, we observed a "star-like" phylogeny for haplotypes with the derived allele at polymorphic site 516 associated with increased bitter taste perception that is consistent with a model of selection for "high-sensitivity" variation. In contrast, haplotypes carrying the "low-sensitivity" ancestral allele at site 516 showed evidence of strong purifying selection. We also demonstrated, for the first time, the functional effect of nonsynonymous variation at site 516 on salicin phenotypic variance in vivo in diverse Africans and showed that most other nonsynonymous substitutions have weak or no effect on cell surface expression in vitro, suggesting that one main polymorphism at TAS2R16 influences salicin recognition. Additionally, we detected geographic differences in levels of bitter taste perception in Africa not previously reported and infer an East African origin for high salicin sensitivity in human populations.
Subject(s)
Benzyl Alcohols/chemistry , Black People/genetics , Glucosides/chemistry , Receptors, G-Protein-Coupled/genetics , Taste/genetics , Alleles , Evolution, Molecular , Exons , Genetic Association Studies , Genetic Variation , Haplotypes , Humans , Malaria/epidemiology , Malaria/genetics , Models, Genetic , Phylogeny , Phylogeography , Polymorphism, Single Nucleotide , Receptors, G-Protein-Coupled/metabolism , Selection, GeneticABSTRACT
Malaria has been a very strong selection pressure in recent human evolution, particularly in Africa. Of the one million deaths per year due to malaria, more than 90% are in sub-Saharan Africa, a region with high levels of genetic variation and population substructure. However, there have been few studies of nucleotide variation at genetic loci that are relevant to malaria susceptibility across geographically and genetically diverse ethnic groups in Africa. Invasion of erythrocytes by Plasmodium falciparum parasites is central to the pathology of malaria. Glycophorin A (GYPA) and B (GYPB), which determine MN and Ss blood types, are two major receptors that are expressed on erythrocyte surfaces and interact with parasite ligands. We analyzed nucleotide diversity of the glycophorin gene family in 15 African populations with different levels of malaria exposure. High levels of nucleotide diversity and gene conversion were found at these genes. We observed divergent patterns of genetic variation between these duplicated genes and between different extracellular domains of GYPA. Specifically, we identified fixed adaptive changes at exons 3-4 of GYPA. By contrast, we observed an allele frequency spectrum skewed toward a significant excess of intermediate-frequency alleles at GYPA exon 2 in many populations; the degree of spectrum distortion is correlated with malaria exposure, possibly because of the joint effects of gene conversion and balancing selection. We also identified a haplotype causing three amino acid changes in the extracellular domain of glycophorin B. This haplotype might have evolved adaptively in five populations with high exposure to malaria.
Subject(s)
Endemic Diseases , Genetic Predisposition to Disease , Glycophorins/genetics , MNSs Blood-Group System/genetics , Malaria, Falciparum/genetics , Selection, Genetic , Africa South of the Sahara , Amino Acid Substitution , Animals , Base Sequence , Erythrocytes/metabolism , Erythrocytes/parasitology , Ethnicity/genetics , Exons , Genetic Loci , Glycophorins/chemistry , Glycophorins/classification , Humans , Malaria, Falciparum/blood , Malaria, Falciparum/epidemiology , Molecular Sequence Data , Phylogeny , Plasmodium falciparum , Polymorphism, Single Nucleotide , Protein Structure, TertiaryABSTRACT
Bitter taste perception, mediated by receptors encoded by the TAS2R loci, has important roles in human health and nutrition. Prior studies have demonstrated that nonsynonymous variation at site 516 in the coding exon of TAS2R16, a bitter taste receptor gene on chromosome 7, has been subject to positive selection and is strongly correlated with differences in sensitivity to salicin, a bitter anti-inflammatory compound, in human populations. However, a recent study suggested that the derived G-allele at rs702424 in the TAS2R16 promoter has also been the target of recent selection and may have an additional effect on the levels of salicin bitter taste perception. Here, we examined alleles at rs702424 for signatures of selection using Extended Haplotype Homozygosity (EHH) and FST statistics in diverse populations from West Central, Central and East Africa. We also performed a genotype-phenotype analysis of salicin sensitivity in a subset of 135 individuals from East Africa. Based on our data, we did not find evidence for positive selection at rs702424 in African populations, suggesting that nucleotide position 516 is likely the site under selection at TAS2R16. Moreover, we did not detect a significant association between rs702424 alleles and salicin bitter taste recognition, implying that this site does not contribute to salicin phenotypic variance. Overall, this study of African diversity provides further information regarding the genetic architecture and evolutionary history of a biologically-relevant trait in humans.
Subject(s)
Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Receptors, G-Protein-Coupled/genetics , Taste Perception/genetics , Africa, Eastern , Alleles , Anti-Inflammatory Agents/pharmacology , Benzyl Alcohols/pharmacology , Evolution, Molecular , Genetic Association Studies , Glucosides/pharmacology , Humans , Receptors, G-Protein-Coupled/metabolismABSTRACT
Primary open-angle glaucoma (POAG), a leading cause of irreversible blindness globally, shows disparity in prevalence and manifestations across ancestries. We perform meta-analysis across 15 biobanks (of the Global Biobank Meta-analysis Initiative) (n = 1,487,441: cases = 26,848) and merge with previous multi-ancestry studies, with the combined dataset representing the largest and most diverse POAG study to date (n = 1,478,037: cases = 46,325) and identify 17 novel significant loci, 5 of which were ancestry specific. Gene-enrichment and transcriptome-wide association analyses implicate vascular and cancer genes, a fifth of which are primary ciliary related. We perform an extensive statistical analysis of SIX6 and CDKN2B-AS1 loci in human GTEx data and across large electronic health records showing interaction between SIX6 gene and causal variants in the chr9p21.3 locus, with expression effect on CDKN2A/B. Our results suggest that some POAG risk variants may be ancestry specific, sex specific, or both, and support the contribution of genes involved in programmed cell death in POAG pathogenesis.
Subject(s)
Genetic Predisposition to Disease , Glaucoma, Open-Angle , Male , Female , Humans , Genetic Predisposition to Disease/genetics , Glaucoma, Open-Angle/genetics , Glaucoma, Open-Angle/epidemiology , Polymorphism, Single Nucleotide , Cell Proliferation , BiologyABSTRACT
Although human bitter taste perception is hypothesized to be a dietary adaptation, little is known about genetic signatures of selection and patterns of bitter taste perception variability in ethnically diverse populations with different diets, particularly from Africa. To better understand the genetic basis and evolutionary history of bitter taste sensitivity, we sequenced a 2,975 bp region encompassing TAS2R38, a bitter taste receptor gene, in 611 Africans from 57 populations in West Central and East Africa with diverse subsistence patterns, as well as in a comparative sample of 132 non-Africans. We also examined the association between genetic variability at this locus and threshold levels of phenylthiocarbamide (PTC) bitterness in 463 Africans from the above populations to determine how variation influences bitter taste perception. Here, we report striking patterns of variation at TAS2R38, including a significant excess of novel rare nonsynonymous polymorphisms that recently arose only in Africa, high frequencies of haplotypes in Africa associated with intermediate bitter taste sensitivity, a remarkably similar frequency of common haplotypes across genetically and culturally distinct Africans, and an ancient coalescence time of common variation in global populations. Additionally, several of the rare nonsynonymous substitutions significantly modified levels of PTC bitter taste sensitivity in diverse Africans. While ancient balancing selection likely maintained common haplotype variation across global populations, we suggest that recent selection pressures may have also resulted in the unusually high level of rare nonsynonymous variants in Africa, implying a complex model of selection at the TAS2R38 locus in African populations. Furthermore, the distribution of common haplotypes in Africa is not correlated with diet, raising the possibility that common variation may be under selection due to their role in nondietary biological processes. In addition, our data indicate that novel rare mutations contribute to the phenotypic variance of PTC sensitivity, illustrating the influence of rare variation on a common trait, as well as the relatively recent evolution of functionally diverse alleles at this locus.
Subject(s)
Black People/genetics , Evolution, Molecular , Receptors, G-Protein-Coupled/genetics , Taste/genetics , Adaptation, Biological/genetics , Africa , Alleles , Genetic Variation , Haplotypes/genetics , Humans , Mutation , PhenotypeABSTRACT
Malaria is one of the strongest selective pressures in recent human evolution. African populations have been and continue to be at risk for malarial infections. However, few studies have re-sequenced malaria susceptibility loci across geographically and genetically diverse groups in Africa. We examined nucleotide diversity at Intercellular adhesion molecule-1 (ICAM-1), a malaria susceptibility candidate locus, in a number of human populations with a specific focus on diverse African ethnic groups. We used tests of neutrality to assess whether natural selection has impacted this locus and tested whether SNP variation at ICAM-1 is correlated with malaria endemicity. We observe differing patterns of nucleotide and haplotype variation in global populations and higher levels of diversity in Africa. Although we do not observe a deviation from neutrality based on the allele frequency distribution, we do observe several alleles at ICAM-1, including the ICAM-1 (Kilifi) allele, that are correlated with malaria endemicity. We show that the ICAM-1 (Kilifi) allele, which is common in Africa and Asia, exists on distinct haplotype backgrounds and is likely to have arisen more recently in Asia. Our results suggest that correlation analyses of allele frequencies and malaria endemicity may be useful for identifying candidate functional variants that play a role in malaria resistance and susceptibility.
Subject(s)
Ethnicity/genetics , Genetic Predisposition to Disease/genetics , Genetic Variation , Intercellular Adhesion Molecule-1/genetics , Malaria/genetics , Base Sequence , Black People/genetics , DNA Primers/genetics , Gene Frequency , Genetics, Population , Haplotypes/genetics , Humans , Linkage Disequilibrium , Malaria/ethnology , Molecular Sequence Data , Polymorphism, Single Nucleotide/genetics , Sequence Alignment , Sequence Analysis, DNAABSTRACT
Admixed populations, with their unique and diverse genetic backgrounds, are often underrepresented in genetic studies. This oversight not only limits our understanding but also exacerbates existing health disparities. One major barrier has been the lack of efficient tools tailored for the special challenges of genetic study of admixed populations. Here, we present admix-kit, an integrated toolkit and pipeline for genetic analyses of admixed populations. Admix-kit implements a suite of methods to facilitate genotype and phenotype simulation, association testing, genetic architecture inference, and polygenic scoring in admixed populations.
ABSTRACT
BACKGROUND: Mapping of quantitative trait loci (QTL) associated with molecular phenotypes is a powerful approach for identifying the genes and molecular mechanisms underlying human traits and diseases, though most studies have focused on individuals of European descent. While important progress has been made to study a greater diversity of human populations, many groups remain unstudied, particularly among indigenous populations within Africa. To better understand the genetics of gene regulation in East Africans, we perform expression and splicing QTL mapping in whole blood from a cohort of 162 diverse Africans from Ethiopia and Tanzania. We assess replication of these QTLs in cohorts of predominantly European ancestry and identify candidate genes under selection in human populations. RESULTS: We find the gene regulatory architecture of African and non-African populations is broadly shared, though there is a considerable amount of variation at individual loci across populations. Comparing our analyses to an equivalently sized cohort of European Americans, we find that QTL mapping in Africans improves the detection of expression QTLs and fine-mapping of causal variation. Integrating our QTL scans with signatures of natural selection, we find several genes related to immunity and metabolism that are highly differentiated between Africans and non-Africans, as well as a gene associated with pigmentation. CONCLUSION: Extending QTL mapping studies beyond European ancestry, particularly to diverse indigenous populations, is vital for a complete understanding of the genetic architecture of human traits and can reveal novel functional variation underlying human traits and disease.
Subject(s)
East African People , Quantitative Trait Loci , Humans , Chromosome Mapping , Gene Expression , Tanzania , Genetic VariationABSTRACT
Polygenic risk scores (PRSs) have been widely explored in precision medicine. However, few studies have thoroughly investigated their best practices in global populations across different diseases. We here utilized data from Global Biobank Meta-analysis Initiative (GBMI) to explore methodological considerations and PRS performance in 9 different biobanks for 14 disease endpoints. Specifically, we constructed PRSs using pruning and thresholding (P + T) and PRS-continuous shrinkage (CS). For both methods, using a European-based linkage disequilibrium (LD) reference panel resulted in comparable or higher prediction accuracy compared with several other non-European-based panels. PRS-CS overall outperformed the classic P + T method, especially for endpoints with higher SNP-based heritability. Notably, prediction accuracy is heterogeneous across endpoints, biobanks, and ancestries, especially for asthma, which has known variation in disease prevalence across populations. Overall, we provide lessons for PRS construction, evaluation, and interpretation using GBMI resources and highlight the importance of best practices for PRS in the biobank-scale genomics era.