ABSTRACT
Burkitt lymphoma (BL) is an aggressive B-cell lymphoma that significantly contributes to childhood cancer burden in sub-Saharan Africa. Plasmodium falciparum, which causes malaria, is geographically associated with BL, but the evidence remains insufficient for causal inference. Inference could be strengthened by demonstrating that mendelian genes known to protect against malaria-such as the sickle cell trait variant, HBB-rs334(T)-also protect against BL. We investigated this hypothesis among 800 BL cases and 3845 controls in four East African countries using genome-scan data to detect polymorphisms in 22 genes known to affect malaria risk. We fit generalized linear mixed models to estimate odds ratios (OR) and 95% confidence intervals (95% CI), controlling for age, sex, country, and ancestry. The ORs of the loci with BL and P. falciparum infection among controls were correlated (Spearman's ρ = 0.37, p = .039). HBB-rs334(T) was associated with lower P. falciparum infection risk among controls (OR = 0.752, 95% CI 0.628-0.9; p = .00189) and BL risk (OR = 0.687, 95% CI 0.533-0.885; p = .0037). ABO-rs8176703(T) was associated with decreased risk of BL (OR = 0.591, 95% CI 0.379-0.992; p = .00271), but not of P. falciparum infection. Our results increase support for the etiological correlation between P. falciparum and BL risk.
Subject(s)
Burkitt Lymphoma , Malaria, Falciparum , Malaria , Sickle Cell Trait , Humans , Africa, Eastern , Alleles , Burkitt Lymphoma/epidemiology , Burkitt Lymphoma/genetics , Malaria, Falciparum/epidemiology , Malaria, Falciparum/genetics , Malaria, Falciparum/complications , Sickle Cell Trait/epidemiology , Sickle Cell Trait/genetics , Sickle Cell Trait/complications , Nectins/metabolismABSTRACT
Serum lipids are biomarkers of cardiometabolic disease risk, and understanding genomic factors contributing to their distribution is of interest. Studies of lipids in Africans are rare, though it is expected that such studies could identify novel loci. We conducted a GWAS of 4317 Africans enrolled from Nigeria, Ghana and Kenya. We evaluated linear mixed models of high-density lipoprotein cholesterol (HDLC), low-density lipoprotein cholesterol (LDLC), total cholesterol (CHOL), triglycerides (TG) and TG/HDLC. Replication was attempted in 9542 African Americans (AA). In our main analysis, we identified 28 novel associations in Africans. Of the 18 of these that could be tested in AA, three associations replicated (GPNMB-TG, ENPP1-TG and SMARCA4-LDLC). Five additional novel loci were discovered upon meta-analysis with AA (rs138282551-TG, PGBD5-HDLC, CD80-TG/HDLC, SLC44A1-CHOL and TLL2-CHOL). Analyses considering only those with predominantly West African ancestry (Nigeria, Ghana and AA) yielded new insights: ORC5-LDLC and chr20:60973327-CHOL. Among our novel findings are some loci with known connections to lipids pathways. For instance, rs147706369 (TLL2) alters a regulatory motif for sterol regulatory element-binding proteins, a family of transcription factors that control the expression of a range of enzymes involved in cholesterol, fatty acid and TG synthesis, and rs115749422 (SMARCA4), an independent association near the known LDLR locus that is rare or absent in populations without African ancestry. These findings demonstrate the utility of conducting genomic analyses in Africans for discovering novel loci and provide some preliminary evidence for caution against treating 'African ancestry' as a monolithic category.
Subject(s)
Black People/genetics , Genetic Heterogeneity , Genome-Wide Association Study , Lipid Metabolism , Quantitative Trait Loci , Quantitative Trait, Heritable , Africa , HumansABSTRACT
BACKGROUND: Sex differences in Parkinson's disease (PD) risk are well-known. However, the role of sex chromosomes in the development and progression of PD is still unclear. OBJECTIVE: The objective of this study was to perform the first X-chromosome-wide association study for PD risk in a Latin American cohort. METHODS: We used data from three admixed cohorts: (1) Latin American Research consortium on the Genetics of Parkinson's Disease (n = 1504) as discover cohort, and (2) Latino cohort from International Parkinson Disease Genomics Consortium (n = 155) and (3) Bambui Aging cohort (n = 1442) as replication cohorts. We also developed an X-chromosome framework specifically designed for admixed populations. RESULTS: We identified eight linkage disequilibrium regions associated with PD. We replicated one of these regions (top variant rs525496; discovery odds ratio [95% confidence interval]: 0.60 [0.478-0.77], P = 3.13 × 10-5 replication odds ratio: 0.60 [0.37-0.98], P = 0.04). rs5525496 is associated with multiple expression quantitative trait loci in brain and non-brain tissues, including RAB9B, H2BFM, TSMB15B, and GLRA4, but colocalization analysis suggests that rs5525496 may not mediate risk by expression of these genes. We also replicated a previous X-chromosome-wide association study finding (rs28602900), showing that this variant is associated with PD in non-European populations. CONCLUSIONS: Our results reinforce the importance of including X-chromosome and diverse populations in genetic studies. © 2023 The Authors. Movement Disorders published by Wiley Periodicals LLC on behalf of International Parkinson and Movement Disorder Society.
Subject(s)
Chromosomes, Human, X , Parkinson Disease , Female , Humans , Male , Genome-Wide Association Study , Hispanic or Latino , Latin America , Parkinson Disease/genetics , Sex Factors , Chromosomes, Human, X/genetics , Linkage Disequilibrium/geneticsABSTRACT
Populations in sub-Saharan Africa have historically been exposed to intense selection from chronic infection with falciparum malaria. Interestingly, populations with the highest malaria intensity can be identified by the increased occurrence of endemic Burkitt Lymphoma (eBL), a pediatric cancer that affects populations with intense malaria exposure, in the so called "eBL belt" in sub-Saharan Africa. However, the effects of intense malaria exposure and sub-Saharan populations' genetic histories remain poorly explored. To determine if historical migrations and intense malaria exposure have shaped the genetic composition of the eBL belt populations, we genotyped ~4.3 million SNPs in 1,708 individuals from Ghana and Northern Uganda, located on opposite sides of eBL belt and with ≥ 7 months/year of intense malaria exposure and published evidence of high incidence of BL. Among 35 Ghanaian tribes, we showed a predominantly West-Central African ancestry and genomic footprints of gene flow from Gambian and East African populations. In Uganda, the North West population showed a predominantly Nilotic ancestry, and the North Central population was a mixture of Nilotic and Southern Bantu ancestry, while the Southwest Ugandan population showed a predominant Southern Bantu ancestry. Our results support the hypothesis of diverse ancestral origins of the Ugandan, Kenyan and Tanzanian Great Lakes African populations, reflecting a confluence of Nilotic, Cushitic and Bantu migrations in the last 3000 years. Natural selection analyses suggest, for the first time, a strong positive selection signal in the ATP2B4 gene (rs10900588) in Northern Ugandan populations. These findings provide important baseline genomic data to facilitate disease association studies, including of eBL, in eBL belt populations.
Subject(s)
Burkitt Lymphoma/genetics , Gene Flow , Malaria, Falciparum/genetics , Selection, Genetic , Adolescent , Africa South of the Sahara , Aged , Burkitt Lymphoma/epidemiology , Case-Control Studies , Child , Child, Preschool , Endemic Diseases , Female , Genetics, Population , Genome-Wide Association Study , Ghana/epidemiology , Human Migration , Humans , Incidence , Infant , Infant, Newborn , Malaria, Falciparum/epidemiology , Male , Middle Aged , Models, Genetic , Plasma Membrane Calcium-Transporting ATPases/genetics , Polymorphism, Single Nucleotide , Uganda/epidemiologyABSTRACT
The Transatlantic Slave Trade transported more than 9 million Africans to the Americas between the early 16th and the mid-19th centuries. We performed a genome-wide analysis using 6,267 individuals from 25 populations to infer how different African groups contributed to North-, South-American, and Caribbean populations, in the context of geographic and geopolitical factors, and compared genetic data with demographic history records of the Transatlantic Slave Trade. We observed that West-Central Africa and Western Africa-associated ancestry clusters are more prevalent in northern latitudes of the Americas, whereas the South/East Africa-associated ancestry cluster is more prevalent in southern latitudes of the Americas. This pattern results from geographic and geopolitical factors leading to population differentiation. However, there is a substantial decrease in the between-population differentiation of the African gene pool within the Americas, when compared with the regions of origin from Africa, underscoring the importance of historical factors favoring admixture between individuals with different African origins in the New World. This between-population homogenization in the Americas is consistent with the excess of West-Central Africa ancestry (the most prevalent in the Americas) in the United States and Southeast-Brazil, with respect to historical-demography expectations. We also inferred that in most of the Americas, intercontinental admixture intensification occurred between 1750 and 1850, which correlates strongly with the peak of arrivals from Africa. This study contributes with a population genetics perspective to the ongoing social, cultural, and political debate regarding ancestry, admixture, and the mestizaje process in the Americas.
Subject(s)
Black People/genetics , Enslavement/history , Gene Pool , Genome, Human , Human Migration/history , Africa , Americas , History, 16th Century , History, 17th Century , History, 18th Century , History, 19th Century , Humans , PhylogeographyABSTRACT
BACKGROUND/OBJECTIVES: Admixed populations are a resource to study the global genetic architecture of complex phenotypes, which is critical, considering that non-European populations are severely underrepresented in genomic studies. Here, we study the genetic architecture of BMI in children, young adults, and elderly individuals from the admixed population of Brazil. SUBJECTS/METHODS: Leveraging admixture in Brazilians, whose chromosomes are mosaics of fragments of Native American, European, and African origins, we used genome-wide data to perform admixture mapping/fine-mapping of body mass index (BMI) in three Brazilian population-based cohorts from Northeast (Salvador), Southeast (Bambuí), and South (Pelotas). RESULTS: We found significant associations with African-associated alleles in children from Salvador (PALD1 and ZMIZ1 genes), and in young adults from Pelotas (NOD2 and MTUS2 genes). More importantly, in Pelotas, rs114066381, mapped in a potential regulatory region, is significantly associated only in females (p = 2.76e-06). This variant is rare in Europeans but with frequencies of ~3% in West Africa and has a strong female-specific effect (95% CI: 2.32-5.65 kg/m2 per each A allele). We confirmed this sex-specific association and replicated its strong effect for an adjusted fat mass index in the same Pelotas cohort, and for BMI in another Brazilian cohort from São Paulo (Southeast Brazil). A meta-analysis confirmed the significant association. Remarkably, we observed that while the frequency of rs114066381-A allele ranges from 0.8 to 2.1% in the studied populations, it attains ~9% among women with morbid obesity from Pelotas, São Paulo, and Bambuí. The effect size of rs114066381 is at least five times higher than the FTO SNPs rs9939609 and rs1558902, already emblematic for their high effects. CONCLUSIONS: We identified six candidate SNPs associated with BMI. rs114066381 stands out for its high effect that was replicated and its high frequency in women with morbid obesity. We demonstrate how admixed populations are a source of new relevant phenotype-associated genetic variants.
Subject(s)
Body Mass Index , Genetics, Population , Polymorphism, Single Nucleotide , Aged , Aged, 80 and over , Alleles , Brazil , Child , Child, Preschool , Chromosome Mapping , Female , Humans , Male , Middle Aged , Phenotype , Regulatory Sequences, Nucleic Acid , Sex Factors , Young AdultABSTRACT
While South Americans are underrepresented in human genomic diversity studies, Brazil has been a classical model for population genetics studies on admixture. We present the results of the EPIGEN Brazil Initiative, the most comprehensive up-to-date genomic analysis of any Latin-American population. A population-based genome-wide analysis of 6,487 individuals was performed in the context of worldwide genomic diversity to elucidate how ancestry, kinship, and inbreeding interact in three populations with different histories from the Northeast (African ancestry: 50%), Southeast, and South (both with European ancestry >70%) of Brazil. We showed that ancestry-positive assortative mating permeated Brazilian history. We traced European ancestry in the Southeast/South to a wider European/Middle Eastern region with respect to the Northeast, where ancestry seems restricted to Iberia. By developing an approximate Bayesian computation framework, we infer more recent European immigration to the Southeast/South than to the Northeast. Also, the observed low Native-American ancestry (6-8%) was mostly introduced in different regions of Brazil soon after the European Conquest. We broadened our understanding of the African diaspora, the major destination of which was Brazil, by revealing that Brazilians display two within-Africa ancestry components: one associated with non-Bantu/western Africans (more evident in the Northeast and African Americans) and one associated with Bantu/eastern Africans (more present in the Southeast/South). Furthermore, the whole-genome analysis of 30 individuals (42-fold deep coverage) shows that continental admixture rather than local post-Columbian history is the main and complex determinant of the individual amount of deleterious genotypes.
Subject(s)
Genetics, Population , Mutation , Black People/genetics , Brazil , Humans , White People/geneticsABSTRACT
BACKGROUND: Asthma is a chronic disease of the airways and, despite the advances in the knowledge of associated genetic regions in recent years, their mechanisms have yet to be explored. Several genome-wide association studies have been carried out in recent years, but none of these have involved Latin American populations with a high level of miscegenation, as is seen in the Brazilian population. METHODS: 1246 children were recruited from a longitudinal cohort study in Salvador, Brazil. Asthma symptoms were identified in accordance with an International Study of Asthma and Allergies in Childhood (ISAAC) questionnaire. Following quality control, 1,877,526 autosomal SNPs were tested for association with childhood asthma symptoms by logistic regression using an additive genetic model. We complemented the analysis with an estimate of the phenotypic variance explained by common genetic variants. Replications were investigated in independent Mexican and US Latino samples. RESULTS: Two chromosomal regions reached genome-wide significance level for childhood asthma symptoms: the 14q11 region flanking the DAD1 and OXA1L genes (rs1999071, MAF 0.32, OR 1.78, 95% CI 1.45-2.18, p-value 2.83 × 10(-8)) and 15q22 region flanking the FOXB1 gene (rs10519031, MAF 0.04, OR 3.0, 95% CI 2.02-4.49, p-value 6.68 × 10(-8) and rs8029377, MAF 0.03, OR 2.49, 95% CI 1.76-3.53, p-value 2.45 × 10(-7)). eQTL analysis suggests that rs1999071 regulates the expression of OXA1L gene. However, the original findings were not replicated in the Mexican or US Latino samples. CONCLUSIONS: We conclude that the 14q11 and 15q22 regions may be associated with asthma symptoms in childhood.
Subject(s)
Asthma/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Child , Child, Preschool , Chromosomes, Human, Pair 14/genetics , Female , Humans , Latin America , Male , Metabolic Networks and Pathways/genetics , Phenotype , Polymorphism, Single Nucleotide/genetics , Principal Component AnalysisABSTRACT
BACKGROUND: Archaeology reports millenary cultural contacts between Peruvian Coast-Andes and the Amazon Yunga, a rainforest transitional region between Andes and Lower Amazonia. To clarify the relationships between cultural and biological evolution of these populations, in particular between Amazon Yungas and Andeans, we used DNA-sequence data, a model-based Bayesian approach and several statistical validations to infer a set of demographic parameters. RESULTS: We found that the genetic diversity of the Shimaa (an Amazon Yunga population) is a subset of that of Quechuas from Central-Andes. Using the Isolation-with-Migration population genetics model, we inferred that the Shimaa ancestors were a small subgroup that split less than 5300 years ago (after the development of complex societies) from an ancestral Andean population. After the split, the most plausible scenario compatible with our results is that the ancestors of Shimaas moved toward the Peruvian Amazon Yunga and incorporated the culture and language of some of their neighbors, but not a substantial amount of their genes. We validated our results using Approximate Bayesian Computations, posterior predictive tests and the analysis of pseudo-observed datasets. CONCLUSIONS: We presented a case study in which model-based Bayesian approaches, combined with necessary statistical validations, shed light into the prehistoric demographic relationship between Andeans and a population from the Amazon Yunga. Our results offer a testable model for the peopling of this large transitional environmental region between the Andes and the Lower Amazonia. However, studies on larger samples and involving more populations of these regions are necessary to confirm if the predominant Andean biological origin of the Shimaas is the rule, and not the exception.
Subject(s)
Genetics, Population , Indians, South American/genetics , Bayes Theorem , Biological Evolution , Genetic Variation , Human Migration , Humans , Molecular Sequence Data , Population Groups , South AmericaABSTRACT
BACKGROUND: Type 2 diabetes (T2D) has reached epidemic proportions globally, including in Africa. However, molecular studies to understand the pathophysiology of T2D remain scarce outside Europe and North America. The aims of this study are to use an untargeted metabolomics approach to identify: (a) metabolites that are differentially expressed between individuals with and without T2D and (b) a metabolic signature associated with T2D in a population of Sub-Saharan Africa (SSA). METHODS: A total of 580 adult Nigerians from the Africa America Diabetes Mellitus (AADM) study were studied. The discovery study included 310 individuals (210 without T2D, 100 with T2D). Metabolites in plasma were assessed by reverse phase, ultra-performance liquid chromatography and mass spectrometry (RP)/UPLC-MS/MS methods on the Metabolon Platform. Welch's two-sample t-test was used to identify differentially expressed metabolites (DEMs), followed by the construction of a biomarker panel using a random forest (RF) algorithm. The biomarker panel was evaluated in a replication sample of 270 individuals (110 without T2D and 160 with T2D) from the same study. RESULTS: Untargeted metabolomic analyses revealed 280 DEMs between individuals with and without T2D. The DEMs predominantly belonged to the lipid (51%, 142/280), amino acid (21%, 59/280), xenobiotics (13%, 35/280), carbohydrate (4%, 10/280) and nucleotide (4%, 10/280) super pathways. At the sub-pathway level, glycolysis, free fatty acid, bile metabolism, and branched chain amino acid catabolism were altered in T2D individuals. A 10-metabolite biomarker panel including glucose, gluconate, mannose, mannonate, 1,5-anhydroglucitol, fructose, fructosyl-lysine, 1-carboxylethylleucine, metformin, and methyl-glucopyranoside predicted T2D with an area under the curve (AUC) of 0.924 (95% CI: 0.845-0.966) and a predicted accuracy of 89.3%. The panel was validated with a similar AUC (0.935, 95% CI 0.906-0.958) in the replication cohort. The 10 metabolites in the biomarker panel correlated significantly with several T2D-related glycemic indices, including Hba1C, insulin resistance (HOMA-IR), and diabetes duration. CONCLUSIONS: We demonstrate that metabolomic dysregulation associated with T2D in Nigerians affects multiple processes, including glycolysis, free fatty acid and bile metabolism, and branched chain amino acid catabolism. Our study replicated previous findings in other populations and identified a metabolic signature that could be used as a biomarker panel of T2D risk and glycemic control thus enhancing our knowledge of molecular pathophysiologic changes in T2D. The metabolomics dataset generated in this study represents an invaluable addition to publicly available multi-omics data on understudied African ancestry populations.
Subject(s)
Diabetes Mellitus, Type 2 , West African People , Adult , Humans , Chromatography, Liquid , Fatty Acids, Nonesterified , Tandem Mass Spectrometry , Amino Acids, Branched-Chain , BiomarkersABSTRACT
Burkitt lymphoma (BL) is responsible for many childhood cancers in sub-Saharan Africa, where it is linked to recurrent or chronic infection by Epstein-Barr virus or Plasmodium falciparum. However, whether human leukocyte antigen (HLA) polymorphisms, which regulate immune response, are associated with BL has not been well investigated, which limits our understanding of BL etiology. Here we investigate this association among 4,645 children aged 0-15 years, 800 with BL, enrolled in Uganda, Tanzania, Kenya, and Malawi. HLA alleles are imputed with accuracy >90% for HLA class I and 85-89% for class II alleles. BL risk is elevated with HLA-DQA1*04:01 (adjusted odds ratio [OR] = 1.61, 95% confidence interval [CI] = 1.32-1.97, P = 3.71 × 10-6), with rs2040406(G) in HLA-DQA1 region (OR = 1.43, 95% CI = 1.26-1.63, P = 4.62 × 10-8), and with amino acid Gln at position 53 versus other variants in HLA-DQA1 (OR = 1.36, P = 2.06 × 10-6). The associations with HLA-DQA1*04:01 (OR = 1.29, P = 0.03) and rs2040406(G) (OR = 1.68, P = 0.019) persist in mutually adjusted models. The higher risk rs2040406(G) variant for BL is associated with decreased HLA-DQB1 expression in eQTLs in EBV transformed lymphocytes. Our results support the role of HLA variation in the etiology of BL and suggest that a promising area of research might be understanding the link between HLA variation and EBV control.
Subject(s)
Burkitt Lymphoma , Epstein-Barr Virus Infections , Child , Humans , Burkitt Lymphoma/genetics , Epstein-Barr Virus Infections/complications , Epstein-Barr Virus Infections/genetics , Herpesvirus 4, Human/genetics , HLA-DQ alpha-Chains/geneticsABSTRACT
Chronic kidney disease is a leading cause of death and disability globally and impacts individuals of African ancestry (AFR) or with ancestry in the Americas (AMS) who are under-represented in genome-wide association studies (GWASs) of kidney function. To address this bias, we conducted a large meta-analysis of GWASs of estimated glomerular filtration rate (eGFR) in 145,732 AFR and AMS individuals. We identified 41 loci at genome-wide significance (p < 5 × 10-8), of which two have not been previously reported in any ancestry group. We integrated fine-mapped loci with epigenomic and transcriptomic resources to highlight potential effector genes relevant to kidney physiology and disease, and reveal key regulatory elements and pathways involved in renal function and development. We demonstrate the varying but increased predictive power offered by a multi-ancestry polygenic score for eGFR and highlight the importance of population diversity in GWASs and multi-omics resources to enhance opportunities for clinical translation for all.
Subject(s)
Genome-Wide Association Study , Renal Insufficiency, Chronic , Humans , Renal Insufficiency, Chronic/diagnosis , Glomerular Filtration Rate/genetics , Multifactorial Inheritance/genetics , Kidney/physiologyABSTRACT
Latin Americans are underrepresented in genetic studies, increasing disparities in personalized genomic medicine. Despite available genetic data from thousands of Latin Americans, accessing and navigating the bureaucratic hurdles for consent or access remains challenging. To address this, we introduce the Genetics of Latin American Diversity (GLAD) Project, compiling genome-wide information from 53,738 Latin Americans across 39 studies representing 46 geographical regions. Through GLAD, we identified heterogeneous ancestry composition and recent gene flow across the Americas. Additionally, we developed GLAD-match, a simulated annealing-based algorithm, to match the genetic background of external samples to our database, sharing summary statistics (i.e., allele and haplotype frequencies) without transferring individual-level genotypes. Finally, we demonstrate the potential of GLAD as a critical resource for evaluating statistical genetic software in the presence of admixture. By providing this resource, we promote genomic research in Latin Americans and contribute to the promises of personalized medicine to more people.
ABSTRACT
The vast majority of human populations and individuals have mixed ancestry. Consequently, adjustment for locus-specific ancestry is essential for genetic association studies. To empower association studies for all populations, it is necessary to integrate effects of locus-specific ancestry and genotype. We developed a joint test of ancestry and association that can be performed with summary statistics, is independent of study design, can take advantage of locus-specific ancestry effects to boost power in association testing, and can utilize association effects to fine map admixture peaks. We illustrate the test using the association between serum triglycerides and LPL. By combining data from African Americans, European Americans, and West Africans, we identify three conditionally independent variants with varying amounts of ancestrally differentiated allele frequencies. Using out-of-sample data, we demonstrate improved prediction achievable by accounting for multiple causal variants and locus-specific ancestry effects at a single locus.
Subject(s)
Black or African American , Genome-Wide Association Study , Humans , Linkage Disequilibrium , Black or African American/genetics , Gene Frequency , WhiteABSTRACT
European-ancestry populations are recognized as stratified but not as admixed, implying that residual confounding by locus-specific ancestry can affect studies of association, polygenic adaptation, and polygenic risk scores. We integrate individual-level genome-wide data from ~19,000 European-ancestry individuals across 79 European populations and five European American cohorts. We generate a new reference panel that captures ancestral diversity missed by both the 1000 Genomes and Human Genome Diversity Projects. Both Europeans and European Americans are admixed at the subcontinental level, with admixture dates differing among subgroups of European Americans. After adjustment for both genome-wide and locus-specific ancestry, associations between a highly differentiated variant in LCT (rs4988235) and height or LDL-cholesterol were confirmed to be false positives whereas the association between LCT and body mass index was genuine. We provide formal evidence of subcontinental admixture in individuals with European ancestry, which, if not properly accounted for, can produce spurious results in genetic epidemiology studies.
Subject(s)
European People , Genetics, Population , Humans , European People/genetics , Molecular EpidemiologyABSTRACT
In regions where reads don't align well to a reference, it is generally difficult to characterize structural variation using short read sequencing. Here, we utilize machine learning classifiers and short sequence reads to genotype structural variants in the alpha globin locus on chromosome 16, a medically-relevant region that is challenging to genotype in individuals. Using models trained only with simulated data, we accurately genotype two hard-to-distinguish deletions in two separate human cohorts. Furthermore, population allele frequencies produced by our methods across a wide set of ancestries agree more closely with previously-determined frequencies than those obtained using currently available genotyping software.
ABSTRACT
Sex differences in Parkinson Disease (PD) risk are well-known. However, it is still unclear the role of sex chromosomes in the development and progression of PD. We performed the first X-chromosome Wide Association Study (XWAS) for PD risk in Latin American individuals. We used data from three admixed cohorts: (i) Latin American Research consortium on the GEnetics of Parkinson's Disease (n=1,504) as discover cohort and (ii) Latino cohort from International Parkinson Disease Genomics Consortium (n = 155) and (iii) Bambui Aging cohort (n= 1,442) as replication cohorts. After developing a X-chromosome framework specifically designed for admixed populations, we identified eight linkage disequilibrium regions associated with PD. We fully replicated one of these regions (top variant rs525496; discovery OR [95%CI]: 0.60 [0.478 - 0.77], p = 3.13 × 10 -5 ; replication OR: 0.60 [0.37-0.98], p = 0.04). rs525496 is an expression quantitative trait loci for several genes expressed in brain tissues, including RAB9B, H2BFM, TSMB15B and GLRA4 . We also replicated a previous XWAS finding (rs28602900), showing that this variant is associated with PD in non-European populations. Our results reinforce the importance of including X-chromosome and diverse populations in genetic studies.
ABSTRACT
In high-income countries, mosaic chromosomal alterations in peripheral blood leukocytes are associated with an elevated risk of adverse health outcomes, including hematologic malignancies. We investigate mosaic chromosomal alterations in sub-Saharan Africa among 931 children with Burkitt lymphoma, an aggressive lymphoma commonly characterized by immunoglobulin-MYC chromosomal rearrangements, 3822 Burkitt lymphoma-free children, and 674 cancer-free men from Ghana. We find autosomal and X chromosome mosaic chromosomal alterations in 3.4% and 1.7% of Burkitt lymphoma-free children, and 8.4% and 3.7% of children with Burkitt lymphoma (P-values = 5.7×10-11 and 3.74×10-2, respectively). Autosomal mosaic chromosomal alterations are detected in 14.0% of Ghanaian men and increase with age. Mosaic chromosomal alterations in Burkitt lymphoma cases include gains on chromosomes 1q and 8, the latter spanning MYC, while mosaic chromosomal alterations in Burkitt lymphoma-free children include copy-neutral loss of heterozygosity on chromosomes 10, 14, and 16. Our results highlight mosaic chromosomal alterations in sub-Saharan African populations as a promising area of research.
Subject(s)
Burkitt Lymphoma , Male , Child , Humans , Burkitt Lymphoma/genetics , Burkitt Lymphoma/pathology , Ghana , Chromosome Aberrations , Leukocytes/pathology , Immunoglobulins/genetics , Translocation, GeneticABSTRACT
Since the 1960s, East African athletes, mainly from Kenya and Ethiopia, have dominated long-distance running events in both the male and female categories. Further demographic studies have shown that two ethnic groups are overrepresented among elite endurance runners in each of these countries: the Kalenjin, from Kenya, and the Oromo, from Ethiopia, raising the possibility that this dominance results from genetic or/and cultural factors. However, looking at the life history of these athletes or at loci previously associated with endurance athletic performance, no compelling explanation has emerged. Here, we used a population approach to identify peaks of genetic differentiation for these two ethnicities and compared the list of genes close to these regions with a list, manually curated by us, of genes that have been associated with traits possibly relevant to endurance running in GWAS studies, and found a significant enrichment in both populations (Kalenjin, P = 0.048, and Oromo, P = 1.6x10-5). Those traits are mainly related to anthropometry, circulatory and respiratory systems, energy metabolism, and calcium homeostasis. Our results reinforce the notion that endurance running is a systemic activity with a complex genetic architecture, and indicate new candidate genes for future studies. Finally, we argue that a deterministic relationship between genetics and sports must be avoided, as it is both scientifically incorrect and prone to reinforcing population (racial) stereotyping.
Subject(s)
Athletic Performance , Running , Black People/genetics , Ethnicity/genetics , Female , Humans , Male , Physical Endurance/geneticsABSTRACT
BACKGROUND: A complex set of perturbations occur in cytokines and hormones in the etiopathogenesis of obesity and related cardiometabolic conditions such as type 2 diabetes (T2D). Evidence for the genetic regulation of these cytokines and hormones is limited, particularly in African-ancestry populations. In order to improve our understanding of the biology of cardiometabolic traits, we investigated the genetic architecture of a large panel of obesity- related cytokines and hormones among Africans with replication analyses in African Americans. METHODS: We performed genome-wide association studies (GWAS) in 4432 continental Africans, enrolled from Ghana, Kenya, and Nigeria as part of the Africa America Diabetes Mellitus (AADM) study, for 13 obesity-related cytokines and hormones, including adipsin, glucose-dependent insulinotropic peptide (GIP), glucagon-like peptide-1 (GLP-1), interleukin-1 receptor antagonist (IL1-RA), interleukin-6 (IL-6), interleukin-10 (IL-10), leptin, plasminogen activator inhibitor-1 (PAI-1), resistin, visfatin, insulin, glucagon, and ghrelin. Exact and local replication analyses were conducted in African Americans (n = 7990). The effects of sex, body mass index (BMI), and T2D on results were investigated through stratified analyses. RESULTS: GWAS identified 39 significant (P value < 5 × 10-8) loci across all 13 traits. Notably, 14 loci were African-ancestry specific. In this first GWAS for adipsin and ghrelin, we detected 13 and 4 genome-wide significant loci respectively. Stratified analyses by sex, BMI, and T2D showed a strong effect of these variables on detected loci. Eight novel loci were successfully replicated: adipsin (3), GIP (1), GLP-1 (1), and insulin (3). Annotation of these loci revealed promising links between these adipocytokines and cardiometabolic outcomes as illustrated by rs201751833 for adipsin and blood pressure and locus rs759790 for insulin level and T2D in lean individuals. CONCLUSIONS: Our study identified genetic variants underlying variation in multiple adipocytokines, including the first loci for adipsin and ghrelin. We identified population differences in variants associated with adipocytokines and highlight the importance of stratification for discovery of loci. The high number of African-specific loci detected emphasizes the need for GWAS in African-ancestry populations, as these loci could not have been detected in other populations. Overall, our work contributes to the understanding of the biology linking adipocytokines to cardiometabolic traits.