RESUMO
BACKGROUND: Genome-wide association studies (GWASs) have identified multiple risk loci for Parkinson's disease (PD). However, identifying the functional (or potential causal) variants in the reported risk loci and elucidating their roles in PD pathogenesis remain major challenges. To identify the potential causal (or functional) variants in the reported PD risk loci and to elucidate their regulatory mechanisms, we report a functional genomics study of PD. METHODS: We first integrated chromatin immunoprecipitation sequencing (ChIP-Seq) (from neuronal cells and human brain tissues) data and GWAS-identified single-nucleotide polymorphisms (SNPs) in PD risk loci. We then conducted a series of experiments and analyses to validate the regulatory effects of these (i.e., functional) SNPs, including reporter gene assays, allele-specific expression (ASE), transcription factor (TF) knockdown, CRISPR-Cas9-mediated genome editing, and expression quantitative trait loci (eQTL) analysis. RESULTS: We identified 44 SNPs (from 11 risk loci) affecting the binding of 12 TFs and we validated the regulatory effects of 15 TF binding-disrupting SNPs. In addition, we also identified the potential target genes regulated by these TF binding-disrupting SNPs through eQTL analysis. Finally, we showed that 4 eQTL genes of these TF binding-disrupting SNPs were dysregulated in PD cases compared with controls. CONCLUSION: Our study systematically reveals the gene regulatory mechanisms of PD risk variants (including widespread disruption of CTCF binding), generates the landscape of potential PD causal variants, and pinpoints promising candidate genes for further functional characterization and drug development.
Assuntos
Estudo de Associação Genômica Ampla , Doença de Parkinson , Predisposição Genética para Doença/genética , Genômica , Humanos , Doença de Parkinson/genética , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Bone mineral density (BMD) assessed by DXA is used to evaluate bone health. In children, total body (TB) measurements are commonly used; in older individuals, BMD at the lumbar spine (LS) and femoral neck (FN) is used to diagnose osteoporosis. To date, genetic variants in more than 60 loci have been identified as associated with BMD. To investigate the genetic determinants of TB-BMD variation along the life course and test for age-specific effects, we performed a meta-analysis of 30 genome-wide association studies (GWASs) of TB-BMD including 66,628 individuals overall and divided across five age strata, each spanning 15 years. We identified variants associated with TB-BMD at 80 loci, of which 36 have not been previously identified; overall, they explain approximately 10% of the TB-BMD variance when combining all age groups and influence the risk of fracture. Pathway and enrichment analysis of the association signals showed clustering within gene sets implicated in the regulation of cell growth and SMAD proteins, overexpressed in the musculoskeletal system, and enriched in enhancer and promoter regions. These findings reveal TB-BMD as a relevant trait for genetic studies of osteoporosis, enabling the identification of variants and pathways influencing different bone compartments. Only variants in ESR1 and close proximity to RANKL showed a clear effect dependency on age. This most likely indicates that the majority of genetic variants identified influence BMD early in life and that their effect can be captured throughout the life course.
Assuntos
Densidade Óssea/genética , Estudo de Associação Genômica Ampla , Adolescente , Fatores Etários , Animais , Criança , Pré-Escolar , Loci Gênicos , Humanos , Lactente , Recém-Nascido , Camundongos Knockout , Polimorfismo de Nucleotídeo Único/genética , Característica Quantitativa Herdável , Análise de RegressãoRESUMO
Genome-wide association studies (GWASs) have identified a multitude of genetic loci involved with traits and diseases. However, it is often unclear which genes are affected in such loci and whether the associated genetic variants lead to increased or decreased gene function. To mitigate this, we integrated associations of common genetic variants in 57 GWASs with 24 studies of expression quantitative trait loci (eQTLs) from a broad range of tissues by using a Mendelian randomization approach. We discovered a total of 3,484 instances of gene-trait-associated changes in expression at a false-discovery rate < 0.05. These genes were often not closest to the genetic variant and were primarily identified in eQTLs derived from pathophysiologically relevant tissues. For instance, genes with expression changes associated with lipid traits were mostly identified in the liver, and those associated with cardiovascular disease were identified in arterial tissue. The affected genes additionally point to biological processes implicated in the interrogated traits, such as the interleukin-27 pathway in rheumatoid arthritis. Further, comparing trait-associated gene expression changes across traits suggests that pleiotropy is a widespread phenomenon and points to specific instances of both agonistic and antagonistic pleiotropy. For instance, expression of SNX19 and ABCB9 is positively correlated with both the risk of schizophrenia and educational attainment. To facilitate interpretation, we provide this lexicon of how common trait-associated genetic variants alter gene expression in various tissues as the online database GWAS2Genes.
Assuntos
Regulação da Expressão Gênica , Predisposição Genética para Doença , Variação Genética , Locos de Características Quantitativas/genética , Característica Quantitativa Herdável , Escolaridade , Redes Reguladoras de Genes , Pleiotropia Genética , Estudo de Associação Genômica Ampla , Humanos , Esquizofrenia/genéticaRESUMO
BACKGROUND: Over the relatively short history of Genome Wide Association Studies (GWASs), hundreds of GWASs have been published and thousands of disease risk-associated SNPs have been identified. Summary statistics from the conducted GWASs are often available and can be used to identify SNP features associated with the level of GWAS statistical significance. Those features could be used to select SNPs from gray zones (SNPs that are nominally significant but do not reach the genome-wide level of significance) for targeted analyses. METHODS: We used summary statistics from recently published breast and lung cancer and scleroderma GWASs to explore the association between the level of the GWAS statistical significance and the expression quantitative trait loci (eQTL) status of the SNP. Data from the Genotype-Tissue Expression Project (GTEx) were used to identify eQTL SNPs. RESULTS: We found that SNPs reported as eQTLs were more significant in GWAS (higher -log10p) regardless of the tissue specificity of the eQTL. Pan-tissue eQTLs (those reported as eQTLs in multiple tissues) tended to be more significant in the GWAS compared to those reported as eQTL in only one tissue type. eQTL density in the ±5 kb adjacent region of a given SNP was also positively associated with the level of GWAS statistical significance regardless of the eQTL status of the SNP. We found that SNPs located in the regions of high eQTL density were more likely to be located in regulatory elements (transcription factor or miRNA binding sites). When SNPs were stratified by the level of statistical significance, the proportion of eQTLs was positively associated with the mean level of statistical significance in the group. The association curve reaches a plateau around -log10p ≈ 5. The observed associations suggest that quasi-significant SNPs (10- 5 < p < 5 × 10- 8) and SNPs at the genome wide level of statistical significance (p < 5 × 10- 8) may have a similar proportions of risk associated SNPs. CONCLUSIONS: The results of this study indicate that the SNP's eQTL status, as well as eQTL density in the adjacent region are positively associated with the level of statistical significance of the SNP in GWAS.
Assuntos
Neoplasias da Mama/genética , Estudo de Associação Genômica Ampla/métodos , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único , Locos de Características Quantitativas , Escleroderma Sistêmico/genética , Feminino , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Predisposição Genética para Doença , Humanos , Masculino , Modelos Estatísticos , Especificidade de Órgãos , Elementos Reguladores de TranscriçãoRESUMO
Long noncoding RNAs (lncRNAs) have emerged as key players in several biological processes and complex diseases including type 2 diabetes mellitus (T2DM). The purpose of this study was to investigate the expression levels of SNHG17 and TTC28-AS1 in T2DM patients. Quantitative real-time RT-PCR analysis was performed using peripheral blood mononuclear cells (PBMCs) samples from patients diagnosed with T2DM and healthy controls. Binary logistic regression analysis was carried out to determine the odds of development of T2DM based on expression levels of lncRNAs and clinical characteristic of the subjects. Spearman's correlation analysis was used to clarify the correlation between SNHG17 and TTC28-AS1 expressions to metabolic features. We found that SNHG17 and TTC28-AS1were down-regulated in the T2DM group compared to the healthy control group. The logistic regression revealed that body mass index (BMI), systolic blood pressure (SBP), fasting blood glucose (FBG) and TTC28-AS1 expression substantially affect T2DM susceptibility. Furthermore, expression of SNHG17 was negatively correlated with high-density lipoprotein cholesterol (HDL-C) and expression of TTC28-AS1 was positively correlated with low-density lipoprotein cholesterol (LDL-C). Decreased expressions of lncRNAs TTC28-AS1 and SNHG17 in T2DM are possibly associated with the development of T2DM.
Assuntos
Diabetes Mellitus Tipo 2/genética , RNA Longo não Codificante/sangue , Índice de Massa Corporal , Estudos de Casos e Controles , HDL-Colesterol/sangue , HDL-Colesterol/genética , LDL-Colesterol/sangue , LDL-Colesterol/genética , Diabetes Mellitus Tipo 2/sangue , Regulação da Expressão Gênica , Humanos , Leucócitos Mononucleares/fisiologia , Modelos LogísticosRESUMO
In spite of the success of genome-wide association studies (GWASs), only a small proportion of heritability for each complex trait has been explained by identified genetic variants, mainly SNPs. Likely reasons include genetic heterogeneity (i.e., multiple causal genetic variants) and small effect sizes of causal variants, for which pathway analysis has been proposed as a promising alternative to the standard single-SNP-based analysis. A pathway contains a set of functionally related genes, each of which includes multiple SNPs. Here we propose a pathway-based test that is adaptive at both the gene and SNP levels, thus maintaining high power across a wide range of situations with varying numbers of the genes and SNPs associated with a trait. The proposed method is applicable to both common variants and rare variants and can incorporate biological knowledge on SNPs and genes to boost statistical power. We use extensively simulated data and a WTCCC GWAS dataset to compare our proposal with several existing pathway-based and SNP-set-based tests, demonstrating its promising performance and its potential use in practice.
Assuntos
Genes/genética , Doenças Genéticas Inatas/genética , Estudo de Associação Genômica Ampla/métodos , Modelos Genéticos , Polimorfismo de Nucleotídeo Único/genética , Simulação por Computador , Humanos , Herança MultifatorialRESUMO
Psoriasis is an inflammatory skin disease with an estimated heritability of around 70%. Previous GWASs have detected several risk loci for psoriasis. To further improve the understanding of the genetic risk factors impacting the disease, we conducted a discovery GWAS in FinnGen and a subsequent replication and meta-analysis with data from the Estonian Biobank and the UK Biobank; the study sample included 925,649 individuals (22,659 cases and 902,990 controls), the largest sample for psoriasis yet. In addition, we conducted downstream analyses to find out more about psoriasis' cross-trait genetic correlations and causal relationships. We report 6 risk loci, which, to our knowledge, are previously unreported, most of which harbor genes related to NF-κB signaling pathway and overall immunity. Genetic correlations highlight the relationship between psoriasis and smoking, higher body weight, and lower education level. In addition, we report causal relationships between psoriasis and mood symptoms as well as 2-directioned causal relationship between psoriasis and lower education level. Our results provide further knowledge on psoriasis risk factors, which may be useful in the development of future treatment strategies.
RESUMO
Sjögren's syndrome (SS) is a chronic autoimmune disorder characterised by lymphocytic infiltration of the exocrine glands, which leads to dryness of the eyes and mouth; systemic manifestations such as arthritis, vasculitis, and interstitial lung disease; and increased risks of lymphoma and cardiovascular diseases. SS predominantly affects women, with a strong genetic component linked to sex chromosomes. Genome-wide association studies (GWASs) have identified numerous single-nucleotide polymorphisms (SNPs) associated with primary SS (pSS), revealing insights into its pathogenesis. The adaptive and innate immune systems are crucial to SS's development, with viral infections implicated as environmental triggers that exacerbate autoimmune responses in genetically susceptible individuals. Moreover, recent research has highlighted the role of vitamin D in modulating immune responses in pSS patients, suggesting its potential therapeutic implications. In this review, we focus on the recently identified SNPs in genes like OAS1, NUDT15, LINC00243, TNXB, and THBS1, which have been associated with increased risks of developing more severe symptoms and other diseases such as fatigue, lymphoma, neuromyelitis optica spectrum disorder (NMOSD), dry eye syndrome (DES), and adverse drug reactions. Future studies should focus on larger, multi-ethnic cohorts with standardised protocols to validate findings and identify new associations. Integrating genetic testing into clinical practise holds promise for improving SS management and treatment strategies, enabling personalised interventions based on comprehensive genetic profiles. By focusing on specific SNPs, vitamin D, and their implications, future research can lead to more effective and personalised approaches for managing pSS and its complications.
RESUMO
BACKGROUND: In recent years, various coronaviruses have caused severe respiratory illnesses worldwide. For example the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) infections of COVID-19 outbreak in 2019 in Wuhan, China. Genome-wide association studies (GWAS) have significantly expanded our comprehension of how specific genetic variations are linked to diseases. Research has demonstrated the existence of genetic factors influencing susceptibility to coronaviruses. The objective of this study was to examine the association of certain loci with the COVID-19 in Saudi population. METHODS: In the present study we have examined the link between the COVID-19 disease and certain genetic variants in hospitalized COVID-19 patients (n = 16) in Tabuk and Bisha, Kingdom Saudi Arabia. We used the genome Analysis Toolkit (GATK) and Comprehensive variant annotation was performed different databases and tools such as Search Tool for the Retrieval of Interacting Genes (STRING), PanelApp and PolyPhen-2. RESULTS: The study showed that the genetic variants associated with genes such as Homeostatic Iron Regulator (HFE) (found in 7 patients, representing 44%), complement factor H (CFH) (6 patients, 38%), cadherin 23 (CDH23) (4 patients, 25%), cytotoxic T-lymphocyte associated protein 4 (CTLA-4) (3 patients, 19%), Transforming Growth Factor Beta 1 (TGFB1) (3 patients, 19%), CREB-binding protein (CREBBP) (2 patients, 13%), E1A Binding Protein P300 (EP300) (2 patients, 13%), hemoglobin subunit beta (HBB) (2 patients, 13%), interferon regulatory factor 7 (IRF7) (2 patients, 13%), and unc-119 lipid binding chaperone (UNC119) (2 patients, 13%) might be associated with susceptibility to coronavirus. We also identified mutations in the COVID-19 patient that are pathogenic or likely pathogenic. CONCLUSION: A recurrent pathogenic mutation, HFE p.His63Asp (H63D), was identified in 7 patients, suggesting its potential contribution to disease severity. Additionally, a likely pathogenic variant, HBB p.Glu7Val (E7V), was present in 2 patients, highlighting its potential role in disease susceptibility. Our results shed light on the key genetic mechanisms of COVID-19 pathogenesis and help to identify and stratify the individuals or populations that are at risk to corona virus infection. The identification of susceptible individuals or populations assist in prevention and/or in treatment programs.
Assuntos
COVID-19 , Sequenciamento do Exoma , Proteína da Hemocromatose , SARS-CoV-2 , Humanos , COVID-19/genética , COVID-19/virologia , COVID-19/mortalidade , SARS-CoV-2/genética , Masculino , Pessoa de Meia-Idade , Feminino , Arábia Saudita/epidemiologia , Proteína da Hemocromatose/genética , Adulto , Predisposição Genética para Doença , Idoso , Mutação , Estudo de Associação Genômica AmplaRESUMO
The human milk microbiota (HMM) is thought to influence the long-term health of offspring. However, its role in asthma and atopy and the impact of host genomics on HMM composition remain unclear. Through the CHILD Cohort Study, we followed 885 pregnant mothers and their offspring from birth to 5 years and determined that HMM was associated with maternal genomics and prevalence of childhood asthma and allergic sensitization (atopy) among human milk-fed infants. Network analysis identified modules of correlated microbes in human milk that were associated with subsequent asthma and atopy in preschool-aged children. Moreover, reduced alpha-diversity and increased Lawsonella abundance in HMM were associated with increased prevalence of childhood atopy. Genome-wide association studies (GWASs) identified maternal genetic loci (e.g., ADAMTS8, NPR1, and COTL1) associated with HMM implicated with asthma and atopy, notably Lawsonella and alpha-diversity. Thus, our study elucidates the role of host genomics on the HMM and its potential impact on childhood asthma and atopy.
Assuntos
Asma , Estudo de Associação Genômica Ampla , Hipersensibilidade , Microbiota , Leite Humano , Humanos , Asma/genética , Asma/microbiologia , Asma/imunologia , Feminino , Pré-Escolar , Leite Humano/microbiologia , Leite Humano/imunologia , Lactente , Hipersensibilidade/microbiologia , Hipersensibilidade/genética , Genômica , Recém-Nascido , Gravidez , Masculino , Estudos de Coortes , AdultoRESUMO
Ischemic stroke is a heterogeneous condition influenced by a combination of genetic and environmental factors. Recent advancements have explored genetics in relation to various aspects of ischemic stroke, including the alteration of individual stroke occurrence risk, modulation of treatment response, and effectiveness of post-stroke functional recovery. This article aims to review the recent findings from genetic studies related to various clinical and molecular aspects of ischemic stroke. The potential clinical applications of these genetic insights in stratifying stroke risk, guiding personalized therapy, and identifying new therapeutic targets are discussed herein.
RESUMO
Genome-wide association studies reveal the complex polygenic architecture underlying psychiatric disorder risk, but there is an unmet need to validate causal variants, resolve their target genes(s), and explore their functional impacts on disorder-related mechanisms. Disorder-associated loci regulate transcription of target genes in a cell type- and context-specific manner, which can be measured through expression quantitative trait loci. In this review, we discuss methods and insights from context-specific modeling of genetically and environmentally regulated expression. Human induced pluripotent stem cell-derived cell type and organoid models have uncovered context-specific psychiatric disorder associations by investigating tissue-, cell type-, sex-, age-, and stressor-specific genetic regulation of expression. Techniques such as massively parallel reporter assays and pooled CRISPR (clustered regularly interspaced short palindromic repeats) screens make it possible to functionally fine-map genome-wide association study loci and validate their target genes at scale. Integration of disorder-associated contexts with these patient-specific human induced pluripotent stem cell models makes it possible to uncover gene by environment interactions that mediate disorder risk, which will ultimately improve our ability to diagnose and treat psychiatric disorders.
Assuntos
Células-Tronco Pluripotentes Induzidas , Transtornos Mentais , Humanos , Estudo de Associação Genômica Ampla/métodos , Células-Tronco Pluripotentes Induzidas/metabolismo , Locos de Características Quantitativas , Transtornos Mentais/genética , Transtornos Mentais/metabolismo , Regulação da Expressão GênicaRESUMO
Age-related macular degeneration (AMD) is a leading cause of blindness in older adults. Investigating shared genetic components between metabolites and AMD can enhance our understanding of its pathogenesis. We conduct metabolite genome-wide association studies (mGWASs) using multi-ethnic genetic and metabolomic data from up to 28,000 participants. With bidirectional Mendelian randomization analysis involving 16,144 advanced AMD cases and 17,832 controls, we identify 108 putatively causal relationships between plasma metabolites and advanced AMD. These metabolites are enriched in glycerophospholipid metabolism, lysophospholipid, triradylcglycerol, and long chain polyunsaturated fatty acid pathways. Bayesian genetic colocalization analysis and a customized metabolome-wide association approach prioritize putative causal AMD-associated metabolites. We find limited evidence linking urine metabolites to AMD risk. Our study emphasizes the contribution of plasma metabolites, particularly lipid-related pathways and genes, to AMD risk and uncovers numerous putative causal associations between metabolites and AMD risk.
Assuntos
Estudo de Associação Genômica Ampla , Degeneração Macular , Humanos , Idoso , Teorema de Bayes , Degeneração Macular/genética , Degeneração Macular/metabolismo , Metabolômica , Metaboloma/genéticaRESUMO
PURPOSE: The associations of adiponectin with type 2 diabetes mellitus (T2DM), glucose homeostasis (including ß-cell function index (HOMA-ß), insulin resistance (HOMA-IR), fasting insulin (FI) and fasting glucose (FG)) have reported in epidemiological studies. However, the previous observational studies are prone to biases, such as reverse causation and residual confounding factors. Herein, a Mendelian Randomization (MR) study was conducted to determine whether causal effects exist among them. MATERIALS AND AND METHODS: Two-sample MR analyses and multiple sensitivity analyses were performed using the summary data from the ADIPOGen consortium, MAGIC Consortium, and a meta-analysis of GWAS with a considerable sample of T2DM (62,892 cases and 596,424 controls of European ancestry). We got eight valid genetic variants to predict the causal effect among adiponectin and T2DM and glucose homeostasis after excluding the probable invalid or pleiotropic variants. RESULTS: Adiponectin was not associated with T2DM (odds ratio (OR) = 1.004; 95% confidence interval (CI): 0.740, 1.363) when using MR Egger after removing the invalid SNPs, and the results were consistent when using the other four methods. Similar results existed among adiponectin and HOMA-ß, HOMA-IR, FI, FG. CONCLUSION: Our MR study revealed that adiponectin had no causal effect on T2DM and glucose homeostasis and that the associations among them in observational studies may be due to confounding factors.
RESUMO
Although genome-wide association studies (GWASs) have successfully identified thousands of risk variants for human complex diseases, understanding the biological function and molecular mechanisms of the associated SNPs involved in complex diseases is challenging. Here we developed a framework named integrative multi-omics network-based approach (IMNA), aiming to identify potential key genes in regulatory networks by integrating molecular interactions across multiple biological scales, including GWAS signals, gene expression-based signatures, chromatin interactions and protein interactions from the network topology. We applied this approach to breast cancer, and prioritized key genes involved in regulatory networks. We also developed an abnormal gene expression score (AGES) signature based on the gene expression deviation of the top 20 rank-ordered genes in breast cancer. The AGES values are associated with genetic variants, tumor properties and patient survival outcomes. Among the top 20 genes, RNASEH2A was identified as a new candidate gene for breast cancer. Thus, our integrative network-based approach provides a genetic-driven framework to unveil tissue-specific interactions from multiple biological scales and reveal potential key regulatory genes for breast cancer. This approach can also be applied in other complex diseases such as ovarian cancer to unravel underlying mechanisms and help for developing therapeutic targets.
RESUMO
Human diseases are usually linked to multiloci genetic alterations, including single-nucleotide polymorphisms (SNPs). Methods to use these SNPs for disease risk prediction (DRP) are of clinical interest. DRP algorithms explored by commercial companies to date have tended to be complex and led to controversial prediction results. Here, we present a general approach for establishing a logistic model-based DRP algorithm, in which multiple SNP risk factors from different publications are directly used. In particular, the coefficient ß of each SNP is set as the natural logarithm of the reported odds ratio, and the constant coefficient ß0 is comprehensively determined by the coefficient and frequency of each SNP and the average disease risk in populations. Furthermore, homozygous SNP is considered a dummy variable, and the SNPs are updated (addition, deletion and modification) if necessary. Importantly, we validated this algorithm as a proof of concept: two patients with lung cancer were identified as the maximum risk cases from 57 Chinese individuals. Our logistic model-based DRP algorithm is apparently more intuitive and self-evident than the algorithms explored by commercial companies, and it may facilitate DRP commercialization in the era of personalized medicine.
Assuntos
Algoritmos , Modelos Logísticos , Neoplasias Pulmonares/genética , Polimorfismo de Nucleotídeo Único/genética , China , Feminino , Humanos , Neoplasias Pulmonares/sangue , Masculino , Fatores de RiscoRESUMO
Although genome-wide association studies (GWASs) have identified some risk single-nucleotide polymorphisms in East Asian never-smoking females, the unexplained missing heritability is still required to be investigated. Runs of homozygosity (ROHs) are thought to be a type of genetic variation acting on human complex traits and diseases. We detected ROHs in 8,881 East Asian never-smoking women. The summed ROHs were used to fit a logistic regression model which noteworthily revealed a significant association between ROHs and the decreased risk of lung cancer (P < 0.05). We identified 4 common ROHs regions located at 2p22.1, which were significantly associated with decreased risk of lung cancer (P = 2.00 × 10-4 - 1.35 × 10-4). Functional annotation was conducted to investigate the regulatory function of ROHs. The common ROHs were overlapped with potential regulatory elements, such as active epigenome elements and chromatin states in lung-derived cell lines. SOS1 and ARHGEF33 were significantly up-regulated as the putative target genes of the identified ROHs in lung cancer samples according to the analysis of differently expressed genes. Our results suggest that ROHs could act as recessive contributing factors and regulatory elements to influence the risk of lung cancer in never-smoking East Asian females.
RESUMO
BACKGROUND: Genome-wide association studies (GWASs) have revealed relationships between over 57,000 genetic variants and diseases. However, unlike Mendelian diseases, complex diseases arise from the interplay of multiple genetic and environmental factors. Natural selection has led to a high tendency of risk alleles to be enriched in minor alleles in Mendelian diseases. Therefore, an allele that was previously advantageous or neutral may later become harmful, making it a risk allele. METHODS: Using data in the NHGRI-EBI Catalog and the VARIMED database, we investigated whether (1) GWASs more easily detect risk alleles and (2) facilitate evolutionary insights by comparing risk allele frequencies of different diseases. We conducted computer simulations of P-values for association tests when major and minor alleles were risk alleles. We compared the expected proportion of SNVs whose risk alleles were minor alleles with the observed proportion. RESULTS: Our statistical results revealed that risk alleles were enriched in minor alleles, especially for variants with low minor allele frequencies (MAFs < 0.1). Our computer simulations revealed that > 50% risk alleles were minor alleles because of the larger difference in the power of GWASs to differentiate between minor and major alleles, especially with low MAFs or when the number of controls exceeds the number of cases. However, the observed ratios between minor and major alleles in low MAFs (< 0.1) were much larger than the expected ratios of GWAS's power imbalance, especially for diseases whose average risk allele frequencies were low, such as myopia, sudden cardiac arrest, and systemic lupus erythematosus. CONCLUSIONS: Minor alleles are more likely to be risk alleles in the published GWASs on complex diseases. One reason is that minor alleles are more easily detected as risk alleles in GWASs. Even when correcting for the GWAS's power imbalance, minor alleles are more likely to be risk alleles, especially in some diseases whose average risk allele frequencies are low. These analyses serve as a starting point for future studies on quantifying the degree of negative natural selection in various complex diseases.