Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 6.693
Filtrar
1.
Am J Hum Genet ; 111(5): 966-978, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38701746

RESUMEN

Replicability is the cornerstone of modern scientific research. Reliable identifications of genotype-phenotype associations that are significant in multiple genome-wide association studies (GWASs) provide stronger evidence for the findings. Current replicability analysis relies on the independence assumption among single-nucleotide polymorphisms (SNPs) and ignores the linkage disequilibrium (LD) structure. We show that such a strategy may produce either overly liberal or overly conservative results in practice. We develop an efficient method, ReAD, to detect replicable SNPs associated with the phenotype from two GWASs accounting for the LD structure. The local dependence structure of SNPs across two heterogeneous studies is captured by a four-state hidden Markov model (HMM) built on two sequences of p values. By incorporating information from adjacent locations via the HMM, our approach provides more accurate SNP significance rankings. ReAD is scalable, platform independent, and more powerful than existing replicability analysis methods with effective false discovery rate control. Through analysis of datasets from two asthma GWASs and two ulcerative colitis GWASs, we show that ReAD can identify replicable genetic loci that existing methods might otherwise miss.


Asunto(s)
Asma , Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento , Polimorfismo de Nucleótido Simple , Estudio de Asociación del Genoma Completo/métodos , Humanos , Asma/genética , Cadenas de Markov , Colitis Ulcerosa/genética , Reproducibilidad de los Resultados , Fenotipo , Genotipo
2.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38711368

RESUMEN

Common genetic variants and susceptibility loci associated with Alzheimer's disease (AD) have been discovered through large-scale genome-wide association studies (GWAS), GWAS by proxy (GWAX) and meta-analysis of GWAS and GWAX (GWAS+GWAX). However, due to the very low repeatability of AD susceptibility loci and the low heritability of AD, these AD genetic findings have been questioned. We summarize AD genetic findings from the past 10 years and provide a new interpretation of these findings in the context of statistical heterogeneity. We discovered that only 17% of AD risk loci demonstrated reproducibility with a genome-wide significance of P < 5.00E-08 across all AD GWAS and GWAS+GWAX datasets. We highlighted that the AD GWAS+GWAX with the largest sample size failed to identify the most significant signals, the maximum number of genome-wide significant genetic variants or maximum heritability. Additionally, we identified widespread statistical heterogeneity in AD GWAS+GWAX datasets, but not in AD GWAS datasets. We consider that statistical heterogeneity may have attenuated the statistical power in AD GWAS+GWAX and may contribute to explaining the low repeatability (17%) of genome-wide significant AD susceptibility loci and the decreased AD heritability (40-2%) as the sample size increased. Importantly, evidence supports the idea that a decrease in statistical heterogeneity facilitates the identification of genome-wide significant genetic loci and contributes to an increase in AD heritability. Collectively, current AD GWAX and GWAS+GWAX findings should be meticulously assessed and warrant additional investigation, and AD GWAS+GWAX should employ multiple meta-analysis methods, such as random-effects inverse variance-weighted meta-analysis, which is designed specifically for statistical heterogeneity.


Asunto(s)
Enfermedad de Alzheimer , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Enfermedad de Alzheimer/genética , Humanos , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Heterogeneidad Genética
3.
Int J Mol Sci ; 25(9)2024 Apr 25.
Artículo en Inglés | MEDLINE | ID: mdl-38731885

RESUMEN

Lysine is an essential amino acid that cannot be synthesized in humans. Rice is a global staple food for humans but has a rather low lysine content. Identification of the quantitative trait nucleotides (QTNs) and genes underlying lysine content is crucial to increase lysine accumulation. In this study, five grain and three leaf lysine content datasets and 4,630,367 single nucleotide polymorphisms (SNPs) of 387 rice accessions were used to perform a genome-wide association study (GWAS) by ten statistical models. A total of 248 and 71 common QTNs associated with grain/leaf lysine content were identified. The accuracy of genomic selection/prediction RR-BLUP models was up to 0.85, and the significant correlation between the number of favorable alleles per accession and lysine content was up to 0.71, which validated the reliability and additive effects of these QTNs. Several key genes were uncovered for fine-tuning lysine accumulation. Additionally, 20 and 30 QTN-by-environment interactions (QEIs) were detected in grains/leaves. The QEI-sf0111954416 candidate gene LOC_Os01g21380 putatively accounted for gene-by-environment interaction was identified in grains. These findings suggested the application of multi-model GWAS facilitates a better understanding of lysine accumulation in rice. The identified QTNs and genes hold the potential for lysine-rich rice with a normal phenotype.


Asunto(s)
Estudio de Asociación del Genoma Completo , Lisina , Oryza , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Oryza/genética , Oryza/metabolismo , Lisina/metabolismo , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Interacción Gen-Ambiente , Grano Comestible/genética , Grano Comestible/metabolismo
4.
Eur J Med Res ; 29(1): 261, 2024 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-38698427

RESUMEN

BACKGROUND: Prior observational research has investigated the association between dietary patterns and Alzheimer's disease (AD) risk. Nevertheless, due to constraints in past observational studies, establishing a causal link between dietary habits and AD remains challenging. METHODS: Methodology involved the utilization of extensive cohorts sourced from publicly accessible genome-wide association study (GWAS) datasets of European descent for conducting Mendelian randomization (MR) analyses. The principal analytical technique utilized was the inverse-variance weighted (IVW) method. RESULTS: The MR analysis conducted in this study found no statistically significant causal association between 20 dietary habits and the risk of AD (All p > 0.05). These results were consistent across various MR methods employed, including MR-Egger, weighted median, simple mode, and weighted mode approaches. Moreover, there was no evidence of horizontal pleiotropy detected (All p > 0.05). CONCLUSION: In this MR analysis, our finding did not provide evidence to support the causal genetic relationships between dietary habits and AD risk.


Asunto(s)
Enfermedad de Alzheimer , Estudio de Asociación del Genoma Completo , Análisis de la Aleatorización Mendeliana , Enfermedad de Alzheimer/genética , Enfermedad de Alzheimer/epidemiología , Enfermedad de Alzheimer/etiología , Humanos , Análisis de la Aleatorización Mendeliana/métodos , Estudio de Asociación del Genoma Completo/métodos , Factores de Riesgo , Conducta Alimentaria/fisiología , Dieta/efectos adversos , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
5.
Sci Adv ; 10(19): eadj1424, 2024 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-38718126

RESUMEN

The ongoing expansion of human genomic datasets propels therapeutic target identification; however, extracting gene-disease associations from gene annotations remains challenging. Here, we introduce Mantis-ML 2.0, a framework integrating AstraZeneca's Biological Insights Knowledge Graph and numerous tabular datasets, to assess gene-disease probabilities throughout the phenome. We use graph neural networks, capturing the graph's holistic structure, and train them on hundreds of balanced datasets via a robust semi-supervised learning framework to provide gene-disease probabilities across the human exome. Mantis-ML 2.0 incorporates natural language processing to automate disease-relevant feature selection for thousands of diseases. The enhanced models demonstrate a 6.9% average classification power boost, achieving a median receiver operating characteristic (ROC) area under curve (AUC) score of 0.90 across 5220 diseases from Human Phenotype Ontology, OpenTargets, and Genomics England. Notably, Mantis-ML 2.0 prioritizes associations from an independent UK Biobank phenome-wide association study (PheWAS), providing a stronger form of triaging and mitigating against underpowered PheWAS associations. Results are exposed through an interactive web resource.


Asunto(s)
Bancos de Muestras Biológicas , Redes Neurales de la Computación , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Reino Unido , Fenómica/métodos , Predisposición Genética a la Enfermedad , Genómica/métodos , Bases de Datos Genéticas , Algoritmos , Biología Computacional/métodos , Biobanco del Reino Unido
6.
CNS Neurosci Ther ; 30(5): e14741, 2024 05.
Artículo en Inglés | MEDLINE | ID: mdl-38702940

RESUMEN

AIMS: Despite the success of single-cell RNA sequencing in identifying cellular heterogeneity in ischemic stroke, clarifying the mechanisms underlying these associations of differently expressed genes remains challenging. Several studies that integrate gene expression and gene expression quantitative trait loci (eQTLs) with genome wide-association study (GWAS) data to determine their causal role have been proposed. METHODS: Here, we combined Mendelian randomization (MR) framework and single cell (sc) RNA sequencing to study how differently expressed genes (DEGs) mediating the effect of gene expression on ischemic stroke. The hub gene was further validated in the in vitro model. RESULTS: We identified 2339 DEGs in 10 cell clusters. Among these DEGs, 58 genes were associated with the risk of ischemic stroke. After external validation with eQTL dataset, lactate dehydrogenase B (LDHB) is identified to be positively associated with ischemic stroke. The expression of LDHB has also been validated in sc RNA-seq with dominant expression in microglia and astrocytes, and melatonin is able to reduce the LDHB expression and activity in vitro ischemic models. CONCLUSION: Our study identifies LDHB as a novel biomarker for ischemic stroke via combining the sc RNA-seq and MR analysis.


Asunto(s)
Accidente Cerebrovascular Isquémico , L-Lactato Deshidrogenasa , Melatonina , Análisis de la Aleatorización Mendeliana , Análisis de Secuencia de ARN , Animales , Humanos , Estudio de Asociación del Genoma Completo/métodos , Accidente Cerebrovascular Isquémico/genética , Accidente Cerebrovascular Isquémico/metabolismo , Isoenzimas/genética , Isoenzimas/metabolismo , L-Lactato Deshidrogenasa/metabolismo , L-Lactato Deshidrogenasa/genética , Análisis de la Aleatorización Mendeliana/métodos , Sitios de Carácter Cuantitativo , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Ratones
7.
Front Immunol ; 15: 1277720, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38633255

RESUMEN

Background: The existence of chronic pain increases susceptibility to virus and is now widely acknowledged as a prominent feature recognized as a major manifestation of long-term coronavirus disease 2019 (COVID-19) infection. Given the ongoing COVID-19 pandemic, it is imperative to explore the genetic associations between chronic pain and predisposition to COVID-19. Methods: We conducted genetic analysis at the single nucleotide polymorphism (SNP), gene, and molecular levels using summary statistics of genome-wide association study (GWAS) and analyzed the drug targets by summary data-based Mendelian randomization analysis (SMR) to alleviate the multi-site chronic pain in COVID-19. Additionally, we performed a latent causal variable (LCV) method to investigate the causal relationship between chronic pain and susceptibility to COVID-19. Results: The cross-trait meta-analysis identified 19 significant SNPs shared between COVID-19 and chronic pain. Coloc analysis indicated that the posterior probability of association (PPH4) for three loci was above 70% in both critical COVID-19 and COVID-19, with the corresponding top three SNPs being rs13135092, rs7588831, and rs13135092. A total of 482 significant overlapped genes were detected from MAGMA and CPASSOC results. Additionally, the gene ANAPC4 was identified as a potential drug target for treating chronic pain (P=7.66E-05) in COVID-19 (P=8.23E-03). Tissue enrichment analysis highlighted that the amygdala (P=7.81E-04) and prefrontal cortex (P=8.19E-05) as pivotal in regulating chronic pain of critical COVID-19. KEGG pathway enrichment further revealed the enrichment of pleiotropic genes in both COVID-19 (P=3.20E-03,Padjust=4.77E-02,hsa05171) and neurotrophic pathways (P=9.03E-04,Padjust =2.55E-02,hsa04621). Finally, the latent causal variable (LCV) model was applied to find the genetic component of critical COVID-19 was causal for multi-site chronic pain (P=0.015), with a genetic causality proportion (GCP) of was 0.60. Conclusions: In this study, we identified several functional genes and underscored the pivotal role of the inflammatory system in the correlation between the paired traits. Notably, heat shock proteins emerged as potential objective biomarkers for chronic pain symptoms in individuals with COVID-19. Additionally, the ubiquitin system might play a role in mediating the impact of COVID-19 on chronic pain. These findings contribute to a more comprehensive understanding of the pleiotropy between COVID-19 and chronic pain, offering insights for therapeutic trials.


Asunto(s)
COVID-19 , Dolor Crónico , Humanos , Estudio de Asociación del Genoma Completo/métodos , Predisposición Genética a la Enfermedad , Pandemias
8.
Hum Genomics ; 18(1): 39, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38632618

RESUMEN

Age-related cataract and hearing difficulties are major sensory disorders that often co-exist in the global-wide elderly and have a tangible influence on the quality of life. However, the epidemiologic association between cataract and hearing difficulties remains unexplored, while little is known about whether the two share their genetic etiology. We first investigated the clinical association between cataract and hearing difficulties using the UK Biobank covering 502,543 individuals. Both unmatched analysis (adjusted for confounders) and a matched analysis (one control matched for each patient with cataract according to confounding factors) were undertaken and confirmed that cataract was associated with hearing difficulties (OR, 2.12; 95% CI, 1.98-2.27; OR, 2.03; 95% CI, 1.86-2.23, respectively). Furthermore, we explored and quantified the shared genetic architecture of these two complex sensory disorders at the common variant level using the bivariate causal mixture model (MiXeR) and conditional/conjunctional false discovery rate method based on the largest available genome-wide association studies of cataract (N = 585,243) and hearing difficulties (N = 323,978). Despite detecting only a negligible genetic correlation, we observe polygenic overlap between cataract and hearing difficulties and identify 6 shared loci with mixed directions of effects. Follow-up analysis of the shared loci implicates candidate genes QKI, STK17A, TYR, NSF, and TCF4 likely contribute to the pathophysiology of cataracts and hearing difficulties. In conclusion, this study demonstrates the presence of epidemiologic association between cataract and hearing difficulties and provides new insights into the shared genetic architecture of these two disorders at the common variant level.


Asunto(s)
Catarata , Pérdida Auditiva , Anciano , Persona de Mediana Edad , Humanos , Estudio de Asociación del Genoma Completo/métodos , Calidad de Vida , Audición , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Sitios Genéticos , Proteínas Serina-Treonina Quinasas , Proteínas Reguladoras de la Apoptosis
9.
BMC Genomics ; 25(1): 386, 2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38641604

RESUMEN

BACKGROUND: The growth and development of organism were dependent on the effect of genetic, environment, and their interaction. In recent decades, lots of candidate additive genetic markers and genes had been detected by using genome-widely association study (GWAS). However, restricted to computing power and practical tool, the interactive effect of markers and genes were not revealed clearly. And utilization of these interactive markers is difficult in the breeding and prediction, such as genome selection (GS). RESULTS: Through the Power-FDR curve, the GbyE algorithm can detect more significant genetic loci at different levels of genetic correlation and heritability, especially at low heritability levels. The additive effect of GbyE exhibits high significance on certain chromosomes, while the interactive effect detects more significant sites on other chromosomes, which were not detected in the first two parts. In prediction accuracy testing, in most cases of heritability and genetic correlation, the majority of prediction accuracy of GbyE is significantly higher than that of the mean method, regardless of whether the rrBLUP model or BGLR model is used for statistics. The GbyE algorithm improves the prediction accuracy of the three Bayesian models BRR, BayesA, and BayesLASSO using information from genetic by environmental interaction (G × E) and increases the prediction accuracy by 9.4%, 9.1%, and 11%, respectively, relative to the Mean value method. The GbyE algorithm is significantly superior to the mean method in the absence of a single environment, regardless of the combination of heritability and genetic correlation, especially in the case of high genetic correlation and heritability. CONCLUSIONS: Therefore, this study constructed a new genotype design model program (GbyE) for GWAS and GS using Kronecker product. which was able to clearly estimate the additive and interactive effects separately. The results showed that GbyE can provide higher statistical power for the GWAS and more prediction accuracy of the GS models. In addition, GbyE gives varying degrees of improvement of prediction accuracy in three Bayesian models (BRR, BayesA, and BayesCpi). Whatever the phenotype were missed in the single environment or multiple environments, the GbyE also makes better prediction for inference population set. This study helps us understand the interactive relationship between genomic and environment in the complex traits. The GbyE source code is available at the GitHub website ( https://github.com/liu-xinrui/GbyE ).


Asunto(s)
Sitios de Carácter Cuantitativo , Selección Genética , Teorema de Bayes , Modelos Genéticos , Fenotipo , Genotipo , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple
10.
PLoS Comput Biol ; 20(4): e1011990, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38598551

RESUMEN

Prostate cancer is a heritable disease with ancestry-biased incidence and mortality. Polygenic risk scores (PRSs) offer promising advancements in predicting disease risk, including prostate cancer. While their accuracy continues to improve, research aimed at enhancing their effectiveness within African and Asian populations remains key for equitable use. Recent algorithmic developments for PRS derivation have resulted in improved pan-ancestral risk prediction for several diseases. In this study, we benchmark the predictive power of six widely used PRS derivation algorithms, including four of which adjust for ancestry, against prostate cancer cases and controls from the UK Biobank and All of Us cohorts. We find modest improvement in discriminatory ability when compared with a simple method that prioritizes variants, clumping, and published polygenic risk scores. Our findings underscore the importance of improving upon risk prediction algorithms and the sampling of diverse cohorts.


Asunto(s)
Algoritmos , Benchmarking , Predisposición Genética a la Enfermedad , Herencia Multifactorial , Neoplasias de la Próstata , Humanos , Neoplasias de la Próstata/genética , Masculino , Benchmarking/métodos , Predisposición Genética a la Enfermedad/genética , Herencia Multifactorial/genética , Estudios de Cohortes , Factores de Riesgo , Polimorfismo de Nucleótido Simple/genética , Estudio de Asociación del Genoma Completo/métodos , Biología Computacional/métodos , Medición de Riesgo/métodos , Estudios de Casos y Controles , Puntuación de Riesgo Genético
11.
PLoS One ; 19(4): e0298906, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38625909

RESUMEN

Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Herencia Multifactorial/genética , Modelos Logísticos , Polimorfismo de Nucleótido Simple
12.
Bioinformatics ; 40(4)2024 Mar 29.
Artículo en Inglés | MEDLINE | ID: mdl-38632050

RESUMEN

MOTIVATION: As the availability of larger and more ethnically diverse reference panels grows, there is an increase in demand for ancestry-informed imputation of genome-wide association studies (GWAS), and other downstream analyses, e.g. fine-mapping. Performing such analyses at the genotype level is computationally challenging and necessitates, at best, a laborious process to access individual-level genotype and phenotype data. Summary-statistics-based tools, not requiring individual-level data, provide an efficient alternative that streamlines computational requirements and promotes open science by simplifying the re-analysis and downstream analysis of existing GWAS summary data. However, existing tools perform only disparate parts of needed analysis, have only command-line interfaces, and are difficult to extend/link by applied researchers. RESULTS: To address these challenges, we present Genome Analysis Using Summary Statistics (GAUSS)-a comprehensive and user-friendly R package designed to facilitate the re-analysis/downstream analysis of GWAS summary statistics. GAUSS offers an integrated toolkit for a range of functionalities, including (i) estimating ancestry proportion of study cohorts, (ii) calculating ancestry-informed linkage disequilibrium, (iii) imputing summary statistics of unobserved variants, (iv) conducting transcriptome-wide association studies, and (v) correcting for "Winner's Curse" biases. Notably, GAUSS utilizes an expansive, multi-ethnic reference panel consisting of 32 953 genomes from 29 ethnic groups. This panel enhances the range and accuracy of imputable variants, including the ability to impute summary statistics of rarer variants. As a result, GAUSS elevates the quality and applicability of existing GWAS analyses without requiring access to subject-level genotypic and phenotypic information. AVAILABILITY AND IMPLEMENTATION: The GAUSS R package, complete with its source code, is readily accessible to the public via our GitHub repository at https://github.com/statsleelab/gauss. To further assist users, we provided illustrative use-case scenarios that are conveniently found at https://statsleelab.github.io/gauss/, along with a comprehensive user guide detailed in Supplementary Text S1.


Asunto(s)
Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento , Programas Informáticos , Estudio de Asociación del Genoma Completo/métodos , Humanos , Polimorfismo de Nucleótido Simple , Genotipo , Estudios de Cohortes
13.
BMC Genomics ; 25(1): 375, 2024 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-38627641

RESUMEN

BACKGROUND: Approximately 95% of samples analyzed in univariate genome-wide association studies (GWAS) are of European ancestry. This bias toward European ancestry populations in association screening also exists for other analyses and methods that are often developed and tested on European ancestry only. However, existing data in non-European populations, which are often of modest sample size, could benefit from innovative approaches as recently illustrated in the context of polygenic risk scores. METHODS: Here, we extend and assess the potential limitations and gains of our multi-trait GWAS pipeline, JASS (Joint Analysis of Summary Statistics), for the analysis of non-European ancestries. To this end, we conducted the joint GWAS of 19 hematological traits and glycemic traits across five ancestries (European (EUR), admixed American (AMR), African (AFR), East Asian (EAS), and South-East Asian (SAS)). RESULTS: We detected 367 new genome-wide significant associations in non-European populations (15 in Admixed American (AMR), 72 in African (AFR) and 280 in East Asian (EAS)). New associations detected represent 5%, 17% and 13% of associations in the AFR, AMR and EAS populations, respectively. Overall, multi-trait testing increases the replication of European associated loci in non-European ancestry by 15%. Pleiotropic effects were highly similar at significant loci across ancestries (e.g. the mean correlation between multi-trait genetic effects of EUR and EAS ancestries was 0.88). For hematological traits, strong discrepancies in multi-trait genetic effects are tied to known evolutionary divergences: the ARKC1 loci, which is adaptive to overcome p.vivax induced malaria. CONCLUSIONS: Multi-trait GWAS can be a valuable tool to narrow the genetic knowledge gap between European and non-European populations.


Asunto(s)
Pueblo Asiatico , Población Negra , Estudio de Asociación del Genoma Completo , Humanos , Pueblo Asiatico/genética , Población Negra/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Polimorfismo de Nucleótido Simple , Pueblo Europeo/genética
14.
J Transl Med ; 22(1): 356, 2024 Apr 16.
Artículo en Inglés | MEDLINE | ID: mdl-38627847

RESUMEN

Machine learning (ML) methods are increasingly becoming crucial in genome-wide association studies for identifying key genetic variants or SNPs that statistical methods might overlook. Statistical methods predominantly identify SNPs with notable effect sizes by conducting association tests on individual genetic variants, one at a time, to determine their relationship with the target phenotype. These genetic variants are then used to create polygenic risk scores (PRSs), estimating an individual's genetic risk for complex diseases like cancer or cardiovascular disorders. Unlike traditional methods, ML algorithms can identify groups of low-risk genetic variants that improve prediction accuracy when combined in a mathematical model. However, the application of ML strategies requires addressing the feature selection challenge to prevent overfitting. Moreover, ensuring the ML model depends on a concise set of genomic variants enhances its clinical applicability, where testing is feasible for only a limited number of SNPs. In this study, we introduce a robust pipeline that applies ML algorithms in combination with feature selection (ML-FS algorithms), aimed at identifying the most significant genomic variants associated with the coronary artery disease (CAD) phenotype. The proposed computational approach was tested on individuals from the UK Biobank, differentiating between CAD and non-CAD individuals within this extensive cohort, and benchmarked against standard PRS-based methodologies like LDpred2 and Lassosum. Our strategy incorporates cross-validation to ensure a more robust evaluation of genomic variant-based prediction models. This method is commonly applied in machine learning strategies but has often been neglected in previous studies assessing the predictive performance of polygenic risk scores. Our results demonstrate that the ML-FS algorithm can identify panels with as few as 50 genetic markers that can achieve approximately 80% accuracy when used in combination with known risk factors. The modest increase in accuracy over PRS performances is noteworthy, especially considering that PRS models incorporate a substantially larger number of genetic variants. This extensive variant selection can pose practical challenges in clinical settings. Additionally, the proposed approach revealed novel CAD-genetic variant associations.


Asunto(s)
Enfermedad de la Arteria Coronaria , Humanos , Enfermedad de la Arteria Coronaria/genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Factores de Riesgo , Puntuación de Riesgo Genético , Aprendizaje Automático , Genómica
15.
Nat Genet ; 56(4): 615-626, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38594305

RESUMEN

Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.


Asunto(s)
Estudio de Asociación del Genoma Completo , Secuencias Reguladoras de Ácidos Nucleicos , Humanos , Alelos , Estudio de Asociación del Genoma Completo/métodos , Mapeo Cromosómico , Fenotipo , Cromatina/genética , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad/genética
16.
Cell Genom ; 4(4): 100539, 2024 Apr 10.
Artículo en Inglés | MEDLINE | ID: mdl-38604127

RESUMEN

Polygenic risk scores (PRSs) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in summary statistics from genome-wide association studies (GWASs) across multiple ancestry groups via Bayesian hierarchical modeling and ensemble learning. In our simulation studies and data analyses across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. For example, MUSSEL has an average gain in prediction R2 across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, trait architecture, and linkage disequilibrium reference samples; thus, ultimately a combination of methods may be needed to generate the most robust PRSs across diverse populations.


Asunto(s)
Bivalvos , Herencia Multifactorial , Humanos , Animales , Herencia Multifactorial/genética , Estudio de Asociación del Genoma Completo/métodos , Teorema de Bayes , Fenotipo , Puntuación de Riesgo Genético
17.
Cell Death Dis ; 15(4): 251, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-38589365

RESUMEN

Cell death mediated by genetically defined signaling pathways influences the health and dynamics of all tissues, however the tissue specificity of cell death pathways and the relationships between these pathways and human disease are not well understood. We analyzed the expression profiles of an array of 44 cell death genes involved in apoptosis, necroptosis, and pyroptosis cell death pathways across 49 human tissues from GTEx, to elucidate the landscape of cell death gene expression across human tissues, and the relationship between tissue-specific genetically determined expression and the human phenome. We uncovered unique cell death gene expression profiles across tissue types, suggesting there are physiologically distinct cell death programs in different tissues. Using summary statistics-based transcriptome wide association studies (TWAS) on human traits in the UK Biobank (n ~ 500,000), we evaluated 513 traits encompassing ICD-10 defined diagnoses and laboratory-derived traits. Our analysis revealed hundreds of significant (FDR < 0.05) associations between genetically regulated cell death gene expression and an array of human phenotypes encompassing both clinical diagnoses and hematologic parameters, which were independently validated in another large-scale DNA biobank (BioVU) at Vanderbilt University Medical Center (n = 94,474) with matching phenotypes. Cell death genes were highly enriched for significant associations with blood traits versus non-cell-death genes, with apoptosis-associated genes enriched for leukocyte and platelet traits. Our findings are also concordant with independently published studies (e.g. associations between BCL2L11/BIM expression and platelet & lymphocyte counts). Overall, these results suggest that cell death genes play distinct roles in their contribution to human phenotypes, and that cell death genes influence a diverse array of human traits.


Asunto(s)
Estudio de Asociación del Genoma Completo , Transcriptoma , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Muerte Celular/genética , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
18.
Genes (Basel) ; 15(4)2024 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-38674346

RESUMEN

Ketosis is a common metabolic disorder in the early lactation of dairy cows. It is typically diagnosed by measuring the concentration of ß-hydroxybutyrate (BHB) in the blood. This study aimed to estimate the genetic parameters of blood BHB and conducted a genome-wide association study (GWAS) based on the estimated breeding value. Phenotypic data were collected from December 2019 to August 2023, comprising blood BHB concentrations in 45,617 Holstein cows during the three weeks post-calving across seven dairy farms. Genotypic data were obtained using the Neogen Geneseek Genomic Profiler (GGP) Bovine 100 K SNP Chip and GGP Bovine SNP50 v3 (Illumina Inc., San Diego, CA, USA) for genotyping. The estimated heritability and repeatability values for blood BHB levels were 0.167 and 0.175, respectively. The GWAS result detected a total of ten genome-wide significant associations with blood BHB. Significant SNPs were distributed in Bos taurus autosomes (BTA) 2, 6, 9, 11, 13, and 23, with 48 annotated candidate genes. These potential genes included those associated with insulin regulation, such as INSIG2, and those linked to fatty acid metabolism, such as HADHB, HADHA, and PANK2. Enrichment analysis of the candidate genes for blood BHB revealed the molecular functions and biological processes involved in fatty acid and lipid metabolism in dairy cattle. The identification of novel genomic regions in this study contributes to the characterization of key genes and pathways that elucidate susceptibility to ketosis in dairy cattle.


Asunto(s)
Ácido 3-Hidroxibutírico , Estudio de Asociación del Genoma Completo , Lactancia , Polimorfismo de Nucleótido Simple , Animales , Bovinos/genética , Ácido 3-Hidroxibutírico/sangre , Estudio de Asociación del Genoma Completo/métodos , Estudio de Asociación del Genoma Completo/veterinaria , Femenino , Lactancia/genética , Cetosis/veterinaria , Cetosis/genética , Cetosis/sangre , Antecedentes Genéticos , Enfermedades de los Bovinos/genética , Enfermedades de los Bovinos/sangre , Genotipo
19.
Genes (Basel) ; 15(4)2024 Mar 26.
Artículo en Inglés | MEDLINE | ID: mdl-38674348

RESUMEN

The length of coleoptile is crucial for determining the sowing depth of oats in low-precipitation regions, which is significant for oat breeding programs. In this study, a diverse panel of 243 oat accessions was used to explore coleoptile length in two independent experiments. The panel exhibited significant variation in coleoptile length, ranging from 4.66 to 8.76 cm. Accessions from Africa, America, and the Mediterranean region displayed longer coleoptile lengths than those from Asia and Europe. Genome-wide association studies (GWASs) using 26,196 SNPs identified 34 SNPs, representing 32 quantitative trait loci (QTLs) significantly associated with coleoptile length. Among these QTLs, six were consistently detected in both experiments, explaining 6.43% to 10.07% of the phenotypic variation. The favorable alleles at these stable loci additively increased coleoptile length, offering insights for pyramid breeding. Gene Ontology (GO) analysis of the 350 candidate genes underlying the six stable QTLs revealed significant enrichment in cell development-related processes. Several phytochrome-related genes, including auxin transporter-like protein 1 and cytochrome P450 proteins, were found within these QTLs. Further validation of these loci will enhance our understanding of coleoptile length regulation. This study provides new insights into the genetic architecture of coleoptile length in oats.


Asunto(s)
Avena , Cotiledón , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Avena/genética , Avena/crecimiento & desarrollo , Estudio de Asociación del Genoma Completo/métodos , Cotiledón/genética , Cotiledón/crecimiento & desarrollo , Fenotipo , Genoma de Planta , Fitomejoramiento
20.
BMC Med ; 22(1): 152, 2024 Apr 08.
Artículo en Inglés | MEDLINE | ID: mdl-38589871

RESUMEN

BACKGROUND: Despite substantial research revealing that patients with rheumatoid arthritis (RA) have excessive morbidity and mortality of cardiovascular disease (CVD), the mechanism underlying this association has not been fully known. This study aims to systematically investigate the phenotypic and genetic correlation between RA and CVD. METHODS: Based on UK Biobank, we conducted two cohort studies to evaluate the phenotypic relationships between RA and CVD, including atrial fibrillation (AF), coronary artery disease (CAD), heart failure (HF), and stroke. Next, we used linkage disequilibrium score regression, Local Analysis of [co]Variant Association, and bivariate causal mixture model (MiXeR) methods to examine the genetic correlation and polygenic overlap between RA and CVD, using genome-wide association summary statistics. Furthermore, we explored specific shared genetic loci by conjunctional false discovery rate analysis and association analysis based on subsets. RESULTS: Compared with the general population, RA patients showed a higher incidence of CVD (hazard ratio [HR] = 1.21, 95% confidence interval [CI]: 1.15-1.28). We observed positive genetic correlations of RA with AF and stroke, and a mixture of negative and positive local genetic correlations underlying the global genetic correlation for CAD and HF, with 13 ~ 33% of shared genetic variants for these trait pairs. We further identified 23 pleiotropic loci associated with RA and at least one CVD, including one novel locus (rs7098414, TSPAN14, 10q23.1). Genes mapped to these shared loci were enriched in immune and inflammatory-related pathways, and modifiable risk factors, such as high diastolic blood pressure. CONCLUSIONS: This study revealed the shared genetic architecture of RA and CVD, which may facilitate drug target identification and improved clinical management.


Asunto(s)
Artritis Reumatoide , Enfermedades Cardiovasculares , Enfermedad de la Arteria Coronaria , Insuficiencia Cardíaca , Accidente Cerebrovascular , Humanos , Enfermedades Cardiovasculares/epidemiología , Enfermedades Cardiovasculares/genética , Estudio de Asociación del Genoma Completo/métodos , Predisposición Genética a la Enfermedad/genética , Artritis Reumatoide/genética , Artritis Reumatoide/epidemiología , Enfermedad de la Arteria Coronaria/genética , Accidente Cerebrovascular/epidemiología , Accidente Cerebrovascular/genética , Polimorfismo de Nucleótido Simple/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA