Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 187(2): 464-480.e10, 2024 01 18.
Artículo en Inglés | MEDLINE | ID: mdl-38242088

RESUMEN

Primary open-angle glaucoma (POAG), the leading cause of irreversible blindness worldwide, disproportionately affects individuals of African ancestry. We conducted a genome-wide association study (GWAS) for POAG in 11,275 individuals of African ancestry (6,003 cases; 5,272 controls). We detected 46 risk loci associated with POAG at genome-wide significance. Replication and post-GWAS analyses, including functionally informed fine-mapping, multiple trait co-localization, and in silico validation, implicated two previously undescribed variants (rs1666698 mapping to DBF4P2; rs34957764 mapping to ROCK1P1) and one previously associated variant (rs11824032 mapping to ARHGEF12) as likely causal. For individuals of African ancestry, a polygenic risk score (PRS) for POAG from our mega-analysis (African ancestry individuals) outperformed a PRS from summary statistics of a much larger GWAS derived from European ancestry individuals. This study quantifies the genetic architecture similarities and differences between African and non-African ancestry populations for this blinding disease.


Asunto(s)
Estudio de Asociación del Genoma Completo , Glaucoma de Ángulo Abierto , Humanos , Predisposición Genética a la Enfermedad , Glaucoma de Ángulo Abierto/genética , Población Negra/genética , Polimorfismo de Nucleótido Simple/genética
2.
Am J Hum Genet ; 110(4): 575-591, 2023 04 06.
Artículo en Inglés | MEDLINE | ID: mdl-37028392

RESUMEN

Leveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta's D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes.


Asunto(s)
Epistasis Genética , Estudio de Asociación del Genoma Completo , Desequilibrio de Ligamiento/genética , Genotipo , Bancos de Muestras Biológicas , Reino Unido , Polimorfismo de Nucleótido Simple/genética
3.
Proc Natl Acad Sci U S A ; 119(21): e2123000119, 2022 05 24.
Artículo en Inglés | MEDLINE | ID: mdl-35580180

RESUMEN

Human genomic diversity has been shaped by both ancient and ongoing challenges from viruses. The current coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has had a devastating impact on population health. However, genetic diversity and evolutionary forces impacting host genes related to SARS-CoV-2 infection are not well understood. We investigated global patterns of genetic variation and signatures of natural selection at host genes relevant to SARS-CoV-2 infection (angiotensin converting enzyme 2 [ACE2], transmembrane protease serine 2 [TMPRSS2], dipeptidyl peptidase 4 [DPP4], and lymphocyte antigen 6 complex locus E [LY6E]). We analyzed data from 2,012 ethnically diverse Africans and 15,977 individuals of European and African ancestry with electronic health records and integrated with global data from the 1000 Genomes Project. At ACE2, we identified 41 nonsynonymous variants that were rare in most populations, several of which impact protein function. However, three nonsynonymous variants (rs138390800, rs147311723, and rs145437639) were common among central African hunter-gatherers from Cameroon (minor allele frequency 0.083 to 0.164) and are on haplotypes that exhibit signatures of positive selection. We identify signatures of selection impacting variation at regulatory regions influencing ACE2 expression in multiple African populations. At TMPRSS2, we identified 13 amino acid changes that are adaptive and specific to the human lineage compared with the chimpanzee genome. Genetic variants that are targets of natural selection are associated with clinical phenotypes common in patients with COVID-19. Our study provides insights into global variation at host genes related to SARS-CoV-2 infection, which have been shaped by natural selection in some populations, possibly due to prior viral infections.


Asunto(s)
COVID-19 , África , Enzima Convertidora de Angiotensina 2/genética , COVID-19/genética , Variación Genética , Humanos , Fenotipo , SARS-CoV-2/genética , Selección Genética
4.
Am J Hum Genet ; 108(3): 482-501, 2021 03 04.
Artículo en Inglés | MEDLINE | ID: mdl-33636100

RESUMEN

Rare monogenic disorders of the primary cilium, termed ciliopathies, are characterized by extreme presentations of otherwise common diseases, such as diabetes, hepatic fibrosis, and kidney failure. However, despite a recent revolution in our understanding of the cilium's role in rare disease pathogenesis, the organelle's contribution to common disease remains largely unknown. Hypothesizing that common genetic variants within Mendelian ciliopathy genes might contribute to common complex diseases pathogenesis, we performed association studies of 16,874 common genetic variants across 122 ciliary genes with 12 quantitative laboratory traits characteristic of ciliopathy syndromes in 452,593 individuals in the UK Biobank. We incorporated tissue-specific gene expression analysis, expression quantitative trait loci, and Mendelian disease phenotype information into our analysis and replicated our findings in meta-analysis. 101 statistically significant associations were identified across 42 of the 122 examined ciliary genes (including eight novel replicating associations). These ciliary genes were widely expressed in tissues relevant to the phenotypes being studied, and eQTL analysis revealed strong evidence for correlation between ciliary gene expression levels and laboratory traits. Perhaps most interestingly, our analysis identified different ciliary subcompartments as being specifically associated with distinct sets of phenotypes. Taken together, our data demonstrate the utility of a Mendelian pathway-based approach to genomic association studies, challenge the widely held belief that the cilium is an organelle important mainly in development and in rare syndromic disease pathogenesis, and provide a framework for the continued integration of common and rare disease genetics to provide insight into the pathophysiology of human diseases of immense public health burden.


Asunto(s)
Cilios/genética , Ciliopatías/genética , Enfermedades Genéticas Congénitas/genética , Enfermedades Raras/genética , Cilios/patología , Ciliopatías/patología , Estudios de Asociación Genética , Enfermedades Genéticas Congénitas/patología , Predisposición Genética a la Enfermedad , Genómica , Humanos , Fenotipo , Sitios de Carácter Cuantitativo/genética , Enfermedades Raras/patología
5.
PLoS Genet ; 17(6): e1009534, 2021 06.
Artículo en Inglés | MEDLINE | ID: mdl-34086673

RESUMEN

Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)-rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action.


Asunto(s)
Modelos Genéticos , Catarata/genética , Conjuntos de Datos como Asunto , Diabetes Mellitus Tipo 2/genética , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Glaucoma/genética , Humanos , Hipertensión/genética , Degeneración Macular/genética , Fenotipo , Polimorfismo de Nucleótido Simple
6.
Proc Natl Acad Sci U S A ; 118(42)2021 10 19.
Artículo en Inglés | MEDLINE | ID: mdl-34593624

RESUMEN

The coronaviruses responsible for severe acute respiratory syndrome (SARS-CoV), COVID-19 (SARS-CoV-2), Middle East respiratory syndrome-CoV, and other coronavirus infections express a nucleocapsid protein (N) that is essential for viral replication, transcription, and virion assembly. Phosphorylation of N from SARS-CoV by glycogen synthase kinase 3 (GSK-3) is required for its function and inhibition of GSK-3 with lithium impairs N phosphorylation, viral transcription, and replication. Here we report that the SARS-CoV-2 N protein contains GSK-3 consensus sequences and that this motif is conserved in diverse coronaviruses, raising the possibility that SARS-CoV-2 may be sensitive to GSK-3 inhibitors, including lithium. We conducted a retrospective analysis of lithium use in patients from three major health systems who were PCR-tested for SARS-CoV-2. We found that patients taking lithium have a significantly reduced risk of COVID-19 (odds ratio = 0.51 [0.35-0.74], P = 0.005). We also show that the SARS-CoV-2 N protein is phosphorylated by GSK-3. Knockout of GSK3A and GSK3B demonstrates that GSK-3 is essential for N phosphorylation. Alternative GSK-3 inhibitors block N phosphorylation and impair replication in SARS-CoV-2 infected lung epithelial cells in a cell-type-dependent manner. Targeting GSK-3 may therefore provide an approach to treat COVID-19 and future coronavirus outbreaks.


Asunto(s)
COVID-19/prevención & control , Proteínas de la Nucleocápside de Coronavirus/metabolismo , Glucógeno Sintasa Quinasa 3/antagonistas & inhibidores , Compuestos de Litio/uso terapéutico , Adulto , Anciano , Femenino , Glucógeno Sintasa Quinasa 3/metabolismo , Células HEK293 , Humanos , Compuestos de Litio/farmacología , Masculino , Persona de Mediana Edad , Terapia Molecular Dirigida , Fosfoproteínas/metabolismo , Fosforilación/efectos de los fármacos , Estudios Retrospectivos
7.
J Transl Med ; 21(1): 415, 2023 06 26.
Artículo en Inglés | MEDLINE | ID: mdl-37365631

RESUMEN

BACKGROUND: Computational drug repurposing is crucial for identifying candidate therapeutic medications to address the urgent need for developing treatments for newly emerging infectious diseases. The recent COVID-19 pandemic has taught us the importance of rapidly discovering candidate drugs and providing them to medical and pharmaceutical experts for further investigation. Network-based approaches can provide repurposable drugs quickly by leveraging comprehensive relationships among biological components. However, in a case of newly emerging disease, applying a repurposing methods with only pre-existing knowledge networks may prove inadequate due to the insufficiency of information flow caused by the novel nature of the disease. METHODS: We proposed a network-based complementary linkage method for drug repurposing to solve the lack of incoming new disease-specific information in knowledge networks. We simulate our method under the controlled repurposing scenario that we faced in the early stage of the COVID-19 pandemic. First, the disease-gene-drug multi-layered network was constructed as the backbone network by fusing comprehensive knowledge database. Then, complementary information for COVID-19, containing data on 18 comorbid diseases and 17 relevant proteins, was collected from publications or preprint servers as of May 2020. We estimated connections between the novel COVID-19 node and the backbone network to construct a complemented network. Network-based drug scoring for COVID-19 was performed by applying graph-based semi-supervised learning, and the resulting scores were used to validate prioritized drugs for population-scale electronic health records-based medication analyses. RESULTS: The backbone networks consisted of 591 diseases, 26,681 proteins, and 2,173 drug nodes based on pre-pandemic knowledge. After incorporating the 35 entities comprised of complemented information into the backbone network, drug scoring screened top 30 potential repurposable drugs for COVID-19. The prioritized drugs were subsequently analyzed in electronic health records obtained from patients in the Penn Medicine COVID-19 Registry as of October 2021 and 8 of these were found to be statistically associated with a COVID-19 phenotype. CONCLUSION: We found that 8 of the 30 drugs identified by graph-based scoring on complemented networks as potential candidates for COVID-19 repurposing were additionally supported by real-world patient data in follow-up analyses. These results show that our network-based complementary linkage method and drug scoring algorithm are promising strategies for identifying candidate repurposable drugs when new emerging disease outbreaks.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Pandemias , Algoritmos , Proteínas , Reposicionamiento de Medicamentos/métodos
8.
Am J Hum Genet ; 102(4): 592-608, 2018 04 05.
Artículo en Inglés | MEDLINE | ID: mdl-29606303

RESUMEN

Most phenome-wide association studies (PheWASs) to date have used a small to moderate number of SNPs for association with phenotypic data. We performed a large-scale single-cohort PheWAS, using electronic health record (EHR)-derived case-control status for 541 diagnoses using International Classification of Disease version 9 (ICD-9) codes and 25 median clinical laboratory measures. We calculated associations between these diagnoses and traits with ∼630,000 common frequency SNPs with minor allele frequency > 0.01 for 38,662 individuals. In this landscape PheWAS, we explored results within diseases and traits, comparing results to those previously reported in genome-wide association studies (GWASs), as well as previously published PheWASs. We further leveraged the context of functional impact from protein-coding to regulatory regions, providing a deeper interpretation of these associations. The comprehensive nature of this PheWAS allows for novel hypothesis generation, the identification of phenotypes for further study for future phenotypic algorithm development, and identification of cross-phenotype associations.


Asunto(s)
Técnicas de Laboratorio Clínico , Registros Electrónicos de Salud , Estudio de Asociación del Genoma Completo , Clasificación Internacional de Enfermedades , Cromatina/genética , ADN Intergénico/genética , Regulación de la Expresión Génica , Genoma Humano , Haplotipos/genética , Humanos , Anotación de Secuencia Molecular , Sistemas de Lectura Abierta/genética , Fenotipo , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN
9.
PLoS Genet ; 12(9): e1006186, 2016 09.
Artículo en Inglés | MEDLINE | ID: mdl-27623284

RESUMEN

Primary open angle glaucoma (POAG) is a complex disease and is one of the major leading causes of blindness worldwide. Genome-wide association studies have successfully identified several common variants associated with glaucoma; however, most of these variants only explain a small proportion of the genetic risk. Apart from the standard approach to identify main effects of variants across the genome, it is believed that gene-gene interactions can help elucidate part of the missing heritability by allowing for the test of interactions between genetic variants to mimic the complex nature of biology. To explain the etiology of glaucoma, we first performed a genome-wide association study (GWAS) on glaucoma case-control samples obtained from electronic medical records (EMR) to establish the utility of EMR data in detecting non-spurious and relevant associations; this analysis was aimed at confirming already known associations with glaucoma and validating the EMR derived glaucoma phenotype. Our findings from GWAS suggest consistent evidence of several known associations in POAG. We then performed an interaction analysis for variants found to be marginally associated with glaucoma (SNPs with main effect p-value <0.01) and observed interesting findings in the electronic MEdical Records and GEnomics Network (eMERGE) network dataset. Genes from the top epistatic interactions from eMERGE data (Likelihood Ratio Test i.e. LRT p-value <1e-05) were then tested for replication in the NEIGHBOR consortium dataset. To replicate our findings, we performed a gene-based SNP-SNP interaction analysis in NEIGHBOR and observed significant gene-gene interactions (p-value <0.001) among the top 17 gene-gene models identified in the discovery phase. Variants from gene-gene interaction analysis that we found to be associated with POAG explain 3.5% of additional genetic variance in eMERGE dataset above what is explained by the SNPs in genes that are replicated from previous GWAS studies (which was only 2.1% variance explained in eMERGE dataset); in the NEIGHBOR dataset, adding replicated SNPs from gene-gene interaction analysis explain 3.4% of total variance whereas GWAS SNPs alone explain only 2.8% of variance. Exploring gene-gene interactions may provide additional insights into many complex traits when explored in properly designed and powered association studies.


Asunto(s)
Epistasis Genética , Glaucoma de Ángulo Abierto/genética , Polimorfismo de Nucleótido Simple , Estudios de Casos y Controles , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Fenotipo
10.
BMC Bioinformatics ; 19(1): 120, 2018 04 04.
Artículo en Inglés | MEDLINE | ID: mdl-29618318

RESUMEN

BACKGROUND: Phenome-wide association studies (PheWAS) are a high-throughput approach to evaluate comprehensive associations between genetic variants and a wide range of phenotypic measures. PheWAS has varying sample sizes for quantitative traits, and variable numbers of cases and controls for binary traits across the many phenotypes of interest, which can affect the statistical power to detect associations. The motivation of this study is to investigate the various parameters which affect the estimation of statistical power in PheWAS, including sample size, case-control ratio, minor allele frequency, and disease penetrance. RESULTS: We performed a PheWAS simulation study, where we investigated variations in statistical power based on different parameters, such as overall sample size, number of cases, case-control ratio, minor allele frequency, and disease penetrance. The simulation was performed on both binary and quantitative phenotypic measures. Our simulation on binary traits suggests that the number of cases has more impact on statistical power than the case to control ratio; also, we found that a sample size of 200 cases or more maintains the statistical power to identify associations for common variants. For quantitative traits, a sample size of 1000 or more individuals performed best in the power calculations. We focused on common genetic variants (MAF > 0.01) in this study; however, in future studies, we will be extending this effort to perform similar simulations on rare variants. CONCLUSIONS: This study provides a series of PheWAS simulation analyses that can be used to estimate statistical power for some potential scenarios. These results can be used to provide guidelines for appropriate study design for future PheWAS analyses.


Asunto(s)
Simulación por Computador , Enfermedad/genética , Estudios de Asociación Genética , Estudio de Asociación del Genoma Completo , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Algoritmos , Humanos
11.
Genet Epidemiol ; 39(5): 376-84, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25982363

RESUMEN

Bioinformatics approaches to examine gene-gene models provide a means to discover interactions between multiple genes that underlie complex disease. Extensive computational demands and adjusting for multiple testing make uncovering genetic interactions a challenge. Here, we address these issues using our knowledge-driven filtering method, Biofilter, to identify putative single nucleotide polymorphism (SNP) interaction models for cataract susceptibility, thereby reducing the number of models for analysis. Models were evaluated in 3,377 European Americans (1,185 controls, 2,192 cases) from the Marshfield Clinic, a study site of the Electronic Medical Records and Genomics (eMERGE) Network, using logistic regression. All statistically significant models from the Marshfield Clinic were then evaluated in an independent dataset of 4,311 individuals (742 controls, 3,569 cases), using independent samples from additional study sites in the eMERGE Network: Mayo Clinic, Group Health/University of Washington, Vanderbilt University Medical Center, and Geisinger Health System. Eighty-three SNP-SNP models replicated in the independent dataset at likelihood ratio test P < 0.05. Among the most significant replicating models was rs12597188 (intron of CDH1)-rs11564445 (intron of CTNNB1). These genes are known to be involved in processes that include: cell-to-cell adhesion signaling, cell-cell junction organization, and cell-cell communication. Further Biofilter analysis of all replicating models revealed a number of common functions among the genes harboring the 83 replicating SNP-SNP models, which included signal transduction and PI3K-Akt signaling pathway. These findings demonstrate the utility of Biofilter as a biology-driven method, applicable for any genome-wide association study dataset.


Asunto(s)
Catarata/genética , Biología Computacional/métodos , Interpretación Estadística de Datos , Registros Electrónicos de Salud , Interacción Gen-Ambiente , Modelos Genéticos , Factores de Edad , Estudios de Casos y Controles , Adhesión Celular , Femenino , Estudio de Asociación del Genoma Completo , Genómica/métodos , Humanos , Masculino , Persona de Mediana Edad , Polimorfismo de Nucleótido Simple/genética , Grupos de Población/genética , Transducción de Señal , Programas Informáticos
12.
Pac Symp Biocomput ; 29: 611-626, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38160310

RESUMEN

Polygenic risk scores (PRS) have predominantly been derived from genome-wide association studies (GWAS) conducted in European ancestry (EUR) individuals. In this study, we present an in-depth evaluation of PRS based on multi-ancestry GWAS for five cardiometabolic phenotypes in the Penn Medicine BioBank (PMBB) followed by a phenome-wide association study (PheWAS). We examine the PRS performance across all individuals and separately in African ancestry (AFR) and EUR ancestry groups. For AFR individuals, PRS derived using the multi-ancestry LD panel showed a higher effect size for four out of five PRSs (DBP, SBP, T2D, and BMI) than those derived from the AFR LD panel. In contrast, for EUR individuals, the multi-ancestry LD panel PRS demonstrated a higher effect size for two out of five PRSs (SBP and T2D) compared to the EUR LD panel. These findings underscore the potential benefits of utilizing a multi-ancestry LD panel for PRS derivation in diverse genetic backgrounds and demonstrate overall robustness in all individuals. Our results also revealed significant associations between PRS and various phenotypic categories. For instance, CAD PRS was linked with 18 phenotypes in AFR and 82 in EUR, while T2D PRS correlated with 84 phenotypes in AFR and 78 in EUR. Notably, associations like hyperlipidemia, renal failure, atrial fibrillation, coronary atherosclerosis, obesity, and hypertension were observed across different PRSs in both AFR and EUR groups, with varying effect sizes and significance levels. However, in AFR individuals, the strength and number of PRS associations with other phenotypes were generally reduced compared to EUR individuals. Our study underscores the need for future research to prioritize 1) conducting GWAS in diverse ancestry groups and 2) creating a cosmopolitan PRS methodology that is universally applicable across all genetic backgrounds. Such advances will foster a more equitable and personalized approach to precision medicine.


Asunto(s)
Diabetes Mellitus Tipo 2 , Hipertensión , Humanos , Puntuación de Riesgo Genético , Estudio de Asociación del Genoma Completo/métodos , Predisposición Genética a la Enfermedad , Medicina de Precisión , Herencia Multifactorial , Biología Computacional , Fenotipo , Hipertensión/genética , Diabetes Mellitus Tipo 2/genética , Factores de Riesgo
13.
AMIA Jt Summits Transl Sci Proc ; 2023: 487-496, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37350926

RESUMEN

Modeling with longitudinal electronic health record (EHR) data proves challenging given the high dimensionality, redundancy, and noise captured in EHR. In order to improve precision medicine strategies and identify predictors of disease risk in advance, evaluating meaningful patient disease trajectories is essential. In this study, we develop the algorithm DiseasE Trajectory fEature extraCTion (DETECT) for feature extraction and trajectory generation in high-throughput temporal EHR data. This algorithm can 1) simulate longitudinal individual-level EHR data, specified to user parameters of scale, complexity, and noise and 2) use a convergent relative risk framework to test intermediate codes occurring between specified index code(s) and outcome code(s) to determine if they are predictive features of the outcome. Temporal range can be specified to investigate predictors occurring during a specific period of time prior to onset of the outcome. We benchmarked our method on simulated data and generated real-world disease trajectories using DETECT in a cohort of 145,575 individuals diagnosed with hypertension in Penn Medicine EHR for severe cardiometabolic outcomes.

14.
J Am Heart Assoc ; 12(5): e026561, 2023 03 07.
Artículo en Inglés | MEDLINE | ID: mdl-36846987

RESUMEN

Background Cardiometabolic diseases are highly comorbid, but their relationship with female-specific or overwhelmingly female-predominant health conditions (breast cancer, endometriosis, pregnancy complications) is understudied. This study aimed to estimate the cross-trait genetic overlap and influence of genetic burden of cardiometabolic traits on health conditions unique to women. Methods and Results Using electronic health record data from 71 008 ancestrally diverse women, we examined relationships between 23 obstetrical/gynecological conditions and 4 cardiometabolic phenotypes (body mass index, coronary artery disease, type 2 diabetes, and hypertension) by performing 4 analyses: (1) cross-trait genetic correlation analyses to compare genetic architecture, (2) polygenic risk score-based association tests to characterize shared genetic effects on disease risk, (3) Mendelian randomization for significant associations to assess cross-trait causal relationships, and (4) chronology analyses to visualize the timeline of events unique to groups of women with high and low genetic burden for cardiometabolic traits and highlight the disease prevalence in risk groups by age. We observed 27 significant associations between cardiometabolic polygenic scores and obstetrical/gynecological conditions (body mass index and endometrial cancer, body mass index and polycystic ovarian syndrome, type 2 diabetes and gestational diabetes, type 2 diabetes and polycystic ovarian syndrome). Mendelian randomization analysis provided additional evidence of independent causal effects. We also identified an inverse association between coronary artery disease and breast cancer. High cardiometabolic polygenic scores were associated with early development of polycystic ovarian syndrome and gestational hypertension. Conclusions We conclude that polygenic susceptibility to cardiometabolic traits is associated with elevated risk of certain female-specific health conditions.


Asunto(s)
Enfermedad de la Arteria Coronaria , Diabetes Mellitus Tipo 2 , Síndrome del Ovario Poliquístico , Humanos , Femenino , Diabetes Mellitus Tipo 2/epidemiología , Diabetes Mellitus Tipo 2/genética , Enfermedad de la Arteria Coronaria/epidemiología , Enfermedad de la Arteria Coronaria/genética , Síndrome del Ovario Poliquístico/epidemiología , Síndrome del Ovario Poliquístico/genética , Factores de Riesgo , Fenotipo
15.
Cell Rep Med ; 3(12): 100855, 2022 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-36513072

RESUMEN

Nonalcoholic fatty liver disease is common and highly heritable. Genetic studies of hepatic fat have not sufficiently addressed non-European and rare variants. In a medical biobank, we quantitate hepatic fat from clinical computed tomography (CT) scans via deep learning in 10,283 participants with whole-exome sequences available. We conduct exome-wide associations of single variants and rare predicted loss-of-function (pLOF) variants with CT-based hepatic fat and perform cross-modality replication in the UK Biobank (UKB) by linking whole-exome sequences to MRI-based hepatic fat. We confirm single variants previously associated with hepatic fat and identify several additional variants, including two (FGD5 H600Y and CITED2 S198_G199del) that replicated in UKB. A burden of rare pLOF variants in LMF2 is associated with increased hepatic fat and replicates in UKB. Quantitative phenotypes generated from clinical imaging studies and intersected with genomic data in medical biobanks have the potential to identify molecular pathways associated with human traits and disease.


Asunto(s)
Exoma , Enfermedad del Hígado Graso no Alcohólico , Humanos , Exoma/genética , Bancos de Muestras Biológicas , Fenotipo , Tomografía Computarizada por Rayos X , Enfermedad del Hígado Graso no Alcohólico/diagnóstico por imagen , Enfermedad del Hígado Graso no Alcohólico/genética , Proteínas Represoras/genética , Transactivadores/genética
16.
Curr Protoc ; 2(11): e603, 2022 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-36441943

RESUMEN

Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of many complex diseases. Regardless of the context, the practical utility of this information ultimately depends upon the quality of the data used for statistical analyses. Quality control (QC) procedures for GWAS are constantly evolving. Here, we enumerate some of the challenges in QC of genotyped GWAS data and describe the approaches involving genotype imputation of a sample dataset along with post-imputation quality assurance, thereby minimizing potential bias and error in GWAS results. We discuss common issues associated with QC of the GWAS data (genotyped and imputed), including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We provide detailed guidelines along with a sample dataset to suggest current best practices and discuss areas of ongoing and future research. © 2022 Wiley Periodicals LLC.


Asunto(s)
Estudio de Asociación del Genoma Completo , Proyectos de Investigación , Humanos , Control de Calidad , Genotipo , Aberraciones Cromosómicas Sexuales
17.
J Pers Med ; 12(12)2022 Nov 29.
Artículo en Inglés | MEDLINE | ID: mdl-36556195

RESUMEN

The Penn Medicine BioBank (PMBB) is an electronic health record (EHR)-linked biobank at the University of Pennsylvania (Penn Medicine). A large variety of health-related information, ranging from diagnosis codes to laboratory measurements, imaging data and lifestyle information, is integrated with genomic and biomarker data in the PMBB to facilitate discoveries and translational science. To date, 174,712 participants have been enrolled into the PMBB, including approximately 30% of participants of non-European ancestry, making it one of the most diverse medical biobanks. There is a median of seven years of longitudinal data in the EHR available on participants, who also consent to permission to recontact. Herein, we describe the operations and infrastructure of the PMBB, summarize the phenotypic architecture of the enrolled participants, and use body mass index (BMI) as a proof-of-concept quantitative phenotype for PheWAS, LabWAS, and GWAS. The major representation of African-American participants in the PMBB addresses the essential need to expand the diversity in genetic and translational research. There is a critical need for a "medical biobank consortium" to facilitate replication, increase power for rare phenotypes and variants, and promote harmonized collaboration to optimize the potential for biological discovery and precision medicine.

18.
Nat Commun ; 13(1): 3428, 2022 06 14.
Artículo en Inglés | MEDLINE | ID: mdl-35701404

RESUMEN

Clinical and epidemiological studies have shown that circulatory system diseases and nervous system disorders often co-occur in patients. However, genetic susceptibility factors shared between these disease categories remain largely unknown. Here, we characterized pleiotropy across 107 circulatory system and 40 nervous system traits using an ensemble of methods in the eMERGE Network and UK Biobank. Using a formal test of pleiotropy, five genomic loci demonstrated statistically significant evidence of pleiotropy. We observed region-specific patterns of direction of genetic effects for the two disease categories, suggesting potential antagonistic and synergistic pleiotropy. Our findings provide insights into the relationship between circulatory system diseases and nervous system disorders which can provide context for future prevention and treatment strategies.


Asunto(s)
Enfermedades Cardiovasculares , Enfermedades del Sistema Nervioso , Enfermedades Cardiovasculares/genética , Pleiotropía Genética , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Enfermedades del Sistema Nervioso/genética , Polimorfismo de Nucleótido Simple
19.
BioData Min ; 14(1): 32, 2021 Jul 17.
Artículo en Inglés | MEDLINE | ID: mdl-34273980

RESUMEN

BACKGROUND: Genomic studies increasingly integrate expression quantitative trait loci (eQTL) information into their analysis pipelines, but few tools exist for the visualization of colocalization between eQTL and GWAS results. Those tools that do exist are limited in their analysis options, and do not integrate eQTL and GWAS information into a single figure panel, making the visualization of colocalization difficult. RESULTS: To address this issue, we developed the intuitive and user-friendly R package eQTpLot. eQTpLot takes as input standard GWAS and cis-eQTL summary statistics, and optional pairwise LD information, to generate a series of plots visualizing colocalization, correlation, and enrichment between eQTL and GWAS signals for a given gene-trait pair. With eQTpLot, investigators can easily generate a series of customizable plots clearly illustrating, for a given gene-trait pair: 1) colocalization between GWAS and eQTL signals, 2) correlation between GWAS and eQTL p-values, 3) enrichment of eQTLs among trait-significant variants, 4) the LD landscape of the locus in question, and 5) the relationship between the direction of effect of eQTL signals and the direction of effect of colocalizing GWAS peaks. These clear and comprehensive plots provide a unique view of eQTL-GWAS colocalization, allowing for a more complete understanding of the interaction between gene expression and trait associations. CONCLUSIONS: eQTpLot provides a unique, user-friendly, and intuitive means of visualizing eQTL and GWAS signal colocalization, incorporating novel features not found in other eQTL visualization software. We believe eQTpLot will prove a useful tool for investigators seeking a convenient and customizable visualization of eQTL and GWAS data colocalization. AVAILABILITY AND IMPLEMENTATION: the eQTpLot R package and tutorial are available at https://github.com/RitchieLab/eQTpLot.

20.
Nat Genet ; 53(7): 972-981, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34140684

RESUMEN

Plasma lipids are known heritable risk factors for cardiovascular disease, but increasing evidence also supports shared genetics with diseases of other organ systems. We devised a comprehensive three-phase framework to identify new lipid-associated genes and study the relationships among lipids, genotypes, gene expression and hundreds of complex human diseases from the Electronic Medical Records and Genomics (347 traits) and the UK Biobank (549 traits). Aside from 67 new lipid-associated genes with strong replication, we found evidence for pleiotropic SNPs/genes between lipids and diseases across the phenome. These include discordant pleiotropy in the HLA region between lipids and multiple sclerosis and putative causal paths between triglycerides and gout, among several others. Our findings give insights into the genetic basis of the relationship between plasma lipids and diseases on a phenome-wide scale and can provide context for future prevention and treatment strategies.


Asunto(s)
Biomarcadores , Susceptibilidad a Enfermedades , Registros Electrónicos de Salud , Lípidos/sangre , Alelos , Bancos de Muestras Biológicas , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Humanos , Polimorfismo de Nucleótido Simple , Vigilancia en Salud Pública , Carácter Cuantitativo Heredable , Reino Unido
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA