Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
1.
Nat Genet ; 56(7): 1412-1419, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38862854

RESUMEN

Coronary artery disease (CAD) exists on a spectrum of disease represented by a combination of risk factors and pathogenic processes. An in silico score for CAD built using machine learning and clinical data in electronic health records captures disease progression, severity and underdiagnosis on this spectrum and could enhance genetic discovery efforts for CAD. Here we tested associations of rare and ultrarare coding variants with the in silico score for CAD in the UK Biobank, All of Us Research Program and BioMe Biobank. We identified associations in 17 genes; of these, 14 show at least moderate levels of prior genetic, biological and/or clinical support for CAD. We also observed an excess of ultrarare coding variants in 321 aggregated CAD genes, suggesting more ultrarare variant associations await discovery. These results expand our understanding of the genetic etiology of CAD and illustrate how digital markers can enhance genetic association investigations for complex diseases.


Asunto(s)
Enfermedad de la Arteria Coronaria , Predisposición Genética a la Enfermedad , Aprendizaje Automático , Enfermedad de la Arteria Coronaria/genética , Humanos , Exoma/genética , Secuenciación del Exoma/métodos , Variación Genética , Estudio de Asociación del Genoma Completo/métodos , Femenino , Polimorfismo de Nucleótido Simple
2.
Am J Ophthalmol ; 267: 204-212, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38906208

RESUMEN

PURPOSE: Polygenic risk scores (PRSs) likely predict risk and prognosis of glaucoma. We compared the PRS performance for primary open-angle glaucoma (POAG), defined using International Classification of Diseases (ICD) codes vs manual medical record review. DESIGN: Retrospective cohort study. METHODS: We identified POAG cases in the Mount Sinai BioMe and Mass General Brigham (MGB) biobanks using ICD codes. We confirmed POAG based on optical coherence tomograms and visual fields. In a separate 5% sample, the absence of POAG was confirmed with intraocular pressure and cup-disc ratio criteria. We used genotype data and either self-reported glaucoma diagnoses or ICD-10 codes for glaucoma diagnoses from the UK Biobank and the lassosum method to compute a genome-wide POAG PRS. We compared the area under the curve (AUC) for POAG prediction based on ICD codes vs medical records. RESULTS: We reviewed 804 of 996 BioMe and 367 of 1006 MGB ICD-identified cases. In BioMe and MGB, respectively, positive predictive value was 53% and 55%; negative predictive value was 96% and 97%; sensitivity was 97% and 97%; and specificity was 44% and 53%. Adjusted PRS AUCs for POAG using ICD codes vs manual record review in BioMe were not statistically different (P ≥.21) by ancestry: 0.77 vs 0.75 for African, 0.80 vs 0.80 for Hispanic, and 0.81 vs 0.81 for European. Results were similar in MGB (P ≥.18): 0.72 vs 0.80 for African, 0.83 vs 0.86 for Hispanic, and 0.74 vs 0.73 for European. CONCLUSIONS: A POAG PRS performed similarly using either manual review or ICD codes in 2 electronic health record-linked biobanks; manual assessment of glaucoma status might not be necessary for some PRS studies. However, caution should be exercised when using ICD codes for glaucoma diagnosis given their low specificity (44%-53%) for manually confirmed cases of glaucoma.


Asunto(s)
Registros Electrónicos de Salud , Glaucoma de Ángulo Abierto , Presión Intraocular , Humanos , Glaucoma de Ángulo Abierto/genética , Glaucoma de Ángulo Abierto/diagnóstico , Estudios Retrospectivos , Masculino , Femenino , Presión Intraocular/fisiología , Anciano , Persona de Mediana Edad , Bancos de Muestras Biológicas , Factores de Riesgo , Clasificación Internacional de Enfermedades , Campos Visuales/fisiología , Herencia Multifactorial , Área Bajo la Curva , Tomografía de Coherencia Óptica , Estudio de Asociación del Genoma Completo , Medición de Riesgo/métodos , Curva ROC , Valor Predictivo de las Pruebas , Puntuación de Riesgo Genético
3.
Diabetes Care ; 47(6): 1042-1047, 2024 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-38652672

RESUMEN

OBJECTIVE: To identify genetic risk factors for incident cardiovascular disease (CVD) among people with type 2 diabetes (T2D). RESEARCH DESIGN AND METHODS: We conducted a multiancestry time-to-event genome-wide association study for incident CVD among people with T2D. We also tested 204 known coronary artery disease (CAD) variants for association with incident CVD. RESULTS: Among 49,230 participants with T2D, 8,956 had incident CVD events (event rate 18.2%). We identified three novel genetic loci for incident CVD: rs147138607 (near CACNA1E/ZNF648, hazard ratio [HR] 1.23, P = 3.6 × 10-9), rs77142250 (near HS3ST1, HR 1.89, P = 9.9 × 10-9), and rs335407 (near TFB1M/NOX3, HR 1.25, P = 1.5 × 10-8). Among 204 known CAD loci, 5 were associated with incident CVD in T2D (multiple comparison-adjusted P < 0.00024, 0.05/204). A standardized polygenic score of these 204 variants was associated with incident CVD with HR 1.14 (P = 1.0 × 10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.


Asunto(s)
Enfermedades Cardiovasculares , Diabetes Mellitus Tipo 2 , Estudio de Asociación del Genoma Completo , Humanos , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/epidemiología , Diabetes Mellitus Tipo 2/complicaciones , Enfermedades Cardiovasculares/genética , Enfermedades Cardiovasculares/epidemiología , Femenino , Masculino , Persona de Mediana Edad , Anciano , Polimorfismo de Nucleótido Simple
4.
Cell Rep Med ; 5(5): 101518, 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38642551

RESUMEN

Population-based genomic screening may help diagnose individuals with disease-risk variants. Here, we perform a genome-first evaluation for nine disorders in 29,039 participants with linked exome sequences and electronic health records (EHRs). We identify 614 individuals with 303 pathogenic/likely pathogenic or predicted loss-of-function (P/LP/LoF) variants, yielding 644 observations; 487 observations (76%) lack a corresponding clinical diagnosis in the EHR. Upon further investigation, 75 clinically undiagnosed observations (15%) have evidence of symptomatic untreated disease, including familial hypercholesterolemia (3 of 6 [50%] undiagnosed observations with disease evidence) and breast cancer (23 of 106 [22%]). These genetic findings enable targeted phenotyping that reveals new diagnoses in previously undiagnosed individuals. Disease yield is greater with variants in penetrant genes for which disease is observed in carriers in an independent cohort. The prevalence of P/LP/LoF variants exceeds that of clinical diagnoses, and some clinically undiagnosed carriers are discovered to have disease. These results highlight the potential of population-based genomic screening.


Asunto(s)
Secuenciación del Exoma , Exoma , Humanos , Femenino , Masculino , Exoma/genética , Secuenciación del Exoma/métodos , Persona de Mediana Edad , Adulto , Enfermedades Genéticas Congénitas/genética , Enfermedades Genéticas Congénitas/diagnóstico , Enfermedades Genéticas Congénitas/epidemiología , Predisposición Genética a la Enfermedad , Registros Electrónicos de Salud , Pruebas Genéticas/métodos , Genoma Humano , Anciano , Atención a la Salud , Adolescente , Genómica/métodos , Adulto Joven
6.
medRxiv ; 2023 Jul 28.
Artículo en Inglés | MEDLINE | ID: mdl-37546893

RESUMEN

BACKGROUND: Type 2 diabetes mellitus (T2D) confers a two- to three-fold increased risk of cardiovascular disease (CVD). However, the mechanisms underlying increased CVD risk among people with T2D are only partially understood. We hypothesized that a genetic association study among people with T2D at risk for developing incident cardiovascular complications could provide insights into molecular genetic aspects underlying CVD. METHODS: From 16 studies of the Cohorts for Heart & Aging Research in Genomic Epidemiology (CHARGE) Consortium, we conducted a multi-ancestry time-to-event genome-wide association study (GWAS) for incident CVD among people with T2D using Cox proportional hazards models. Incident CVD was defined based on a composite of coronary artery disease (CAD), stroke, and cardiovascular death that occurred at least one year after the diagnosis of T2D. Cohort-level estimated effect sizes were combined using inverse variance weighted fixed effects meta-analysis. We also tested 204 known CAD variants for association with incident CVD among patients with T2D. RESULTS: A total of 49,230 participants with T2D were included in the analyses (31,118 European ancestries and 18,112 non-European ancestries) which consisted of 8,956 incident CVD cases over a range of mean follow-up duration between 3.2 and 33.7 years (event rate 18.2%). We identified three novel, distinct genetic loci for incident CVD among individuals with T2D that reached the threshold for genome-wide significance (P<5.0×10-8): rs147138607 (intergenic variant between CACNA1E and ZNF648) with a hazard ratio (HR) 1.23, 95% confidence interval (CI) 1.15 - 1.32, P=3.6×10-9, rs11444867 (intergenic variant near HS3ST1) with HR 1.89, 95% CI 1.52 - 2.35, P=9.9×10-9, and rs335407 (intergenic variant between TFB1M and NOX3) HR 1.25, 95% CI 1.16 - 1.35, P=1.5×10-8. Among 204 known CAD loci, 32 were associated with incident CVD in people with T2D with P<0.05, and 5 were significant after Bonferroni correction (P<0.00024, 0.05/204). A polygenic score of these 204 variants was significantly associated with incident CVD with HR 1.14 (95% CI 1.12 - 1.16) per 1 standard deviation increase (P=1.0×10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.

7.
Nat Genet ; 55(7): 1106-1115, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-37308786

RESUMEN

The current understanding of the genetic determinants of thoracic aortic aneurysms and dissections (TAAD) has largely been informed through studies of rare, Mendelian forms of disease. Here, we conducted a genome-wide association study (GWAS) of TAAD, testing ~25 million DNA sequence variants in 8,626 participants with and 453,043 participants without TAAD in the Million Veteran Program, with replication in an independent sample of 4,459 individuals with and 512,463 without TAAD from six cohorts. We identified 21 TAAD risk loci, 17 of which have not been previously reported. We leverage multiple downstream analytic methods to identify causal TAAD risk genes and cell types and provide human genetic evidence that TAAD is a non-atherosclerotic aortic disorder distinct from other forms of vascular disease. Our results demonstrate that the genetic architecture of TAAD mirrors that of other complex traits and that it is not solely inherited through protein-altering variants of large effect size.


Asunto(s)
Aneurisma de la Aorta Torácica , Disección Aórtica , Veteranos , Humanos , Estudio de Asociación del Genoma Completo , Linaje , Aneurisma de la Aorta Torácica/genética , Disección Aórtica/genética
8.
Nat Med ; 29(6): 1540-1549, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37248299

RESUMEN

Preeclampsia and gestational hypertension are common pregnancy complications associated with adverse maternal and child outcomes. Current tools for prediction, prevention and treatment are limited. Here we tested the association of maternal DNA sequence variants with preeclampsia in 20,064 cases and 703,117 control individuals and with gestational hypertension in 11,027 cases and 412,788 control individuals across discovery and follow-up cohorts using multi-ancestry meta-analysis. Altogether, we identified 18 independent loci associated with preeclampsia/eclampsia and/or gestational hypertension, 12 of which are new (for example, MTHFR-CLCN6, WNT3A, NPR3, PGR and RGL3), including two loci (PLCE1 and FURIN) identified in the multitrait analysis. Identified loci highlight the role of natriuretic peptide signaling, angiogenesis, renal glomerular function, trophoblast development and immune dysregulation. We derived genome-wide polygenic risk scores that predicted preeclampsia/eclampsia and gestational hypertension in external cohorts, independent of clinical risk factors, and reclassified eligibility for low-dose aspirin to prevent preeclampsia. Collectively, these findings provide mechanistic insights into the hypertensive disorders of pregnancy and have the potential to advance pregnancy risk stratification.


Asunto(s)
Eclampsia , Hipertensión Inducida en el Embarazo , Hipertensión , Preeclampsia , Embarazo , Femenino , Niño , Humanos , Hipertensión Inducida en el Embarazo/genética , Preeclampsia/genética , Preeclampsia/prevención & control , Aspirina , Factores de Riesgo
10.
medRxiv ; 2023 Dec 24.
Artículo en Inglés | MEDLINE | ID: mdl-38196638

RESUMEN

It is estimated that as many as 1 in 16 people worldwide suffer from rare diseases. Rare disease patients face difficulty finding diagnosis and treatment for their conditions, including long diagnostic odysseys, multiple incorrect diagnoses, and unavailable or prohibitively expensive treatments. As a result, it is likely that large electronic health record (EHR) systems include high numbers of participants suffering from undiagnosed rare disease. While this has been shown in detail for specific diseases, these studies are expensive and time consuming and have only been feasible to perform for a handful of the thousands of known rare diseases. The bulk of these undiagnosed cases are effectively hidden, with no straightforward way to differentiate them from healthy controls. The ability to access them at scale would enormously expand our capacity to study and develop drugs for rare diseases, adding to tools aimed at increasing availability of study cohorts for rare disease. In this study, we train a deep learning transformer algorithm, RarePT (Rare-Phenotype Prediction Transformer), to impute undiagnosed rare disease from EHR diagnosis codes in 436,407 participants in the UK Biobank and validated on an independent cohort from 3,333,560 individuals from the Mount Sinai Health System. We applied our model to 155 rare diagnosis codes with fewer than 250 cases each in the UK Biobank and predicted participants with elevated risk for each diagnosis, with the number of participants predicted to be at risk ranging from 85 to 22,000 for different diagnoses. These risk predictions are significantly associated with increased mortality for 65% of diagnoses, with disease burden expressed as disability-adjusted life years (DALY) for 73% of diagnoses, and with 72% of available disease-specific diagnostic tests. They are also highly enriched for known rare diagnoses in patients not included in the training set, with an odds ratio (OR) of 48.0 in cross-validation cohorts of the UK Biobank and an OR of 30.6 in the independent Mount Sinai Health System cohort. Most importantly, RarePT successfully screens for undiagnosed patients in 32 rare diseases with available diagnostic tests in the UK Biobank. Using the trained model to estimate the prevalence of undiagnosed disease in the UK Biobank for these 32 rare phenotypes, we find that at least 50% of patients remain undiagnosed for 20 of 32 diseases. These estimates provide empirical evidence of a high prevalence of undiagnosed rare disease, as well as demonstrating the enormous potential benefit of using RarePT to screen for undiagnosed rare disease patients in large electronic health systems.

11.
Nat Commun ; 13(1): 6914, 2022 11 14.
Artículo en Inglés | MEDLINE | ID: mdl-36376295

RESUMEN

Heart failure is a leading cause of cardiovascular morbidity and mortality. However, the contribution of common genetic variation to heart failure risk has not been fully elucidated, particularly in comparison to other common cardiometabolic traits. We report a multi-ancestry genome-wide association study meta-analysis of all-cause heart failure including up to 115,150 cases and 1,550,331 controls of diverse genetic ancestry, identifying 47 risk loci. We also perform multivariate genome-wide association studies that integrate heart failure with related cardiac magnetic resonance imaging endophenotypes, identifying 61 risk loci. Gene-prioritization analyses including colocalization and transcriptome-wide association studies identify known and previously unreported candidate cardiomyopathy genes and cellular processes, which we validate in gene-expression profiling of failing and healthy human hearts. Colocalization, gene expression profiling, and Mendelian randomization provide convergent evidence for the roles of BCKDHA and circulating branch-chain amino acids in heart failure and cardiac structure. Finally, proteome-wide Mendelian randomization identifies 9 circulating proteins associated with heart failure or quantitative imaging traits. These analyses highlight similarities and differences among heart failure and associated cardiovascular imaging endophenotypes, implicate common genetic variation in the pathogenesis of heart failure, and identify circulating proteins that may represent cardiomyopathy treatment targets.


Asunto(s)
Estudio de Asociación del Genoma Completo , Insuficiencia Cardíaca , Humanos , Estudio de Asociación del Genoma Completo/métodos , Fenotipo , Insuficiencia Cardíaca/genética , Corazón , Perfilación de la Expresión Génica , Polimorfismo de Nucleótido Simple , Predisposición Genética a la Enfermedad
13.
Nat Genet ; 54(7): 950-962, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35710981

RESUMEN

More than 800 million people suffer from kidney disease, yet the mechanism of kidney dysfunction is poorly understood. In the present study, we define the genetic association with kidney function in 1.5 million individuals and identify 878 (126 new) loci. We map the genotype effect on the methylome in 443 kidneys, transcriptome in 686 samples and single-cell open chromatin in 57,229 kidney cells. Heritability analysis reveals that methylation variation explains a larger fraction of heritability than gene expression. We present a multi-stage prioritization strategy and prioritize target genes for 87% of kidney function loci. We highlight key roles of proximal tubules and metabolism in kidney function regulation. Furthermore, the causal role of SLC47A1 in kidney disease is defined in mice with genetic loss of Slc47a1 and in human individuals carrying loss-of-function variants. Our findings emphasize the key role of bulk and single-cell epigenomic information in translating genome-wide association studies into identifying causal genes, cellular origins and mechanisms of complex traits.


Asunto(s)
Epigenómica , Enfermedades Renales , Animales , Estudio de Asociación del Genoma Completo , Humanos , Enfermedades Renales/genética , Ratones , Polimorfismo de Nucleótido Simple/genética , Transcriptoma/genética
14.
JAMA ; 327(4): 350-359, 2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35076666

RESUMEN

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, Setting, and Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020. Participants had linked exome and electronic health record data, were older than 20 years, and were of diverse ancestral backgrounds. Exposures: Variants previously reported as pathogenic or predicted to cause a loss of protein function by bioinformatic algorithms (pathogenic/loss-of-function variants). Main Outcomes and Measures: The primary outcome was the disease risk associated with clinical variants. The risk difference (RD) between the prevalence of disease in individuals with a variant allele (penetrance) vs in individuals with a normal allele was measured. Results: Among 72 434 study participants, 43 395 were from the UK Biobank (mean [SD] age, 57 [8.0] years; 24 065 [55%] women; 2948 [7%] non-European) and 29 039 were from the BioMe Biobank (mean [SD] age, 56 [16] years; 17 355 [60%] women; 19 663 [68%] non-European). Of 5360 pathogenic/loss-of-function variants, 4795 (89%) were associated with an RD less than or equal to 0.05. Mean penetrance was 6.9% (95% CI, 6.0%-7.8%) for pathogenic variants and 0.85% (95% CI, 0.76%-0.95%) for benign variants reported in ClinVar (difference, 6.0 [95% CI, 5.6-6.4] percentage points), with a median of 0% for both groups due to large numbers of nonpenetrant variants. Penetrance of pathogenic/loss-of-function variants for late-onset diseases was modified by age: mean penetrance was 10.3% (95% CI, 9.0%-11.6%) in individuals 70 years or older and 8.5% (95% CI, 7.9%-9.1%) in individuals 20 years or older (difference, 1.8 [95% CI, 0.40-3.3] percentage points). Penetrance of pathogenic/loss-of-function variants was heterogeneous even in known disease predisposition genes, including BRCA1 (mean [range], 38% [0%-100%]), BRCA2 (mean [range], 38% [0%-100%]), and PALB2 (mean [range], 26% [0%-100%]). Conclusions and Relevance: In 2 large biobank cohorts, the estimated penetrance of pathogenic/loss-of-function variants was variable but generally low. Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.


Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Mutación con Pérdida de Función , Penetrancia , Anciano , Bancos de Muestras Biológicas , Estudios de Cohortes , Femenino , Humanos , Masculino , Mutación , Reino Unido
15.
Kidney Med ; 3(4): 653-658, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33942030

RESUMEN

Recent case reports suggest that coronavirus disease 2019 (COVID-19) is associated with collapsing glomerulopathy in African Americans with apolipoprotein L1 gene (APOL1) risk alleles; however, it is unclear whether disease pathogenesis is similar to HIV-associated nephropathy. RNA sequencing analysis of a kidney biopsy specimen from a patient with COVID-19-associated collapsing glomerulopathy and APOL1 risk alleles (G1/G1) revealed similar levels of APOL1 and angiotensin-converting enzyme 2 (ACE2) messenger RNA transcripts as compared with 12 control kidney samples downloaded from the GTEx (Genotype-Tissue Expression) Portal. Whole-genome sequencing of the COVID-19-associated collapsing glomerulopathy kidney sample identified 4 indel gene variants, 3 of which are of unknown significance with respect to chronic kidney disease and/or focal segmental glomerulosclerosis. Molecular profiling of the kidney demonstrated activation of COVID-19-associated cell injury pathways such as inflammation and coagulation. Evidence for direct severe acute respiratory syndrome coronavirus 2 infection of kidney cells was lacking, which is consistent with the findings of several recent studies. Interestingly, immunostaining of kidney biopsy sections revealed increased expression of phospho-STAT3 (signal transducer and activator of transcription 3) in both COVID-19-associated collapsing glomerulopathy and HIV-associated nephropathy as compared with control kidney tissue. Importantly, interleukin 6-induced activation of STAT3 may be a targetable mechanism driving COVID-19-associated acute kidney injury.

16.
Hum Mutat ; 42(8): 969-977, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34005834

RESUMEN

Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss-of-function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal disease. We analyzed 213,084 exomes, along with a targeted set of retinal, cardiac, and immune phenotypes from two large-scale EHR-linked biobanks. In the primary analysis, a burden of deleterious variants in ERCC6 was strongly associated with (1) retinal disorders; (2) cardiac and electrocardiogram perturbations; and (3) immunodeficiency and decreased immunoglobulin levels. Meta-analysis of results from the BioMe Biobank and UK Biobank showed a significant association of deleterious ERCC6 burden with retinal dystrophy (odds ratio [OR] = 2.6, 95% confidence interval [CI]: 1.5-4.6; p = 8.7 × 10-4 ), atypical atrial flutter (OR = 3.5, 95% CI: 1.9-6.5; p = 6.2 × 10-5 ), arrhythmia (OR = 1.5, 95% CI: 1.2-2.0; p = 2.7 × 10-3 ), and lymphocyte immunodeficiency (OR = 3.8, 95% CI: 2.1-6.8; p = 5.0 × 10-6 ). Carriers of ERCC6 LoF variants who lacked a diagnosis of these conditions exhibited increased symptoms, indicating underdiagnosis. These results reveal a unique genetic link among retinal, cardiac, and immune disorders and underscore the value of EHR-linked biobanks in assessing the full clinical profile of carriers of rare variants.


Asunto(s)
Pleiotropía Genética , Distrofias Retinianas , Arritmias Cardíacas , ADN Helicasas , Enzimas Reparadoras del ADN , Exoma , Humanos , Proteínas de Unión a Poli-ADP-Ribosa , Distrofias Retinianas/genética , Secuenciación del Exoma/métodos
18.
Hum Mol Genet ; 30(10): 952-960, 2021 05 29.
Artículo en Inglés | MEDLINE | ID: mdl-33704450

RESUMEN

Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR. We evaluated the PRS in 6079 individuals with T2D of European, Hispanic, African and other ancestries from a large-scale multi-ethnic biobank. Main outcomes were PRS association with DR diagnosis, symptoms and complications, and time to diagnosis, and transferability to non-European ancestries. We observed that PRS was significantly associated with DR. A standard deviation increase in PRS was accompanied by an adjusted odds ratio (OR) of 1.12 [95% confidence interval (CI) 1.04-1.20; P = 0.001] for DR diagnosis. When stratified by ancestry, PRS was associated with the highest OR in European ancestry (OR = 1.22, 95% CI 1.02-1.41; P = 0.049), followed by African (OR = 1.15, 95% CI 1.03-1.28; P = 0.028) and Hispanic ancestries (OR = 1.10, 95% CI 1.00-1.10; P = 0.050). Individuals in the top PRS decile had a 1.8-fold elevated risk for DR versus the bottom decile (P = 0.002). Among individuals without DR diagnosis, the top PRS decile had more DR symptoms than the bottom decile (P = 0.008). The PRS was associated with retinal hemorrhage (OR = 1.44, 95% CI 1.03-2.02; P = 0.03) and earlier DR presentation (10% probability of DR by 4 years in the top PRS decile versus 8 years in the bottom decile). These results establish the significant polygenic underpinnings of DR and indicate the need for more diverse ancestries in biobanks to develop multi-ancestral PRS.


Asunto(s)
Diabetes Mellitus Tipo 2/epidemiología , Retinopatía Diabética/epidemiología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Adulto , Anciano , Población Negra/genética , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patología , Retinopatía Diabética/complicaciones , Retinopatía Diabética/genética , Retinopatía Diabética/patología , Hispánicos o Latinos/genética , Humanos , Persona de Mediana Edad , Herencia Multifactorial/genética , Medición de Riesgo , Factores de Riesgo , Población Blanca/genética
19.
PLoS Genet ; 17(1): e1009337, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33493176

RESUMEN

Understanding the relationship between natural selection and phenotypic variation has been a long-standing challenge in human population genetics. With the emergence of biobank-scale datasets, along with new statistical metrics to approximate strength of purifying selection at the variant level, it is now possible to correlate a proxy of individual relative fitness with a range of medical phenotypes. We calculated a per-individual deleterious load score by summing the total number of derived alleles per individual after incorporating a weight that approximates strength of purifying selection. We assessed four methods for the weight, including GERP, phyloP, CADD, and fitcons. By quantitatively tracking each of these scores with the site frequency spectrum, we identified phyloP as the most appropriate weight. The phyloP-weighted load score was then calculated across 15,129,142 variants in 335,161 individuals from the UK Biobank and tested for association on 1,380 medical phenotypes. After accounting for multiple test correction, we observed a strong association of the load score amongst coding sites only on 27 traits including body mass, adiposity and metabolic rate. We further observed that the association signals were driven by common variants (derived allele frequency > 5%) with high phyloP score (phyloP > 2). Finally, through permutation analyses, we showed that the load score amongst coding sites had an excess of nominally significant associations on many medical phenotypes. These results suggest a broad impact of deleterious load on medical phenotypes and highlight the deleterious load score as a tool to disentangle the complex relationship between natural selection and medical phenotypes.


Asunto(s)
Evolución Molecular , Aptitud Genética/genética , Genética de Población , Selección Genética/genética , Alelos , Bancos de Muestras Biológicas , Índice de Masa Corporal , Femenino , Frecuencia de los Genes , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Variación Genética/genética , Humanos , Masculino , Reino Unido
20.
Mol Biol Evol ; 34(11): 2792-2807, 2017 Nov 01.
Artículo en Inglés | MEDLINE | ID: mdl-28981697

RESUMEN

It remains a challenge in evolutionary genetics to elucidate how beneficial mutations arise and propagate in a population and how selective pressures on mutant alleles are structured over space and time. By identifying "sweeping haplotypes (SHs)" that putatively carry beneficial alleles and are increasing (or have increased) rapidly in frequency, and surveying the geographic distribution of SH frequencies, we can indirectly infer how selective sweeps unfold in time and thus which modes of positive selection underlie those sweeps. Using population genomic data from African Drosophila melanogaster, we identified SHs from 37 candidate loci under selection. At more than half of loci, we identify single SHs. However, many other loci harbor multiple independent SHs, namely soft selective sweeps, either due to parallel evolution across space or a high beneficial mutation rate. At about a quarter of the loci, intermediate SH frequencies are found across multiple populations, which cannot be explained unless a certain form of frequency-dependent positive selection, such as heterozygote advantage, is invoked given the reasonable range of migration rates between African populations. At one locus, many independent SHs are observed over multiple populations but always together with ancestral haplotypes. This complex pattern is compatible with a large number of mutational targets in a gene and frequency-dependent selection on new variants. We conclude that very diverse modes of positive selection are operating at different sets of loci in D. melanogaster populations.


Asunto(s)
Drosophila melanogaster/genética , Selección Genética/genética , África , Alelos , Animales , Evolución Biológica , Bases de Datos de Ácidos Nucleicos , Evolución Molecular , Frecuencia de los Genes/genética , Variación Genética , Genética de Población/métodos , Genoma de los Insectos , Haplotipos/genética , Heterocigoto , Modelos Genéticos , Mutación
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA