Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 124
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Cell ; 165(6): 1530-1545, 2016 Jun 02.
Artículo en Inglés | MEDLINE | ID: mdl-27259154

RESUMEN

Genome-wide association studies (GWAS) have successfully identified thousands of associations between common genetic variants and human disease phenotypes, but the majority of these variants are non-coding, often requiring genetic fine-mapping, epigenomic profiling, and individual reporter assays to delineate potential causal variants. We employ a massively parallel reporter assay (MPRA) to simultaneously screen 2,756 variants in strong linkage disequilibrium with 75 sentinel variants associated with red blood cell traits. We show that this assay identifies elements with endogenous erythroid regulatory activity. Across 23 sentinel variants, we conservatively identified 32 MPRA functional variants (MFVs). We used targeted genome editing to demonstrate endogenous enhancer activity across 3 MFVs that predominantly affect the transcription of SMIM1, RBM38, and CD164. Functional follow-up of RBM38 delineates a key role for this gene in the alternative splicing program occurring during terminal erythropoiesis. Finally, we provide evidence for how common GWAS-nominated variants can disrupt cell-type-specific transcriptional regulatory pathways.


Asunto(s)
Eritrocitos , Técnicas Genéticas , Variación Genética , Empalme Alternativo , Línea Celular , Linaje de la Célula/genética , Eritropoyesis/genética , Biblioteca de Genes , Genes Reporteros , Humanos , Secuencias Reguladoras de Ácidos Nucleicos , Transcripción Genética
2.
Am J Hum Genet ; 109(1): 33-49, 2022 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-34951958

RESUMEN

The identification of genes that evolve under recessive natural selection is a long-standing goal of population genetics research that has important applications to the discovery of genes associated with disease. We found that commonly used methods to evaluate selective constraint at the gene level are highly sensitive to genes under heterozygous selection but ubiquitously fail to detect recessively evolving genes. Additionally, more sophisticated likelihood-based methods designed to detect recessivity similarly lack power for a human gene of realistic length from current population sample sizes. However, extensive simulations suggested that recessive genes may be detectable in aggregate. Here, we offer a method informed by population genetics simulations designed to detect recessive purifying selection in gene sets. Applying this to empirical gene sets produced significant enrichments for strong recessive selection in genes previously inferred to be under recessive selection in a consanguineous cohort and in genes involved in autosomal recessive monogenic disorders.


Asunto(s)
Frecuencia de los Genes , Genes Recesivos , Genética de Población , Selección Genética , Algoritmos , Alelos , Genes Dominantes , Predisposición Genética a la Enfermedad , Variación Genética , Genética de Población/métodos , Genómica/métodos , Genotipo , Humanos , Patrón de Herencia , Funciones de Verosimilitud , Modelos Genéticos , Mutación , Reino Unido
3.
Arterioscler Thromb Vasc Biol ; 44(2): 491-504, 2024 02.
Artículo en Inglés | MEDLINE | ID: mdl-38095106

RESUMEN

BACKGROUND: Venous thromboembolism (VTE) is a major cause of morbidity and mortality worldwide. Current risk assessment tools, such as the Caprini and Padua scores and Wells criteria, have limitations in their applicability and accuracy. This study aimed to develop machine learning models using structured electronic health record data to predict diagnosis and 1-year risk of VTE. METHODS: We trained and validated models on data from 159 001 participants in the Mount Sinai Data Warehouse. We then externally tested them on 401 723 participants in the UK Biobank and 123 039 participants in All of Us. All data sets contain populations of diverse ancestries and clinical histories. We used these data sets to develop small, medium, and large models with increasing features on a range of optimizing portability to maximizing performance. We make trained models publicly available in click-and-run format at https://doi.org/10.17632/tkwzysr4y6.6. RESULTS: In the holdout and external test sets, respectively, models achieved areas under the receiver operating characteristic curve of 0.80 to 0.83 and 0.72 to 0.82 for VTE diagnosis prediction and 0.76 to 0.78 and 0.64 to 0.69 for 1-year risk prediction, significantly outperforming the Padua score. Models also demonstrated robust performance across different VTE types and patient subsets, including ethnicity, age, and surgical and hospitalization status. Models identified both established and novel clinical features contributing to VTE risk, offering valuable insights into its underlying pathophysiology. CONCLUSIONS: Machine learning models using structured electronic health record data can significantly improve VTE diagnosis and 1-year risk prediction in diverse populations. Model probability scores exist on a continuum, affecting mortality risk in both healthy individuals and VTE cases. Integrating these models into electronic health record systems to generate real-time predictions may enhance VTE risk assessment, early detection, and preventative measures, ultimately reducing the morbidity and mortality associated with VTE.


Asunto(s)
Salud Poblacional , Tromboembolia Venosa , Humanos , Registros Electrónicos de Salud , Factores de Riesgo , Tromboembolia Venosa/diagnóstico , Tromboembolia Venosa/epidemiología , Tromboembolia Venosa/etiología , Medición de Riesgo , Aprendizaje Automático , Estudios Retrospectivos
4.
Nature ; 570(7762): 514-518, 2019 06.
Artículo en Inglés | MEDLINE | ID: mdl-31217584

RESUMEN

Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.


Asunto(s)
Pueblo Asiatico/genética , Población Negra/genética , Estudio de Asociación del Genoma Completo/métodos , Hispánicos o Latinos/genética , Grupos Minoritarios , Herencia Multifactorial/genética , Salud de la Mujer , Estatura/genética , Estudios de Cohortes , Femenino , Genética Médica/métodos , Equidad en Salud/tendencias , Disparidades en el Estado de Salud , Humanos , Masculino , Estados Unidos
5.
Lancet ; 401(10372): 215-225, 2023 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-36563696

RESUMEN

BACKGROUND: Binary diagnosis of coronary artery disease does not preserve the complexity of disease or quantify its severity or its associated risk with death; hence, a quantitative marker of coronary artery disease is warranted. We evaluated a quantitative marker of coronary artery disease derived from probabilities of a machine learning model. METHODS: In this cohort study, we developed and validated a coronary artery disease-predictive machine learning model using 95 935 electronic health records and assessed its probabilities as in-silico scores for coronary artery disease (ISCAD; range 0 [lowest probability] to 1 [highest probability]) in participants in two longitudinal biobank cohorts. We measured the association of ISCAD with clinical outcomes-namely, coronary artery stenosis, obstructive coronary artery disease, multivessel coronary artery disease, all-cause death, and coronary artery disease sequelae. FINDINGS: Among 95 935 participants, 35 749 were from the BioMe Biobank (median age 61 years [IQR 18]; 14 599 [41%] were male and 21 150 [59%] were female; 5130 [14%] were with diagnosed coronary artery disease) and 60 186 were from the UK Biobank (median age 62 [15] years; 25 031 [42%] male and 35 155 [58%] female; 8128 [14%] with diagnosed coronary artery disease). The model predicted coronary artery disease with an area under the receiver operating characteristic curve of 0·95 (95% CI 0·94-0·95; sensitivity of 0·94 [0·94-0·95] and specificity of 0·82 [0·81-0·83]) and 0·93 (0·92-0·93; sensitivity of 0·90 [0·89-0·90] and specificity of 0·88 [0·87-0·88]) in the BioMe validation and holdout sets, respectively, and 0·91 (0·91-0·91; sensitivity of 0·84 [0·83-0·84] and specificity of 0·83 [0·82-0·83]) in the UK Biobank external test set. ISCAD captured coronary artery disease risk from known risk factors, pooled cohort equations, and polygenic risk scores. Coronary artery stenosis increased quantitatively with ascending ISCAD quartiles (increase per quartile of 12 percentage points), including risk of obstructive coronary artery disease, multivessel coronary artery disease, and stenosis of major coronary arteries. Hazard ratios (HRs) and prevalence of all-cause death increased stepwise over ISCAD deciles (decile 1: HR 1·0 [95% CI 1·0-1·0], 0·2% prevalence; decile 6: 11 [3·9-31], 3·1% prevalence; and decile 10: 56 [20-158], 11% prevalence). A similar trend was observed for recurrent myocardial infarction. 12 (46%) undiagnosed individuals with high ISCAD (≥0·9) had clinical evidence of coronary artery disease according to the 2014 American College of Cardiology/American Heart Association Task Force guidelines. INTERPRETATION: Electronic health record-based machine learning was used to generate an in-silico marker for coronary artery disease that can non-invasively quantify atherosclerosis and risk of death on a continuous spectrum, and identify underdiagnosed individuals. FUNDING: National Institutes of Health.


Asunto(s)
Enfermedad de la Arteria Coronaria , Estenosis Coronaria , Humanos , Masculino , Femenino , Persona de Mediana Edad , Enfermedad de la Arteria Coronaria/diagnóstico , Enfermedad de la Arteria Coronaria/epidemiología , Estudios de Cohortes , Valor Predictivo de las Pruebas , Estenosis Coronaria/diagnóstico , Factores de Riesgo , Aprendizaje Automático , Angiografía Coronaria
6.
PLoS Genet ; 17(1): e1009337, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33493176

RESUMEN

Understanding the relationship between natural selection and phenotypic variation has been a long-standing challenge in human population genetics. With the emergence of biobank-scale datasets, along with new statistical metrics to approximate strength of purifying selection at the variant level, it is now possible to correlate a proxy of individual relative fitness with a range of medical phenotypes. We calculated a per-individual deleterious load score by summing the total number of derived alleles per individual after incorporating a weight that approximates strength of purifying selection. We assessed four methods for the weight, including GERP, phyloP, CADD, and fitcons. By quantitatively tracking each of these scores with the site frequency spectrum, we identified phyloP as the most appropriate weight. The phyloP-weighted load score was then calculated across 15,129,142 variants in 335,161 individuals from the UK Biobank and tested for association on 1,380 medical phenotypes. After accounting for multiple test correction, we observed a strong association of the load score amongst coding sites only on 27 traits including body mass, adiposity and metabolic rate. We further observed that the association signals were driven by common variants (derived allele frequency > 5%) with high phyloP score (phyloP > 2). Finally, through permutation analyses, we showed that the load score amongst coding sites had an excess of nominally significant associations on many medical phenotypes. These results suggest a broad impact of deleterious load on medical phenotypes and highlight the deleterious load score as a tool to disentangle the complex relationship between natural selection and medical phenotypes.


Asunto(s)
Evolución Molecular , Aptitud Genética/genética , Genética de Población , Selección Genética/genética , Alelos , Bancos de Muestras Biológicas , Índice de Masa Corporal , Femenino , Frecuencia de los Genes , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad , Variación Genética/genética , Humanos , Masculino , Reino Unido
7.
Int J Mol Sci ; 25(13)2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-39000484

RESUMEN

Circulating biomarkers play a pivotal role in personalized medicine, offering potential for disease screening, prevention, and treatment. Despite established associations between numerous biomarkers and diseases, elucidating their causal relationships is challenging. Mendelian Randomization (MR) can address this issue by employing genetic instruments to discern causal links. Additionally, using multiple MR methods with overlapping results enhances the reliability of discovered relationships. Here, we report an MR study using multiple methods, including inverse variance weighted, simple mode, weighted mode, weighted median, and MR-Egger. We use the MR-base resource (v0.5.6) from Hemani et al. 2018 to evaluate causal relationships between 212 circulating biomarkers (curated from UK Biobank analyses by Neale lab and from Shin et al. 2014, Roederer et al. 2015, and Kettunen et al. 2016 and 99 complex diseases (curated from several consortia by MRC IEU and Biobank Japan). We report novel causal relationships found by four or more MR methods between glucose and bipolar disorder (Mean Effect Size estimate across methods: 0.39) and between cystatin C and bipolar disorder (Mean Effect Size: -0.31). Based on agreement in four or more methods, we also identify previously known links between urate with gout and creatine with chronic kidney disease, as well as biomarkers that may be causal of cardiovascular conditions: apolipoprotein B, cholesterol, LDL, lipoprotein A, and triglycerides in coronary heart disease, as well as lipoprotein A, LDL, cholesterol, and apolipoprotein B in myocardial infarction. This Mendelian Randomization study not only corroborates known causal relationships between circulating biomarkers and diseases but also uncovers two novel biomarkers associated with bipolar disorder that warrant further investigation. Our findings provide insight into understanding how biological processes reflecting circulating biomarkers and their associated effects may contribute to disease etiology, which can eventually help improve precision diagnostics and intervention.


Asunto(s)
Biomarcadores , Análisis de la Aleatorización Mendeliana , Humanos , Biomarcadores/sangre , Trastorno Bipolar/genética , Trastorno Bipolar/sangre , Enfermedades Cardiovasculares/genética , Enfermedades Cardiovasculares/sangre , Factores de Riesgo , Cistatina C/sangre , Cistatina C/genética , Gota/genética , Gota/sangre
8.
Circulation ; 146(16): 1225-1242, 2022 10 18.
Artículo en Inglés | MEDLINE | ID: mdl-36154123

RESUMEN

BACKGROUND: Venous thromboembolism (VTE) is a life-threatening vascular event with environmental and genetic determinants. Recent VTE genome-wide association studies (GWAS) meta-analyses involved nearly 30 000 VTE cases and identified up to 40 genetic loci associated with VTE risk, including loci not previously suspected to play a role in hemostasis. The aim of our research was to expand discovery of new genetic loci associated with VTE by using cross-ancestry genomic resources. METHODS: We present new cross-ancestry meta-analyzed GWAS results involving up to 81 669 VTE cases from 30 studies, with replication of novel loci in independent populations and loci characterization through in silico genomic interrogations. RESULTS: In our genetic discovery effort that included 55 330 participants with VTE (47 822 European, 6320 African, and 1188 Hispanic ancestry), we identified 48 novel associations, of which 34 were replicated after correction for multiple testing. In our combined discovery-replication analysis (81 669 VTE participants) and ancestry-stratified meta-analyses (European, African, and Hispanic), we identified another 44 novel associations, which are new candidate VTE-associated loci requiring replication. In total, across all GWAS meta-analyses, we identified 135 independent genomic loci significantly associated with VTE risk. A genetic risk score of the significantly associated loci in Europeans identified a 6-fold increase in risk for those in the top 1% of scores compared with those with average scores. We also identified 31 novel transcript associations in transcriptome-wide association studies and 8 novel candidate genes with protein quantitative-trait locus Mendelian randomization analyses. In silico interrogations of hemostasis and hematology traits and a large phenome-wide association analysis of the 135 GWAS loci provided insights to biological pathways contributing to VTE, with some loci contributing to VTE through well-characterized coagulation pathways and others providing new data on the role of hematology traits, particularly platelet function. Many of the replicated loci are outside of known or currently hypothesized pathways to thrombosis. CONCLUSIONS: Our cross-ancestry GWAS meta-analyses identified new loci associated with VTE. These findings highlight new pathways to thrombosis and provide novel molecules that may be useful in the development of improved antithrombosis treatments.


Asunto(s)
Trombosis , Tromboembolia Venosa , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genómica , Humanos , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Trombosis/genética , Tromboembolia Venosa/diagnóstico , Tromboembolia Venosa/genética
9.
Clin Infect Dis ; 77(6): 839-847, 2023 09 18.
Artículo en Inglés | MEDLINE | ID: mdl-37227948

RESUMEN

BACKGROUND: Lyme disease is the most prevalent vector-borne disease in the US, yet its host factors are poorly understood and diagnostic tests are limited. We evaluated patients in a large health system to uncover cholesterol's role in the susceptibility, severity, and machine learning-based diagnosis of Lyme disease. METHODS: A longitudinal health system cohort comprised 1 019 175 individuals with electronic health record data and 50 329 with linked genetic data. Associations of blood cholesterol level, cholesterol genetic scores comprising common genetic variants, and burden of rare loss-of-function (LoF) variants in cholesterol metabolism genes with Lyme disease were investigated. A portable machine learning model was constructed and tested to predict Lyme disease using routine lipid and clinical measurements. RESULTS: There were 3832 cases of Lyme disease. Increasing cholesterol was associated with greater risk of Lyme disease and hypercholesterolemia was more prevalent in Lyme disease cases than in controls. Cholesterol genetic scores and rare LoF variants in CD36 and LDLR were associated with Lyme disease risk. Serological profiling of cases revealed parallel trajectories of rising cholesterol and immunoglobulin levels over the disease course, including marked increases in individuals with LoF variants and high cholesterol genetic scores. The machine learning model predicted Lyme disease solely using routine lipid panel, blood count, and metabolic measurements. CONCLUSIONS: These results demonstrate the value of large-scale genetic and clinical data to reveal host factors underlying infectious disease biology, risk, and prognosis and the potential for their clinical translation to machine learning diagnostics that do not need specialized assays.


Asunto(s)
Hipercolesterolemia , Enfermedad de Lyme , Humanos , Enfermedad de Lyme/diagnóstico , Enfermedad de Lyme/epidemiología , Colesterol , Pronóstico , Aprendizaje Automático
10.
Hum Mol Genet ; 30(10): 952-960, 2021 05 29.
Artículo en Inglés | MEDLINE | ID: mdl-33704450

RESUMEN

Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR. We evaluated the PRS in 6079 individuals with T2D of European, Hispanic, African and other ancestries from a large-scale multi-ethnic biobank. Main outcomes were PRS association with DR diagnosis, symptoms and complications, and time to diagnosis, and transferability to non-European ancestries. We observed that PRS was significantly associated with DR. A standard deviation increase in PRS was accompanied by an adjusted odds ratio (OR) of 1.12 [95% confidence interval (CI) 1.04-1.20; P = 0.001] for DR diagnosis. When stratified by ancestry, PRS was associated with the highest OR in European ancestry (OR = 1.22, 95% CI 1.02-1.41; P = 0.049), followed by African (OR = 1.15, 95% CI 1.03-1.28; P = 0.028) and Hispanic ancestries (OR = 1.10, 95% CI 1.00-1.10; P = 0.050). Individuals in the top PRS decile had a 1.8-fold elevated risk for DR versus the bottom decile (P = 0.002). Among individuals without DR diagnosis, the top PRS decile had more DR symptoms than the bottom decile (P = 0.008). The PRS was associated with retinal hemorrhage (OR = 1.44, 95% CI 1.03-2.02; P = 0.03) and earlier DR presentation (10% probability of DR by 4 years in the top PRS decile versus 8 years in the bottom decile). These results establish the significant polygenic underpinnings of DR and indicate the need for more diverse ancestries in biobanks to develop multi-ancestral PRS.


Asunto(s)
Diabetes Mellitus Tipo 2/epidemiología , Retinopatía Diabética/epidemiología , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Adulto , Anciano , Población Negra/genética , Diabetes Mellitus Tipo 2/complicaciones , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patología , Retinopatía Diabética/complicaciones , Retinopatía Diabética/genética , Retinopatía Diabética/patología , Hispánicos o Latinos/genética , Humanos , Persona de Mediana Edad , Herencia Multifactorial/genética , Medición de Riesgo , Factores de Riesgo , Población Blanca/genética
11.
BMC Med ; 21(1): 316, 2023 08 21.
Artículo en Inglés | MEDLINE | ID: mdl-37605270

RESUMEN

BACKGROUND: Micronutrients, namely vitamins and minerals, are associated with cancer outcomes; however, their reported effects have been inconsistent across studies. We aimed to identify the causally estimated effects of micronutrients on cancer by applying the Mendelian randomization (MR) method, using single-nucleotide polymorphisms associated with micronutrient levels as instrumental variables. METHODS: We obtained instrumental variables of 14 genetically predicted micronutrient levels and applied two-sample MR to estimate their causal effects on 22 cancer outcomes from a meta-analysis of the UK Biobank (UKB) and FinnGen cohorts (overall cancer and 21 site-specific cancers, including breast, colorectal, lung, and prostate cancer), in addition to six major cancer outcomes and 20 cancer subset outcomes from cancer consortia. We used sensitivity MR methods, including weighted median, MR-Egger, and MR-PRESSO, to assess potential horizontal pleiotropy or heterogeneity. Genome-wide association summary statistical data of European descent were used for both exposure and outcome data, including up to 940,633 participants of European descent with 133,384 cancer cases. RESULTS: In total, 672 MR tests (14 micronutrients × 48 cancer outcomes) were performed. The following two associations met Bonferroni significance by the number of associations (P < 0.00016) in the UKB plus FinnGen cohorts: increased risk of breast cancer with magnesium levels (odds ratio [OR] = 1.281 per 1 standard deviation [SD] higher magnesium level, 95% confidence interval [CI] = 1.151 to 1.426, P < 0.0001) and increased risk of colorectal cancer with vitamin B12 level (OR = 1.22 per 1 SD higher vitamin B12 level, 95% CI = 1.107 to 1.345, P < 0.0001). These two associations remained significant in the analysis of the cancer consortia. No significant heterogeneity or horizontal pleiotropy was observed. Micronutrient levels were not associated with overall cancer risk. CONCLUSIONS: Our results may aid clinicians in deciding whether to regulate the intake of certain micronutrients, particularly in high-risk groups without nutritional deficiencies, and may help in the design of future clinical trials.


Asunto(s)
Neoplasias de la Mama , Micronutrientes , Humanos , Masculino , Estudio de Asociación del Genoma Completo , Magnesio , Análisis de la Aleatorización Mendeliana , Femenino
12.
Nature ; 544(7649): 235-239, 2017 04 12.
Artículo en Inglés | MEDLINE | ID: mdl-28406212

RESUMEN

A major goal of biomedicine is to understand the function of every gene in the human genome. Loss-of-function mutations can disrupt both copies of a given gene in humans and phenotypic analysis of such 'human knockouts' can provide insight into gene function. Consanguineous unions are more likely to result in offspring carrying homozygous loss-of-function mutations. In Pakistan, consanguinity rates are notably high. Here we sequence the protein-coding regions of 10,503 adult participants in the Pakistan Risk of Myocardial Infarction Study (PROMIS), designed to understand the determinants of cardiometabolic diseases in individuals from South Asia. We identified individuals carrying homozygous predicted loss-of-function (pLoF) mutations, and performed phenotypic analysis involving more than 200 biochemical and disease traits. We enumerated 49,138 rare (<1% minor allele frequency) pLoF mutations. These pLoF mutations are estimated to knock out 1,317 genes, each in at least one participant. Homozygosity for pLoF mutations at PLA2G7 was associated with absent enzymatic activity of soluble lipoprotein-associated phospholipase A2; at CYP2F1, with higher plasma interleukin-8 concentrations; at TREH, with lower concentrations of apoB-containing lipoprotein subfractions; at either A3GALT2 or NRG4, with markedly reduced plasma insulin C-peptide concentrations; and at SLC9A3R1, with mediators of calcium and phosphate signalling. Heterozygous deficiency of APOC3 has been shown to protect against coronary heart disease; we identified APOC3 homozygous pLoF carriers in our cohort. We recruited these human knockouts and challenged them with an oral fat load. Compared with family members lacking the mutation, individuals with APOC3 knocked out displayed marked blunting of the usual post-prandial rise in plasma triglycerides. Overall, these observations provide a roadmap for a 'human knockout project', a systematic effort to understand the phenotypic consequences of complete disruption of genes in humans.


Asunto(s)
Consanguinidad , Análisis Mutacional de ADN , Eliminación de Gen , Genes/genética , Estudios de Asociación Genética/métodos , Homocigoto , Fenotipo , 1-Alquil-2-acetilglicerofosfocolina Esterasa/deficiencia , 1-Alquil-2-acetilglicerofosfocolina Esterasa/genética , Apolipoproteína C-III/deficiencia , Apolipoproteína C-III/genética , Estudios de Cohortes , Enfermedad Coronaria/sangre , Enfermedad Coronaria/genética , Familia 2 del Citocromo P450/genética , Grasas de la Dieta/farmacología , Exoma/genética , Ayuno/sangre , Femenino , Frecuencia de los Genes , Humanos , Interleucina-8/sangre , Masculino , Persona de Mediana Edad , Infarto del Miocardio/sangre , Infarto del Miocardio/genética , Neurregulinas/genética , Pakistán , Linaje , Fosfoproteínas/genética , Periodo Posprandial , Sitios de Empalme de ARN/genética , Genética Inversa/métodos , Intercambiadores de Sodio-Hidrógeno/genética , Triglicéridos/sangre
13.
PLoS Genet ; 16(3): e1008684, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-32226016

RESUMEN

Lipid levels are important markers for the development of cardio-metabolic diseases. Although hundreds of associated loci have been identified through genetic association studies, the contribution of genetic factors to variation in lipids is not fully understood, particularly in U.S. minority groups. We performed genome-wide association analyses for four lipid traits in over 45,000 ancestrally diverse participants from the Population Architecture using Genomics and Epidemiology (PAGE) Study, followed by a meta-analysis with several European ancestry studies. We identified nine novel lipid loci, five of which showed evidence of replication in independent studies. Furthermore, we discovered one novel gene in a PrediXcan analysis, minority-specific independent signals at eight previously reported loci, and potential functional variants at two known loci through fine-mapping. Systematic examination of known lipid loci revealed smaller effect estimates in African American and Hispanic ancestry populations than those in Europeans, and better performance of polygenic risk scores based on minority-specific effect estimates. Our findings provide new insight into the genetic architecture of lipid traits and highlight the importance of conducting genetic studies in diverse populations in the era of precision medicine.


Asunto(s)
Lípidos/sangre , Lípidos/genética , Grupos Raciales/genética , Bases de Datos Genéticas , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genotipo , Humanos , Lípidos/análisis , Masculino , Metagenómica/métodos , Grupos Minoritarios , Herencia Multifactorial/genética , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Estados Unidos/epidemiología
14.
Stroke ; 53(3): 875-885, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-34727735

RESUMEN

BACKGROUND AND PURPOSE: Stroke is the leading cause of death and long-term disability worldwide. Previous genome-wide association studies identified 51 loci associated with stroke (mostly ischemic) and its subtypes among predominantly European populations. Using whole-genome sequencing in ancestrally diverse populations from the Trans-Omics for Precision Medicine (TOPMed) Program, we aimed to identify novel variants, especially low-frequency or ancestry-specific variants, associated with all stroke, ischemic stroke and its subtypes (large artery, cardioembolic, and small vessel), and hemorrhagic stroke and its subtypes (intracerebral and subarachnoid). METHODS: Whole-genome sequencing data were available for 6833 stroke cases and 27 116 controls, including 22 315 European, 7877 Black, 2616 Hispanic/Latino, 850 Asian, 54 Native American, and 237 other ancestry participants. In TOPMed, we performed single variant association analysis examining 40 million common variants and aggregated association analysis focusing on rare variants. We also combined TOPMed European populations with over 28 000 additional European participants from the UK BioBank genome-wide array data through meta-analysis. RESULTS: In the single variant association analysis in TOPMed, we identified one novel locus 13q33 for large artery at whole-genome-wide significance (P<5.00×10-9) and 4 novel loci at genome-wide significance (P<5.00×10-8), all of which need confirmation in independent studies. Lead variants in all 5 loci are low-frequency but are more common in non-European populations. An aggregation of synonymous rare variants within the gene C6orf26 demonstrated suggestive evidence of association for hemorrhagic stroke (P<3.11×10-6). By meta-analyzing European ancestry samples in TOPMed and UK BioBank, we replicated several previously reported stroke loci including PITX2, HDAC9, ZFHX3, and LRCH1. CONCLUSIONS: We represent the first association analysis for stroke and its subtypes using whole-genome sequencing data from ancestrally diverse populations. While our findings suggest the potential benefits of combining whole-genome sequencing data with populations of diverse genetic backgrounds to identify possible low-frequency or ancestry-specific variants, they also highlight the need to increase genome coverage and sample sizes.


Asunto(s)
Sitios Genéticos , Predisposición Genética a la Enfermedad , Polimorfismo de Nucleótido Simple , Medicina de Precisión , Grupos Raciales/genética , Accidente Cerebrovascular/genética , Anciano , Anciano de 80 o más Años , Femenino , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Persona de Mediana Edad , Secuenciación Completa del Genoma
15.
Am Heart J ; 250: 29-33, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35526571

RESUMEN

Genetic risk for coronary artery disease (CAD) is commonly measured with polygenic risk scores (PRS); yet, the relationship of atherosclerotic burden with PRS in healthy individuals not at high clinical risk for CAD (ie, without a high pooled cohort equations [PCE] score) is unknown. Here, we implemented a novel recall-by-PRS strategy to measure coronary artery calcium (CAC) scores prospectively in 53 healthy individuals with extreme high PRS (median [IQR] PRS = 94% [83-98]) and low PRS (median [IQR] PRS = 3.6% [1.2-10]). The high PRS group was associated with a 2.8-fold greater CAC than the low PRS group, adjusted for age, sex, BMI, smoking, and statin use, and had a 6.7-fold greater proportion of individuals with CAC exceeding 300 HU. These findings reveal that extreme PRS tracks with CAD risk even in those without high clinical risk and demonstrate proof of principle for recall-by-PRS approaches that should be assessed prospectively in larger trials.


Asunto(s)
Calcio , Enfermedad de la Arteria Coronaria , Calcio de la Dieta , Estudios de Cohortes , Enfermedad de la Arteria Coronaria/genética , Humanos , Medición de Riesgo , Factores de Riesgo
16.
Nature ; 536(7616): 285-91, 2016 08 18.
Artículo en Inglés | MEDLINE | ID: mdl-27535533

RESUMEN

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.


Asunto(s)
Exoma/genética , Variación Genética/genética , Análisis Mutacional de ADN , Conjuntos de Datos como Asunto , Humanos , Fenotipo , Proteoma/genética , Enfermedades Raras/genética , Tamaño de la Muestra
17.
JAMA ; 327(4): 350-359, 2022 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-35076666

RESUMEN

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, Setting, and Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020. Participants had linked exome and electronic health record data, were older than 20 years, and were of diverse ancestral backgrounds. Exposures: Variants previously reported as pathogenic or predicted to cause a loss of protein function by bioinformatic algorithms (pathogenic/loss-of-function variants). Main Outcomes and Measures: The primary outcome was the disease risk associated with clinical variants. The risk difference (RD) between the prevalence of disease in individuals with a variant allele (penetrance) vs in individuals with a normal allele was measured. Results: Among 72 434 study participants, 43 395 were from the UK Biobank (mean [SD] age, 57 [8.0] years; 24 065 [55%] women; 2948 [7%] non-European) and 29 039 were from the BioMe Biobank (mean [SD] age, 56 [16] years; 17 355 [60%] women; 19 663 [68%] non-European). Of 5360 pathogenic/loss-of-function variants, 4795 (89%) were associated with an RD less than or equal to 0.05. Mean penetrance was 6.9% (95% CI, 6.0%-7.8%) for pathogenic variants and 0.85% (95% CI, 0.76%-0.95%) for benign variants reported in ClinVar (difference, 6.0 [95% CI, 5.6-6.4] percentage points), with a median of 0% for both groups due to large numbers of nonpenetrant variants. Penetrance of pathogenic/loss-of-function variants for late-onset diseases was modified by age: mean penetrance was 10.3% (95% CI, 9.0%-11.6%) in individuals 70 years or older and 8.5% (95% CI, 7.9%-9.1%) in individuals 20 years or older (difference, 1.8 [95% CI, 0.40-3.3] percentage points). Penetrance of pathogenic/loss-of-function variants was heterogeneous even in known disease predisposition genes, including BRCA1 (mean [range], 38% [0%-100%]), BRCA2 (mean [range], 38% [0%-100%]), and PALB2 (mean [range], 26% [0%-100%]). Conclusions and Relevance: In 2 large biobank cohorts, the estimated penetrance of pathogenic/loss-of-function variants was variable but generally low. Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.


Asunto(s)
Predisposición Genética a la Enfermedad , Variación Genética , Mutación con Pérdida de Función , Penetrancia , Anciano , Bancos de Muestras Biológicas , Estudios de Cohortes , Femenino , Humanos , Masculino , Mutación , Reino Unido
18.
Hum Mutat ; 42(8): 969-977, 2021 08.
Artículo en Inglés | MEDLINE | ID: mdl-34005834

RESUMEN

Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss-of-function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal disease. We analyzed 213,084 exomes, along with a targeted set of retinal, cardiac, and immune phenotypes from two large-scale EHR-linked biobanks. In the primary analysis, a burden of deleterious variants in ERCC6 was strongly associated with (1) retinal disorders; (2) cardiac and electrocardiogram perturbations; and (3) immunodeficiency and decreased immunoglobulin levels. Meta-analysis of results from the BioMe Biobank and UK Biobank showed a significant association of deleterious ERCC6 burden with retinal dystrophy (odds ratio [OR] = 2.6, 95% confidence interval [CI]: 1.5-4.6; p = 8.7 × 10-4 ), atypical atrial flutter (OR = 3.5, 95% CI: 1.9-6.5; p = 6.2 × 10-5 ), arrhythmia (OR = 1.5, 95% CI: 1.2-2.0; p = 2.7 × 10-3 ), and lymphocyte immunodeficiency (OR = 3.8, 95% CI: 2.1-6.8; p = 5.0 × 10-6 ). Carriers of ERCC6 LoF variants who lacked a diagnosis of these conditions exhibited increased symptoms, indicating underdiagnosis. These results reveal a unique genetic link among retinal, cardiac, and immune disorders and underscore the value of EHR-linked biobanks in assessing the full clinical profile of carriers of rare variants.


Asunto(s)
Pleiotropía Genética , Distrofias Retinianas , Arritmias Cardíacas , ADN Helicasas , Enzimas Reparadoras del ADN , Exoma , Humanos , Proteínas de Unión a Poli-ADP-Ribosa , Distrofias Retinianas/genética , Secuenciación del Exoma/métodos
19.
Annu Rev Genomics Hum Genet ; 19: 289-301, 2018 08 31.
Artículo en Inglés | MEDLINE | ID: mdl-29641912

RESUMEN

While sequence-based genetic tests have long been available for specific loci, especially for Mendelian disease, the rapidly falling costs of genome-wide genotyping arrays, whole-exome sequencing, and whole-genome sequencing are moving us toward a future where full genomic information might inform the prognosis and treatment of a variety of diseases, including complex disease. Similarly, the availability of large populations with full genomic information has enabled new insights about the etiology and genetic architecture of complex disease. Insights from the latest generation of genomic studies suggest that our categorization of diseases as complex may conceal a wide spectrum of genetic architectures and causal mechanisms that ranges from Mendelian forms of complex disease to complex regulatory structures underlying Mendelian disease. Here, we review these insights, along with advances in the prediction of disease risk and outcomes from full genomic information.


Asunto(s)
Enfermedades Genéticas Congénitas/genética , Enfermedades Genéticas Congénitas/complicaciones , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Humanos , Secuenciación del Exoma
20.
Am J Hum Genet ; 103(6): 930-947, 2018 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-30503522

RESUMEN

Diamond-Blackfan anemia (DBA) is a rare bone marrow failure disorder that affects 7 out of 1,000,000 live births and has been associated with mutations in components of the ribosome. In order to characterize the genetic landscape of this heterogeneous disorder, we recruited a cohort of 472 individuals with a clinical diagnosis of DBA and performed whole-exome sequencing (WES). We identified relevant rare and predicted damaging mutations for 78% of individuals. The majority of mutations were singletons, absent from population databases, predicted to cause loss of function, and located in 1 of 19 previously reported ribosomal protein (RP)-encoding genes. Using exon coverage estimates, we identified and validated 31 deletions in RP genes. We also observed an enrichment for extended splice site mutations and validated their diverse effects using RNA sequencing in cell lines obtained from individuals with DBA. Leveraging the size of our cohort, we observed robust genotype-phenotype associations with congenital abnormalities and treatment outcomes. We further identified rare mutations in seven previously unreported RP genes that may cause DBA, as well as several distinct disorders that appear to phenocopy DBA, including nine individuals with biallelic CECR1 mutations that result in deficiency of ADA2. However, no new genes were identified at exome-wide significance, suggesting that there are no unidentified genes containing mutations readily identified by WES that explain >5% of DBA-affected case subjects. Overall, this report should inform not only clinical practice for DBA-affected individuals, but also the design and analysis of rare variant studies for heterogeneous Mendelian disorders.


Asunto(s)
Anemia de Diamond-Blackfan/genética , Adolescente , Niño , Preescolar , Estudios de Cohortes , Exoma/genética , Exones/genética , Femenino , Eliminación de Gen , Estudios de Asociación Genética/métodos , Humanos , Péptidos y Proteínas de Señalización Intercelular/genética , Masculino , Mutación/genética , Fenotipo , Proteínas Ribosómicas/genética , Ribosomas/genética , Análisis de Secuencia de ARN/métodos , Secuenciación del Exoma/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA