RESUMEN
Peripheral artery disease (PAD) is a form of atherosclerotic cardiovascular disease, affecting â¼8 million Americans, and is known to have racial and ethnic disparities. PAD has been reported to have a significantly higher prevalence in African Americans (AAs) compared to non-Hispanic European Americans (EAs). Hispanic/Latinos (HLs) have been reported to have lower or similar rates of PAD compared to EAs, despite having a paradoxically high burden of PAD risk factors; however, recent work suggests prevalence may differ between sub-groups. Here, we examined a large cohort of diverse adults in the BioMe biobank in New York City. We observed the prevalence of PAD at 1.7% in EAs vs. 8.5% and 9.4% in AAs and HLs, respectively, and among HL sub-groups, the prevalence was found at 11.4% and 11.5% in Puerto Rican and Dominican populations, respectively. Follow-up analysis that adjusted for common risk factors demonstrated that Dominicans had the highest increased risk for PAD relative to EAs [OR = 3.15 (95% CI 2.33-4.25), p < 6.44 × 10-14]. To investigate whether genetic factors may explain this increased risk, we performed admixture mapping by testing the association between local ancestry and PAD in Dominican BioMe participants (N = 1,813) separately from European, African, and Native American (NAT) continental ancestry tracts. The top association with PAD was an NAT ancestry tract at chromosome 2q35 [OR = 1.96 (SE = 0.16), p < 2.75 × 10-05) with 22.6% vs. 12.9% PAD prevalence in heterozygous NAT tract carriers versus non-carriers, respectively. Fine-mapping at this locus implicated tag SNP rs78529201 located within a long intergenic non-coding RNA (lincRNA) LINC00607, a gene expression regulator of key genes related to thrombosis and extracellular remodeling of endothelial cells, suggesting a putative link of the 2q35 locus to PAD etiology. Efforts to reproduce the signal in other Hispanic cohorts were unsuccessful. In summary, we showed how leveraging health system data helped understand nuances of PAD risk across HL sub-groups and admixture mapping approaches elucidated a putative risk locus in a Dominican population.
RESUMEN
Fibromyalgia is a complex disease of unclear etiology that is complicated by difficulties in diagnosis, treatment, and clinical heterogeneity. To clarify this etiology, healthcare-based data are leveraged to assess the influences on fibromyalgia in several domains. Prevalence is less than 1% of females in our population register data, and about 1/10th that in males. Fibromyalgia often presents with co-occurring conditions including back pain, rheumatoid arthritis, and anxiety. More comorbidities are identified with hospital-associated biobank data, falling into three broad categories of pain-related, autoimmune, and psychiatric disorders. Selecting representative phenotypes with published genome-wide association results for polygenic scoring, we confirm genetic predispositions to psychiatric, pain sensitivity, and autoimmune conditions show associations with fibromyalgia, although these may differ by ancestry group. We conduct a genome-wide association analysis of fibromyalgia in biobank samples, which did not result in any genome-wide significant loci; further studies with increased sample size are necessary to identify specific genetic effects on fibromyalgia. Overall, fibromyalgia appears to have strong clinical and likely genetic links to several disease categories, and could usefully be understood as a composite manifestation of these etiological sources.
Asunto(s)
Artritis Reumatoide , Fibromialgia , Masculino , Femenino , Humanos , Fibromialgia/genética , Fibromialgia/diagnóstico , Fibromialgia/epidemiología , Estudio de Asociación del Genoma Completo , Dolor/genética , Dolor/complicaciones , Dolor/diagnóstico , Comorbilidad , Artritis Reumatoide/complicaciones , Artritis Reumatoide/diagnóstico , Artritis Reumatoide/epidemiologíaRESUMEN
Peripheral artery disease (PAD) is a form of atherosclerotic cardiovascular disease, affecting â¼8 million Americans, and is known to have racial and ethnic disparities. PAD has been reported to have significantly higher prevalence in African Americans (AAs) compared to non-Hispanic European Americans (EAs). Hispanic/Latinos (HLs) have been reported to have lower or similar rates of PAD compared to EAs, despite having a paradoxically high burden of PAD risk factors, however recent work suggests prevalence may differ between sub-groups. Here we examined a large cohort of diverse adults in the Bio Me biobank in New York City (NYC). We observed the prevalence of PAD at 1.7% in EAs vs 8.5% and 9.4% in AAs and HLs, respectively; and among HL sub-groups, at 11.4% and 11.5% in Puerto Rican and Dominican populations, respectively. Follow-up analysis that adjusted for common risk factors demonstrated that Dominicans had the highest increased risk for PAD relative to EAs (OR=3.15 (95% CI 2.33-4.25), P <6.44×10 -14 ). To investigate whether genetic factors may explain this increased risk, we performed admixture mapping by testing the association between local ancestry (LA) and PAD in Dominican Bio Me participants (N=1,940) separately for European (EUR), African (AFR) and Native American (NAT) continental ancestry tracts. We identified a NAT ancestry tract at chromosome 2q35 that was significantly associated with PAD (OR=2.05 (95% CI 1.51-2.78), P <4.06×10 -6 ) with 22.5% vs 12.5% PAD prevalence in heterozygous NAT tract carriers versus non-carriers, respectively. Fine-mapping at this locus implicated tag SNP rs78529201 located within a long intergenic non-coding RNA (lincRNA) LINC00607 , a gene expression regulator of key genes related to thrombosis and extracellular remodeling of endothelial cells, suggesting a putative link of the 2q35 locus to PAD etiology. In summary, we showed how leveraging health systems data helped understand nuances of PAD risk across HL sub-groups and admixture mapping approaches elucidated a novel risk locus in a Dominican population.
RESUMEN
Individuals of admixed ancestries (for example, African Americans) inherit a mosaic of ancestry segments (local ancestry) originating from multiple continental ancestral populations. This offers the unique opportunity of investigating the similarity of genetic effects on traits across ancestries within the same population. Here we introduce an approach to estimate correlation of causal genetic effects (radmix) across local ancestries and analyze 38 complex traits in African-European admixed individuals (N = 53,001) to observe very high correlations (meta-analysis radmix = 0.95, 95% credible interval 0.93-0.97), much higher than correlation of causal effects across continental ancestries. We replicate our results using regression-based methods from marginal genome-wide association study summary statistics. We also report realistic scenarios where regression-based methods yield inflated heterogeneity-by-ancestry due to ancestry-specific tagging of causal effects, and/or polygenicity. Our results motivate genetic analyses that assume minimal heterogeneity in causal effects by ancestry, with implications for the inclusion of ancestry-diverse individuals in studies.
Asunto(s)
Genética de Población , Herencia Multifactorial , Humanos , Herencia Multifactorial/genética , Estudio de Asociación del Genoma Completo/métodos , Grupos Raciales/genética , Negro o Afroamericano/genética , Polimorfismo de Nucleótido Simple/genéticaRESUMEN
BACKGROUND & AIMS: Genetic variants affecting liver disease risk vary among racial and ethnic groups. Hispanics/Latinos in the United States have a high prevalence of PNPLA3 I148M, which increases liver disease risk, and a low prevalence of HSD17B13 predicted loss-of-function (pLoF) variants, which reduce risk. Less is known about the prevalence of liver disease-associated variants among Hispanic/Latino subpopulations defined by country of origin and genetic ancestry. We evaluated the prevalence of HSD17B13 pLoF variants and PNPLA3 I148M, and their associations with quantitative liver phenotypes in Hispanic/Latino participants from an electronic health record-linked biobank in New York City. METHODS: This study included 8739 adult Hispanic/Latino participants of the BioMe biobank with genotyping and exome sequencing data. We estimated the prevalence of Hispanic/Latino individuals harboring HSD17B13 and PNPLA3 variants, stratified by genetic ancestry, and performed association analyses between variants and liver enzymes and Fibrosis-4 (FIB-4) scores. RESULTS: Individuals with ancestry from Ecuador and Mexico had the lowest frequency of HSD17B13 pLoF variants (10%/7%) and the highest frequency of PNPLA3 I148M (54%/65%). These ancestry groups had the highest outpatient alanine aminotransferase (ALT) and aspartate aminotransferase (AST) levels, and the largest proportion of individuals with a FIB-4 score greater than 2.67. HSD17B13 pLoF variants were associated with reduced ALT level (P = .002), AST level (P < .001), and FIB-4 score (P = .045). PNPLA3 I148M was associated with increased ALT level, AST level, and FIB-4 score (P < .001 for all). HSD17B13 pLoF variants mitigated the increase in ALT conferred by PNPLA3 I148M (P = .006). CONCLUSIONS: Variation in HSD17B13 and PNPLA3 variants across genetic ancestry groups may contribute to differential risk for liver fibrosis among Hispanic/Latino individuals.
Asunto(s)
Cirrosis Hepática , Enfermedad del Hígado Graso no Alcohólico , Humanos , Predisposición Genética a la Enfermedad , Hispánicos o Latinos/genética , Cirrosis Hepática/enzimología , Cirrosis Hepática/genética , Enfermedad del Hígado Graso no Alcohólico/enzimología , Enfermedad del Hígado Graso no Alcohólico/genética , Polimorfismo de Nucleótido SimpleRESUMEN
Groups of distantly related individuals who share a short segment of their genome identical-by-descent (IBD) can provide insights about rare traits and diseases in massive biobanks using IBD mapping. Clustering algorithms play an important role in finding these groups accurately and at scale. We set out to analyze the fitness of commonly used, fast and scalable clustering algorithms for IBD mapping applications. We designed a realistic benchmark for local IBD graphs and utilized it to compare the statistical power of clustering algorithms via simulating 2.3 million clusters across 850 experiments. We found Infomap and Markov Clustering (MCL) community detection methods to have high statistical power in most of the scenarios. They yield a 30% increase in power compared to the current state-of-art approach, with a 3 orders of magnitude lower runtime. We also found that standard clustering metrics, such as modularity, cannot predict statistical power of algorithms in IBD mapping applications. We extend our findings to real datasets by analyzing the Population Architecture using Genomics and Epidemiology (PAGE) Study dataset with 51,000 samples and 2 million shared segments on Chromosome 1, resulting in the extraction of 39 million local IBD clusters. We demonstrate the power of our approach by recovering signals of rare genetic variation in the Whole-Exome Sequence data of 200,000 individuals in the UK Biobank. We provide an efficient implementation to enable clustering at scale for IBD mapping for various populations and scenarios.Supplementary Information: The code, along with supplementary methods and figures are available at https://github.com/roohy/localIBDClustering.
Asunto(s)
Algoritmos , Biología Computacional , Humanos , Genómica , Análisis por ConglomeradosRESUMEN
CDH1 pathogenic variants confer a markedly elevated lifetime risk of developing diffuse gastric cancer (DGC) and lobular breast cancer (LBC). The aim of this study was to evaluate the prevalence and clinical impact of CDH1 pathogenic variants in the unselected and ancestrally diverse BioMe Biobank. We evaluated exome sequence data from 30,223 adult BioMe participants to identify CDH1 positive individuals, defined as those harboring a variant previously classified as pathogenic or likely pathogenic or a predicted loss-of-function variant in CDH1. We reviewed electronic health records and BioMe enrollment surveys for personal and family history of malignancy and evidence of prior clinical genetic testing. Using a genomics-first approach, we identified 6 CDH1 positive individuals in BioMe (~ 1 in 5000). CDH1 positive individuals had a median age of 42 years (range 35-62 years), all were non-European by self-report, and one was female. None had evidence of either a personal or family history of DGC or LBC. Our findings suggest a low risk of DGC and LBC in unselected patients harboring a pathogenic variant in CDH1. Knowledge of CDH1-related cancer risk in individuals with no personal or family history may better inform surveillance and prophylactic measures.
Asunto(s)
Antígenos CD , Cadherinas , Mutación de Línea Germinal , Neoplasias Gástricas , Adulto , Antígenos CD/genética , Cadherinas/genética , Femenino , Predisposición Genética a la Enfermedad , Humanos , Masculino , Persona de Mediana Edad , Riesgo , Neoplasias Gástricas/epidemiología , Neoplasias Gástricas/genética , Neoplasias Gástricas/patología , Secuenciación del ExomaRESUMEN
The integration of genomic data into health systems offers opportunities to identify genomic factors underlying the continuum of rare and common disease. We applied a population-scale haplotype association approach based on identity-by-descent (IBD) in a large multi-ethnic biobank to a spectrum of disease outcomes derived from electronic health records (EHRs) and uncovered a risk locus for liver disease. We used genome sequencing and in silico approaches to fine-map the signal to a non-coding variant (c.2784-12T>C) in the gene ABCB4. In vitro analysis confirmed the variant disrupted splicing of the ABCB4 pre-mRNA. Four of five homozygotes had evidence of advanced liver disease, and there was a significant association with liver disease among heterozygotes, suggesting the variant is linked to increased risk of liver disease in an allele dose-dependent manner. Population-level screening revealed the variant to be at a carrier rate of 1.95% in Puerto Rican individuals, likely as the result of a Puerto Rican founder effect. This work demonstrates that integrating EHR and genomic data at a population scale can facilitate strategies for understanding the continuum of genomic risk for common diseases, particularly in populations underrepresented in genomic medicine.
Asunto(s)
Atención a la Salud/organización & administración , Predisposición Genética a la Enfermedad , Hepatopatías/genética , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Registros Electrónicos de Salud , Haplotipos , Heterocigoto , Hispánicos o Latinos/genética , Homocigoto , Humanos , Puerto RicoRESUMEN
Identity-by-descent (IBD), the detection of shared segments inherited from a common ancestor, is a fundamental concept in genomics with broad applications in the characterization and analysis of genomes. While historically the concept of IBD was extensively utilized through linkage analyses and in studies of founder populations, applications of IBD-based methods subsided during the genome-wide association study era. This was primarily due to the computational expense of IBD detection, which becomes increasingly relevant as the field moves toward the analysis of biobank-scale datasets that encompass individuals from highly diverse backgrounds. To address these computational barriers, the past several years have seen new methodological advances enabling IBD detection for datasets in the hundreds of thousands to millions of individuals, enabling novel analyses at an unprecedented scale. Here, we describe the latest innovations in IBD detection and describe opportunities for the application of IBD-based methods across a broad range of questions in the field of genomics.
RESUMEN
Importance: Up to two-thirds of African American individuals carry the benign rs2814778-CC genotype that lowers total white blood cell (WBC) count. Objective: To examine whether the rs2814778-CC genotype is associated with an increased likelihood of receiving a bone marrow biopsy (BMB) for an isolated low WBC count. Design, Setting, and Participants: This retrospective genetic association study assessed African American patients younger than 90 years who underwent a BMB at Vanderbilt University Medical Center, Mount Sinai Health System, or Children's Hospital of Philadelphia from January 1, 1998, to December 31, 2020. Exposure: The rs2814778-CC genotype. Main Outcomes and Measures: The proportion of individuals with the CC genotype who underwent BMB for an isolated low WBC count and had a normal biopsy result compared with the proportion of individuals with the CC genotype who underwent BMB for other indications and had a normal biopsy result. Results: Among 399 individuals who underwent a BMB (mean [SD] age, 41.8 [22.5] years, 234 [59%] female), 277 (69%) had the CC genotype. A total of 35 patients (9%) had clinical histories of isolated low WBC counts, and 364 (91%) had other histories. Of those with a clinical history of isolated low WBC count, 34 of 35 (97%) had the CC genotype vs 243 of 364 (67%) of those without a low WBC count history. Among those with the CC genotype, 33 of 34 (97%) had normal results for biopsies performed for isolated low WBC counts compared with 134 of 243 individuals (55%) with biopsies performed for other histories (P < .001). Conclusions and Relevance: In this genetic association study, among patients of African American race who had a BMB with a clinical history of isolated low WBC counts, the rs2814778-CC genotype was highly prevalent, and 97% of these BMBs identified no hematologic abnormality. Accounting for the rs2814778-CC genotype in clinical decision-making could avoid unnecessary BMB procedures.
Asunto(s)
Biopsia , Negro o Afroamericano/genética , Examen de la Médula Ósea , Sistema del Grupo Sanguíneo Duffy/genética , Neutropenia , Receptores de Superficie Celular/genética , Adulto , Biopsia/métodos , Biopsia/estadística & datos numéricos , Examen de la Médula Ósea/métodos , Examen de la Médula Ósea/estadística & datos numéricos , Femenino , Perfilación de la Expresión Génica/estadística & datos numéricos , Perfil Genético , Estudio de Asociación del Genoma Completo , Humanos , Recuento de Leucocitos , Masculino , Neutropenia/diagnóstico , Neutropenia/etnología , Neutropenia/genética , Polimorfismo de Nucleótido Simple , Estados Unidos/epidemiología , Procedimientos Innecesarios/métodos , Procedimientos Innecesarios/estadística & datos numéricosRESUMEN
The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of magnitude on genomic datasets, making IBD estimation tractable for millions of individuals. We apply iLASH to the PAGE dataset of ~52,000 multi-ethnic participants, including several founder populations with elevated IBD sharing, identifying IBD segments in ~3 minutes per chromosome compared to over 6 days for a state-of-the-art algorithm. iLASH enables efficient analysis of very large-scale datasets, as we demonstrate by computing IBD across the UK Biobank (~500,000 individuals), detecting 12.9 billion pairwise connections.
Asunto(s)
Genética de Población/métodos , Genómica/métodos , Algoritmos , Simulación por Computador , Bases de Datos Genéticas , Genoma Humano , Haplotipos , Humanos , Linaje , Polimorfismo de Nucleótido Simple , Control de Calidad , Reino Unido/epidemiología , Reino Unido/etnologíaRESUMEN
Understanding population health disparities is an essential component of equitable precision health efforts. Epidemiology research often relies on definitions of race and ethnicity, but these population labels may not adequately capture disease burdens and environmental factors impacting specific sub-populations. Here, we propose a framework for repurposing data from electronic health records (EHRs) in concert with genomic data to explore the demographic ties that can impact disease burdens. Using data from a diverse biobank in New York City, we identified 17 communities sharing recent genetic ancestry. We observed 1,177 health outcomes that were statistically associated with a specific group and demonstrated significant differences in the segregation of genetic variants contributing to Mendelian diseases. We also demonstrated that fine-scale population structure can impact the prediction of complex disease risk within groups. This work reinforces the utility of linking genomic data to EHRs and provides a framework toward fine-scale monitoring of population health.
Asunto(s)
Etnicidad/genética , Salud Poblacional , Bases de Datos Genéticas , Registros Electrónicos de Salud , Genómica , Humanos , AutoinformeRESUMEN
BACKGROUND: Population-based genomic screening has the predicted ability to reduce morbidity and mortality associated with medically actionable conditions. However, much research is needed to develop standards for genomic screening and to understand the perspectives of people offered this new testing modality. This is particularly true for non-European ancestry populations who are vastly underrepresented in genomic medicine research. Therefore, we implemented a pilot genomic screening program in the BioMe Biobank in New York City, where the majority of participants are of non-European ancestry. METHODS: We initiated genomic screening for well-established genes associated with hereditary breast and ovarian cancer syndrome (HBOC), Lynch syndrome (LS), and familial hypercholesterolemia (FH). We evaluated and included an additional gene (TTR) associated with hereditary transthyretin amyloidosis (hATTR), which has a common founder variant in African ancestry populations. We evaluated the characteristics of 74 participants who received results associated with these conditions. We also assessed the preferences of 7461 newly enrolled BioMe participants to receive genomic results. RESULTS: In the pilot genomic screening program, 74 consented participants received results related to HBOC (N = 26), LS (N = 6), FH (N = 8), and hATTR (N = 34). Thirty-three of 34 (97.1%) participants who received a result related to hATTR were self-reported African American/African (AA) or Hispanic/Latinx (HL), compared to 14 of 40 (35.0%) participants who received a result related to HBOC, LS, or FH. Among the 7461 participants enrolled after the BioMe protocol modification to allow the return of genomic results, 93.4% indicated that they would want to receive results. Younger participants, women, and HL participants were more likely to opt to receive results. CONCLUSIONS: The addition of TTR to a pilot genomic screening program meant that we returned results to a higher proportion of AA and HL participants, in comparison with genes traditionally included in genomic screening programs in the USA. We found that the majority of participants in a multi-ethnic biobank are interested in receiving genomic results for medically actionable conditions. These findings increase knowledge about the perspectives of diverse research participants on receiving genomic results and inform the broader implementation of genomic medicine in underrepresented patient populations.
Asunto(s)
Pruebas Genéticas , Genética de Población , Adolescente , Adulto , Factores de Edad , Anciano , Anciano de 80 o más Años , Etnicidad/genética , Femenino , Medicina Genómica , Humanos , Masculino , Persona de Mediana Edad , Proyectos Piloto , Encuestas y Cuestionarios , Adulto JovenRESUMEN
Elevated plasma cholesterol and type 2 diabetes (T2D) are associated with coronary artery disease (CAD). Individuals treated with cholesterol-lowering statins have increased T2D risk, while individuals with hypercholesterolemia have reduced T2D risk. We explore the relationship between lipid and glucose control by constructing network models from the STARNET study with sequencing data from seven cardiometabolic tissues obtained from CAD patients during coronary artery by-pass grafting surgery. By integrating gene expression, genotype, metabolomic, and clinical data, we identify a glucose and lipid determining (GLD) regulatory network showing inverse relationships with lipid and glucose traits. Master regulators of the GLD network also impact lipid and glucose levels in inverse directions. Experimental inhibition of one of the GLD network master regulators, lanosterol synthase (LSS), in mice confirms the inverse relationships to glucose and lipid levels as predicted by our model and provides mechanistic insights.
Asunto(s)
Glucemia/metabolismo , Enfermedad de la Arteria Coronaria/metabolismo , Diabetes Mellitus Tipo 2/metabolismo , Glucosa/metabolismo , Metabolismo de los Lípidos , Modelos Biológicos , Animales , Colesterol/sangre , Enfermedad de la Arteria Coronaria/sangre , Enfermedad de la Arteria Coronaria/genética , Diabetes Mellitus Tipo 2/sangre , Diabetes Mellitus Tipo 2/genética , Femenino , Regulación de la Expresión Génica , Redes Reguladoras de Genes , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Hipercolesterolemia/sangre , Hipercolesterolemia/genética , Hipercolesterolemia/metabolismo , Ratones Endogámicos C57BL , Polimorfismo de Nucleótido SimpleRESUMEN
OBJECTIVE: To describe a founder mutation effect and the clinical phenotype of homozygous FRRS1L c.737_739delGAG (p.Gly246del) variant in 15 children of Puerto Rican (Boricua) ancestry presenting with early infantile epileptic encephalopathy (EIEE-37) with prominent movement disorder. BACKGROUND: EIEE-37 is caused by biallelic loss of function variants in the FRRS1L gene, which is critical for AMPA-receptor function, resulting in intractable epilepsy and dyskinesia. METHODS: A retrospective, multicenter chart review of patients sharing the same homozygous FRRS1L (p.Gly246del) pathogenic variant identified by clinical genetic testing. Clinical information was collected regarding neurodevelopmental outcomes, neuroimaging, electrographic features and clinical response to antiseizure medications. RESULTS: Fifteen patients from 12 different families of Puerto Rican ancestry were homozygous for the FRRS1L (p.Gly246del) pathogenic variant, with ages ranging from 1 to 25 years. The onset of seizures was from 6 to 24 months. All had hypotonia, severe global developmental delay, and most had hyperkinetic involuntary movements. Developmental regression during the first year of life was common (86%). Electroencephalogram showed hypsarrhythmia in 66% (10/15), with many older children evolving into Lennox-Gastaut syndrome. Six patients demonstrated progressive volume loss and/or cerebellar atrophy on brain magnetic resonance imaging (MRI). CONCLUSIONS: We describe the largest cohort to date of patients with epileptic encephalopathy. We estimate that 0.76% of unaffected individuals of Puerto Rican ancestry carry this pathogenic variant due to a founder effect. Children homozygous for the FRRS1L (p.Gly246del) Boricua variant exhibit a very homogenous phenotype of early developmental regression and epilepsy, starting with infantile spasms and evolving into Lennox-Gastaut syndrome with hyperkinetic movement disorder.
Asunto(s)
Hispánicos o Latinos/genética , Síndrome de Lennox-Gastaut/genética , Proteínas de la Membrana/genética , Mutación/genética , Proteínas del Tejido Nervioso/genética , Espasmos Infantiles/genética , Adolescente , Adulto , Niño , Preescolar , Estudios de Cohortes , Electroencefalografía , Femenino , Hispánicos o Latinos/estadística & datos numéricos , Humanos , Lactante , Masculino , Puerto Rico , Estudios Retrospectivos , Espasmos Infantiles/fisiopatología , Adulto JovenRESUMEN
The emergence of genomic data in biobanks and health systems offers new ways to derive medically important phenotypes, including acute phenotypes occurring during inpatient clinical care. Here we study the genetic underpinnings of the rapid response to phenylephrine, an α1-adrenergic receptor agonist commonly used to treat hypotension during anesthesia and surgery. We quantified this response by extracting blood pressure (BP) measurements 5 min before and after the administration of phenylephrine. Based on this derived phenotype, we show that systematic differences exist between self-reported ancestry groups: European-Americans (EA; n = 1387) have a significantly higher systolic response to phenylephrine than African-Americans (AA; n = 1217) and Hispanic/Latinos (HA; n = 1713) (31.3% increase, p value < 6e-08 and 22.9% increase, p value < 5e-05 respectively), after adjusting for genetic ancestry, demographics, and relevant clinical covariates. We performed a genome-wide association study to investigate genetic factors underlying individual differences in this derived phenotype. We discovered genome-wide significant association signals in loci and genes previously associated with BP measured in ambulatory settings, and a general enrichment of association in these genes. Finally, we discovered two low frequency variants, present at ~1% in EAs and AAs, respectively, where patients carrying one copy of these variants show no phenylephrine response. This work demonstrates our ability to derive a quantitative phenotype suited for comparative statistics and genome-wide association studies from dense clinical and physiological measures captured for managing patients during surgery. We identify genetic variants underlying non response to phenylephrine, with implications for preemptive pharmacogenomic screening to improve safety during surgery.
Asunto(s)
Adrenérgicos/uso terapéutico , Fenilefrina/uso terapéutico , Negro o Afroamericano/genética , Presión Sanguínea/efectos de los fármacos , Presión Sanguínea/genética , Femenino , Estudio de Asociación del Genoma Completo/métodos , Genómica/métodos , Humanos , Masculino , Persona de Mediana Edad , Periodo Perioperatorio/métodos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Población Blanca/genéticaRESUMEN
PURPOSE: Limited data are available on the prevalence and clinical impact of Lynch syndrome (LS)-associated genomic variants in non-European ancestry populations. We identified and characterized individuals harboring LS-associated variants in the ancestrally diverse BioMe Biobank in New York City. PATIENTS AND METHODS: Exome sequence data from 30,223 adult BioMe participants were evaluated for pathogenic, likely pathogenic, and predicted loss-of-function variants in MLH1, MSH2, MSH6, and PMS2. Survey and electronic health record data from variant-positive individuals were reviewed for personal and family cancer histories. RESULTS: We identified 70 individuals (0.2%) harboring LS-associated variants in MLH1 (n = 12; 17%), MSH2 (n = 13; 19%), MSH6 (n = 16; 23%), and PMS2 (n = 29; 41%). The overall prevalence was 1 in 432, with higher prevalence among individuals of self-reported African ancestry (1 in 299) than among Hispanic/Latinx (1 in 654) or European (1 in 518) ancestries. Thirteen variant-positive individuals (19%) had a personal history, and 19 (27%) had a family history of an LS-related cancer. LS-related cancer rates were highest in individuals with MSH6 variants (31%) and lowest in those with PMS2 variants (7%). LS-associated variants were associated with increased risk of colorectal (odds ratio [OR], 5.0; P = .02) and endometrial (OR, 30.1; P = 8.5 × 10-9) cancers in BioMe. Only 2 variant-positive individuals (3%) had a documented diagnosis of LS. CONCLUSION: We found a higher prevalence of LS-associated variants among individuals of African ancestry in New York City. Although cancer risk is significantly increased among variant-positive individuals, the majority do not harbor a clinical diagnosis of LS, suggesting underrecognition of this disease.
RESUMEN
Though discovered over 100 years ago, the molecular foundation of sporadic Alzheimer's disease (AD) remains elusive. To better characterize the complex nature of AD, we constructed multiscale causal networks on a large human AD multi-omics dataset, integrating clinical features of AD, DNA variation, and gene- and protein-expression. These probabilistic causal models enabled detection, prioritization and replication of high-confidence master regulators of AD-associated networks, including the top predicted regulator, VGF. Overexpression of neuropeptide precursor VGF in 5xFAD mice partially rescued beta-amyloid-mediated memory impairment and neuropathology. Molecular validation of network predictions downstream of VGF was also achieved in this AD model, with significant enrichment for homologous genes identified as differentially expressed in 5xFAD brains overexpressing VGF. Our findings support a causal role for VGF in protecting against AD pathogenesis and progression.
Asunto(s)
Enfermedad de Alzheimer/etiología , Encéfalo/patología , Factores de Crecimiento Nervioso/metabolismo , Mapas de Interacción de Proteínas , Anciano , Anciano de 80 o más Años , Enfermedad de Alzheimer/patología , Péptidos beta-Amiloides/metabolismo , Animales , Conjuntos de Datos como Asunto , Modelos Animales de Enfermedad , Femenino , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Estudio de Asociación del Genoma Completo , Humanos , Masculino , Ratones , Ratones Transgénicos , Factores de Crecimiento Nervioso/genética , Mapeo de Interacción de Proteínas , ProteómicaRESUMEN
On average, Peruvian individuals are among the shortest in the world1. Here we show that Native American ancestry is associated with reduced height in an ethnically diverse group of Peruvian individuals, and identify a population-specific, missense variant in the FBN1 gene (E1297G) that is significantly associated with lower height. Each copy of the minor allele (frequency of 4.7%) reduces height by 2.2 cm (4.4 cm in homozygous individuals). To our knowledge, this is the largest effect size known for a common height-associated variant. FBN1 encodes the extracellular matrix protein fibrillin 1, which is a major structural component of microfibrils. We observed less densely packed fibrillin-1-rich microfibrils with irregular edges in the skin of individuals who were homozygous for G1297 compared with individuals who were homozygous for E1297. Moreover, we show that the E1297G locus is under positive selection in non-African populations, and that the E1297 variant shows subtle evidence of positive selection specifically within the Peruvian population. This variant is also significantly more frequent in coastal Peruvian populations than in populations from the Andes or the Amazon, which suggests that short stature might be the result of adaptation to factors that are associated with the coastal environment in Peru.
Asunto(s)
Estatura/genética , Fibrilina-1/genética , Mutación Missense , Selección Genética , Femenino , Frecuencia de los Genes , Estudio de Asociación del Genoma Completo , Herencia , Humanos , Indígenas Sudamericanos/genética , Masculino , Microfibrillas/química , Microfibrillas/genética , PerúRESUMEN
BACKGROUND & AIMS: The Ile138Met variant (rs738409) in the PNPLA3 gene has the largest effect on non-alcoholic fatty liver disease (NAFLD), increasing the risk of progression to severe forms of liver disease. It remains unknown if the variant plays a role in age of NAFLD onset. We aimed to determine if rs738409 impacts on the age of NAFLD diagnosis. METHODS: We applied a novel natural language processing (NLP) algorithm to a longitudinal electronic health records (EHR) dataset of >27,000 individuals with genetic data from a multi-ethnic biobank, defining NAFLD cases (n = 1,703) and confirming controls (n = 8,119). We conducted i) a survival analysis to determine if age at diagnosis differed by rs738409 genotype, ii) a receiver operating characteristics analysis to assess the utility of the rs738409 genotype in discriminating NAFLD cases from controls, and iii) a phenome-wide association study (PheWAS) between rs738409 and 10,095 EHR-derived disease diagnoses. RESULTS: The PNPLA3 G risk allele was associated with: i) earlier age of NAFLD diagnosis, with the strongest effect in Hispanics (hazard ratio 1.33; 95% CI 1.15-1.53; p <0.0001) among whom a NAFLD diagnosis was 15% more likely in risk allele carriers vs. non-carriers; ii) increased NAFLD risk (odds ratio 1.61; 95% CI 1.349-1.73; p <0.0001), with the strongest effect among Hispanics (odds ratio 1.43; 95% CI 1.28-1.59; p <0.0001); iii) additional liver diseases in a PheWAS (p <4.95 × 10-6) where the risk variant also associated with earlier age of diagnosis. CONCLUSION: Given the role of the rs738409 in NAFLD diagnosis age, our results suggest that stratifying risk within populations known to have an enhanced risk of liver disease, such as Hispanic carriers of the rs738409 variant, would be effective in earlier identification of those who would benefit most from early NAFLD prevention and treatment strategies. LAY SUMMARY: Despite clear associations between the PNPLA3 rs738409 variant and elevated risk of progression from non-alcoholic fatty liver disease (NAFLD) to more severe forms of liver disease, it remains unknown if PNPLA3 rs738409 plays a role in the age of NAFLD onset. Herein, we found that this risk variant is associated with an earlier age of NAFLD and other liver disease diagnoses; an observation most pronounced in Hispanic Americans. We conclude that PNPLA3 rs738409 could be used to better understand liver disease risk within vulnerable populations and identify patients that may benefit from early prevention strategies.