RESUMEN
Polygenic scores (PGSs) have limited portability across different groupings of individuals (for example, by genetic ancestries and/or social determinants of health), preventing their equitable use1-3. PGS portability has typically been assessed using a single aggregate population-level statistic (for example, R2)4, ignoring inter-individual variation within the population. Here, using a large and diverse Los Angeles biobank5 (ATLAS, n = 36,778) along with the UK Biobank6 (UKBB, n = 487,409), we show that PGS accuracy decreases individual-to-individual along the continuum of genetic ancestries7 in all considered populations, even within traditionally labelled 'homogeneous' genetic ancestries. The decreasing trend is well captured by a continuous measure of genetic distance (GD) from the PGS training data: Pearson correlation of -0.95 between GD and PGS accuracy averaged across 84 traits. When applying PGS models trained on individuals labelled as white British in the UKBB to individuals with European ancestries in ATLAS, individuals in the furthest GD decile have 14% lower accuracy relative to the closest decile; notably, the closest GD decile of individuals with Hispanic Latino American ancestries show similar PGS performance to the furthest GD decile of individuals with European ancestries. GD is significantly correlated with PGS estimates themselves for 82 of 84 traits, further emphasizing the importance of incorporating the continuum of genetic ancestries in PGS interpretation. Our results highlight the need to move away from discrete genetic ancestry clusters towards the continuum of genetic ancestries when considering PGSs.
Asunto(s)
Herencia Multifactorial , Grupos Raciales , Humanos , Europa (Continente)/etnología , Hispánicos o Latinos/genética , Herencia Multifactorial/genética , Grupos Raciales/genética , Reino Unido , Población Blanca/genética , Pueblo Europeo/genética , Los Angeles , Bases de Datos GenéticasRESUMEN
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality, with large disparities in incidence rates between Black and White Americans. Polygenic risk scores (PRSs) limited to variants discovered in genome-wide association studies in European-ancestry samples can identify European-ancestry individuals at high risk of VTE. However, there is limited evidence on whether high-dimensional PRS constructed using more sophisticated methods and more diverse training data can enhance the predictive ability and their utility across diverse populations. We developed PRSs for VTE using summary statistics from the International Network against Venous Thrombosis (INVENT) consortium genome-wide association studies meta-analyses of European- (71 771 cases and 1 059 740 controls) and African-ancestry samples (7482 cases and 129 975 controls). We used LDpred2 and PRS-CSx to construct ancestry-specific and multi-ancestry PRSs and evaluated their performance in an independent European- (6781 cases and 103 016 controls) and African-ancestry sample (1385 cases and 12 569 controls). Multi-ancestry PRSs with weights tuned in European-ancestry samples slightly outperformed ancestry-specific PRSs in European-ancestry test samples (e.g. the area under the receiver operating curve [AUC] was 0.609 for PRS-CSx_combinedEUR and 0.608 for PRS-CSxEUR [P = 0.00029]). Multi-ancestry PRSs with weights tuned in African-ancestry samples also outperformed ancestry-specific PRSs in African-ancestry test samples (PRS-CSxAFR: AUC = 0.58, PRS-CSx_combined AFR: AUC = 0.59), although this difference was not statistically significant (P = 0.34). The highest fifth percentile of the best-performing PRS was associated with 1.9-fold and 1.68-fold increased risk for VTE among European- and African-ancestry subjects, respectively, relative to those in the middle stratum. These findings suggest that the multi-ancestry PRS might be used to improve performance across diverse populations to identify individuals at highest risk for VTE.
Asunto(s)
Puntuación de Riesgo Genético , Tromboembolia Venosa , Femenino , Humanos , Masculino , Negro o Afroamericano/genética , Estudios de Casos y Controles , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Tromboembolia Venosa/genética , Tromboembolia Venosa/epidemiología , Blanco/genéticaRESUMEN
We report a 25-year-old female confirmed to have Smith-Magenis syndrome (SMS) due to a de novo RAI1 variant. Her past history is significant for developmental and intellectual delay, early and escalating maladaptive behaviors, and features consistent with significant sleep disturbance, the etiology of which was not confirmed for over two decades. The diagnosis of SMS was initially suspected in 1998 (at age 12 years), but that was 5 years before the initial report of RAI1 variants as causative of the SMS phenotype; cytogenetic fluorescence in situ hybridization studies failed to confirm an interstitial deletion of 17p11.2. Re-evaluation for suspected SMS was pursued with RAI1 sequencing analysis in response to urgent parental concerns of escalating behaviors and aggression with subsequent incarceration of the subject for assault of a health professional. Genetic analysis revealed a de novo RAI1 (NM_030665.3) nonsense variant, c.5536C>T; p.Q1846X. This case illustrates the importance of confirming the SMS diagnosis, which is associated with cognitive and functional impairment, as well as significant psychiatric co-morbidities and behavioral problems. The diagnosis was particularly relevant to the legal discussion and determination of her competence to stand trial. As other similar cases may exist, this report will help to increase awareness of the possibility of a very late diagnosis of SMS, with the need for re-evaluation of individuals suspected to have SMS who were initially evaluated prior to the identification of the RAI1 gene. © 2016 Wiley Periodicals, Inc.
Asunto(s)
Codón sin Sentido , Estudios de Asociación Genética , Fenotipo , Síndrome de Smith-Magenis/diagnóstico , Síndrome de Smith-Magenis/genética , Factores de Transcripción/genética , Adulto , Deleción Cromosómica , Cromosomas Humanos Par 17 , Variaciones en el Número de Copia de ADN , Análisis Mutacional de ADN , Diagnóstico Tardío , Facies , Femenino , Humanos , Linaje , Polimorfismo de Nucleótido Simple , TransactivadoresRESUMEN
Tobacco use is a major risk factor for many diseases and is heavily influenced by environmental factors with significant underlying genetic contributions. Here, we evaluated the predictive performance, risk stratification, and potential systemic health effects of tobacco use disorder (TUD) predisposing germline variants using a European- ancestry-derived polygenic score (PGS) in 24,202 participants from the multi-ancestry, hospital-based UCLA ATLAS biobank. Among genetically inferred ancestry groups (GIAs), TUD-PGS was significantly associated with TUD in European American (EA) (OR: 1.20, CI: [1.16, 1.24]), Hispanic/Latin American (HL) (OR:1.19, CI: [1.11, 1.28]), and East Asian American (EAA) (OR: 1.18, CI: [1.06, 1.31]) GIAs but not in African American (AA) GIA (OR: 1.04, CI: [0.93, 1.17]). Similarly, TUD-PGS offered strong risk stratification across PGS quantiles in EA and HL GIAs and inconsistently in EAA and AA GIAs. In a cross-ancestry phenome-wide association meta-analysis, TUD-PGS was associated with cardiometabolic, respiratory, and psychiatric phecodes (17 phecodes at P < 2.7E-05). In individuals with no history of smoking, the top TUD-PGS associations with obesity and alcohol-related disorders (P = 3.54E-07, 1.61E-06) persist. Mendelian Randomization (MR) analysis provides evidence of a causal association between adiposity measures and tobacco use. Inconsistent predictive performance of the TUD-PGS across GIAs motivates the inclusion of multiple ancestry populations at all levels of genetic research of tobacco use for equitable clinical translation of TUD-PGS. Phenome associations suggest that TUD-predisposed individuals may require comprehensive tobacco use prevention and management approaches to address underlying addictive tendencies.
Asunto(s)
Bancos de Muestras Biológicas , Tabaquismo , Humanos , Los Angeles , Uso de Tabaco , Tabaquismo/genética , Factores de Riesgo , Obesidad , Estudio de Asociación del Genoma CompletoRESUMEN
AIMS: In patients with atrial fibrillation (AF) and stroke risk factors, randomized trials have demonstrated that anticoagulation decreases the risk of ischemic stroke. However, all trials to date have excluded patients with significant liver disease, leaving guidelines to extrapolate recommendations. We aim to evaluate the impact of anticoagulation on safety events in patients with AF and cirrhosis. METHODS AND RESULTS: In this retrospective cohort study, we obtained de-identified health record data to extract anticoagulation strategy, comorbidities, prescriptions, lab values, and procedures for a cohort of patients with cirrhosis who develop AF. After selecting a propensity matched population to match patients with various anticoagulation strategies, we tracked data on outcomes for death, transfusion requirements, hospital and ICU admissions. After propensity score weighting and multivariable adjustment, anticoagulation strategy was associated with increased hospital admission count (OR = 1.74 per admission, P < .001), binary risk of hospital admission (OR = 1.54, P = .010) and risk of ICU admission (OR = 1.41, P = .047). We detected no significant differences in mortality, transfusion of blood products, or average length of stay. Direct oral anticoagulant (DOAC) prescriptions were associated with increased binary risk of hospital admission compared to warfarin prescriptions. In a third comparison, DOAC strategy alone was associated with increased hospital admission count (OR = 1.41 per admission, P < .001) and binary risk of hospital admission (OR = 1.52, P = .038) compared to no anticoagulation strategy. CONCLUSION: Anticoagulation strategy in patients with cirrhosis and AF was associated with increased rate of hospital admission and ICU admission but not associated with increased risk of mortality or transfusion requirement.
Asunto(s)
Anticoagulantes , Fibrilación Atrial , Cirrosis Hepática , Humanos , Fibrilación Atrial/tratamiento farmacológico , Fibrilación Atrial/complicaciones , Fibrilación Atrial/mortalidad , Fibrilación Atrial/diagnóstico , Masculino , Estudios Retrospectivos , Femenino , Cirrosis Hepática/complicaciones , Cirrosis Hepática/mortalidad , Anciano , Persona de Mediana Edad , Anticoagulantes/efectos adversos , Anticoagulantes/uso terapéutico , Anticoagulantes/administración & dosificación , Factores de Riesgo , Resultado del Tratamiento , Transfusión Sanguínea , Medición de Riesgo , Warfarina/efectos adversos , Warfarina/uso terapéutico , Factores de Tiempo , Hemorragia/inducido químicamente , Admisión del Paciente , Accidente Cerebrovascular/prevención & control , Accidente Cerebrovascular/mortalidad , Accidente Cerebrovascular/epidemiologíaRESUMEN
Polygenic scores (PGS) have emerged as the tool of choice for genomic prediction in a wide range of fields. We show that PGS performance varies broadly across contexts and biobanks. Contexts such as age, sex and income can impact PGS accuracy with similar magnitudes as genetic ancestry. Here we introduce an approach (CalPred) that models all contexts jointly to produce prediction intervals that vary across contexts to achieve calibration (include the trait with 90% probability), whereas existing methods are miscalibrated. In analyses of 72 traits across large and diverse biobanks (All of Us and UK Biobank), we find that prediction intervals required adjustment by up to 80% for quantitative traits. For disease traits, PGS-based predictions were miscalibrated across socioeconomic contexts such as annual household income levels, further highlighting the need of accounting for context information in PGS-based prediction across diverse populations.
Asunto(s)
Estudio de Asociación del Genoma Completo , Modelos Genéticos , Herencia Multifactorial , Humanos , Herencia Multifactorial/genética , Estudio de Asociación del Genoma Completo/métodos , Femenino , Masculino , Calibración , Bancos de Muestras Biológicas , Fenotipo , Genómica/métodos , Polimorfismo de Nucleótido SimpleRESUMEN
Polygenic scores (PGSs) summarize the combined effect of common risk variants and are associated with breast cancer risk in patients without identifiable monogenic risk factors. One of the most well-validated PGSs in breast cancer to date is PGS313, which was developed from a Northern European biobank but has shown attenuated performance in non-European ancestries. We further investigate the generalizability of the PGS313 for American women of European (EA), African (AFR), Asian (EAA), and Latinx (HL) ancestry within one institution with a singular electronic health record (EHR) system, genotyping platform, and quality control process. We found that the PGS313 achieved overlapping areas under the receiver operator characteristic (ROC) curve (AUCs) in females of HL (AUC = 0.68, 95% confidence interval [CI] = 0.65-0.71) and EA ancestry (AUC = 0.70, 95% CI = 0.69-0.71) but lower AUCs for the AFR and EAA populations (AFR: AUC = 0.61, 95% CI = 0.56-0.65; EAA: AUC = 0.64, 95% CI = 0.60-0.680). While PGS313 is associated with hormone-receptor-positive (HR+) disease in EA Americans (odds ratio [OR] = 1.42, 95% CI = 1.16-1.64), this association is lost in African, Latinx, and Asian Americans. In summary, we found that PGS313 was significantly associated with breast cancer but with attenuated accuracy in women of AFR and EAA descent within a singular health system in Los Angeles. Our work further highlights the need for additional validation in diverse cohorts prior to the clinical implementation of PGSs.
Asunto(s)
Bancos de Muestras Biológicas , Neoplasias de la Mama , Predisposición Genética a la Enfermedad , Humanos , Neoplasias de la Mama/genética , Neoplasias de la Mama/epidemiología , Neoplasias de la Mama/etnología , Femenino , Los Angeles/epidemiología , Persona de Mediana Edad , Factores de Riesgo , Herencia Multifactorial , Curva ROC , Adulto , Anciano , Polimorfismo de Nucleótido SimpleRESUMEN
Importance: Polygenic risk scores (PRSs) for coronary artery disease (CAD) are a growing clinical and commercial reality. Whether existing scores provide similar individual-level assessments of disease liability is a critical consideration for clinical implementation that remains uncharacterized. Objective: Characterize the reliability of CAD PRSs that perform equivalently at the population level at predicting individual-level risk. Design: Cross-sectional Study. Setting: All of Us Research Program (AOU), Penn Medicine Biobank (PMBB), and UCLA ATLAS Precision Health Biobank. Participants: Volunteers of diverse genetic backgrounds enrolled in AOU, PMBB, and UCLA with available electronic health record and genotyping data. Exposures: Polygenic risk for CAD from previously published PRSs and new PRSs developed separately from the testing cohorts. Main Outcomes and Measures: Sets of CAD PRSs that perform population prediction equivalently were identified by comparing calibration and discrimination (Brier score and AUROC) of generalized linear models of prevalent CAD using Bayesian analysis of variance. Among equivalently performing scores, individual-level agreement between risk estimates was tested with intraclass correlation (ICC) and Light's Kappa, measures of inter-rater reliability. Results: 50 PRSs were calculated for 171,095 AOU participants. When included in a model of prevalent CAD, 48 scores had practically equivalent Brier scores and AUROCs (region of practical equivalence = 0.02). Across these scores, 84% of participants had at least one score in both the top and bottom risk quintile. Continuous agreement of individual risk predictions from the 48 scores was poor, with an ICC of 0.351 (95% CI; 0.349, 0.352). Agreement between two statistically equivalent scores was moderate, with an ICC of 0.649 (95% CI; 0.646, 0.652). Light's Kappa, used to evaluate consistency of assignment to high-risk thresholds, did not exceed 0.56 (interpreted as 'fair') across statistically and practically equivalent scores. Repeating the analysis among 41,193 PMBB and 50,748 UCLA participants yielded different sets of statistically and practically equivalent scores which also lacked strong individual agreement. Conclusions and Relevance: Across three diverse biobanks, CAD PRSs that performed equivalently at the population level produced unreliable individual risk estimates. Approaches to clinical implementation of CAD PRSs must consider the potential for discordant individual risk estimates from otherwise indistinguishable scores.
RESUMEN
Venous thromboembolism (VTE) is a significant contributor to morbidity and mortality, with large disparities in incidence rates between Black and White Americans. Polygenic risk scores (PRSs) limited to variants discovered in genome-wide association studies in European-ancestry samples can identify European-ancestry individuals at high risk of VTE. However, there is limited evidence on whether high-dimensional PRS constructed using more sophisticated methods and more diverse training data can enhance the predictive ability and their utility across diverse populations. We developed PRSs for VTE using summary statistics from the International Network against Venous Thrombosis (INVENT) consortium GWAS meta-analyses of European- (71,771 cases and 1,059,740 controls) and African-ancestry samples (7,482 cases and 129,975 controls). We used LDpred2 and PRSCSx to construct ancestry-specific and multi-ancestry PRSs and evaluated their performance in an independent European- (6,261 cases and 88,238 controls) and African-ancestry sample (1,385 cases and 12,569 controls). Multi-ancestry PRSs with weights tuned in European- and African-ancestry samples, respectively, outperformed ancestry-specific PRSs in European- (PRSCSXEUR: AUC=0.61 (0.60, 0.61), PRSCSX_combinedEUR: AUC=0.61 (0.60, 0.62)) and African-ancestry test samples (PRSCSXAFR: AUC=0.58 (0.57, 0.6), PRSCSX_combined AFR: AUC=0.59 (0.57, 0.60)). The highest fifth percentile of the best-performing PRS was associated with 1.9-fold and 1.68-fold increased risk for VTE among European- and African-ancestry subjects, respectively, relative to those in the middle stratum. These findings suggest that the multi-ancestry PRS may be used to identify individuals at highest risk for VTE and provide guidance for the most effective treatment strategy across diverse populations.
RESUMEN
Background: Bilirubin is a potent antioxidant with a protective role in many diseases. We examined the relationships between serum bilirubin (SB) levels, tobacco smoking (a known cause of low SB), and aerodigestive cancers, grouped as lung (LC) and head and neck (HNC). Methods: We examined the associations between SB, LC and HNC using data from 393,210 participants from UCLA Health, employing regression models, propensity score matching, and polygenic scores. Results: Current tobacco smokers showed lower SB (-0.04mg/dL, 95% CI: [-0.04, -0.03]), compared to never-smokers. Lower SB levels were observed in HNC and LC cases (-0.10 mg/dL, [-0.13, -0.09] and -0.09 mg/dL, CI [-0.1, -0.07] respectively) compared to cancer-free controls with the effect persisting after adjusting for smoking. SB levels were inversely associated with HNC and LC risk (ORs per SD change in SB: 0.64, CI [0.59,0.69] and 0.57, CI [0.43,0.75], respectively). Lastly, a polygenic score (PGS) for SB was associated with LC (OR per SD change of SB-PGS: 0.71, CI [0.67, 0.76]). Conclusions: Low SB levels are associated with an increased risk of both HNC and LC, independent of the effect of tobacco smoking with tobacco smoking demonstrating a strong interaction with SB on LC risk. Additionally, genetically predicted low SB (from polygenic scores) is negatively associated with LC. Impact: These findings suggest that SB could serve as a potential early biomarker for LC and HNC.
RESUMEN
Background: Bilirubin is a potent antioxidant with a protective role in many diseases. We examined the relationships between serum bilirubin (SB) levels, tobacco smoking (a known cause of low SB), and aerodigestive cancers, grouped as lung cancers (LC) and head and neck cancers (HNC). Methods: We examined the associations between SB, LC, and HNC using data from 393,210 participants from a real-world, diverse, de-identified data repository and biobank linked to the UCLA Health system. We employed regression models, propensity score matching, and polygenic scores to investigate the associations and interactions between SB, tobacco smoking, LC, and HNC. Results: Current tobacco smokers showed lower SB (-0.04mg/dL, 95% CI: [-0.04, -0.03]), compared to never-smokers. Lower SB levels were observed in HNC and LC cases (-0.10 mg/dL, [-0.13, -0.09] and - 0.09 mg/dL, CI [-0.1, -0.07] respectively) compared to cancer-free controls with the effect persisting after adjusting for smoking. SB levels were inversely associated with HNC and LC risk (ORs per SD change in SB: 0.64, CI [0.59,0.69] and 0.57, CI [0.43,0.75], respectively). Lastly, a polygenic score (PGS) for SB was associated with LC (OR per SD change of SB-PGS: 0.71, CI [0.67, 0.76]). Conclusions: Low SB levels are associated with an increased risk of both HNC and LC, independent of the effect of tobacco smoking. Additionally, tobacco smoking demonstrated a strong interaction with SB on LC risk. Lastly, genetically predicted low SB (using a polygenic score) is negatively associated with LC. These findings suggest that SB could serve as a potential early and low-cost biomarker for LC and HNC. The interaction with tobacco smoking suggests that smokers with lower bilirubin could likely be at higher risk for LC compared to never smokers, suggesting the utility of SB in risk stratification for patients at risk for LC. Lastly, the results of the polygenic score analyses suggest potential shared biological pathways between the genetic control of SB and the risk of LC development.
RESUMEN
BACKGROUND: Large medical centers in urban areas, like Los Angeles, care for a diverse patient population and offer the potential to study the interplay between genetic ancestry and social determinants of health. Here, we explore the implications of genetic ancestry within the University of California, Los Angeles (UCLA) ATLAS Community Health Initiative-an ancestrally diverse biobank of genomic data linked with de-identified electronic health records (EHRs) of UCLA Health patients (N=36,736). METHODS: We quantify the extensive continental and subcontinental genetic diversity within the ATLAS data through principal component analysis, identity-by-descent, and genetic admixture. We assess the relationship between genetically inferred ancestry (GIA) and >1500 EHR-derived phenotypes (phecodes). Finally, we demonstrate the utility of genetic data linked with EHR to perform ancestry-specific and multi-ancestry genome and phenome-wide scans across a broad set of disease phenotypes. RESULTS: We identify 5 continental-scale GIA clusters including European American (EA), African American (AA), Hispanic Latino American (HL), South Asian American (SAA) and East Asian American (EAA) individuals and 7 subcontinental GIA clusters within the EAA GIA corresponding to Chinese American, Vietnamese American, and Japanese American individuals. Although we broadly find that self-identified race/ethnicity (SIRE) is highly correlated with GIA, we still observe marked differences between the two, emphasizing that the populations defined by these two criteria are not analogous. We find a total of 259 significant associations between continental GIA and phecodes even after accounting for individuals' SIRE, demonstrating that for some phenotypes, GIA provides information not already captured by SIRE. GWAS identifies significant associations for liver disease in the 22q13.31 locus across the HL and EAA GIA groups (HL p-value=2.32×10-16, EAA p-value=6.73×10-11). A subsequent PheWAS at the top SNP reveals significant associations with neurologic and neoplastic phenotypes specifically within the HL GIA group. CONCLUSIONS: Overall, our results explore the interplay between SIRE and GIA within a disease context and underscore the utility of studying the genomes of diverse individuals through biobank-scale genotyping linked with EHR-based phenotyping.
Asunto(s)
Registros Electrónicos de Salud , Salud Pública , Pueblo Asiatico , Bancos de Muestras Biológicas , Genómica , HumanosRESUMEN
Even in well-described genetic syndromes, such as neurofibromatosis type 1, expansion of the phenotype should be considered as a possible explanation for atypical presentations. However, it is critical to complete the evaluation for a potential dual diagnosis, as there could be significant prognostic and management implications.
RESUMEN
Rapid advances in DNA synthesis techniques have made it possible to engineer viruses, biochemical pathways and assemble bacterial genomes. Here, we report the synthesis of a functional 272,871-base pair designer eukaryotic chromosome, synIII, which is based on the 316,617-base pair native Saccharomyces cerevisiae chromosome III. Changes to synIII include TAG/TAA stop-codon replacements, deletion of subtelomeric regions, introns, transfer RNAs, transposons, and silent mating loci as well as insertion of loxPsym sites to enable genome scrambling. SynIII is functional in S. cerevisiae. Scrambling of the chromosome in a heterozygous diploid reveals a large increase in a-mater derivatives resulting from loss of the MATα allele on synIII. The complete design and synthesis of synIII establishes S. cerevisiae as the basis for designer eukaryotic genome biology.