Search | Virtual Health Library

1.

Polygenic scoring accuracy varies across the genetic ancestry continuum.

Ding, Yi; Hou, Kangcheng; Xu, Ziqi; Pimplaskar, Aditya; Petter, Ella; Boulier, Kristin; Privé, Florian; Vilhjálmsson, Bjarni J; Olde Loohuis, Loes M; Pasaniuc, Bogdan.

Nature ; 618(7966): 774-781, 2023 Jun.

Article in English | MEDLINE | ID: mdl-37198491

ABSTRACT

Polygenic scores (PGSs) have limited portability across different groupings of individuals (for example, by genetic ancestries and/or social determinants of health), preventing their equitable use1-3. PGS portability has typically been assessed using a single aggregate population-level statistic (for example, R2)4, ignoring inter-individual variation within the population. Here, using a large and diverse Los Angeles biobank5 (ATLAS, n = 36,778) along with the UK Biobank6 (UKBB, n = 487,409), we show that PGS accuracy decreases individual-to-individual along the continuum of genetic ancestries7 in all considered populations, even within traditionally labelled 'homogeneous' genetic ancestries. The decreasing trend is well captured by a continuous measure of genetic distance (GD) from the PGS training data: Pearson correlation of -0.95 between GD and PGS accuracy averaged across 84 traits. When applying PGS models trained on individuals labelled as white British in the UKBB to individuals with European ancestries in ATLAS, individuals in the furthest GD decile have 14% lower accuracy relative to the closest decile; notably, the closest GD decile of individuals with Hispanic Latino American ancestries show similar PGS performance to the furthest GD decile of individuals with European ancestries. GD is significantly correlated with PGS estimates themselves for 82 of 84 traits, further emphasizing the importance of incorporating the continuum of genetic ancestries in PGS interpretation. Our results highlight the need to move away from discrete genetic ancestry clusters towards the continuum of genetic ancestries when considering PGSs.

Subject(s)

Multifactorial Inheritance , Racial Groups , Humans , Europe/ethnology , Hispanic or Latino/genetics , Multifactorial Inheritance/genetics , Racial Groups/genetics , United Kingdom , White People/genetics , European People/genetics , Los Angeles , Databases, Genetic

2.

Inferring disease architecture and predictive ability with LDpred2-auto.

Privé, Florian; Albiñana, Clara; Arbel, Julyan; Pasaniuc, Bogdan; Vilhjálmsson, Bjarni J.

Am J Hum Genet ; 110(12): 2042-2055, 2023 Dec 07.

Article in English | MEDLINE | ID: mdl-37944514

ABSTRACT

LDpred2 is a widely used Bayesian method for building polygenic scores (PGSs). LDpred2-auto can infer the two parameters from the LDpred model, the SNP heritability h2 and polygenicity p, so that it does not require an additional validation dataset to choose best-performing parameters. The main aim of this paper is to properly validate the use of LDpred2-auto for inferring multiple genetic parameters. Here, we present a new version of LDpred2-auto that adds an optional third parameter α to its model, for modeling negative selection. We then validate the inference of these three parameters (or two, when using the previous model). We also show that LDpred2-auto provides per-variant probabilities of being causal that are well calibrated and can therefore be used for fine-mapping purposes. We also introduce a formula to infer the out-of-sample predictive performance r2 of the resulting PGS directly from the Gibbs sampler of LDpred2-auto. Finally, we extend the set of HapMap3 variants recommended to use with LDpred2 with 37% more variants to improve the coverage of this set, and we show that this new set of variants captures 12% more heritability and provides 6% more predictive performance, on average, in UK Biobank analyses.

Subject(s)

Genome-Wide Association Study , Multifactorial Inheritance , Humans , Bayes Theorem , Genome-Wide Association Study/methods , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics

3.

Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.

Privé, Florian; Aschard, Hugues; Carmi, Shai; Folkersen, Lasse; Hoggart, Clive; O'Reilly, Paul F; Vilhjálmsson, Bjarni J.

Am J Hum Genet ; 109(1): 12-23, 2022 01 06.

Article in English | MEDLINE | ID: mdl-34995502

ABSTRACT

The low portability of polygenic scores (PGSs) across global populations is a major concern that must be addressed before PGSs can be used for everyone in the clinic. Indeed, prediction accuracy has been shown to decay as a function of the genetic distance between the training and test cohorts. However, such cohorts differ not only in their genetic distance but also in their geographical distance and their data collection and assaying, conflating multiple factors. In this study, we examine the extent to which PGSs are transferable between ancestries by deriving polygenic scores for 245 curated traits from the UK Biobank data and applying them in nine ancestry groups from the same cohort. By restricting both training and testing to the UK Biobank data, we reduce the risk of environmental and genotyping confounding from using different cohorts. We define the nine ancestry groups at a sub-continental level, based on a simple, robust, and effective method that we introduce here. We then apply two different predictive methods to derive polygenic scores for all 245 phenotypes and show a systematic and dramatic reduction in portability of PGSs trained using Northwestern European individuals and applied to nine ancestry groups. These analyses demonstrate that prediction already drops off within European ancestries and reduces globally in proportion to genetic distance. Altogether, our study provides unique and robust insights into the PGS portability problem.

Subject(s)

Genetic Association Studies/methods , Genetic Predisposition to Disease , Genetics, Population/methods , Multifactorial Inheritance , Algorithms , Alleles , Biological Specimen Banks , Genetic Variation , Genome-Wide Association Study , Genotype , Humans , Models, Genetic , Phenotype , Reproducibility of Results , United Kingdom

4.

Accounting for age of onset and family history improves power in genome-wide association studies.

Pedersen, Emil M; Agerbo, Esben; Plana-Ripoll, Oleguer; Grove, Jakob; Dreier, Julie W; Musliner, Katherine L; Bækvad-Hansen, Marie; Athanasiadis, Georgios; Schork, Andrew; Bybjerg-Grauholm, Jonas; Hougaard, David M; Werge, Thomas; Nordentoft, Merete; Mors, Ole; Dalsgaard, Søren; Christensen, Jakob; Børglum, Anders D; Mortensen, Preben B; McGrath, John J; Privé, Florian; Vilhjálmsson, Bjarni J.

Am J Hum Genet ; 109(3): 417-432, 2022 03 03.

Article in English | MEDLINE | ID: mdl-35139346

ABSTRACT

Genome-wide association studies (GWASs) have revolutionized human genetics, allowing researchers to identify thousands of disease-related genes and possible drug targets. However, case-control status does not account for the fact that not all controls may have lived through their period of risk for the disorder of interest. This can be quantified by examining the age-of-onset distribution and the age of the controls or the age of onset for cases. The age-of-onset distribution may also depend on information such as sex and birth year. In addition, family history is not routinely included in the assessment of control status. Here, we present LT-FH++, an extension of the liability threshold model conditioned on family history (LT-FH), which jointly accounts for age of onset and sex as well as family history. Using simulations, we show that, when family history and the age-of-onset distribution are available, the proposed approach yields statistically significant power gains over LT-FH and large power gains over genome-wide association study by proxy (GWAX). We applied our method to four psychiatric disorders available in the iPSYCH data and to mortality in the UK Biobank and found 20 genome-wide significant associations with LT-FH++, compared to ten for LT-FH and eight for a standard case-control GWAS. As more genetic data with linked electronic health records become available to researchers, we expect methods that account for additional health information, such as LT-FH++, to become even more beneficial.

Subject(s)

Genetic Predisposition to Disease , Genome-Wide Association Study , Age of Onset , Case-Control Studies , Genome-Wide Association Study/methods , Humans , Medical History Taking

5.

Deep integrative models for large-scale human genomics.

Sigurdsson, Arnór I; Louloudis, Ioannis; Banasik, Karina; Westergaard, David; Winther, Ole; Lund, Ole; Ostrowski, Sisse Rye; Erikstrup, Christian; Pedersen, Ole Birger Vesterager; Nyegaard, Mette; Brunak, Søren; Vilhjálmsson, Bjarni J; Rasmussen, Simon.

Nucleic Acids Res ; 51(12): e67, 2023 07 07.

Article in English | MEDLINE | ID: mdl-37224538

ABSTRACT

Polygenic risk scores (PRSs) are expected to play a critical role in precision medicine. Currently, PRS predictors are generally based on linear models using summary statistics, and more recently individual-level data. However, these predictors mainly capture additive relationships and are limited in data modalities they can use. We developed a deep learning framework (EIR) for PRS prediction which includes a model, genome-local-net (GLN), specifically designed for large-scale genomics data. The framework supports multi-task learning, automatic integration of other clinical and biochemical data, and model explainability. When applied to individual-level data from the UK Biobank, the GLN model demonstrated a competitive performance compared to established neural network architectures, particularly for certain traits, showcasing its potential in modeling complex genetic relationships. Furthermore, the GLN model outperformed linear PRS methods for Type 1 Diabetes, likely due to modeling non-additive genetic effects and epistasis. This was supported by our identification of widespread non-additive genetic effects and epistasis in the context of T1D. Finally, we constructed PRS models that integrated genotype, blood, urine, and anthropometric data and found that this improved performance for 93% of the 290 diseases and disorders considered. EIR is available at https://github.com/arnor-sigurdsson/EIR.

Subject(s)

Models, Genetic , Multifactorial Inheritance , Polymorphism, Single Nucleotide , Humans , Genetic Predisposition to Disease , Genome, Human , Genome-Wide Association Study , Genomics/methods , Genotype , Risk Factors

6.

Leveraging both individual-level genetic data and GWAS summary statistics increases polygenic prediction.

Albiñana, Clara; Grove, Jakob; McGrath, John J; Agerbo, Esben; Wray, Naomi R; Bulik, Cynthia M; Nordentoft, Merete; Hougaard, David M; Werge, Thomas; Børglum, Anders D; Mortensen, Preben Bo; Privé, Florian; Vilhjálmsson, Bjarni J.

Am J Hum Genet ; 108(6): 1001-1011, 2021 06 03.

Article in English | MEDLINE | ID: mdl-33964208

ABSTRACT

The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate.

Subject(s)

Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Models, Statistical , Multifactorial Inheritance , Polymorphism, Single Nucleotide , Case-Control Studies , Humans , Phenotype

7.

Interplay of polygenic liability with birth-related, somatic, and psychosocial factors in anorexia nervosa risk: a nationwide study.

Papini, Natalie M; Presseller, Emily; Bulik, Cynthia M; Holde, Katrine; Larsen, Janne T; Thornton, Laura M; Albiñana, Clara; Vilhjálmsson, Bjarni J; Mortensen, Preben B; Yilmaz, Zeynep; Petersen, Liselotte V.

Psychol Med ; : 1-14, 2024 Feb 13.

Article in English | MEDLINE | ID: mdl-38347808

ABSTRACT

BACKGROUND: Although several types of risk factors for anorexia nervosa (AN) have been identified, including birth-related factors, somatic, and psychosocial risk factors, their interplay with genetic susceptibility remains unclear. Genetic and epidemiological interplay in AN risk were examined using data from Danish nationwide registers. AN polygenic risk score (PRS) and risk factor associations, confounding from AN PRS and/or parental psychiatric history on the association between the risk factors and AN risk, and interactions between AN PRS and each level of target risk factor on AN risk were estimated. METHODS: Participants were individuals born in Denmark between 1981 and 2008 including nationwide-representative data from the iPSYCH2015, and Danish AN cases from the Anorexia Nervosa Genetics Initiative and Eating Disorder Genetics Initiative cohorts. A total of 7003 individuals with AN and 45 229 individuals without a registered AN diagnosis were included. We included 22 AN risk factors from Danish registers. RESULTS: Risk factors showing association with PRS for AN included urbanicity, parental ages, genitourinary tract infection, and parental socioeconomic factors. Risk factors showed the expected association to AN risk, and this association was only slightly attenuated when adjusted for parental history of psychiatric disorders or/and for the AN PRS. The interaction analyses revealed a differential effect of AN PRS according to the level of the following risk factors: sex, maternal age, genitourinary tract infection, C-section, parental socioeconomic factors and psychiatric history. CONCLUSIONS: Our findings provide evidence for interactions between AN PRS and certain risk-factors, illustrating potential diverse risk pathways to AN diagnosis.

8.

Shared familial risk for type 2 diabetes mellitus and psychiatric disorders: a nationwide multigenerational genetics study.

Wimberley, Theresa; Brikell, Isabell; Astrup, Aske; Larsen, Janne T; Petersen, Liselotte V; Albiñana, Clara; Vilhjálmsson, Bjarni J; Bulik, Cynthia M; Chang, Zheng; Fanelli, Giuseppe; Bralten, Janita; Mota, Nina R; Salas-Salvadó, Jordi; Fernandez-Aranda, Fernando; Bulló, Monica; Franke, Barbara; Børglum, Anders; Mortensen, Preben B; Horsdal, Henriette T; Dalsgaard, Søren.

Psychol Med ; : 1-10, 2024 May 27.

Article in English | MEDLINE | ID: mdl-38801094

ABSTRACT

BACKGROUND: Psychiatric disorders and type 2 diabetes mellitus (T2DM) are heritable, polygenic, and often comorbid conditions, yet knowledge about their potential shared familial risk is lacking. We used family designs and T2DM polygenic risk score (T2DM-PRS) to investigate the genetic associations between psychiatric disorders and T2DM. METHODS: We linked 659 906 individuals born in Denmark 1990-2000 to their parents, grandparents, and aunts/uncles using population-based registers. We compared rates of T2DM in relatives of children with and without a diagnosis of any or one of 11 specific psychiatric disorders, including neuropsychiatric and neurodevelopmental disorders, using Cox regression. In a genotyped sample (iPSYCH2015) of individuals born 1981-2008 (n = 134 403), we used logistic regression to estimate associations between a T2DM-PRS and these psychiatric disorders. RESULTS: Among 5 235 300 relative pairs, relatives of individuals with a psychiatric disorder had an increased risk for T2DM with stronger associations for closer relatives (parents:hazard ratio = 1.38, 95% confidence interval 1.35-1.42; grandparents: 1.14, 1.13-1.15; and aunts/uncles: 1.19, 1.16-1.22). In the genetic sample, one standard deviation increase in T2DM-PRS was associated with an increased risk for any psychiatric disorder (odds ratio = 1.11, 1.08-1.14). Both familial T2DM and T2DM-PRS were significantly associated with seven of 11 psychiatric disorders, most strongly with attention-deficit/hyperactivity disorder and conduct disorder, and inversely with anorexia nervosa. CONCLUSIONS: Our findings of familial co-aggregation and higher T2DM polygenic liability associated with psychiatric disorders point toward shared familial risk. This suggests that part of the comorbidity is explained by shared familial risks. The underlying mechanisms still remain largely unknown and the contributions of genetics and environment need further investigation.

9.

Phenomewide Association Study of Health Outcomes Associated With the Genetic Correlates of 25 Hydroxyvitamin D Concentration and Vitamin D Binding Protein Concentration.

Kresge, Hailey A; Blostein, Freida; Goleva, Slavina; Albiñana, Clara; Revez, Joana A; Wray, Naomi R; Vilhjálmsson, Bjarni J; Zhu, Zhihong; McGrath, John J; Davis, Lea K.

Twin Res Hum Genet ; 27(2): 69-79, 2024 Apr.

Article in English | MEDLINE | ID: mdl-38644690

ABSTRACT

While it is known that vitamin D deficiency is associated with adverse bone outcomes, it remains unclear whether low vitamin D status may increase the risk of a wider range of health outcomes. We had the opportunity to explore the association between common genetic variants associated with both 25 hydroxyvitamin D (25OHD) and the vitamin D binding protein (DBP, encoded by the GC gene) with a comprehensive range of health disorders and laboratory tests in a large academic medical center. We used summary statistics for 25OHD and DBP to generate polygenic scores (PGS) for 66,482 participants with primarily European ancestry and 13,285 participants with primarily African ancestry from the Vanderbilt University Medical Center Biobank (BioVU). We examined the predictive properties of PGS25OHD, and two scores related to DBP concentration with respect to 1322 health-related phenotypes and 315 laboratory-measured phenotypes from electronic health records. In those with European ancestry: (a) the PGS25OHD and PGSDBP scores, and individual SNPs rs4588 and rs7041 were associated with both 25OHD concentration and 1,25 dihydroxyvitamin D concentrations; (b) higher PGS25OHD was associated with decreased concentrations of triglycerides and cholesterol, and reduced risks of vitamin D deficiency, disorders of lipid metabolism, and diabetes. In general, the findings for the African ancestry group were consistent with findings from the European ancestry analyses. Our study confirms the utility of PGS and two key variants within the GC gene (rs4588 and rs7041) to predict the risk of vitamin D deficiency in clinical settings and highlights the shared biology between vitamin D-related genetic pathways a range of health outcomes.

Subject(s)

Vitamin D-Binding Protein , Vitamin D , Humans , Vitamin D-Binding Protein/genetics , Vitamin D/blood , Vitamin D/genetics , Vitamin D/analogs & derivatives , Female , Male , Middle Aged , Adult , Genome-Wide Association Study , Polymorphism, Single Nucleotide , White People/genetics , Phenotype , Aged , Vitamin D Deficiency/genetics , Vitamin D Deficiency/blood , Vitamin D Deficiency/epidemiology , Multifactorial Inheritance/genetics

10.

Multitrait GWAS to connect disease variants and biological mechanisms.

Julienne, Hanna; Laville, Vincent; McCaw, Zachary R; He, Zihuai; Guillemot, Vincent; Lasry, Carla; Ziyatdinov, Andrey; Nerin, Cyril; Vaysse, Amaury; Lechat, Pierre; Ménager, Hervé; Le Goff, Wilfried; Dube, Marie-Pierre; Kraft, Peter; Ionita-Laza, Iuliana; Vilhjálmsson, Bjarni J; Aschard, Hugues.

PLoS Genet ; 17(8): e1009713, 2021 08.

Article in English | MEDLINE | ID: mdl-34460823

ABSTRACT

Genome-wide association studies (GWASs) have uncovered a wealth of associations between common variants and human phenotypes. Here, we present an integrative analysis of GWAS summary statistics from 36 phenotypes to decipher multitrait genetic architecture and its link with biological mechanisms. Our framework incorporates multitrait association mapping along with an investigation of the breakdown of genetic associations into clusters of variants harboring similar multitrait association profiles. Focusing on two subsets of immunity and metabolism phenotypes, we then demonstrate how genetic variants within clusters can be mapped to biological pathways and disease mechanisms. Finally, for the metabolism set, we investigate the link between gene cluster assignment and the success of drug targets in randomized controlled trials.

Subject(s)

Computational Biology/methods , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Cluster Analysis , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Phenotype

11.

Polygenic liability, stressful life events and risk for secondary-treated depression in early life: a nationwide register-based case-cohort study.

Musliner, Katherine L; Andersen, Klaus K; Agerbo, Esben; Albiñana, Clara; Vilhjalmsson, Bjarni J; Rajagopal, Veera M; Bybjerg-Grauholm, Jonas; Bækved-Hansen, Marie; Pedersen, Carsten B; Pedersen, Marianne G; Munk-Olsen, Trine; Benros, Michael E; Als, Thomas D; Grove, Jakob; Werge, Thomas; Børglum, Anders D; Hougaard, David M; Mors, Ole; Nordentoft, Merete; Mortensen, Preben B; Suppli, Nis P.

Psychol Med ; 53(1): 217-226, 2023 01.

Article in English | MEDLINE | ID: mdl-33949298

ABSTRACT

BACKGROUND: In this study, we examined the relationship between polygenic liability for depression and number of stressful life events (SLEs) as risk factors for early-onset depression treated in inpatient, outpatient or emergency room settings at psychiatric hospitals in Denmark. METHODS: Data were drawn from the iPSYCH2012 case-cohort sample, a population-based sample of individuals born in Denmark between 1981 and 2005. The sample included 18 532 individuals who were diagnosed with depression by a psychiatrist by age 31 years, and a comparison group of 20 184 individuals. Information on SLEs was obtained from nationwide registers and operationalized as a time-varying count variable. Hazard ratios and cumulative incidence rates were estimated using Cox regressions. RESULTS: Risk for depression increased by 35% with each standard deviation increase in polygenic liability (p < 0.0001), and 36% (p < 0.0001) with each additional SLE. There was a small interaction between polygenic liability and SLEs (ß = -0.04, p = 0.0009). The probability of being diagnosed with depression in a hospital-based setting between ages 15 and 31 years ranged from 1.5% among males in the lowest quartile of polygenic liability with 0 events by age 15, to 18.8% among females in the highest quartile of polygenic liability with 4+ events by age 15. CONCLUSIONS: These findings suggest that although there is minimal interaction between polygenic liability and SLEs as risk factors for hospital-treated depression, combining information on these two important risk factors could potentially be useful for identifying high-risk individuals.

Subject(s)

Depression , Life Change Events , Male , Female , Humans , Infant , Adult , Cohort Studies , Risk Factors , Proportional Hazards Models , Case-Control Studies

12.

Genome-wide association study of febrile seizures implicates fever response and neuronal excitability genes.

Skotte, Line; Fadista, João; Bybjerg-Grauholm, Jonas; Appadurai, Vivek; Hildebrand, Michael S; Hansen, Thomas F; Banasik, Karina; Grove, Jakob; Albiñana, Clara; Geller, Frank; Bjurström, Carmen F; Vilhjálmsson, Bjarni J; Coleman, Matthew; Damiano, John A; Burgess, Rosemary; Scheffer, Ingrid E; Pedersen, Ole Birger Vesterager; Erikstrup, Christian; Westergaard, David; Nielsen, Kaspar René; Sørensen, Erik; Bruun, Mie Topholm; Liu, Xueping; Hjalgrim, Henrik; Pers, Tune H; Mortensen, Preben Bo; Mors, Ole; Nordentoft, Merete; Dreier, Julie W; Børglum, Anders D; Christensen, Jakob; Hougaard, David M; Buil, Alfonso; Hviid, Anders; Melbye, Mads; Ullum, Henrik; Berkovic, Samuel F; Werge, Thomas; Feenstra, Bjarke.

Brain ; 145(2): 555-568, 2022 04 18.

Article in English | MEDLINE | ID: mdl-35022648

ABSTRACT

Febrile seizures represent the most common type of pathological brain activity in young children and are influenced by genetic, environmental and developmental factors. In a minority of cases, febrile seizures precede later development of epilepsy. We conducted a genome-wide association study of febrile seizures in 7635 cases and 83 966 controls identifying and replicating seven new loci, all with P < 5 × 10-10. Variants at two loci were functionally related to altered expression of the fever response genes PTGER3 and IL10, and four other loci harboured genes (BSN, ERC2, GABRG2, HERC1) influencing neuronal excitability by regulating neurotransmitter release and binding, vesicular transport or membrane trafficking at the synapse. Four previously reported loci (SCN1A, SCN2A, ANO3 and 12q21.33) were all confirmed. Collectively, the seven novel and four previously reported loci explained 2.8% of the variance in liability to febrile seizures, and the single nucleotide polymorphism heritability based on all common autosomal single nucleotide polymorphisms was 10.8%. GABRG2, SCN1A and SCN2A are well-established epilepsy genes and, overall, we found positive genetic correlations with epilepsies (rg = 0.39, P = 1.68 × 10-4). Further, we found that higher polygenic risk scores for febrile seizures were associated with epilepsy and with history of hospital admission for febrile seizures. Finally, we found that polygenic risk of febrile seizures was lower in febrile seizure patients with neuropsychiatric disease compared to febrile seizure patients in a general population sample. In conclusion, this largest genetic investigation of febrile seizures to date implicates central fever response genes as well as genes affecting neuronal excitability, including several known epilepsy genes. Further functional and genetic studies based on these findings will provide important insights into the complex pathophysiological processes of seizures with and without fever.

Subject(s)

Epilepsy , Seizures, Febrile , Anoctamins/genetics , Child , Child, Preschool , Epilepsy/genetics , Fever/complications , Fever/genetics , Genome-Wide Association Study , Humans , NAV1.1 Voltage-Gated Sodium Channel/genetics , Seizures, Febrile/genetics

13.

Making the Most of Clumping and Thresholding for Polygenic Scores.

Privé, Florian; Vilhjálmsson, Bjarni J; Aschard, Hugues; Blum, Michael G B.

Am J Hum Genet ; 105(6): 1213-1221, 2019 12 05.

Article in English | MEDLINE | ID: mdl-31761295

ABSTRACT

Polygenic prediction has the potential to contribute to precision medicine. Clumping and thresholding (C+T) is a widely used method to derive polygenic scores. When using C+T, several p value thresholds are tested to maximize predictive ability of the derived polygenic scores. Along with this p value threshold, we propose to tune three other hyper-parameters for C+T. We implement an efficient way to derive thousands of different C+T scores corresponding to a grid over four hyper-parameters. For example, it takes a few hours to derive 123K different C+T scores for 300K individuals and 1M variants using 16 physical cores. We find that optimizing over these four hyper-parameters improves the predictive performance of C+T in both simulations and real data applications as compared to tuning only the p value threshold. A particularly large increase can be noted when predicting depression status, from an AUC of 0.557 (95% CI: [0.544-0.569]) when tuning only the p value threshold to an AUC of 0.592 (95% CI: [0.580-0.604]) when tuning all four hyper-parameters we propose for C+T. We further propose stacked clumping and thresholding (SCT), a polygenic score that results from stacking all derived C+T scores. Instead of choosing one set of hyper-parameters that maximizes prediction in some training set, SCT learns an optimal linear combination of all C+T scores by using an efficient penalized regression. We apply SCT to eight different case-control diseases in the UK biobank data and find that SCT substantially improves prediction accuracy with an average AUC increase of 0.035 over standard C+T.

Subject(s)

Algorithms , Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide , Biological Specimen Banks , Case-Control Studies , Computer Simulation , Humans , Models, Genetic , United Kingdom

14.

LDpred2: better, faster, stronger.

Privé, Florian; Arbel, Julyan; Vilhjálmsson, Bjarni J.

Bioinformatics ; 36(22-23): 5424-5431, 2021 Apr 01.

Article in English | MEDLINE | ID: mdl-33326037

ABSTRACT

MOTIVATION: Polygenic scores have become a central tool in human genetics research. LDpred is a popular method for deriving polygenic scores based on summary statistics and a matrix of correlation between genetic variants. However, LDpred has limitations that may reduce its predictive performance. RESULTS: Here, we present LDpred2, a new version of LDpred that addresses these issues. We also provide two new options in LDpred2: a 'sparse' option that can learn effects that are exactly 0, and an 'auto' option that directly learns the two LDpred parameters from data. We benchmark predictive performance of LDpred2 against the previous version on simulated and real data, demonstrating substantial improvements in robustness and predictive accuracy compared to LDpred1. We then show that LDpred2 also outperforms other polygenic score methods recently developed, with a mean AUC over the 8 real traits analyzed here of 65.1%, compared to 63.8% for lassosum, 62.9% for PRS-CS and 61.5% for SBayesR. Note that LDpred2 provides more accurate polygenic scores when run genome-wide, instead of per chromosome. AVAILABILITY AND IMPLEMENTATION: LDpred2 is implemented in R package bigsnpr. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

15.

Performing Highly Efficient Genome Scans for Local Adaptation with R Package pcadapt Version 4.

Privé, Florian; Luu, Keurcien; Vilhjálmsson, Bjarni J; Blum, Michael G B.

Mol Biol Evol ; 37(7): 2153-2154, 2020 07 01.

Article in English | MEDLINE | ID: mdl-32343802

ABSTRACT

R package pcadapt is a user-friendly R package for performing genome scans for local adaptation. Here, we present version 4 of pcadapt which substantially improves computational efficiency while providing similar results. This improvement is made possible by using a different format for storing genotypes and a different algorithm for computing principal components of the genotype matrix, which is the most computationally demanding step in method pcadapt. These changes are seamlessly integrated into the existing pcadapt package, and users will experience a large reduction in computation time (by a factor of 20-60 in our analyses) as compared with previous versions.

Subject(s)

Adaptation, Biological , Genomics/methods , Software

16.

Efficient toolkit implementing best practices for principal component analysis of population genetic data.

Privé, Florian; Luu, Keurcien; Blum, Michael G B; McGrath, John J; Vilhjálmsson, Bjarni J.

Bioinformatics ; 36(16): 4449-4457, 2020 08 15.

Article in English | MEDLINE | ID: mdl-32415959

ABSTRACT

MOTIVATION: Principal component analysis (PCA) of genetic data is routinely used to infer ancestry and control for population structure in various genetic analyses. However, conducting PCA analyses can be complicated and has several potential pitfalls. These pitfalls include (i) capturing linkage disequilibrium (LD) structure instead of population structure, (ii) projected PCs that suffer from shrinkage bias, (iii) detecting sample outliers and (iv) uneven population sizes. In this work, we explore these potential issues when using PCA, and present efficient solutions to these. Following applications to the UK Biobank and the 1000 Genomes project datasets, we make recommendations for best practices and provide efficient and user-friendly implementations of the proposed solutions in R packages bigsnpr and bigutilsr. RESULTS: For example, we find that PC19-PC40 in the UK Biobank capture complex LD structure rather than population structure. Using our automatic algorithm for removing long-range LD regions, we recover 16 PCs that capture population structure only. Therefore, we recommend using only 16-18 PCs from the UK Biobank to account for population structure confounding. We also show how to use PCA to restrict analyses to individuals of homogeneous ancestry. Finally, when projecting individual genotypes onto the PCA computed from the 1000 Genomes project data, we find a shrinkage bias that becomes large for PC5 and beyond. We then demonstrate how to obtain unbiased projections efficiently using bigsnpr. Overall, we believe this work would be of interest for anyone using PCA in their analyses of genetic data, as well as for other omics data. AVAILABILITY AND IMPLEMENTATION: R packages bigsnpr and bigutilsr can be installed from either CRAN or GitHub (see https://github.com/privefl/bigsnpr). A tutorial on the steps to perform PCA on 1000G data is available at https://privefl.github.io/bigsnpr/articles/bedpca.html. All code used for this paper is available at https://github.com/privefl/paper4-bedpca/tree/master/code. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Subject(s)

Genetics, Population , Software , Algorithms , Humans , Linkage Disequilibrium , Principal Component Analysis

17.

Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort.

Privé, Florian; Aschard, Hugues; Carmi, Shai; Folkersen, Lasse; Hoggart, Clive; O'Reilly, Paul F; Vilhjálmsson, Bjarni J.

Am J Hum Genet ; 109(2): 373, 2022 Feb 03.

Article in English | MEDLINE | ID: mdl-35120604

18.

Genetic liability to major depression and risk of childhood asthma.

Liu, Xiaoqin; Munk-Olsen, Trine; Albiñana, Clara; Vilhjálmsson, Bjarni J; Pedersen, Emil M; Schlünssen, Vivi; Bækvad-Hansen, Marie; Bybjerg-Grauholm, Jonas; Nordentoft, Merete; Børglum, Anders D; Werge, Thomas; Hougaard, David M; Mortensen, Preben B; Agerbo, Esben.

Brain Behav Immun ; 89: 433-439, 2020 10.

Article in English | MEDLINE | ID: mdl-32735934

ABSTRACT

OBJECTIVE: Major depression and asthma frequently co-occur, suggesting shared genetic vulnerability between these two disorders. We aimed to determine whether a higher genetic liability to major depression was associated with increased childhood asthma risk, and if so, whether such an association differed by sex of the child. METHODS: We conducted a population-based cohort study comprising 16,687 singletons born between 1991 and 2005 in Denmark. We calculated the polygenic risk score (PRS) for major depression as a measure of genetic liability based on the summary statistics from the Major Depressive Disorder Psychiatric Genomics Consortium collaboration. The outcome was incident asthma from age 5 to 15 years, identified from the Danish National Patient Registry and the Danish National Prescription Registry. Stratified Cox regression was used to analyze the data. RESULTS: Greater genetic liability to major depression was associated with an increased asthma risk with a hazard ratio (HR) of 1.06 (95% CI: 1.01-1.10) per standard deviation increase in PRS. Children in the highest major depression PRS quartile had a HR for asthma of 1.20 (95% CI: 1.06-1.36), compared with children in the lowest quartile. However, major depression PRS explained only 0.03% of asthma variance (Pseudo-R2). The HRs of asthma by major depression PRS did not differ between boys and girls. CONCLUSION: Our results suggest a shared genetic contribution to major depression and childhood asthma, and there is no evidence of a sex-specific difference in the association.

Subject(s)

Asthma , Depressive Disorder, Major , Adolescent , Asthma/epidemiology , Asthma/genetics , Child , Child, Preschool , Cohort Studies , Depression , Depressive Disorder, Major/genetics , Female , Humans , Male , Multifactorial Inheritance

19.

Linking the association between circRNAs and Alzheimer's disease progression by multi-tissue circular RNA characterization.

Lo, IJu; Hill, Jamie; Vilhjálmsson, Bjarni J; Kjems, Jørgen.

RNA Biol ; 17(12): 1789-1797, 2020 12.

Article in English | MEDLINE | ID: mdl-32618510

ABSTRACT

Alzheimer's disease (AD) has devastating consequences for patients during its slow, progressive course. It is important to understand the pathology of AD onset. Recently, circular RNAs (circRNAs) have been found to participate in many human diseases including cancers and neurodegenerative conditions. In this study, we mined the published dataset on the AMP-AD Knowledge Portal from the Mount Sinai Brain Bank (MSBB) to describe the circRNA profiles at different AD stages in brain samples from four brain regions: anterior prefrontal cortex, superior temporal lobe, parahippocampal gyrus and inferior frontal gyrus. In total, we found 147 circRNAs to be differentially expressed (DE) for different AD severity levels in the four regions. We also characterized the mRNA-circRNA co-expression network and annotated the potential function of circRNAs based on the co-expressed modules. Based on our results, we found that the most circRNA-regulated region in AD patients with severe symptoms was the parahippocampal gyrus. The strongest negatively AD severity-correlated module in the parahippocampal gyrus was enriched in cognitive disability and pathological-associated pathways such as synapse organization and regulation of membrane potential. Finally, a regression model based on the expression pattern of DE circRNAs in the module could help to distinguish the disease severity of patients, further supporting a role for circRNAs in AD pathology. In conclusion, our findings indicate that circRNAs in parahippocampal gyrus are possible biomarkers and regulators of AD as well as potential therapeutic targets.

Subject(s)

Alzheimer Disease/genetics , Alzheimer Disease/pathology , Gene Expression Regulation , RNA, Circular/genetics , Alzheimer Disease/metabolism , Biomarkers , Brain/metabolism , Brain/pathology , Computational Biology/methods , Disease Progression , Disease Susceptibility , Gene Expression Profiling/methods , Gene Regulatory Networks , Humans , Molecular Sequence Annotation , Organ Specificity/genetics , RNA, Messenger/genetics , ROC Curve , Transcriptome

20.

The nature of confounding in genome-wide association studies.

Vilhjálmsson, Bjarni J; Nordborg, Magnus.

Nat Rev Genet ; 14(1): 1-2, 2013 Jan.

Article in English | MEDLINE | ID: mdl-23165185

ABSTRACT

The authors argue that population structure per se is not a problem in genome-wide association studies - the true sources are the environment and the genetic background, and the latter is greatly underappreciated. They conclude that mixed models effectively address this issue.

Subject(s)

Genome-Wide Association Study , Animals , Confounding Factors, Epidemiologic , Gene-Environment Interaction , Humans

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL