ABSTRACT
There is increasing interest in the potential contribution of the gut microbiome to autism spectrum disorder (ASD). However, previous studies have been underpowered and have not been designed to address potential confounding factors in a comprehensive way. We performed a large autism stool metagenomics study (n = 247) based on participants from the Australian Autism Biobank and the Queensland Twin Adolescent Brain project. We found negligible direct associations between ASD diagnosis and the gut microbiome. Instead, our data support a model whereby ASD-related restricted interests are associated with less-diverse diet, and in turn reduced microbial taxonomic diversity and looser stool consistency. In contrast to ASD diagnosis, our dataset was well powered to detect microbiome associations with traits such as age, dietary intake, and stool consistency. Overall, microbiome differences in ASD may reflect dietary preferences that relate to diagnostic features, and we caution against claims that the microbiome has a driving role in ASD.
Subject(s)
Autistic Disorder/microbiology , Feeding Behavior , Gastrointestinal Microbiome , Adolescent , Age Factors , Autistic Disorder/diagnosis , Behavior , Child , Child, Preschool , Feces/microbiology , Female , Humans , Male , Phenotype , Phylogeny , Species SpecificityABSTRACT
The evidence that most adult-onset common diseases have a polygenic genetic architecture fully consistent with robust biological systems supported by multiple back-up mechanisms is now overwhelming. In this context, we consider the recent "omnigenic" or "core genes" model. A key assumption of the model is that there is a relatively small number of core genes relevant to any disease. While intuitively appealing, this model may underestimate the biological complexity of common disease, and therefore, the goal to discover core genes should not guide experimental design. We consider other implications of polygenicity, concluding that a focus on patient stratification is needed to achieve the goals of precision medicine.
Subject(s)
Disease/genetics , Models, Genetic , Genome-Wide Association Study , Humans , Multifactorial Inheritance/genetics , Precision MedicineABSTRACT
The coefficient of determination (R2) is a well-established measure to indicate the predictive ability of polygenic scores (PGSs). However, the sampling variance of R2 is rarely considered so that 95% confidence intervals (CI) are not usually reported. Moreover, when comparisons are made between PGSs based on different discovery samples, the sampling covariance of R2 is required to test the difference between them. Here, we show how to estimate the variance and covariance of R2 values to assess the 95% CI and p value of the R2 difference. We apply this approach to real data calculating PGSs in 28,880 European participants derived from UK Biobank (UKBB) and Biobank Japan (BBJ) GWAS summary statistics for cholesterol and BMI. We quantify the significantly higher predictive ability of UKBB PGSs compared to BBJ PGSs (p value 7.6e-31 for cholesterol and 1.4e-50 for BMI). A joint model of UKBB and BBJ PGSs significantly improves the predictive ability, compared to a model of UKBB PGS only (p value 3.5e-05 for cholesterol and 1.3e-28 for BMI). We also show that the predictive ability of regulatory SNPs is significantly enriched over non-regulatory SNPs for cholesterol (p value 8.9e-26 for UKBB and 3.8e-17 for BBJ). We suggest that the proposed approach (available in R package r2redux) should be used to test the statistical significance of difference between pairs of PGSs, which may help to draw a correct conclusion about the comparative predictive ability of PGSs.
Subject(s)
Multifactorial Inheritance , Polymorphism, Single Nucleotide , Humans , Genome-Wide Association StudyABSTRACT
Gene-based association tests aggregate multiple SNP-trait associations into sets defined by gene boundaries and are widely used in post-GWAS analysis. A common approach for gene-based tests is to combine SNPs associations by computing the sum of χ2 statistics. However, this strategy ignores the directions of SNP effects, which could result in a loss of power for SNPs with masking effects, e.g., when the product of two SNP effects and the linkage disequilibrium (LD) correlation is negative. Here, we introduce "mBAT-combo," a set-based test that is better powered than other methods to detect multi-SNP associations in the context of masking effects. We validate the method through simulations and applications to real data. We find that of 35 blood and urine biomarker traits in the UK Biobank, 34 traits show evidence for masking effects in a total of 4,273 gene-trait pairs, indicating that masking effects is common in complex traits. We further validate the improved power of our method in height, body mass index, and schizophrenia with different GWAS sample sizes and show that on average 95.7% of the genes detected only by mBAT-combo with smaller sample sizes can be identified by the single-SNP approach with a 1.7-fold increase in sample sizes. Eleven genes significant only in mBAT-combo for schizophrenia are confirmed by functionally informed fine-mapping or Mendelian randomization integrating gene expression data. The framework of mBAT-combo can be applied to any set of SNPs to refine trait-association signals hidden in genomic regions with complex LD structures.
Subject(s)
Genome-Wide Association Study , Multifactorial Inheritance , Humans , Genome-Wide Association Study/methods , Phenotype , Linkage Disequilibrium , Genomics , Polymorphism, Single Nucleotide/geneticsABSTRACT
In polygenic score (PGS) analysis, the coefficient of determination (R2) is a key statistic to evaluate efficacy. R2 is the proportion of phenotypic variance explained by the PGS, calculated in a cohort that is independent of the genome-wide association study (GWAS) that provided estimates of allelic effect sizes. The SNP-based heritability (hSNP2, the proportion of total phenotypic variances attributable to all common SNPs) is the theoretical upper limit of the out-of-sample prediction R2. However, in real data analyses R2 has been reported to exceed hSNP2, which occurs in parallel with the observation that hSNP2 estimates tend to decline as the number of cohorts being meta-analyzed increases. Here, we quantify why and when these observations are expected. Using theory and simulation, we show that if heterogeneities in cohort-specific hSNP2 exist, or if genetic correlations between cohorts are less than one, hSNP2 estimates can decrease as the number of cohorts being meta-analyzed increases. We derive conditions when the out-of-sample prediction R2 will be greater than hSNP2 and show the validity of our derivations with real data from a binary trait (major depression) and a continuous trait (educational attainment). Our research calls for a better approach to integrating information from multiple cohorts to address issues of between-cohort heterogeneity.
Subject(s)
Genome-Wide Association Study , Polymorphism, Single Nucleotide , Humans , Polymorphism, Single Nucleotide/genetics , Multifactorial Inheritance/genetics , Phenotype , Computer SimulationABSTRACT
Vitamin D status-a complex trait influenced by environmental and genetic factors-is tightly associated with skin colour and ancestry. Yet very few studies have investigated the genetic underpinnings of vitamin D levels across diverse ancestries, and the ones that have, relied on small sample sizes, resulting in inconclusive results. Here, we conduct genome-wide association studies (GWAS) of 25 hydroxyvitamin D (25OHD)-the main circulating form of vitamin D-in 442,435 individuals from four broad genetically-determined ancestry groups represented in the UK Biobank: European (N = 421,867), South Asian (N = 9,983), African (N = 8,306) and East Asian (N = 2,279). We identify a new genetic determinant of 25OHD (rs146759773) in individuals of African ancestry, which was not detected in previous analysis of much larger European cohorts due to low minor allele frequency. We show genome-wide significant evidence of dominance effects in 25OHD that protect against vitamin D deficiency. Given that key events in the synthesis of 25OHD occur in the skin and are affected by pigmentation levels, we conduct GWAS of 25OHD stratified by skin colour and identify new associations. Lastly, we test the interaction between skin colour and variants associated with variance in 25OHD levels and identify two loci (rs10832254 and rs1352846) whose association with 25OHD differs in individuals of distinct complexions. Collectively, our results provide new insights into the complex relationship between 25OHD and skin colour and highlight the importance of diversity in genomic studies. Despite the much larger rates of vitamin D deficiency that we and others report for ancestry groups with dark skin (e.g., South Asian), our study highlights the importance of considering ancestral background and/or skin colour when assessing the implications of low vitamin D.
Subject(s)
Genome-Wide Association Study , Vitamin D Deficiency , Humans , Polymorphism, Single Nucleotide/genetics , Vitamin D/genetics , Vitamin D Deficiency/geneticsABSTRACT
Testing the effect of rare variants on phenotypic variation is difficult due to the need for extremely large cohorts to identify associated variants given expected effect sizes. An alternative approach is to investigate the effect of rare genetic variants on DNA methylation (DNAm) as effect sizes are expected to be larger for molecular traits compared with complex traits. Here, we investigate DNAm in healthy ageing populations-the Lothian Birth Cohorts of 1921 and 1936-and identify both transient and stable outlying DNAm levels across the genome. We find an enrichment of rare genetic single nucleotide polymorphisms (SNPs) within 1 kb of DNAm sites in individuals with stable outlying DNAm, implying genetic control of this extreme variation. Using a family-based cohort, the Brisbane Systems Genetics Study, we observed increased sharing of DNAm outliers among more closely related individuals, consistent with these outliers being driven by rare genetic variation. We demonstrated that outlying DNAm levels have a functional consequence on gene expression levels, with extreme levels of DNAm being associated with gene expression levels toward the tails of the population distribution. This study demonstrates the role of rare SNPs in the phenotypic variation of DNAm and the effect of extreme levels of DNAm on gene expression.
Subject(s)
DNA Methylation , Gene Expression Regulation , Humans , DNA Methylation/genetics , Phenotype , Multifactorial Inheritance , Epigenesis, GeneticABSTRACT
The dominant ('general') version of the diathesis-stress theory of depression views stressors and genetic vulnerability as independent risks. In the Australian Genetics of Depression Study (N = 14,146; 75% female), we tested whether polygenic scores (PGS) for major depression, bipolar disorder, schizophrenia, anxiety, ADHD, and neuroticism were associated with reported exposure to 32 childhood, past-year, lifetime, and accumulated stressful life events (SLEs). In false discovery rate-corrected models, the clearest PGS-SLE relationships were for the ADHD- and depression-PGSs, and to a lesser extent, the anxiety- and schizophrenia-PGSs. We describe the associations for childhood and accumulated SLEs, and the 2-3 strongest past-year/lifetime SLE associations. Higher ADHD-PGS was associated with all childhood SLEs (emotional abuse, emotional neglect, physical neglect; ORs = 1.09-1.14; p's < 1.3 × 10-5), more accumulated SLEs, and reported exposure to sudden violent death (OR = 1.23; p = 3.6 × 10-5), legal troubles (OR = 1.15; p = 0.003), and sudden accidental death (OR = 1.14; p = 0.006). Higher depression-PGS was associated with all childhood SLEs (ORs = 1.07-1.12; p's < 0.013), more accumulated SLEs, and severe human suffering (OR = 1.17; p = 0.003), assault with a weapon (OR = 1.12; p = 0.003), and living in unpleasant surroundings (OR = 1.11; p = 0.001). Higher anxiety-PGS was associated with childhood emotional abuse (OR = 1.08; p = 1.6 × 10-4), more accumulated SLEs, and serious accident (OR = 1.23; p = 0.004), physical assault (OR = 1.08; p = 2.2 × 10-4), and transportation accident (OR = 1.07; p = 0.001). Higher schizophrenia-PGS was associated with all childhood SLEs (ORs = 1.12-1.19; p's < 9.3-8), more accumulated SLEs, and severe human suffering (OR = 1.16; p = 0.003). Higher neuroticism-PGS was associated with living in unpleasant surroundings (OR = 1.09; p = 0.007) and major financial troubles (OR = 1.06; p = 0.014). A reversed pattern was seen for the bipolar-PGS, with lower odds of reported physical assault (OR = 0.95; p = 0.014), major financial troubles (OR = 0.93; p = 0.004), and living in unpleasant surroundings (OR = 0.92; p = 0.007). Genetic risk for several mental disorders influences reported exposure to SLEs among adults with moderately severe, recurrent depression. Our findings emphasise that stressors and diatheses are inter-dependent and challenge diagnosis and subtyping (e.g., reactive/endogenous) based on life events.
Subject(s)
Life Change Events , Multifactorial Inheritance , Neuroticism , Stress, Psychological , Humans , Female , Male , Adult , Multifactorial Inheritance/genetics , Stress, Psychological/genetics , Middle Aged , Depressive Disorder, Major/genetics , Depressive Disorder, Major/epidemiology , Depressive Disorder, Major/psychology , Depression/genetics , Depression/psychology , Australia/epidemiology , Genetic Predisposition to Disease/genetics , Schizophrenia/genetics , Schizophrenia/epidemiology , Bipolar Disorder/genetics , Bipolar Disorder/psychology , Attention Deficit Disorder with Hyperactivity/genetics , Attention Deficit Disorder with Hyperactivity/psychology , Mental Disorders/genetics , Mental Disorders/epidemiology , Mental Disorders/psychology , Anxiety Disorders/genetics , Anxiety Disorders/epidemiology , Anxiety/genetics , Adverse Childhood Experiences/psychology , ChildABSTRACT
The genetic correlation describes the genetic relationship between two traits and can contribute to a better understanding of the shared biological pathways and/or the causality relationships between them. The rarity of large family cohorts with recorded instances of two traits, particularly disease traits, has made it difficult to estimate genetic correlations using traditional epidemiological approaches. However, advances in genomic methodologies, such as genome-wide association studies, and widespread sharing of data now allow genetic correlations to be estimated for virtually any trait pair. Here, we review the definition, estimation, interpretation and uses of genetic correlations, with a focus on applications to human disease.
Subject(s)
Disease , Genome-Wide Association Study , Humans , Models, Genetic , Multifactorial Inheritance , PhenotypeABSTRACT
The environment and events that we are exposed to in utero, during birth and in early childhood influence our future physical and mental health. The underlying mechanisms that lead to these outcomes are unclear, but long-term changes in epigenetic marks, such as DNA methylation, could act as a mediating factor or biomarker. DNA methylation data were assayed at 713 522 CpG sites from 9537 participants of the Generation Scotland: Scottish Family Health Study, a family-based cohort with extensive genetic, medical, family history and lifestyle information. Methylome-wide association studies of eight early life environment phenotypes and two adult mental health phenotypes (major depressive disorder and brief resilience scale) were conducted using DNA methylation data collected from adult whole blood samples. Two genes involved with different developmental pathways (PRICKLE2, Prickle Planar Cell Polarity Protein 2 and ABI1, Abl-Interactor-1) were annotated to CpG sites associated with preterm birth (P < 1.27 × 10-9). A further two genes important to the development of sensory pathways (SOBP, Sine Oculis Binding Protein Homolog and RPGRIP1, Retinitis Pigmentosa GTPase Regulator Interacting Protein) were annotated to sites associated with low birth weight (P < 4.35 × 10-8). The examination of methylation profile scores and genes and gene-sets annotated from associated CpGs sites found no evidence of overlap between the early life environment and mental health conditions. Birth date was associated with a significant difference in estimated lymphocyte and neutrophil counts. Previous studies have shown that early life environments influence the risk of developing mental health disorders later in life; however, this study found no evidence that this is mediated by stable changes to the methylome detectable in peripheral blood.
Subject(s)
Depressive Disorder, Major , Premature Birth , Adaptor Proteins, Signal Transducing , Child, Preschool , CpG Islands/genetics , Cytoskeletal Proteins , DNA Methylation/genetics , Epigenesis, Genetic , Epigenome , Female , Humans , Infant, Newborn , Mental Health , PregnancyABSTRACT
Polygenic risk scores (PRSs) enable early prediction of disease risk. Evaluating PRS performance for binary traits commonly relies on the area under the receiver operating characteristic curve (AUC). However, the widely used DeLong's method for comparative significance tests suffer from limitations, including computational time and the lack of a one-to-one mapping between test statistics based on AUC and R 2 . To overcome these limitations, we propose a novel approach that leverages the Delta method to derive the variance and covariance of AUC values, enabling a comprehensive and efficient comparative significance test. Our approach offers notable advantages over DeLong's method, including reduced computation time (up to 150-fold), making it suitable for large-scale analyses and ideal for integration into machine learning frameworks. Furthermore, our method allows for a direct one-to-one mapping between AUC and R 2 values for comparative significance tests, providing enhanced insights into the relationship between these measures and facilitating their interpretation. We validated our proposed approach through simulations and applied it to real data comparing PRSs for diabetes and coronary artery disease (CAD) prediction in a cohort of 28,880 European individuals. The PRSs were derived using genome-wide association study summary statistics from two distinct sources. Our approach enabled a comprehensive and informative comparison of the PRSs, shedding light on their respective predictive abilities for diabetes and CAD. This advancement contributes to the assessment of genetic risk factors and personalized disease prediction, supporting better healthcare decision-making.
Subject(s)
Area Under Curve , Coronary Artery Disease , Genetic Predisposition to Disease , Genome-Wide Association Study , Multifactorial Inheritance , Humans , Genome-Wide Association Study/methods , Coronary Artery Disease/genetics , Coronary Artery Disease/diagnosis , ROC Curve , Polymorphism, Single Nucleotide , Risk Factors , Machine Learning , Diabetes Mellitus/geneticsABSTRACT
Across species, offspring of related individuals often exhibit significant reduction in fitness-related traits, known as inbreeding depression (ID), yet the genetic and molecular basis for ID remains elusive. Here, we develop a method to quantify enrichment of ID within specific genomic annotations and apply it to human data. We analyzed the phenomes and genomes of â¼350,000 unrelated participants of the UK Biobank and found, on average of over 11 traits, significant enrichment of ID within genomic regions with high recombination rates (>21-fold; p < 10-5), with conserved function across species (>19-fold; p < 10-4), and within regulatory elements such as DNase I hypersensitive sites (â¼5-fold; p = 8.9 × 10-7). We also quantified enrichment of ID within trait-associated regions and found suggestive evidence that genomic regions contributing to additive genetic variance in the population are enriched for ID signal. We find strong correlations between functional enrichment of SNP-based heritability and that of ID (r = 0.8, standard error: 0.1). These findings provide empirical evidence that ID is most likely due to many partially recessive deleterious alleles in low linkage disequilibrium regions of the genome. Our study suggests that functional characterization of ID may further elucidate the genetic architectures and biological mechanisms underlying complex traits and diseases.
Subject(s)
Genome-Wide Association Study , Genomics/methods , Inbreeding Depression/genetics , Linkage Disequilibrium , Multifactorial Inheritance/genetics , Phenotype , Polymorphism, Single Nucleotide , Female , Humans , MaleABSTRACT
Non-additive genetic variance for complex traits is traditionally estimated from data on relatives. It is notoriously difficult to estimate without bias in non-laboratory species, including humans, because of possible confounding with environmental covariance among relatives. In principle, non-additive variance attributable to common DNA variants can be estimated from a random sample of unrelated individuals with genome-wide SNP data. Here, we jointly estimate the proportion of variance explained by additive (hSNP2), dominance (δSNP2) and additive-by-additive (ηSNP2) genetic variance in a single analysis model. We first show by simulations that our model leads to unbiased estimates and provide a new theory to predict standard errors estimated using either least-squares or maximum likelihood. We then apply the model to 70 complex traits using 254,679 unrelated individuals from the UK Biobank and 1.1 M genotyped and imputed SNPs. We found strong evidence for additive variance (average across traits h¯SNP2=0.208). In contrast, the average estimate of δ¯SNP2 across traits was 0.001, implying negligible dominance variance at causal variants tagged by common SNPs. The average epistatic variance η¯SNP2 across the traits was 0.055, not significantly different from zero because of the large sampling variance. Our results provide new evidence that genetic variance for complex traits is predominantly additive and that sample sizes of many millions of unrelated individuals are needed to estimate epistatic variance with sufficient precision.
Subject(s)
Datasets as Topic , Multifactorial Inheritance/genetics , Polymorphism, Single Nucleotide/genetics , Biological Specimen Banks , Epistasis, Genetic , Female , Genotype , Humans , Male , Models, Genetic , Phenotype , Reproducibility of Results , United KingdomABSTRACT
The accuracy of polygenic risk scores (PRSs) to predict complex diseases increases with the training sample size. PRSs are generally derived based on summary statistics from large meta-analyses of multiple genome-wide association studies (GWASs). However, it is now common for researchers to have access to large individual-level data as well, such as the UK Biobank data. To the best of our knowledge, it has not yet been explored how best to combine both types of data (summary statistics and individual-level data) to optimize polygenic prediction. The most widely used approach to combine data is the meta-analysis of GWAS summary statistics (meta-GWAS), but we show that it does not always provide the most accurate PRS. Through simulations and using 12 real case-control and quantitative traits from both iPSYCH and UK Biobank along with external GWAS summary statistics, we compare meta-GWAS with two alternative data-combining approaches, stacked clumping and thresholding (SCT) and meta-PRS. We find that, when large individual-level data are available, the linear combination of PRSs (meta-PRS) is both a simple alternative to meta-GWAS and often more accurate.
Subject(s)
Disease/genetics , Genetic Predisposition to Disease , Genome-Wide Association Study , Models, Statistical , Multifactorial Inheritance , Polymorphism, Single Nucleotide , Case-Control Studies , Humans , PhenotypeABSTRACT
BACKGROUND: Diagnostic criteria for major depressive disorder allow for heterogeneous symptom profiles but genetic analysis of major depressive symptoms has the potential to identify clinical and etiological subtypes. There are several challenges to integrating symptom data from genetically informative cohorts, such as sample size differences between clinical and community cohorts and various patterns of missing data. METHODS: We conducted genome-wide association studies of major depressive symptoms in three cohorts that were enriched for participants with a diagnosis of depression (Psychiatric Genomics Consortium, Australian Genetics of Depression Study, Generation Scotland) and three community cohorts who were not recruited on the basis of diagnosis (Avon Longitudinal Study of Parents and Children, Estonian Biobank, and UK Biobank). We fit a series of confirmatory factor models with factors that accounted for how symptom data was sampled and then compared alternative models with different symptom factors. RESULTS: The best fitting model had a distinct factor for Appetite/Weight symptoms and an additional measurement factor that accounted for the skip-structure in community cohorts (use of Depression and Anhedonia as gating symptoms). CONCLUSION: The results show the importance of assessing the directionality of symptoms (such as hypersomnia versus insomnia) and of accounting for study and measurement design when meta-analyzing genetic association data.
Subject(s)
Depressive Disorder, Major , Genome-Wide Association Study , Humans , Depressive Disorder, Major/genetics , Male , Adult , Female , Middle Aged , Cohort Studies , Australia/epidemiology , Aged , ScotlandABSTRACT
While it is known that vitamin D deficiency is associated with adverse bone outcomes, it remains unclear whether low vitamin D status may increase the risk of a wider range of health outcomes. We had the opportunity to explore the association between common genetic variants associated with both 25 hydroxyvitamin D (25OHD) and the vitamin D binding protein (DBP, encoded by the GC gene) with a comprehensive range of health disorders and laboratory tests in a large academic medical center. We used summary statistics for 25OHD and DBP to generate polygenic scores (PGS) for 66,482 participants with primarily European ancestry and 13,285 participants with primarily African ancestry from the Vanderbilt University Medical Center Biobank (BioVU). We examined the predictive properties of PGS25OHD, and two scores related to DBP concentration with respect to 1322 health-related phenotypes and 315 laboratory-measured phenotypes from electronic health records. In those with European ancestry: (a) the PGS25OHD and PGSDBP scores, and individual SNPs rs4588 and rs7041 were associated with both 25OHD concentration and 1,25 dihydroxyvitamin D concentrations; (b) higher PGS25OHD was associated with decreased concentrations of triglycerides and cholesterol, and reduced risks of vitamin D deficiency, disorders of lipid metabolism, and diabetes. In general, the findings for the African ancestry group were consistent with findings from the European ancestry analyses. Our study confirms the utility of PGS and two key variants within the GC gene (rs4588 and rs7041) to predict the risk of vitamin D deficiency in clinical settings and highlights the shared biology between vitamin D-related genetic pathways a range of health outcomes.
Subject(s)
Vitamin D-Binding Protein , Vitamin D , Humans , Vitamin D-Binding Protein/genetics , Vitamin D/blood , Vitamin D/genetics , Vitamin D/analogs & derivatives , Female , Male , Middle Aged , Adult , Genome-Wide Association Study , Polymorphism, Single Nucleotide , White People/genetics , Phenotype , Aged , Vitamin D Deficiency/genetics , Vitamin D Deficiency/blood , Vitamin D Deficiency/epidemiology , Multifactorial Inheritance/geneticsABSTRACT
Fisher's partitioning of genotypic values and genetic variance is highly relevant in the current era of genome-wide association studies (GWASs). However, despite being more than a century old, a number of persistent misconceptions related to nonadditive genetic effects remain. We developed a user-friendly web tool, the Falconer ShinyApp, to show how the combination of gene action and allele frequencies at causal loci translate to genetic variance and genetic variance components for a complex trait. The app can be used to demonstrate the relationship between a SNP effect size estimated from GWAS and the variation the SNP generates in the population, i.e., how locus-specific effects lead to individual differences in traits. In addition, it can also be used to demonstrate how within and between locus interactions (dominance and epistasis, respectively) usually do not lead to a large amount of nonadditive variance relative to additive variance, and therefore, that these interactions usually do not explain individual differences in a population.
Subject(s)
Genes/genetics , Genetic Variation , Genome-Wide Association Study , Internet , Software , Epistasis, Genetic , Gene Frequency , Genes, Dominant , Genetic Loci/genetics , Genotype , Humans , Models, Genetic , Polymorphism, Single NucleotideABSTRACT
INTRODUCTION: Sex steroid hormone fluctuations may underlie both reproductive disorders and sex differences in lifetime depression prevalence. Previous studies report high comorbidity among reproductive disorders and between reproductive disorders and depression. This study sought to assess the multivariate genetic architecture of reproductive disorders and their loading onto a common genetic factor and investigated whether this latent factor shares a common genetic architecture with female depression, including perinatal depression (PND). METHOD: Using UK Biobank and FinnGen data, genome-wide association meta-analyses were conducted for nine reproductive disorders, and genetic correlation between disorders was estimated. Genomic Structural Equation Modelling identified a latent genetic factor underlying disorders, accounting for their significant genetic correlations. SNPs significantly associated with both latent factor and depression were identified. RESULTS: Excellent model fit existed between a latent factor underlying five reproductive disorders (χ2 (5) = 6.4; AIC = 26.4; CFI = 1.00; SRMR = 0.03) with high standardised loadings for menorrhagia (0.96, SE = 0.05); ovarian cysts (0.94, SE = 0.05); endometriosis (0.83, SE = 0.05); menopausal symptoms (0.77, SE = 0.10); and uterine fibroids (0.65, SE = 0.05). This latent factor was genetically correlated with PND (rG = 0.37, SE = 0.15, p = 1.4e-03), depression in females only (rG = 0.48, SE = 0.06, p = 7.2e-11), and depression in both males and females (MD) (rG = 0.35, SE = 0.03, p = 1.8e-30), with its top locus associated with FSHB/ARL14EP (rs11031006; p = 9.1e-33). SNPs intronic to ESR1, significantly associated with the latent factor, were also associated with PND, female depression, and MD. CONCLUSION: A common genetic factor, correlated with depression, underlies risk of reproductive disorders, with implications for aetiology and treatment. Genetic variation in ESR1 is associated with reproductive disorders and depression, highlighting the importance of oestrogen signalling for both reproductive and mental health.
Subject(s)
Depression , Genome-Wide Association Study , Pregnancy , Humans , Male , Female , Reproduction , Risk Factors , ComorbidityABSTRACT
Samples can be prone to ascertainment and attrition biases. The Australian Genetics of Depression Study is a large publicly recruited cohort (n = 20,689) established to increase the understanding of depression and antidepressant treatment response. This study investigates differences between participants who donated a saliva sample or agreed to linkage of their records compared to those who did not. We observed that older, male participants with higher education were more likely to donate a saliva sample. Self-reported bipolar disorder, ADHD, panic disorder, PTSD, substance use disorder, and social anxiety disorder were associated with lower odds of donating a saliva sample, whereas anorexia was associated with higher odds of donation. Male and younger participants showed higher odds of agreeing to record linkage. Participants with higher neuroticism scores and those with a history of bipolar disorder were also more likely to agree to record linkage whereas participants with a diagnosis of anorexia were less likely to agree. Increased likelihood of consent was associated with increased genetic susceptibility to anorexia and reduced genetic risk for depression, and schizophrenia. Overall, our results show moderate differences among these subsamples. Most current epidemiological studies do not search for attrition biases at the genetic level. The possibility to do so is a strength of samples such as the AGDS. Our results suggest that analyses can be made more robust by identifying attrition biases both on the phenotypic and genetic level, and either contextualising them as a potential limitation or performing sensitivity analyses adjusting for them.