ABSTRACT
Assumptions are made about the genetic model of single nucleotide polymorphisms (SNPs) when choosing a traditional genetic encoding: additive, dominant, and recessive. Furthermore, SNPs across the genome are unlikely to demonstrate identical genetic models. However, running SNP-SNP interaction analyses with every combination of encodings raises the multiple testing burden. Here, we present a novel and flexible encoding for genetic interactions, the elastic data-driven genetic encoding (EDGE), in which SNPs are assigned a heterozygous value based on the genetic model they demonstrate in a dataset prior to interaction testing. We assessed the power of EDGE to detect genetic interactions using 29 combinations of simulated genetic models and found it outperformed the traditional encoding methods across 10%, 30%, and 50% minor allele frequencies (MAFs). Further, EDGE maintained a low false-positive rate, while additive and dominant encodings demonstrated inflation. We evaluated EDGE and the traditional encodings with genetic data from the Electronic Medical Records and Genomics (eMERGE) Network for five phenotypes: age-related macular degeneration (AMD), age-related cataract, glaucoma, type 2 diabetes (T2D), and resistant hypertension. A multi-encoding genome-wide association study (GWAS) for each phenotype was performed using the traditional encodings, and the top results of the multi-encoding GWAS were considered for SNP-SNP interaction using the traditional encodings and EDGE. EDGE identified a novel SNP-SNP interaction for age-related cataract that no other method identified: rs7787286 (MAF: 0.041; intergenic region of chromosome 7)-rs4695885 (MAF: 0.34; intergenic region of chromosome 4) with a Bonferroni LRT p of 0.018. A SNP-SNP interaction was found in data from the UK Biobank within 25 kb of these SNPs using the recessive encoding: rs60374751 (MAF: 0.030) and rs6843594 (MAF: 0.34) (Bonferroni LRT p: 0.026). We recommend using EDGE to flexibly detect interactions between SNPs exhibiting diverse action.
Subject(s)
Models, Genetic , Cataract/genetics , Datasets as Topic , Diabetes Mellitus, Type 2/genetics , Gene Frequency , Genome-Wide Association Study , Glaucoma/genetics , Humans , Hypertension/genetics , Macular Degeneration/genetics , Phenotype , Polymorphism, Single NucleotideABSTRACT
Vaccination is the most effective way to provide long-lasting immunity against viral infection; thus, rapid assessment of vaccine acceptance is a pressing challenge for health authorities. Prior studies have applied survey techniques to investigate vaccine acceptance, but these may be slow and expensive. This study investigates 29 million vaccine-related tweets from August 8, 2020 to April 19, 2021 and proposes a social media-based approach that derives a vaccine acceptance index (VAI) to quantify Twitter users' opinions on COVID-19 vaccination. This index is calculated based on opinion classifications identified with the aid of natural language processing techniques and provides a quantitative metric to indicate the level of vaccine acceptance across different geographic scales in the U.S. The VAI is easily calculated from the number of positive and negative Tweets posted by a specific users and groups of users, it can be compiled for regions such a counties or states to provide geospatial information, and it can be tracked over time to assess changes in vaccine acceptance as related to trends in the media and politics. At the national level, it showed that the VAI moved from negative to positive in 2020 and maintained steady after January 2021. Through exploratory analysis of state- and county-level data, reliable assessments of VAI against subsequent vaccination rates could be made for counties with at least 30 users. The paper discusses information characteristics that enable consistent estimation of VAI. The findings support the use of social media to understand opinions and to offer a timely and cost-effective way to assess vaccine acceptance.
Subject(s)
COVID-19 , Social Media , COVID-19/prevention & control , COVID-19 Vaccines , Humans , Natural Language Processing , VaccinationABSTRACT
Genome-wide association studies (GWAS) have identified numerous loci associated with human phenotypes. This approach, however, does not consider the richly diverse and complex environment with which humans interact throughout the life course, nor does it allow for interrelationships between genetic loci and across traits. As we move toward making precision medicine a reality, whereby we make predictions about disease risk based on genomic profiles, we need to identify improved predictive models of the relationship between genome and phenome. Methods that embrace pleiotropy (the effect of one locus on more than one trait), and gene-environment (G×E) and gene-gene (G×G) interactions, will further unveil the impact of alterations in biological pathways and identify genes that are only involved with disease in the context of the environment. This valuable information can be used to assess personal risk and choose the most appropriate medical interventions based on the genotype and environment of an individual, the whole premise of precision medicine.
Subject(s)
Genetic Association Studies , Genetic Predisposition to Disease , Genome-Wide Association Study , Precision Medicine , Gene-Environment Interaction , Genotype , Humans , Phenotype , Polymorphism, Single NucleotideABSTRACT
We performed a Phenome-wide association study (PheWAS) utilizing diverse genotypic and phenotypic data existing across multiple populations in the National Health and Nutrition Examination Surveys (NHANES), conducted by the Centers for Disease Control and Prevention (CDC), and accessed by the Epidemiological Architecture for Genes Linked to Environment (EAGLE) study. We calculated comprehensive tests of association in Genetic NHANES using 80 SNPs and 1,008 phenotypes (grouped into 184 phenotype classes), stratified by race-ethnicity. Genetic NHANES includes three surveys (NHANES III, 1999-2000, and 2001-2002) and three race-ethnicities: non-Hispanic whites (n = 6,634), non-Hispanic blacks (n = 3,458), and Mexican Americans (n = 3,950). We identified 69 PheWAS associations replicating across surveys for the same SNP, phenotype-class, direction of effect, and race-ethnicity at p<0.01, allele frequency >0.01, and sample size >200. Of these 69 PheWAS associations, 39 replicated previously reported SNP-phenotype associations, 9 were related to previously reported associations, and 21 were novel associations. Fourteen results had the same direction of effect across more than one race-ethnicity: one result was novel, 11 replicated previously reported associations, and two were related to previously reported results. Thirteen SNPs showed evidence of pleiotropy. We further explored results with gene-based biological networks, contrasting the direction of effect for pleiotropic associations across phenotypes. One PheWAS result was ABCG2 missense SNP rs2231142, associated with uric acid levels in both non-Hispanic whites and Mexican Americans, protoporphyrin levels in non-Hispanic whites and Mexican Americans, and blood pressure levels in Mexican Americans. Another example was SNP rs1800588 near LIPC, significantly associated with the novel phenotypes of folate levels (Mexican Americans), vitamin E levels (non-Hispanic whites) and triglyceride levels (non-Hispanic whites), and replication for cholesterol levels. The results of this PheWAS show the utility of this approach for exposing more of the complex genetic architecture underlying multiple traits, through generating novel hypotheses for future research.
Subject(s)
Gene-Environment Interaction , Genome-Wide Association Study , Phenotype , Adult , Environment , Epidemiologic Research Design , Ethnicity/genetics , Ethnicity/statistics & numerical data , Female , Gene Frequency , Genome-Wide Association Study/statistics & numerical data , Humans , Male , Middle Aged , Nutrition Surveys , Polymorphism, Single Nucleotide , Quantitative Trait, Heritable , United States/epidemiologyABSTRACT
Bioinformatics approaches to examine gene-gene models provide a means to discover interactions between multiple genes that underlie complex disease. Extensive computational demands and adjusting for multiple testing make uncovering genetic interactions a challenge. Here, we address these issues using our knowledge-driven filtering method, Biofilter, to identify putative single nucleotide polymorphism (SNP) interaction models for cataract susceptibility, thereby reducing the number of models for analysis. Models were evaluated in 3,377 European Americans (1,185 controls, 2,192 cases) from the Marshfield Clinic, a study site of the Electronic Medical Records and Genomics (eMERGE) Network, using logistic regression. All statistically significant models from the Marshfield Clinic were then evaluated in an independent dataset of 4,311 individuals (742 controls, 3,569 cases), using independent samples from additional study sites in the eMERGE Network: Mayo Clinic, Group Health/University of Washington, Vanderbilt University Medical Center, and Geisinger Health System. Eighty-three SNP-SNP models replicated in the independent dataset at likelihood ratio test P < 0.05. Among the most significant replicating models was rs12597188 (intron of CDH1)-rs11564445 (intron of CTNNB1). These genes are known to be involved in processes that include: cell-to-cell adhesion signaling, cell-cell junction organization, and cell-cell communication. Further Biofilter analysis of all replicating models revealed a number of common functions among the genes harboring the 83 replicating SNP-SNP models, which included signal transduction and PI3K-Akt signaling pathway. These findings demonstrate the utility of Biofilter as a biology-driven method, applicable for any genome-wide association study dataset.
Subject(s)
Cataract/genetics , Computational Biology/methods , Data Interpretation, Statistical , Electronic Health Records , Gene-Environment Interaction , Models, Genetic , Age Factors , Case-Control Studies , Cell Adhesion , Female , Genome-Wide Association Study , Genomics/methods , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide/genetics , Population Groups/genetics , Signal Transduction , SoftwareABSTRACT
The inherent complexity of biological systems can be leveraged for a greater understanding of the impact of genetic architecture on outcomes, traits, and pharmacological response. The genome-wide association study (GWAS) approach has well-developed methods and relatively straight-forward methodologies; however, the bigger picture of the impact of genetic architecture on phenotypic outcome still remains to be elucidated even with an ever-growing number of GWAS performed. Greater consideration of the complexity of biological processes, using more data from the phenome, exposome, and diverse -omic resources, including considering the interplay of pleiotropy and genetic interactions, may provide additional leverage for making the most of the incredible wealth of information available for study. Here, we describe how incorporating greater complexity into analyses through the use of additional phenotypic data and widespread deployment of phenome-wide association studies may provide new insights into genetic factors influencing diseases, traits, and pharmacological response.
Subject(s)
Genome-Wide Association Study , Environment , Epistasis, Genetic , Genetic Pleiotropy , Humans , Phenotype , Polymorphism, Single Nucleotide/geneticsABSTRACT
PURPOSE: Cataract is the leading cause of blindness in the world, and in the United States accounts for approximately 60% of Medicare costs related to vision. The purpose of this study was to identify genetic markers for age-related cataract through a genome-wide association study (GWAS). METHODS: In the electronic medical records and genomics (eMERGE) network, we ran an electronic phenotyping algorithm on individuals in each of five sites with electronic medical records linked to DNA biobanks. We performed a GWAS using 530,101 SNPs from the Illumina 660W-Quad in a total of 7,397 individuals (5,503 cases and 1,894 controls). We also performed an age-at-diagnosis case-only analysis. RESULTS: We identified several statistically significant associations with age-related cataract (45 SNPs) as well as age at diagnosis (44 SNPs). The 45 SNPs associated with cataract at p<1×10(-5) are in several interesting genes, including ALDOB, MAP3K1, and MEF2C. All have potential biologic relationships with cataracts. CONCLUSIONS: This is the first genome-wide association study of age-related cataract, and several regions of interest have been identified. The eMERGE network has pioneered the exploration of genomic associations in biobanks linked to electronic health records, and this study is another example of the utility of such resources. Explorations of age-related cataract including validation and replication of the association results identified herein are needed in future studies.
Subject(s)
Cataract/genetics , Electronic Health Records/statistics & numerical data , Fructose-Bisphosphate Aldolase/genetics , Genetic Predisposition to Disease , MAP Kinase Kinase Kinase 1/genetics , Polymorphism, Single Nucleotide , Age Factors , Aged , Aged, 80 and over , Algorithms , Cataract/pathology , Databases, Nucleic Acid , Female , Genetic Markers , Genome, Human , Genome-Wide Association Study , Health Care Costs , Humans , MEF2 Transcription Factors/genetics , Male , Middle Aged , Quantitative Trait Loci , United StatesABSTRACT
Cross-sectional data allow the investigation of how genetics influence health at a single time point, but to understand how the genome impacts phenotype development, one must use repeated measures data. Ignoring the dependency inherent in repeated measures can exacerbate false positives and requires the utilization of methods other than general or generalized linear models. Many methods can accommodate longitudinal data, including the commonly used linear mixed model and generalized estimating equation, as well as the less popular fixed-effects model, cluster-robust standard error adjustment, and aggregate regression. We simulated longitudinal data and applied these five methods alongside naïve linear regression, which ignored the dependency and served as a baseline, to compare their power, false positive rate, estimation accuracy, and precision. The results showed that the naïve linear regression and fixed-effects models incurred high false positive rates when analyzing a predictor that is fixed over time, making them unviable for studying time-invariant genetic effects. The linear mixed models maintained low false positive rates and unbiased estimation. The generalized estimating equation was similar to the former in terms of power and estimation, but it had increased false positives when the sample size was low, as did cluster-robust standard error adjustment. Aggregate regression produced biased estimates when predictor effects varied over time. To show how the method choice affects downstream results, we performed longitudinal analyses in an adolescent cohort of African and European ancestry. We examined how developing post-traumatic stress symptoms were predicted by polygenic risk, traumatic events, exposure to sexual abuse, and income using four approaches-linear mixed models, generalized estimating equations, cluster-robust standard error adjustment, and aggregate regression. While the directions of effect were generally consistent, coefficient magnitudes and statistical significance differed across methods. Our in-depth comparison of longitudinal methods showed that linear mixed models and generalized estimating equations were applicable in most scenarios requiring longitudinal modeling, but no approach produced identical results even if fit to the same data. Since result discrepancies can result from methodological choices, it is crucial that researchers determine their model a priori, refrain from testing multiple approaches to obtain favorable results, and utilize as similar as possible methods when seeking to replicate results.
ABSTRACT
Sex and gender differences play a crucial role in health and disease outcomes. This study used data from the National Health and Nutrition Examination Survey to explore how environmental exposures affect health-related traits differently in males and females. We utilized a sex-stratified phenomic environment-wide association study (PheEWAS), which allowed the identification of associations across a wide range of phenotypes and environmental exposures. We examined associations between 272 environmental exposures, including smoking-related exposures such as cotinine levels and smoking habits, and 58 clinically relevant blood phenotypes, such as serum albumin and homocysteine levels. Our analysis identified 119 sex-specific associations. For example, smoking-related exposures had a stronger impact on increasing homocysteine, hemoglobin, and hematocrit levels in females while reducing serum albumin and bilirubin levels and increasing c-reactive protein levels more significantly in males. These findings suggest mechanisms by which smoking exposure may pose higher cardiovascular risks and greater induced hypoxia for women, and greater inflammatory and immune responses in men. The results highlight the importance of considering sex differences in biomedical research. Understanding these differences can help develop more personalized and effective health interventions and improve clinical outcomes for both men and women.
Subject(s)
Environmental Exposure , Humans , Female , Male , Environmental Exposure/adverse effects , Middle Aged , Adult , Sex Factors , Smoking/adverse effects , Nutrition Surveys , Phenotype , Sex Characteristics , Cotinine/bloodABSTRACT
Alzheimer's disease (AD) is a neurodegenerative disorder characterized by memory and functional impairments. Two of 3 patients with AD are biologically female; therefore, the biological underpinnings of this diagnosis disparity may inform interventions slowing the AD progression. To bridge this gap, we conducted analyses of 1078 male and female participants from the Alzheimer's Disease Neuroimaging Initiative to examine associations between levels of cerebral spinal fluid (CSF)/neuroimaging biomarkers and cognitive/functional outcomes. The Chow test was used to quantify sex differences by determining if biological sex affects relationships between the studied biomarkers and outcomes. Multiple magnetic resonance imaging (whole brain, entorhinal cortex, middle temporal gyrus, fusiform gyrus, hippocampus), position emission tomography (AV45), and CSF (P-TAU, TAU) biomarkers were differentially associated with cognitive and functional outcomes. Post-hoc bootstrapped and association analyses confirmed these differential effects and emphasized the necessity of using separate, sex-stratified models. The studied imaging/CSF biomarkers may account for some of the sex-based variation in AD pathophysiology. The identified sex-varying relationships between CSF/imaging biomarkers and cognitive/functional outcomes warrant future biological investigation in independent cohorts.
Subject(s)
Alzheimer Disease , Cognitive Dysfunction , Humans , Male , Female , Alzheimer Disease/pathology , Neuroimaging , Brain/diagnostic imaging , Brain/pathology , Cognition , Biomarkers , tau Proteins , Amyloid beta-Peptides , Cognitive Dysfunction/pathologyABSTRACT
BACKGROUND: The additive model of inheritance assumes that heterozygotes (Aa) are exactly intermediate in respect to homozygotes (AA and aa). While this model is commonly used in single-locus genetic association studies, significant deviations from additivity are well-documented and contribute to phenotypic variance across many traits and systems. This assumption can introduce type I and type II errors by overestimating or underestimating the effects of variants that deviate from additivity. Alternative genotype encoding strategies have been explored to account for different inheritance patterns, but they often incur significant computational or methodological costs. To address these challenges, we introduce PAGER (Phenotype Adjusted Genotype Encoding and Ranking), an efficient pre-processing method that encodes each genetic variant based on normalized mean phenotypic differences between diallelic genotype classes (AA, Aa, and aa). This approach more accurately reflects each variant's true inheritance model, improving model precision while minimizing the costs associated with alternative encoding strategies. RESULTS: Through extensive benchmarking on SNPs simulated with both binary and continuous phenotypes, we demonstrate that PAGER accurately represents various inheritance patterns (including additive, dominant, recessive, and heterosis), achieves levels of statistical power that meet or exceed other encoding strategies, and attains computation speeds up to 55 times faster than a similar method, EDGE. We also apply PAGER to publicly available real-world data and identify a novel, relevant putative QTL associated with body mass index in rats (Rattus norvegicus) that is not detected with the additive model. CONCLUSIONS: Overall, we show that PAGER is an efficient genotype encoding approach that can uncover sources of missing heritability and reveal novel insights in the study of complex traits while incurring minimal costs.
ABSTRACT
Background: Circulating small RNAs (smRNAs) originate from diverse tissues and organs. Previous studies investigating smRNAs as potential biomarkers for Parkinson's disease (PD) have yielded inconsistent results. We investigated whether smRNA profiles from neuronally-enriched serum exosomes and microvesicles are altered in PD patients and discriminate PD subjects from controls. Methods: Demographic, clinical, and serum samples were obtained from 60 PD subjects and 40 age- and sex-matched controls. Exosomes and microvesicles were extracted and isolated using a validated neuronal membrane marker (CD171). Sequencing and bioinformatics analyses were used to identify differentially expressed smRNAs in PD and control samples. SmRNAs also were tested for association with clinical metrics. Logistic regression and random forest classification models evaluated the discriminative value of the smRNAs. Results: In serum CD171 enriched exosomes and microvesicles, a panel of 29 smRNAs was expressed differentially between PD and controls (false discovery rate (FDR) < 0.05). Among the smRNAs, 23 were upregulated and 6 were downregulated in PD patients. Pathway analysis revealed links to cellular proliferation regulation and signaling. Least absolute shrinkage and selection operator adjusted for the multicollinearity of these smRNAs and association tests to clinical parameters via linear regression did not yield significant results. Univariate logistic regression models showed that four smRNAs achieved an AUC ≥ 0.74 to discriminate PD subjects from controls. The random forest model had an AUC of 0.942 for the 29 smRNA panel. Conclusion: CD171-enriched exosomes and microvesicles contain the differential expression of smRNAs between PD and controls. Future studies are warranted to follow up on the findings and understand the scientific and clinical relevance.
ABSTRACT
Macrophages play a pivotal role in mediating inflammation and subsequent resolution of inflammation. The availability of selenium as a micronutrient and the subsequent biosynthesis of selenoproteins, containing the 21st amino acid selenocysteine (Sec), are important for the physiological functions of macrophages. Selenoproteins regulate the redox tone in macrophages during inflammation, the early onset of which involves oxidative burst of reactive oxygen and nitrogen species. SELENOW is a highly expressed selenoprotein in bone marrow-derived macrophages (BMDMs). Beyond its described general role as a thiol and peroxide reductase and as an interacting partner for 14-3-3 proteins, its cellular functions, particularly in macrophages, remain largely unknown. In this study, we utilized Selenow knock-out (KO) murine bone marrow-derived macrophages (BMDMs) to address the role of SELENOW in inflammation following stimulation with bacterial endotoxin lipopolysaccharide (LPS). RNAseq-based temporal analyses of expression of selenoproteins and the Sec incorporation machinery genes suggested no major differences in the selenium utilization pathway in the Selenow KO BMDMs compared to their wild-type counterparts. However, selective enrichment of oxidative stress-related selenoproteins and increased ROS in Selenow-/- BMDMs indicated anomalies in redox homeostasis associated with hierarchical expression of selenoproteins. Selenow-/- BMDMs also exhibited reduced expression of arginase-1, a key enzyme associated with anti-inflammatory (M2) phenotype necessary to resolve inflammation, along with a significant decrease in efferocytosis of neutrophils that triggers pathways of resolution. Parallel targeted metabolomics analysis also confirmed an impairment in arginine metabolism in Selenow-/- BMDMs. Furthermore, Selenow-/- BMDMs lacked the ability to enhance characteristic glycolytic metabolism during inflammation. Instead, these macrophages atypically relied on oxidative phosphorylation for energy production when glucose was used as an energy source. These findings suggest that SELENOW expression in macrophages may have important implications on cellular redox processes and bioenergetics during inflammation and its resolution.
Subject(s)
Selenium , Selenoprotein W , Mice , Animals , Selenoprotein W/genetics , Selenoprotein W/metabolism , Selenium/metabolism , Selenoproteins/genetics , Selenoproteins/metabolism , Macrophages/metabolism , Oxidation-Reduction , Inflammation/geneticsABSTRACT
Inflammation skews bone marrow hematopoiesis increasing the production of myeloid effector cells at the expense of steady-state erythropoiesis. A compensatory stress erythropoiesis response is induced to maintain homeostasis until inflammation is resolved. In contrast to steady-state erythroid progenitors, stress erythroid progenitors (SEPs) utilize signals induced by inflammatory stimuli. However, the mechanistic basis for this is not clear. Here we reveal a nitric oxide (NO)-dependent regulatory network underlying two stages of stress erythropoiesis, namely proliferation, and the transition to differentiation. In the proliferative stage, immature SEPs and cells in the niche increased expression of inducible nitric oxide synthase ( Nos2 or iNOS ) to generate NO. Increased NO rewires SEP metabolism to increase anabolic pathways, which drive the biosynthesis of nucleotides, amino acids and other intermediates needed for cell division. This NO-dependent metabolism promotes cell proliferation while also inhibiting erythroid differentiation leading to the amplification of a large population of non-committed progenitors. The transition of these progenitors to differentiation is mediated by the activation of nuclear factor erythroid 2-related factor 2 (Nfe2l2 or Nrf2). Nrf2 acts as an anti-inflammatory regulator that decreases NO production, which removes the NO-dependent erythroid inhibition and allows for differentiation. These data provide a paradigm for how alterations in metabolism allow inflammatory signals to amplify immature progenitors prior to differentiation. Key points: Nitric-oxide (NO) dependent signaling favors an anabolic metabolism that promotes proliferation and inhibits differentiation.Activation of Nfe2l2 (Nrf2) decreases NO production allowing erythroid differentiation.
ABSTRACT
Relapse of acute myeloid leukemia (AML) remains a significant concern due to persistent leukemia-initiating stem cells (LICs) that are typically not targeted by most existing therapies. Using a murine AML model, human AML cell lines, and patient samples, we show that AML LICs are sensitive to endogenous and exogenous cyclopentenone prostaglandin-J (CyPG), Δ12-PGJ2, and 15d-PGJ2, which are increased upon dietary selenium supplementation via the cyclooxygenase-hematopoietic PGD synthase pathway. CyPGs are endogenous ligands for peroxisome proliferator-activated receptor gamma and GPR44 (CRTH2; PTGDR2). Deletion of GPR44 in a mouse model of AML exacerbated the disease suggesting that GPR44 activation mediates selenium-mediated apoptosis of LICs. Transcriptomic analysis of GPR44-/- LICs indicated that GPR44 activation by CyPGs suppressed KRAS-mediated MAPK and PI3K/AKT/mTOR signaling pathways, to enhance apoptosis. Our studies show the role of GPR44, providing mechanistic underpinnings of the chemopreventive and chemotherapeutic properties of selenium and CyPGs in AML.
Subject(s)
Leukemia, Myeloid, Acute , Selenium , Humans , Mice , Animals , Phosphatidylinositol 3-Kinases , Signal Transduction , Cell LineABSTRACT
Alzheimer's disease (AD) is the leading cause of dementia; however, men and women face differential AD prevalence, presentation, and progression risks. Characterizing metabolomic profiles during AD progression is fundamental to understand the metabolic disruptions and the biological pathways involved. However, outstanding questions remain of whether peripheral metabolic changes occur equally in men and women with AD. Here, we evaluated differential effects of metabolomic and brain volume associations between sexes. We used three cohorts from the Alzheimer's Disease Neuroimaging Initiative (ADNI), evaluated 1,368 participants, two metabolomic platforms with 380 metabolites in total, and six brain segment volumes. Using dimension reduction techniques, we took advantage of the correlation structure of the brain volume phenotypes and the metabolite concentration values to reduce the number of tests while aggregating relevant biological structures. Using WGCNA, we aggregated modules of highly co-expressed metabolites. On the other hand, we used partial least squares regression-discriminant analysis (PLS-DA) to extract components of brain volumes that maximally co-vary with AD diagnosis as phenotypes. We tested for differences in effect sizes between sexes in the association between single metabolite and metabolite modules with the brain volume components. We found five metabolite modules and 125 single metabolites with significant differences between sexes. These results highlight a differential lipid disruption in AD progression between sexes. Men showed a greater negative association of phosphatidylcholines and sphingomyelins and a positive association of VLDL and large LDL with AD progression. In contrast, women showed a positive association of triglycerides in VLDL and small and medium LDL with AD progression. Explicitly identifying sex differences in metabolomics during AD progression can highlight particular metabolic disruptions in each sex. Our research study and strategy can lead to better-tailored studies and better-suited treatments that take sex differences into account.
ABSTRACT
New insights into mechanisms linking obesity to poor health outcomes suggest a role for cellular aging pathways, casting obesity as a disease of accelerated biological aging. Although obesity has been linked to accelerated epigenetic aging in middle-aged adults, the impact during childhood remains unclear. We tested the association between body mass index (BMI) and accelerated epigenetic aging in a cohort of high-risk children. Participants were children (N = 273, aged 8 to 14 years, 82% investigated for maltreatment) recruited to the Child Health Study, an ongoing prospective study of youth investigated for maltreatment and a comparison youth. BMI was measured as a continuous variable. Accelerated epigenetic aging of blood leukocytes was defined as the age-adjusted residuals of several established epigenetic aging clocks (Horvath, Hannum, GrimAge, PhenoAge) along with a newer algorithm, the DunedinPoAm, developed to quantify the pace-of-aging. Hypotheses were tested with generalized linear models. Higher age-and sex- adjusted z-scored BMI was significantly correlated with household income, blood cell counts, and three of the accelerated epigenetic aging measures: GrimAge (r = 0.31, P < .0001), PhenoAge (r = 0.24, P < .0001), and DunedinPoAm (r = 0.38, P < .0001). In fully adjusted models, GrimAge (ß = 0.07; P = .0009) and DunedinPoAm (ß = 0.0017; P < .0001) remained significantly associated with higher age- and sex-adjusted z-scored BMI. Maltreatment-status was not associated with accelerated epigenetic aging. In a high-risk cohort of children, higher BMI predicted epigenetic aging as assessed by two epigenetic aging clocks. These results suggest the association between obesity and accelerated epigenetic aging begins in early life, with implications for future morbidity and mortality risk.
Subject(s)
DNA Methylation , Epigenesis, Genetic , Adolescent , Adult , Aging/genetics , Child , Humans , Middle Aged , Obesity/genetics , Prospective StudiesABSTRACT
Environmental exposure pathophysiology related to smoking can yield metabolic changes that are difficult to describe in a biologically informative fashion with manual proprietary software. Nuclear magnetic resonance (NMR) spectroscopy detects compounds found in biofluids yielding a metabolic snapshot. We applied our semi-automated NMR pipeline for a secondary analysis of a smoking study (MTBLS374 from the MetaboLights repository) (n = 112). This involved quality control (in the form of data preprocessing), automated metabolite quantification, and analysis. With our approach we putatively identified 79 metabolites that were previously unreported in the dataset. Quantified metabolites were used for metabolic pathway enrichment analysis that replicated 1 enriched pathway with the original study as well as 3 previously unreported pathways. Our pipeline generated a new random forest (RF) classifier between smoking classes that revealed several combinations of compounds. This study broadens our metabolomic understanding of smoking exposure by 1) notably increasing the number of quantified metabolites with our analytic pipeline, 2) suggesting smoking exposure may lead to heterogenous metabolic responses according to random forest modeling, and 3) modeling how newly quantified individual metabolites can determine smoking status. Our approach can be applied to other NMR studies to characterize environmental risk factors, allowing for the discovery of new biomarkers of disease and exposure status.
Subject(s)
Non-Smokers , Smokers , Computational Biology , Humans , Magnetic Resonance Spectroscopy , MetabolomicsABSTRACT
BACKGROUND: Childhood sexual abuse (CSA) confers elevated risks for obesity in females. Mechanisms that explain this link remain unclear. This study tracked serum basal cortisol levels with body mass index (BMI) from childhood into adulthood to test whether hypothalamic-pituitary-adrenal (HPA) axis attenuation accounts for elevated obesity risks for sexually abused females. METHODS: Data drew from six timepoints of a longitudinal study of the impact of CSA on development. Participants were females aged 6-16 years at time of study enrollment with substantiated CSA and demographically matched non-abused peers. Analyses included only participants who did not have obesity at study enrollment. Main outcomes were BMI growth trajectories across ages 6-27 (n = 150; 66 abused, 84 comparisons) and early adulthood obesity status (ages 20-27; n = 133; 62 abused, 71 comparison). HPA axis functioning indicators were intercept and linear slope parameters extracted from multilevel growth trajectories of serum basal cortisol levels across development. Racial-ethnic minority status, parity, steroid medication use, depression history and disordered eating history were covaried. RESULTS: While controlling for covariates, multilevel modeling indicated that high initial serum basal cortisol levels in childhood and attenuated cortisol growth rate over time (i.e., HPA axis attenuation) were associated with accelerated BMI accumulation (p < .01). Attenuated cortisol growth rate mediated the effect of CSA on accelerated BMI accumulation and on elevated adulthood obesity rates (p < .05). CONCLUSION: This work establishes a mechanistic association between HPA axis attenuation and obesity, suggesting that trauma treatments for abuse survivors should include interventions that reduce health consequences associated with dysregulated stress physiology.
Subject(s)
Child Abuse, Sexual , Hypothalamo-Hypophyseal System , Obesity , Pituitary-Adrenal System , Adolescent , Adult , Case-Control Studies , Child , Child Abuse, Sexual/statistics & numerical data , Female , Humans , Hydrocortisone/blood , Hypothalamo-Hypophyseal System/physiopathology , Longitudinal Studies , Obesity/blood , Obesity/epidemiology , Pituitary-Adrenal System/physiopathology , Risk Assessment , Young AdultABSTRACT
Phenome-wide association studies (PheWAS) allow agnostic investigation of common genetic variants in relation to a variety of phenotypes but preserving the power of PheWAS requires careful phenotypic quality control (QC) procedures. While QC of genetic data is well-defined, no established QC practices exist for multi-phenotypic data. Manually imposing sample size restrictions, identifying variable types/distributions, and locating problems such as missing data or outliers is arduous in large, multivariate datasets. In this paper, we perform two PheWAS on epidemiological data and, utilizing the novel software CLARITE (CLeaning to Analysis: Reproducibility-based Interface for Traits and Exposures), showcase a transparent and replicable phenome QC pipeline which we believe is a necessity for the field. Using data from the Ludwigshafen Risk and Cardiovascular (LURIC) Health Study we ran two PheWAS, one on cardiac-related diseases and the other on polyunsaturated fatty acids levels. These phenotypes underwent a stringent quality control screen and were regressed on a genome-wide sample of single nucleotide polymorphisms (SNPs). Seven SNPs were significant in association with dihomo-γ-linolenic acid, of which five were within fatty acid desaturases FADS1 and FADS2. PheWAS is a useful tool to elucidate the genetic architecture of complex disease phenotypes within a single experimental framework. However, to reduce computational and multiple-comparisons burden, careful assessment of phenotype quality and removal of low-quality data is prudent. Herein we perform two PheWAS while applying a detailed phenotype QC process, for which we provide a replicable pipeline that is modifiable for application to other large datasets with heterogenous phenotypes. As investigation of complex traits continues beyond traditional genome wide association studies (GWAS), such QC considerations and tools such as CLARITE are crucial to the in the analysis of non-genetic big data such as clinical measurements, lifestyle habits, and polygenic traits.