ABSTRACT
BACKGROUND: Menopausal hormone therapy (MHT), a common treatment to relieve symptoms of menopause, is associated with a lower risk of colorectal cancer (CRC). To inform CRC risk prediction and MHT risk-benefit assessment, we aimed to evaluate the joint association of a polygenic risk score (PRS) for CRC and MHT on CRC risk. METHODS: We used data from 28,486 postmenopausal women (11,519 cases and 16,967 controls) of European descent. A PRS based on 141 CRC-associated genetic variants was modeled as a categorical variable in quartiles. Multiplicative interaction between PRS and MHT use was evaluated using logistic regression. Additive interaction was measured using the relative excess risk due to interaction (RERI). 30-year cumulative risks of CRC for 50-year-old women according to MHT use and PRS were calculated. RESULTS: The reduction in odds ratios by MHT use was larger in women within the highest quartile of PRS compared to that in women within the lowest quartile of PRS (p-value = 2.7 × 10-8). At the highest quartile of PRS, the 30-year CRC risk was statistically significantly lower for women taking any MHT than for women not taking any MHT, 3.7% (3.3%-4.0%) vs 6.1% (5.7%-6.5%) (difference 2.4%, P-value = 1.83 × 10-14); these differences were also statistically significant but smaller in magnitude in the lowest PRS quartile, 1.6% (1.4%-1.8%) vs 2.2% (1.9%-2.4%) (difference 0.6%, P-value = 1.01 × 10-3), indicating 4 times greater reduction in absolute risk associated with any MHT use in the highest compared to the lowest quartile of genetic CRC risk. CONCLUSIONS: MHT use has a greater impact on the reduction of CRC risk for women at higher genetic risk. These findings have implications for the development of risk prediction models for CRC and potentially for the consideration of genetic information in the risk-benefit assessment of MHT use.
Subject(s)
Colorectal Neoplasms , Genetic Predisposition to Disease , Humans , Female , Colorectal Neoplasms/genetics , Colorectal Neoplasms/epidemiology , Middle Aged , Case-Control Studies , Risk Factors , Aged , Hormone Replacement Therapy/adverse effects , Risk Assessment , Menopause , Postmenopause , Estrogen Replacement Therapy/adverse effectsABSTRACT
BACKGROUND: Colorectal cancer (CRC) is a common, fatal cancer. Identifying subgroups who may benefit more from intervention is of critical public health importance. Previous studies have assessed multiplicative interaction between genetic risk scores and environmental factors, but few have assessed additive interaction, the relevant public health measure. METHODS: Using resources from colorectal cancer consortia including 45,247 CRC cases and 52,671 controls, we assessed multiplicative and additive interaction (relative excess risk due to interaction, RERI) using logistic regression between 13 harmonized environmental factors and genetic risk score including 141 variants associated with CRC risk. RESULTS: There was no evidence of multiplicative interaction between environmental factors and genetic risk score. There was additive interaction where, for individuals with high genetic susceptibility, either heavy drinking [RERI = 0.24, 95% confidence interval, CI, (0.13, 0.36)], ever smoking [0.11 (0.05, 0.16)], high BMI [female 0.09 (0.05, 0.13), male 0.10 (0.05, 0.14)], or high red meat intake [highest versus lowest quartile 0.18 (0.09, 0.27)] was associated with excess CRC risk greater than that for individuals with average genetic susceptibility. Conversely, we estimate those with high genetic susceptibility may benefit more from reducing CRC risk with aspirin/NSAID use [-0.16 (-0.20, -0.11)] or higher intake of fruit, fiber, or calcium [highest quartile versus lowest quartile -0.12 (-0.18, -0.050); -0.16 (-0.23, -0.09); -0.11 (-0.18, -0.05), respectively] than those with average genetic susceptibility. CONCLUSIONS: Additive interaction is important to assess for identifying subgroups who may benefit from intervention. The subgroups identified in this study may help inform precision CRC prevention.
ABSTRACT
BACKGROUND: Diabetes is an established risk factor for colorectal cancer. However, the mechanisms underlying this relationship still require investigation and it is not known if the association is modified by genetic variants. To address these questions, we undertook a genome-wide gene-environment interaction analysis. METHODS: We used data from 3 genetic consortia (CCFR, CORECT, GECCO; 31,318 colorectal cancer cases/41,499 controls) and undertook genome-wide gene-environment interaction analyses with colorectal cancer risk, including interaction tests of genetics(G)xdiabetes (1-degree of freedom; d.f.) and joint testing of Gxdiabetes, G-colorectal cancer association (2-d.f. joint test) and G-diabetes correlation (3-d.f. joint test). RESULTS: Based on the joint tests, we found that the association of diabetes with colorectal cancer risk is modified by loci on chromosomes 8q24.11 (rs3802177, SLC30A8 - ORAA: 1.62, 95% CI: 1.34-1.96; ORAG: 1.41, 95% CI: 1.30-1.54; ORGG: 1.22, 95% CI: 1.13-1.31; p-value3-d.f.: 5.46 × 10-11) and 13q14.13 (rs9526201, LRCH1 - ORGG: 2.11, 95% CI: 1.56-2.83; ORGA: 1.52, 95% CI: 1.38-1.68; ORAA: 1.13, 95% CI: 1.06-1.21; p-value2-d.f.: 7.84 × 10-09). DISCUSSION: These results suggest that variation in genes related to insulin signaling (SLC30A8) and immune function (LRCH1) may modify the association of diabetes with colorectal cancer risk and provide novel insights into the biology underlying the diabetes and colorectal cancer relationship.
Subject(s)
Colorectal Neoplasms , Diabetes Mellitus , Humans , Gene-Environment Interaction , Genetic Predisposition to Disease , Risk Factors , Diabetes Mellitus/genetics , Colorectal Neoplasms/genetics , Polymorphism, Single Nucleotide , Genome-Wide Association Study/methods , Microfilament Proteins/geneticsABSTRACT
BACKGROUND: There is a need to match characteristics of tobacco users with cessation treatments and risks of tobacco attributable diseases such as lung cancer. The rate in which the body metabolizes nicotine has proven an important predictor of these outcomes. Nicotine metabolism is primarily catalyzed by the enzyme cytochrone P450 (CYP2A6) and CYP2A6 activity can be measured as the ratio of two nicotine metabolites: trans-3'-hydroxycotinine to cotinine (NMR). Measurements of these metabolites are only possible in current tobacco users and vary by biofluid source, timing of collection, and protocols; unfortunately, this has limited their use in clinical practice. The NMR depends highly on genetic variation near CYP2A6 on chromosome 19 as well as ancestry, environmental, and other genetic factors. Thus, we aimed to develop prediction models of nicotine metabolism using genotypes and basic individual characteristics (age, gender, height, and weight). RESULTS: We identified four multiethnic studies with nicotine metabolites and DNA samples. We constructed a 263 marker panel from filtering genome-wide association scans of the NMR in each study. We then applied seven machine learning techniques to train models of nicotine metabolism on the largest and most ancestrally diverse dataset (N=2239). The models were then validated using the other three studies (total N=1415). Using cross-validation, we found the correlations between the observed and predicted NMR ranged from 0.69 to 0.97 depending on the model. When predictions were averaged in an ensemble model, the correlation was 0.81. The ensemble model generalizes well in the validation studies across ancestries, despite differences in the measurements of NMR between studies, with correlations of: 0.52 for African ancestry, 0.61 for Asian ancestry, and 0.46 for European ancestry. The most influential predictors of NMR identified in more than two models were rs56113850, rs11878604, and 21 other genetic variants near CYP2A6 as well as age and ancestry. CONCLUSIONS: We have developed an ensemble of seven models for predicting the NMR across ancestries from genotypes and age, gender and BMI. These models were validated using three datasets and associate with nicotine dosages. The knowledge of how an individual metabolizes nicotine could be used to help select the optimal path to reducing or quitting tobacco use, as well as, evaluating risks of tobacco use.
Subject(s)
Cotinine , Nicotine , Cotinine/metabolism , Genome-Wide Association Study , Genotype , Humans , Nicotine/metabolism , Smoking/genetics , Smoking/metabolismABSTRACT
Background: Substance use disorder (SUD) is a heterogeneous disorder. Adapting machine learning algorithms to allow for the parsing of intrapersonal and interpersonal heterogeneity in meaningful ways may accelerate the discovery and implementation of clinically actionable interventions in SUD research.Objectives: Inspired by a study of heavy drinkers that collected daily drinking and substance use (ABQ DrinQ), we develop tools to estimate subject-specific risk trajectories of heavy drinking; estimate and perform inference on patient characteristics and time-varying covariates; and present results in easy-to-use Jupyter notebooks. Methods: We recast support vector machines (SVMs) into a Bayesian model extended to handle mixed effects. We then apply these methods to ABQ DrinQ to model alcohol use patterns. ABQ DrinQ consists of 190 heavy drinkers (44% female) with 109,580 daily observations. Results: We identified male gender (point estimate; 95% credible interval: -0.25;-0.29,-0.21), older age (-0.03;-0.03,-0.03), and time varying usage of nicotine (1.68;1.62,1.73), cannabis (0.05;0.03,0.07), and other drugs (1.16;1.01,1.35) as statistically significant factors of heavy drinking behavior. By adopting random effects to capture the subject-specific longitudinal trajectories, the algorithm outperforms traditional SVM (classifies 84% of heavy drinking days correctly versus 73%). Conclusions: We developed a mixed effects variant of SVM and compare it to the traditional formulation, with an eye toward elucidating the importance of incorporating random effects to account for underlying heterogeneity in SUD data. These tools and examples are packaged into a repository for researchers to explore. Understanding patterns and risk of substance use could be used for developing individualized interventions.
Subject(s)
Substance-Related Disorders , Support Vector Machine , Bayes Theorem , Female , Humans , Male , Substance-Related Disorders/epidemiologyABSTRACT
INTRODUCTION: The nicotine metabolite ratio and nicotine equivalents are measures of metabolism rate and intake. Genome-wide prediction of these nicotine biomarkers in multiethnic samples will enable tobacco-related biomarker, behavioral, and exposure research in studies without measured biomarkers. AIMS AND METHODS: We screened genetic variants genome-wide using marginal scans and applied statistical learning algorithms on top-ranked genetic variants, age, ethnicity and sex, and, in additional modeling, cigarettes per day (CPD), (in additional modeling) to build prediction models for the urinary nicotine metabolite ratio (uNMR) and creatinine-standardized total nicotine equivalents (TNE) in 2239 current cigarette smokers in five ethnic groups. We predicted these nicotine biomarkers using model ensembles and evaluated external validity using dependence measures in 1864 treatment-seeking smokers in two ethnic groups. RESULTS: The genomic regions with the most selected and included variants for measured biomarkers were chr19q13.2 (uNMR, without and with CPD) and chr15q25.1 and chr10q25.3 (TNE, without and with CPD). We observed ensemble correlations between measured and predicted biomarker values for the uNMR and TNE without (with CPD) of 0.67 (0.68) and 0.65 (0.72) in the training sample. We observed inconsistency in penalized regression models of TNE (with CPD) with fewer variants at chr15q25.1 selected and included. In treatment-seeking smokers, predicted uNMR (without CPD) was significantly associated with CPD and predicted TNE (without CPD) with CPD, time-to-first-cigarette, and Fagerström total score. CONCLUSIONS: Nicotine metabolites, genome-wide data, and statistical learning approaches developed novel robust predictive models for urinary nicotine biomarkers in multiple ethnic groups. Predicted biomarker associations helped define genetically influenced components of nicotine dependence. IMPLICATIONS: We demonstrate development of robust models and multiethnic prediction of the uNMR and TNE using statistical and machine learning approaches. Variants included in trained models for nicotine biomarkers include top-ranked variants in multiethnic genome-wide studies of smoking behavior, nicotine metabolites, and related disease. Association of the two predicted nicotine biomarkers with Fagerström Test for Nicotine Dependence items supports models of nicotine biomarkers as predictors of physical dependence and nicotine exposure. Predicted nicotine biomarkers may facilitate tobacco-related disease and treatment research in samples with genomic data and limited nicotine metabolite or tobacco exposure data.
Subject(s)
Tobacco Products , Tobacco Use Disorder , Biomarkers , Humans , Nicotine , Smoking/genetics , Tobacco Use Disorder/geneticsABSTRACT
Modern epidemiologic studies permit investigation of the complex pathways that mediate effects of social, behavioral, and molecular factors on health outcomes. Conventional analytical approaches struggle with high-dimensional data, leading to high likelihoods of both false-positive and false-negative inferences. Herein, we describe a novel Bayesian pathway analysis approach, the algorithm for learning pathway structure (ALPS), which addresses key limitations in existing approaches to complex data analysis. ALPS uses prior information about pathways in concert with empirical data to identify and quantify complex interactions within networks of factors that mediate an association between an exposure and an outcome. We illustrate ALPS through application to a complex gene-drug interaction analysis in the Predictors of Breast Cancer Recurrence (ProBe CaRe) Study, a Danish cohort study of premenopausal breast cancer patients (2002-2011), for which conventional analyses severely limit the quality of inference.
Subject(s)
Algorithms , Bayes Theorem , Drug Resistance, Neoplasm/genetics , Pharmacogenomic Testing , Antineoplastic Agents, Hormonal/metabolism , Antineoplastic Agents, Hormonal/therapeutic use , Breast Neoplasms/drug therapy , Female , Humans , Tamoxifen/metabolism , Tamoxifen/therapeutic useABSTRACT
Introduction: Human genetic research has succeeded in definitively identifying multiple genetic variants associated with risk for nicotine dependence and heavy smoking. To build on these advances, and to aid in reducing the prevalence of smoking and its consequent health harms, the next frontier is to identify genetic predictors of successful smoking cessation and also of the efficacy of smoking cessation treatments ("pharmacogenomics"). More broadly, additional biomarkers that can be quantified from biosamples also promise to aid "Precision Medicine" and the personalization of treatment, both pharmacological and behavioral. Aims and Methods: To motivate ongoing and future efforts, here we review several compelling genetic and biomarker findings related to smoking cessation and treatment. Results: These Key results involve genetic variants in the nicotinic receptor subunit gene CHRNA5, variants in the nicotine metabolism gene CYP2A6, and the nicotine metabolite ratio. We also summarize reports of epigenetic changes related to smoking behavior. Conclusions: The results to date demonstrate the value and utility of data generated from biosamples in clinical treatment trial settings. This article cross-references a companion paper in this issue that provides practical guidance on how to incorporate biosample collection into a planned clinical trial and discusses avenues for harmonizing data and fostering consortium-based, collaborative research on the pharmacogenomics of smoking cessation. Implications: Evidence is emerging that certain genotypes and biomarkers are associated with smoking cessation success and efficacy of smoking cessation treatments. We review key findings that open potential avenues for personalizing smoking cessation treatment according to an individual's genetic or metabolic profile. These results provide important incentive for smoking cessation researchers to collect biosamples and perform genotyping in research studies and clinical trials.
Subject(s)
Clinical Trials as Topic/methods , Epigenesis, Genetic/genetics , Metabolomics/methods , Smoking Cessation/methods , Smoking/genetics , Smoking/metabolism , Biomarkers/metabolism , Genotype , Humans , Pharmacogenetics/methods , Precision Medicine/methods , Smoking/therapyABSTRACT
Implications: This article outlines a framework for the consistent integration of biological data/samples into smoking cessation pharmacotherapy trials, aligned with the objectives of the recently unveiled Precision Medicine Initiative. Our goal is to encourage and provide support for treatment researchers to consider biosample collection and genotyping their existing samples as well as integrating genetic analyses into their study design in order to realize precision medicine in treatment of nicotine dependence.
Subject(s)
Genomics/methods , Precision Medicine/methods , Smoking Cessation/methods , Smoking/genetics , Smoking/therapy , Clinical Trials as Topic/methods , Humans , Precision Medicine/psychology , Smoking/psychology , Smoking Cessation/psychology , Tobacco Use Cessation Devices , Tobacco Use Disorder/genetics , Tobacco Use Disorder/psychology , Tobacco Use Disorder/therapyABSTRACT
BACKGROUND: Addictive disorders are a class of chronic, relapsing mental disorders that are responsible for increased risk of mental and medical disorders and represent the largest, potentially modifiable cause of death. Tobacco dependence is associated with increased risk of disease and premature death. While tobacco control efforts and therapeutic interventions have made good progress in reducing smoking prevalence, challenges remain in optimizing their effectiveness based on patient characteristics, including genetic variation. In order to maximize collaborative efforts to advance addiction research, we have developed a genotyping array called Smokescreen. This custom array builds upon previous work in the analyses of human genetic variation, the genetics of addiction, drug metabolism, and response to therapy, with an emphasis on smoking and nicotine addiction. RESULTS: The Smokescreen genotyping array includes 646,247 markers in 23 categories. The array design covers genome-wide common variation (65.67, 82.37, and 90.72% in African (YRI), East Asian (ASN), and European (EUR) respectively); most of the variation with a minor allele frequency ≥ 0.01 in 1014 addiction genes (85.16, 89.51, and 90.49% for YRI, ASN, and EUR respectively); and nearly all variation from the 1000 Genomes Project Phase 1, NHLBI GO Exome Sequencing Project and HapMap databases in the regions related to smoking behavior and nicotine metabolism: CHRNA5-CHRNA3-CHRNB4 and CYP2A6-CYP2B6. Of the 636 pilot DNA samples derived from blood or cell line biospecimens that were genotyped on the array, 622 (97.80%) passed quality control. In passing samples, 90.08% of markers passed quality control. The genotype reproducibility in 25 replicate pairs was 99.94%. For 137 samples that overlapped with HapMap2 release 24, the genotype concordance was 99.76%. In a genome-wide association analysis of the nicotine metabolite ratio in 315 individuals participating in nicotine metabolism laboratory studies, we identified genome-wide significant variants in the CYP2A6 region (min p = 9.10E-15). CONCLUSIONS: We developed a comprehensive genotyping array for addiction research and demonstrated its analytic validity and utility through pilot genotyping of HapMap and study samples. This array allows researchers to perform genome-wide, candidate gene, and pathway-based association analyses of addiction, tobacco-use, treatment response, comorbidities, and associated diseases in a standardized, high-throughput platform.
Subject(s)
Genotype , Oligonucleotide Array Sequence Analysis/methods , Tobacco Use Disorder/genetics , Asian People , Black People , Chromosome Mapping , Exons , Genetic Markers , Genome-Wide Association Study , Humans , Nicotine/metabolism , Polymorphism, Single Nucleotide , Smoking/genetics , White PeopleABSTRACT
INTRODUCTION: Metabolic enzyme variation and other patient and environmental characteristics influence smoking behaviors, treatment success, and risk of related disease. Population-specific variation in metabolic genes contributes to challenges in developing and optimizing pharmacogenetic interventions. We applied a custom genome-wide genotyping array for addiction research (Smokescreen), to three laboratory-based studies of nicotine metabolism with oral or venous administration of labeled nicotine and cotinine, to model nicotine metabolism in multiple populations. The trans-3'-hydroxycotinine/cotinine ratio, the nicotine metabolite ratio (NMR), was the nicotine metabolism measure analyzed. METHODS: Three hundred twelve individuals of self-identified European, African, and Asian American ancestry were genotyped and included in ancestry-specific genome-wide association scans (GWAS) and a meta-GWAS analysis of the NMR. We modeled natural-log transformed NMR with covariates: principal components of genetic ancestry, age, sex, body mass index, and smoking status. RESULTS: African and Asian American NMRs were statistically significantly (P values ≤ 5E-5) lower than European American NMRs. Meta-GWAS analysis identified 36 genome-wide significant variants over a 43 kilobase pair region at CYP2A6 with minimum P = 2.46E-18 at rs12459249, proximal to CYP2A6. Additional minima were located in intron 4 (rs56113850, P = 6.61E-18) and in the CYP2A6-CYP2A7 intergenic region (rs34226463, P = 1.45E-12). Most (34/36) genome-wide significant variants suggested reduced CYP2A6 activity; functional mechanisms were identified and tested in knowledge-bases. Conditional analysis resulted in intergenic variants of possible interest (P values < 5E-5). CONCLUSIONS: This meta-GWAS of the NMR identifies CYP2A6 variants, replicates the top-ranked single nucleotide polymorphism from a recent Finnish meta-GWAS of the NMR, identifies functional mechanisms, and provides pan-continental population biomarkers for nicotine metabolism. IMPLICATIONS: This multiple ancestry meta-GWAS of the laboratory study-based NMR provides novel evidence and replication for genome-wide association of CYP2A6 single nucleotide and insertion-deletion polymorphisms. We identify three regions of genome-wide significance: proximal, intronic, and distal to CYP2A6. We replicate the top-ranking single nucleotide polymorphism from a recent GWAS of the NMR in Finnish smokers, identify a functional mechanism for this intronic variant from in silico analyses of RNA-seq data that is consistent with CYP2A6 expression measured in postmortem lung and liver, and provide additional support for the intergenic region between CYP2A6 and CYP2A7.
Subject(s)
Cytochrome P-450 CYP2A6/genetics , Nicotine/genetics , Nicotine/metabolism , Smoking/genetics , Tobacco Use Disorder/genetics , Adult , Asian People/genetics , Black People/genetics , Female , Genome-Wide Association Study , Humans , Male , Middle Aged , Polymorphism, Single Nucleotide , White People/genetics , Young AdultABSTRACT
BACKGROUND: Several lifestyle and environmental exposures have been suspected as risk factors for oral clefts, although few have been convincingly demonstrated. Studies across global diverse populations could offer additional insight given varying types and levels of exposures. METHODS: We performed an international case-control study in the Democratic Republic of the Congo (133 cases, 301 controls), Vietnam (75 cases, 158 controls), the Philippines (102 cases, 152 controls), and Honduras (120 cases, 143 controls). Mothers were recruited from hospitals and their exposures were collected from interviewer-administered questionnaires. We used logistic regression modeling to estimate odds ratios (OR) and 95% confidence intervals (CI). RESULTS: Family history of clefts was strongly associated with increased risk (maternal: OR = 4.7; 95% CI, 3.0-7.2; paternal: OR = 10.5; 95% CI, 5.9-18.8; siblings: OR = 5.3; 95% CI, 1.4-19.9). Advanced maternal age (5 year OR = 1.2; 95% CI, 1.0-1.3), pregestational hypertension (OR = 2.6; 95% CI, 1.3-5.1), and gestational seizures (OR = 2.9; 95% CI, 1.1-7.4) were statistically significant risk factors. Lower maternal (secondary school OR = 1.6; 95% CI, 1.2-2.2; primary school OR = 2.4, 95% CI, 1.6-2.8) and paternal education (OR = 1.9; 95% CI, 1.4-2.5; and OR = 1.8; 95% CI, 1.1-2.9, respectively) and paternal tobacco smoking (OR = 1.5, 95% CI, 1.1-1.9) were associated with an increased risk. No other significant associations between maternal and paternal factors were found; some environmental factors including rural residency, indoor cooking with wood, chemicals and water source appeared to be associated with an increased risk in adjusted models. CONCLUSION: Our study represents one of the first international studies investigating risk factors for clefts among multiethnic underserved populations. Our findings suggest a multifactorial etiology including both maternal and paternal factors.
Subject(s)
Cleft Palate/epidemiology , Models, Biological , Adult , Africa, Central , Asia, Southeastern , Asian People , Case-Control Studies , Central America , Child, Preschool , Cleft Palate/etiology , Female , Humans , Indians, Central American , Indians, South American , Infant , Infant, Newborn , Male , Risk Factors , Socioeconomic FactorsABSTRACT
BACKGROUND: The Total Exposure Study was a stratified, multi-center, cross-sectional study designed to estimate levels of biomarkers of tobacco-specific and non-specific exposure and of potential harm in U.S. adult current cigarette smokers (≥one manufactured cigarette per day over the last year) and tobacco product non-users (no smoking or use of any nicotine containing products over the last 5 years). The study was designed and sponsored by a tobacco company and implemented by contract research organizations in 2002-2003. Multiple analyses of smoking behavior, demographics, and biomarkers were performed. Study data and banked biospecimens were transferred from the sponsor to the Virginia Tobacco and Health Research Repository in 2010, and then to SRI International in 2012, for independent analysis and dissemination. METHODS: We analyzed biomarker distributions overall, and by biospecimen availability, for comparison with existing studies, and to evaluate generalizability to the entire sample. We calculated genome-wide statistical power for a priori hypotheses. We performed clinical chemistries, nucleic acid extractions and genotyping, and report correlation and quality control metrics. RESULTS: Vital signs, clinical chemistries, and laboratory measures of tobacco specific and non-specific toxicants are available from 3585 current cigarette smokers, and 1077 non-users. Peripheral blood mononuclear cells, red blood cells, plasma and 24-h urine biospecimens are available from 3073 participants (2355 smokers and 719 non-users). In multivariate analysis, participants with banked biospecimens were significantly more likely to self-identify as White, to be older, to have increased total nicotine equivalents per cigarette, decreased serum cotinine, and increased forced vital capacity, compared to participants without. Effect sizes were small (Cohen's d-values ≤ 0.11). Power for a priori hypotheses was 57 % in non-Hispanic Black (N = 340), and 96 % in non-Hispanic White (N = 1840), smokers. All DNA samples had genotype completion rates ≥97.5 %; 68 % of RNA samples yielded RIN scores ≥6.0. CONCLUSIONS: Total Exposure Study clinical and laboratory assessments and biospecimens comprise a unique resource for cigarette smoke health effects research. The Total Exposure Study Analysis Consortium seeks to perform molecular studies in multiple domains and will share data and analytic results in public repositories and the peer-reviewed literature. Data and banked biospecimens are available for independent or collaborative research.
Subject(s)
Cotinine/blood , Smoking/blood , Tobacco Use Disorder/blood , Adult , Biomarkers/blood , Black People/statistics & numerical data , Chemistry Techniques, Analytical/methods , Cross-Sectional Studies , Hispanic or Latino/statistics & numerical data , Humans , Male , Nicotine/analysis , Risk Factors , Smoke/adverse effects , United States/epidemiology , Virginia/epidemiology , White People/statistics & numerical dataABSTRACT
Genome-wide association studies (GWAS) for orofacial clefts have identified several susceptibility regions, but have largely focused on non-Hispanic White populations in developed countries. We performed a targeted genome-wide study of single nucleotide polymorphisms (SNPs) in exons using the Illumina HumanExome+ array with custom fine mapping of 16 cleft susceptibility regions in three underserved populations: Congolese (87 case-mother, 210 control-mother pairs), Vietnamese (131 case-parent trios), and Filipinos (42 case-mother, 99 control-mother pairs). All cases were children with cleft lip with or without cleft palate. Families were recruited from local hospitals and parental exposures were collected using interviewer-administered questionnaires. We used logistic regression models for case-control analyses, family-based association tests for trios, and fixed-effect meta-analyses to determine individual SNP effects corrected for multiple testing. Of the 16 known susceptibility regions tested, SNPs in four regions reached statistical significance in one or more of these populations: 1q32.2 (IRF6), 10q25.3 (VAX1), and 17q22 (NOG). Due to different linkage disequilibrium patterns, significant SNPs in these regions differed between the Vietnamese and Filipino populations from the index SNP selected from previous GWAS studies. Among Africans, there were no significant associations identified for any of the susceptibility regions. rs10787738 near VAX1 (P = 4.98E-3) and rs7987165 (P = 6.1E-6) were significant in the meta-analysis of all three populations combined. These results confirm several known susceptibility regions and identify novel risk alleles in understudied populations.
Subject(s)
Asian People/genetics , Black People/genetics , Cleft Lip/genetics , Cleft Palate/genetics , Genetic Predisposition to Disease/genetics , Adult , Alleles , Case-Control Studies , Female , Genome-Wide Association Study/methods , Genotype , Humans , Linkage Disequilibrium/genetics , Logistic Models , Male , Polymorphism, Single Nucleotide/genetics , Risk Factors , Young AdultABSTRACT
Importance: Recently, the Food and Drug Administration gave pre-marketing approval to algorithm based on its purported ability to identify genetic risk for opioid use disorder. However, the clinical utility of the candidate genes comprising the algorithm has not been independently demonstrated. Objective: To assess the utility of 15 variants in candidate genes from an algorithm intended to predict opioid use disorder risk. Design: This case-control study examined the association of 15 candidate genetic variants with risk of opioid use disorder using available electronic health record data from December 20, 1992 to September 30, 2022. Setting: Electronic health record data, including pharmacy records, from Million Veteran Program participants across the United States. Participants: Participants were opioid-exposed individuals enrolled in the Million Veteran Program (n = 452,664). Opioid use disorder cases were identified using International Classification of Disease diagnostic codes, and controls were individuals with no opioid use disorder diagnosis. Exposures: Number of risk alleles present across 15 candidate genetic variants. Main Outcome and Measures: Predictive performance of 15 genetic variants for opioid use disorder risk assessed via logistic regression and machine learning models. Results: Opioid exposed individuals (n=33,669 cases) were on average 61.15 (SD = 13.37) years old, 90.46% male, and had varied genetic similarity to global reference panels. Collectively, the 15 candidate genetic variants accounted for 0.4% of variation in opioid use disorder risk. The accuracy of the ensemble machine learning model using the 15 genes as predictors was 52.8% (95% CI = 52.1 - 53.6%) in an independent testing sample. Conclusions and Relevance: Candidate genes that comprise the approved algorithm do not meet reasonable standards of efficacy in predicting opioid use disorder risk. Given the algorithm's limited predictive accuracy, its use in clinical care would lead to high rates of false positive and negative findings. More clinically useful models are needed to identify individuals at risk of developing opioid use disorder.
ABSTRACT
Regular, long-term aspirin use may act synergistically with genetic variants, particularly those in mechanistically relevant pathways, to confer a protective effect on colorectal cancer (CRC) risk. We leveraged pooled data from 52 clinical trial, cohort, and case-control studies that included 30,806 CRC cases and 41,861 controls of European ancestry to conduct a genome-wide interaction scan between regular aspirin/nonsteroidal anti-inflammatory drug (NSAID) use and imputed genetic variants. After adjusting for multiple comparisons, we identified statistically significant interactions between regular aspirin/NSAID use and variants in 6q24.1 (top hit rs72833769), which has evidence of influencing expression of TBC1D7 (a subunit of the TSC1-TSC2 complex, a key regulator of MTOR activity), and variants in 5p13.1 (top hit rs350047), which is associated with expression of PTGER4 (codes a cell surface receptor directly involved in the mode of action of aspirin). Genetic variants with functional impact may modulate the chemopreventive effect of regular aspirin use, and our study identifies putative previously unidentified targets for additional mechanistic interrogation.
Subject(s)
Anti-Inflammatory Agents, Non-Steroidal , Colorectal Neoplasms , Genome-Wide Association Study , Polymorphism, Single Nucleotide , Humans , Colorectal Neoplasms/genetics , Colorectal Neoplasms/drug therapy , Anti-Inflammatory Agents, Non-Steroidal/pharmacology , Aspirin/pharmacology , Receptors, Prostaglandin E, EP4 Subtype/genetics , Receptors, Prostaglandin E, EP4 Subtype/metabolism , Male , Genetic Predisposition to Disease , Female , Case-Control Studies , Middle Aged , Genetic Loci , AgedABSTRACT
BACKGROUND: Testing for marginal associations between numerous genetic variants and disease may miss complex relationships among variables (e.g., gene-gene interactions). Bayesian approaches can model multiple variables together and offer advantages over conventional model building strategies, including using existing biological evidence as modeling priors and acknowledging that many models may fit the data well. With many candidate variables, Bayesian approaches to variable selection rely on algorithms to approximate the posterior distribution of models, such as Markov-Chain Monte Carlo (MCMC). Unfortunately, MCMC is difficult to parallelize and requires many iterations to adequately sample the posterior. We introduce a scalable algorithm called PEAK that improves the efficiency of MCMC by dividing a large set of variables into related groups using a rooted graph that resembles a mountain peak. Our algorithm takes advantage of parallel computing and existing biological databases when available. RESULTS: By using graphs to manage a model space with more than 500,000 candidate variables, we were able to improve MCMC efficiency and uncover the true simulated causal variables, including a gene-gene interaction. We applied PEAK to a case-control study of childhood asthma with 2,521 genetic variants. We used an informative graph for oxidative stress derived from Gene Ontology and identified several variants in ERBB4, OXR1, and BCL2 with strong evidence for associations with childhood asthma. CONCLUSIONS: We introduced an extremely flexible analysis framework capable of efficiently performing Bayesian variable selection on many candidate variables. The PEAK algorithm can be provided with an informative graph, which can be advantageous when considering gene-gene interactions, or a symmetric graph, which simply divides the model space into manageable regions. The PEAK framework is compatible with various model forms, allowing for the algorithm to be configured for different study designs and applications, such as pathway or rare-variant analyses, by simple modifications to the model likelihood and proposal functions.
Subject(s)
Computational Biology/methods , Genetic Association Studies/methods , Knowledge Bases , Models, Genetic , Algorithms , Asthma/genetics , Bayes Theorem , Case-Control Studies , Child , Humans , Markov Chains , Monte Carlo Method , Polymorphism, Single NucleotideABSTRACT
BACKGROUND: Alcohol use disorder (AUD) has been described as a chronic disease given the high rates that affected individuals have in returning to drinking after a change attempt. Many studies have characterized predictors of aggregated alcohol use (e.g., percent heavy drinking days) following treatment for AUD. However, to inform future research on predicting drinking as an AUD outcome measure, a better understanding is needed of the patterns of drinking that surround a treatment episode and which clinical measures predict patterns of drinking. METHODS: We analyzed data from the Project MATCH and COMBINE studies (MATCH: n = 1726; 24.3% female, 20.0% non-White; COMBINE: n = 1383; 30.9% female, 23.2% non-White). Daily drinking was measured in the 90 days prior to treatment, 90 days (MATCH) and 120 days (COMBINE) during treatment, and 365 days following treatment. Gradient boosting machine learning methods were used to explore baseline predictors of drinking patterns. RESULTS: Drinking patterns during a prior time period were the most consistent predictors of future drinking patterns. Social network drinking, AUD severity, mental health symptoms, and constructs based on the addiction cycle (incentive salience, negative emotionality, and executive function) were associated with patterns of drinking prior to treatment. Addiction cycle constructs, AUD severity, purpose in life, social network, legal history, craving, and motivation were associated with drinking during the treatment period and following treatment. CONCLUSIONS: There is heterogeneity in drinking patterns around an AUD treatment episode. This study provides novel information about variables that may be important to measure to improve the prediction of drinking patterns during and following treatment. Future research should consider which patterns of drinking they aim to predict and which period of drinking is most important to predict. The current findings could guide the selection of predictor variables and generate hypotheses for those predictors.
ABSTRACT
In this work, we develop a novel Bayesian regression framework that can be used to complete variable selection in high dimensional settings. Unlike existing techniques, the proposed approach can leverage side information to inform about the sparsity structure of the regression coefficients. This is accomplished by replacing the usual inclusion probability in the spike and slab prior with a binary regression model which assimilates this extra source of information. To facilitate model fitting, a computationally efficient and easy to implement Markov chain Monte Carlo posterior sampling algorithm is developed via carefully chosen priors and data augmentation steps. The finite sample performance of our methodology is assessed through numerical simulations, and we further illustrate our approach by using it to identify genetic markers associated with the nicotine metabolite ratio; a key biological marker associated with nicotine dependence and smoking cessation treatment.
Subject(s)
Algorithms , Bayes Theorem , Genetic Markers , Markov ChainsABSTRACT
Once an infrequent disease in parts of Asia, the rate of colorectal cancer in recent decades appears to be steadily increasing. Colorectal cancer represents one of the most important causes of cancer mortality worldwide, including in many regions in Asia. Rapid changes in socioeconomic and lifestyle habits have been attributed to the notable increase in the incidence of colorectal cancers in many Asian countries. Through published data from the International Agency for Cancer Research (IARC), we utilized available continuous data to determine which Asian nations had a rise in colorectal cancer rates. We found that East and South East Asian countries had a significant rise in colorectal cancer rates. Subsequently, we summarized here the known genetics and environmental risk factors for colorectal cancer among populations in this region as well as approaches to screening and early detection that have been considered across various countries in the region.