ABSTRACT
Mendelian randomization uses genetic variants as instrumental variables to make causal inferences on the effect of an exposure on an outcome. Due to the recent abundance of high-powered genome-wide association studies, many putative causal exposures of interest have large numbers of independent genetic variants with which they associate, each representing a potential instrument for use in a Mendelian randomization analysis. Such polygenic analyses increase the power of the study design to detect causal effects; however, they also increase the potential for bias due to instrument invalidity. Recent attention has been given to dealing with bias caused by correlated pleiotropy, which results from violation of the "instrument strength independent of direct effect" assumption. Although methods have been proposed that can account for this bias, a number of restrictive conditions remain in many commonly used techniques. In this paper, we propose a Bayesian framework for Mendelian randomization that provides valid causal inference under very general settings. We propose the methods MR-Horse and MVMR-Horse, which can be performed without access to individual-level data, using only summary statistics of the type commonly published by genome-wide association studies, and can account for both correlated and uncorrelated pleiotropy. In simulation studies, we show that the approach retains type I error rates below nominal levels even in high-pleiotropy scenarios. We demonstrate the proposed approaches in applied examples in both univariable and multivariable settings, some with very weak instruments.
Subject(s)
Genome-Wide Association Study , Mendelian Randomization Analysis , Animals , Horses , Bayes Theorem , Computer Simulation , Multifactorial InheritanceABSTRACT
Mendelian randomization (MR) utilizes genome-wide association study (GWAS) summary data to infer causal relationships between exposures and outcomes, offering a valuable tool for identifying disease risk factors. Multivariable MR (MVMR) estimates the direct effects of multiple exposures on an outcome. This study tackles the issue of highly correlated exposures commonly observed in metabolomic data, a situation where existing MVMR methods often face reduced statistical power due to multicollinearity. We propose a robust extension of the MVMR framework that leverages constrained maximum likelihood (cML) and employs a Bayesian approach for identifying independent clusters of exposure signals. Applying our method to the UK Biobank metabolomic data for the largest Alzheimer disease (AD) cohort through a two-sample MR approach, we identified two independent signal clusters for AD: glutamine and lipids, with posterior inclusion probabilities (PIPs) of 95.0% and 81.5%, respectively. Our findings corroborate the hypothesized roles of glutamate and lipids in AD, providing quantitative support for their potential involvement.
Subject(s)
Alzheimer Disease , Bayes Theorem , Genome-Wide Association Study , Mendelian Randomization Analysis , Metabolomics , Humans , Alzheimer Disease/genetics , Metabolomics/methods , Polymorphism, Single Nucleotide , Glutamine/metabolism , Glutamine/genetics , Lipids/blood , Lipids/geneticsABSTRACT
The existing framework of Mendelian randomization (MR) infers the causal effect of one or multiple exposures on one single outcome. It is not designed to jointly model multiple outcomes, as would be necessary to detect causes of more than one outcome and would be relevant to model multimorbidity or other related disease outcomes. Here, we introduce multi-response Mendelian randomization (MR2), an MR method specifically designed for multiple outcomes to identify exposures that cause more than one outcome or, conversely, exposures that exert their effect on distinct responses. MR2 uses a sparse Bayesian Gaussian copula regression framework to detect causal effects while estimating the residual correlation between summary-level outcomes, i.e., the correlation that cannot be explained by the exposures, and vice versa. We show both theoretically and in a comprehensive simulation study how unmeasured shared pleiotropy induces residual correlation between outcomes irrespective of sample overlap. We also reveal how non-genetic factors that affect more than one outcome contribute to their correlation. We demonstrate that by accounting for residual correlation, MR2 has higher power to detect shared exposures causing more than one outcome. It also provides more accurate causal effect estimates than existing methods that ignore the dependence between related responses. Finally, we illustrate how MR2 detects shared and distinct causal exposures for five cardiovascular diseases in two applications considering cardiometabolic and lipidomic exposures and uncovers residual correlation between summary-level outcomes reflecting known relationships between cardiovascular diseases.
Subject(s)
Cardiovascular Diseases , Humans , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/genetics , Bayes Theorem , Multimorbidity , Mendelian Randomization Analysis/methods , Causality , Genome-Wide Association StudyABSTRACT
Mendelian randomization (MR) is a powerful tool for causal inference with observational genome-wide association study (GWAS) summary data. Compared to the more commonly used univariable MR (UVMR), multivariable MR (MVMR) not only is more robust to the notorious problem of genetic (horizontal) pleiotropy but also estimates the direct effect of each exposure on the outcome after accounting for possible mediating effects of other exposures. Despite promising applications, there is a lack of studies on MVMR's theoretical properties and robustness in applications. In this work, we propose an efficient and robust MVMR method based on constrained maximum likelihood (cML), called MVMR-cML, with strong theoretical support. Extensive simulations demonstrate that MVMR-cML performs better than other existing MVMR methods while possessing the above two advantages over its univariable counterpart. An application to several large-scale GWAS summary datasets to infer causal relationships between eight cardiometabolic risk factors and coronary artery disease (CAD) highlights the usefulness and some advantages of the proposed method. For example, after accounting for possible pleiotropic and mediating effects, triglyceride (TG), low-density lipoprotein cholesterol (LDL), and systolic blood pressure (SBP) had direct effects on CAD; in contrast, the effects of high-density lipoprotein cholesterol (HDL), diastolic blood pressure (DBP), and body height diminished after accounting for other risk factors.
Subject(s)
Coronary Artery Disease , Mendelian Randomization Analysis , Humans , Mendelian Randomization Analysis/methods , Genome-Wide Association Study , Risk Factors , Causality , Coronary Artery Disease/genetics , Cholesterol, HDL/geneticsABSTRACT
Mendelian randomization (MR) is a statistical method that utilizes genetic variants as instrumental variables (IVs) to investigate causal relationships between risk factors and outcomes. Although MR has gained popularity in recent years due to its ability to analyze summary statistics from genome-wide association studies (GWAS), it requires a substantial number of single nucleotide polymorphisms (SNPs) as IVs to ensure sufficient power for detecting causal effects. Unfortunately, the complex genetic heritability of many traits can lead to the use of invalid IVs that affect both the risk factor and the outcome directly or through an unobserved confounder. This can result in biased and imprecise estimates, as reflected by a larger mean squared error (MSE). In this study, we focus on the widely used two-stage least squares (2SLS) method and derive formulas for its bias and MSE when estimating causal effects using invalid IVs. Using those formulas, we identify conditions under which the 2SLS estimate is unbiased and reveal how the independent or correlated pleiotropic effects influence the accuracy and precision of the 2SLS estimate. We validate these formulas through extensive simulation studies and demonstrate the application of those formulas in an MR study to evaluate the causal effect of the waist-to-hip ratio on various sleeping patterns. Our results can aid in designing future MR studies and serve as benchmarks for assessing more sophisticated MR methods.
Subject(s)
Genome-Wide Association Study , Mendelian Randomization Analysis , Humans , Mendelian Randomization Analysis/methods , Models, Genetic , Risk Factors , Causality , BiasABSTRACT
Genome-wide association studies (GWAS) have provided large numbers of genetic markers that can be used as instrumental variables in a Mendelian Randomisation (MR) analysis to assess the causal effect of a risk factor on an outcome. An extension of MR analysis, multivariable MR, has been proposed to handle multiple risk factors. However, adjusting or stratifying the outcome on a variable that is associated with it may induce collider bias. For an outcome that represents progression of a disease, conditioning by selecting only the cases may cause a biased MR estimation of the causal effect of the risk factor of interest on the progression outcome. Recently, we developed instrument effect regression and corrected weighted least squares (CWLS) to adjust for collider bias in observational associations. In this paper, we highlight the importance of adjusting for collider bias in MR with a risk factor of interest and disease progression as the outcome. A generalised version of the instrument effect regression and CWLS adjustment is proposed based on a multivariable MR model. We highlight the assumptions required for this approach and demonstrate its utility for bias reduction. We give an illustrative application to the effect of smoking initiation and smoking cessation on Crohn's disease prognosis, finding no evidence to support a causal effect.
ABSTRACT
Gene-environment (GxE) interactions play a crucial role in understanding the complex etiology of various traits, but assessing them using observational data can be challenging due to unmeasured confounders for lifestyle and environmental risk factors. Mendelian randomization (MR) has emerged as a valuable method for assessing causal relationships based on observational data. This approach utilizes genetic variants as instrumental variables (IVs) with the aim of providing a valid statistical test and estimation of causal effects in the presence of unmeasured confounders. MR has gained substantial popularity in recent years largely due to the success of genome-wide association studies. Many methods have been developed for MR; however, limited work has been done on evaluating GxE interaction. In this paper, we focus on two primary IV approaches: the two-stage predictor substitution and the two-stage residual inclusion, and extend them to accommodate GxE interaction under both the linear and logistic regression models for continuous and binary outcomes, respectively. Comprehensive simulation study and analytical derivations reveal that resolving the linear regression model is relatively straightforward. In contrast, the logistic regression model presents a considerably more intricate challenge, which demands additional effort.
Subject(s)
Gene-Environment Interaction , Genome-Wide Association Study , Mendelian Randomization Analysis , Humans , Logistic Models , Linear Models , Polymorphism, Single Nucleotide , Models, Genetic , Genetic Variation , Computer SimulationABSTRACT
Instrumental variable (IV) analysis has been widely applied in epidemiology to infer causal relationships using observational data. Genetic variants can also be viewed as valid IVs in Mendelian randomization and transcriptome-wide association studies. However, most multivariate IV approaches cannot scale to high-throughput experimental data. Here, we leverage the flexibility of our previous work, a hierarchical model that jointly analyzes marginal summary statistics (hJAM), to a scalable framework (SHA-JAM) that can be applied to a large number of intermediates and a large number of correlated genetic variants-situations often encountered in modern experiments leveraging omic technologies. SHA-JAM aims to estimate the conditional effect for high-dimensional risk factors on an outcome by incorporating estimates from association analyses of single-nucleotide polymorphism (SNP)-intermediate or SNP-gene expression as prior information in a hierarchical model. Results from extensive simulation studies demonstrate that SHA-JAM yields a higher area under the receiver operating characteristics curve (AUC), a lower mean-squared error of the estimates, and a much faster computation speed, compared to an existing approach for similar analyses. In two applied examples for prostate cancer, we investigated metabolite and transcriptome associations, respectively, using summary statistics from a GWAS for prostate cancer with more than 140,000 men and high dimensional publicly available summary data for metabolites and transcriptomes.
Subject(s)
Polymorphism, Single Nucleotide , Prostatic Neoplasms , Humans , Prostatic Neoplasms/genetics , Male , Genome-Wide Association Study/methods , Models, Statistical , Mendelian Randomization Analysis , ROC Curve , Computer SimulationABSTRACT
Transcriptome-wide association studies (TWAS) have been increasingly applied to identify (putative) causal genes for complex traits and diseases. TWAS can be regarded as a two-sample two-stage least squares method for instrumental variable (IV) regression for causal inference. The standard TWAS (called TWAS-L) only considers a linear relationship between a gene's expression and a trait in stage 2, which may lose statistical power when not true. Recently, an extension of TWAS (called TWAS-LQ) considers both the linear and quadratic effects of a gene on a trait, which however is not flexible enough due to its parametric nature and may be low powered for nonquadratic nonlinear effects. On the other hand, a deep learning (DL) approach, called DeepIV, has been proposed to nonparametrically model a nonlinear effect in IV regression. However, it is both slow and unstable due to the ill-posed inverse problem of solving an integral equation with Monte Carlo approximations. Furthermore, in the original DeepIV approach, statistical inference, that is, hypothesis testing, was not studied. Here, we propose a novel DL approach, called DeLIVR, to overcome the major drawbacks of DeepIV, by estimating a related but different target function and including a hypothesis testing framework. We show through simulations that DeLIVR was both faster and more stable than DeepIV. We applied both parametric and DL approaches to the GTEx and UK Biobank data, showcasing that DeLIVR detected additional 8 and 7 genes nonlinearly associated with high-density lipoprotein (HDL) cholesterol and low-density lipoprotein (LDL) cholesterol, respectively, all of which would be missed by TWAS-L, TWAS-LQ, and DeepIV; these genes include BUD13 associated with HDL, SLC44A2 and GMIP with LDL, all supported by previous studies.
Subject(s)
Deep Learning , Transcriptome , Humans , Quantitative Trait Loci , Phenotype , Genome-Wide Association Study/methods , Cholesterol , Genetic Predisposition to Disease , Polymorphism, Single NucleotideABSTRACT
Mendelian randomization is a statistical method for inferring the causal relationship between exposures and outcomes using an economics-derived instrumental variable approach. The research results are relatively complete when both exposures and outcomes are continuous variables. However, due to the noncollapsing nature of the logistic model, the existing methods inherited from the linear model for exploring binary outcome cannot take the effect of confounding factors into account, which leads to biased estimate of the causal effect. In this article, we propose an integrated likelihood method MR-BOIL to investigate causal relationships for binary outcomes by treating confounders as latent variables in one-sample Mendelian randomization. Under the assumption of a joint normal distribution of the confounders, we use expectation maximization algorithm to estimate the causal effect. Extensive simulations demonstrate that the estimator of MR-BOIL is asymptotically unbiased and that our method improves statistical power without inflating type I error rate. We then apply this method to analyze the data from Atherosclerosis Risk in Communications Study. The results show that MR-BOIL can better identify plausible causal relationships with high reliability, compared with the unreliable results of existing methods. MR-BOIL is implemented in R and the corresponding R code is provided for free download.
Subject(s)
Mendelian Randomization Analysis , Models, Genetic , Humans , Likelihood Functions , Mendelian Randomization Analysis/methods , Reproducibility of Results , CausalityABSTRACT
Recently, a bespoke instrumental variable method was proposed, which, under certain assumptions, can eliminate bias due to unmeasured confounding when estimating the causal exposure effect among the exposed. This method uses data from both the study population of interest, and a reference population in which the exposure is completely absent. In this paper, we extend the bespoke instrumental variable method to allow for a non-ideal reference population that may include exposed subjects. Such an extension is particularly important in randomized trials with nonadherence, where even subjects in the control arm may have access to the treatment under investigation. We further scrutinize the assumptions underlying the bespoke instrumental method, and caution the reader about the potential non-robustness of the method to these assumptions.
ABSTRACT
Traffic related air pollution is a major concern for perinatal health. Determining causal associations, however, is difficult since high-traffic areas tend to correspond with lower socioeconomic neighborhoods and other environmental exposures. To overcome confounding, we compared pregnant individuals living downwind and upwind of the same high-traffic road. We leveraged vital statistics data for Texas from 2007-2016 (n=3,570,272 births) and computed hourly wind estimates for residential addresses within 500 m of high-traffic roads (i.e., annual average daily traffic greater than 25,000) (10.9% of births). We matched pregnant individuals predominantly upwind to pregnant neighbors downwind of the same road segment (n=37,631 pairs). Living downwind was associated with an 11.6 gram (95% CI: -18.01, -5.21) decrease in term birth weight. No associations were observed with low term birth weight, preterm birth, or very preterm birth. In distance-stratified models, living downwind within 50 m was associated with a -36.3 gram (95% CI: -67.74, -4.93) decrease in term birth weight and living 51-100m downwind was associated with an odds ratio of 3.68 (95% CI: 1.71, 7.90) for very preterm birth. These results suggest traffic air pollution is associated with adverse birth outcomes, with steep distance decay gradients around major roads.
ABSTRACT
One obstacle to adopting instrumental variable (IV) methods in pharmacoepidemiology is their reliance on strong, unverifiable assumptions. We can falsify IV assumptions by leveraging the causal structure, which can strengthen or refute their plausibility and increase the validity of effect estimates. We illustrate a systematic approach to evaluate calendar time IV assumptions in estimating the known effect of thiazolidinediones on hospitalized heart failure. Using cohort entry time before and after 09/2010, when the U.S. Food and Drug Administration issued a safety communication as a proposed IV, we estimated IV and propensity score-weighted 2-year risk differences (RDs) using Medicare data (2008-2014). We (i) performed inequality tests, (ii) identified the negative control IV/outcome using causal assumptions, (iii) estimated RDs after narrowing the calendar time range and excluding patients likely associated with unmeasured confounding, (iv) derived bounds for RDs, and (v) estimated the proportion of compliers and their characteristics. The findings revealed that IV assumptions were violated and RDs were extreme, but the assumptions became more plausible upon narrowing the calendar time range and restricting the cohort by excluding prevalent heart failure (the strongest measured predictor of outcome). Systematically evaluating IV assumptions could help detect bias in IV estimators and increase their validity.
ABSTRACT
With the increasing availability of large-scale GWAS summary data on various complex traits and diseases, there have been tremendous interests in applications of Mendelian randomization (MR) to investigate causal relationships between pairs of traits using SNPs as instrumental variables (IVs) based on observational data. In spite of the potential significance of such applications, the validity of their causal conclusions critically depends on some strong modeling assumptions required by MR, which may be violated due to the widespread (horizontal) pleiotropy. Although many MR methods have been proposed recently to relax the assumptions by mainly dealing with uncorrelated pleiotropy, only a few can handle correlated pleiotropy, in which some SNPs/IVs may be associated with hidden confounders, such as some heritable factors shared by both traits. Here we propose a simple and effective approach based on constrained maximum likelihood and model averaging, called cML-MA, applicable to GWAS summary data. To deal with more challenging situations with many invalid IVs with only weak pleiotropic effects, we modify and improve it with data perturbation. Extensive simulations demonstrated that the proposed methods could control the type I error rate better while achieving higher power than other competitors. Applications to 48 risk factor-disease pairs based on large-scale GWAS summary data of 3 cardio-metabolic diseases (coronary artery disease, stroke, and type 2 diabetes), asthma, and 12 risk factors confirmed its superior performance.
Subject(s)
Algorithms , Genetic Pleiotropy , Likelihood Functions , Mendelian Randomization Analysis/methods , Asthma/etiology , Cardiovascular Diseases/etiology , Causality , Computer Simulation , Diabetes Mellitus, Type 2/etiology , Humans , Models, Statistical , Risk FactorsABSTRACT
Instrumental variable (IV) methods allow us the opportunity to address unmeasured confounding in causal inference. However, most IV methods are only applicable to discrete or continuous outcomes with very few IV methods for censored survival outcomes. In this article, we propose nonparametric estimators for the local average treatment effect on survival probabilities under both covariate-dependent and outcome-dependent censoring. We provide an efficient influence function-based estimator and a simple estimation procedure when the IV is either binary or continuous. The proposed estimators possess double-robustness properties and can easily incorporate nonparametric estimation using machine learning tools. In simulation studies, we demonstrate the flexibility and double robustness of our proposed estimators under various plausible scenarios. We apply our method to the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial for estimating the causal effect of screening on survival probabilities and investigate the causal contrasts between the two interventions under different censoring assumptions.
Subject(s)
Computer Simulation , Humans , Causality , ProbabilityABSTRACT
OBJECTIVES: The goal of the research was to assess the quantitative relationship between median progression-free survival (PFS) and median overall survival (OS) specifically among patients with relapsed/refractory multiple myeloma (RRMM) based on published randomized controlled trials (RCTs). METHODS: Two bibliographic databases (PubMed and Embase, 1970-2017) were systematically searched for RCTs in RRMM that reported OS and PFS, followed by an updated search of studies published between 2010 and 2022 in 3 databases (Embase, MEDLINE, and EBM Reviews, 2010-2022). The association between median PFS and median OS was assessed using the nonparametric Spearman rank and parametric Pearson correlation coefficients. Subsequently, the quantitative relationship between PFS and OS was assessed using weighted least-squares regression adjusted for covariates including age, sex, and publication year. Study arms were weighted by the number of patients in each arm. RESULTS: A total of 31 RCTs (56 treatment arms, 10,450 patients with RRMM) were included in the analysis. The average median PFS and median OS were 7.1 months (SD 5.5) and 28.1 months (SD 11.8), respectively. The Spearman and Pearson correlation coefficients between median PFS and median OS were 0.80 (P < 0.0001) and 0.79 (P < 0.0001), respectively. In individual treatment arms of RRMM trials, each 1-month increase in median PFS was associated with a 1.72-month (95% CI 1.26-2.17) increase in median OS. CONCLUSION: Analysis of the relationship between PFS and OS incorporating more recent studies in RRMM further substantiates the use of PFS to predict OS in RRMM.
Subject(s)
Multiple Myeloma , Progression-Free Survival , Randomized Controlled Trials as Topic , Multiple Myeloma/mortality , Multiple Myeloma/therapy , Multiple Myeloma/pathology , Humans , Neoplasm Recurrence, Local/mortality , Female , MaleABSTRACT
Mediation analysis is a strategy for understanding the mechanisms by which interventions affect later outcomes. However, unobserved confounding concerns may be compounded in mediation analyses, as there may be unobserved exposure-outcome, exposure-mediator, and mediator-outcome confounders. Instrumental variables (IVs) are a popular identification strategy in the presence of unobserved confounding. However, in contrast to the rich literature on the use of IV methods to identify and estimate a total effect of a non-randomized exposure, there has been almost no research into using IV as an identification strategy to identify mediational indirect effects. In response, we define and nonparametrically identify novel estimands-double complier interventional direct and indirect effects-when 2, possibly related, IVs are available, one for the exposure and another for the mediator. We propose nonparametric, robust, efficient estimators for these effects and apply them to a housing voucher experiment.
Subject(s)
Mediation Analysis , Confounding Factors, EpidemiologicABSTRACT
HIV estimation using data from the demographic and health surveys (DHS) is limited by the presence of non-response and test refusals. Conventional adjustments such as imputation require the data to be missing at random. Methods that use instrumental variables allow the possibility that prevalence is different between the respondents and non-respondents, but their performance depends critically on the validity of the instrument. Using Manski's partial identification approach, we form instrumental variable bounds for HIV prevalence from a pool of candidate instruments. Our method does not require all candidate instruments to be valid. We use a simulation study to evaluate and compare our method against its competitors. We illustrate the proposed method using DHS data from Zambia, Malawi and Kenya. Our simulations show that imputation leads to seriously biased results even under mild violations of non-random missingness. Using worst case identification bounds that do not make assumptions about the non-response mechanism is robust but not informative. By taking the union of instrumental variable bounds balances informativeness of the bounds and robustness to inclusion of some invalid instruments. Non-response and refusals are ubiquitous in population based HIV data such as those collected under the DHS. Partial identification bounds provide a robust solution to HIV prevalence estimation without strong assumptions. Union bounds are significantly more informative than the worst case bounds without sacrificing credibility.
Subject(s)
Computer Simulation , HIV Infections , Health Surveys , Humans , HIV Infections/epidemiology , Kenya/epidemiology , Prevalence , Malawi/epidemiology , Models, Statistical , Zambia/epidemiology , Male , Female , Bias , Data Interpretation, StatisticalABSTRACT
BACKGROUND: Mendelian randomization is a popular method for causal inference with observational data that uses genetic variants as instrumental variables. Similarly to a randomized trial, a standard Mendelian randomization analysis estimates the population-averaged effect of an exposure on an outcome. Dividing the population into subgroups can reveal effect heterogeneity to inform who would most benefit from intervention on the exposure. However, as covariates are measured post-"randomization", naive stratification typically induces collider bias in stratum-specific estimates. METHOD: We extend a previously proposed stratification method (the "doubly-ranked method") to form strata based on a single covariate, and introduce a data-adaptive random forest method to calculate stratum-specific estimates that are robust to collider bias based on a high-dimensional covariate set. We also propose measures based on the Q statistic to assess heterogeneity between stratum-specific estimates (to understand whether estimates are more variable than expected due to chance alone) and variable importance (to identify the key drivers of effect heterogeneity). RESULT: We show that the effect of body mass index (BMI) on lung function is heterogeneous, depending most strongly on hip circumference and weight. While for most individuals, the predicted effect of increasing BMI on lung function is negative, it is positive for some individuals and strongly negative for others. CONCLUSION: Our data-adaptive approach allows for the exploration of effect heterogeneity in the relationship between an exposure and an outcome within a Mendelian randomization framework. This can yield valuable insights into disease aetiology and help identify specific groups of individuals who would derive the greatest benefit from targeted interventions on the exposure.
Subject(s)
Genetic Variation , Mendelian Randomization Analysis , Humans , Mendelian Randomization Analysis/methods , Causality , Bias , Body Mass IndexABSTRACT
Mendelian randomization (MR) requires strong unverifiable assumptions to estimate causal effects. However, for categorical exposures, the MR assumptions can be falsified using a method known as the instrumental inequalities. To apply the instrumental inequalities to a continuous exposure, investigators must coarsen the exposure, a process which can itself violate the MR conditions. Violations of the instrumental inequalities for an MR model with a coarsened exposure might therefore reflect the effect of coarsening rather than other sources of bias. We aim to evaluate how exposure coarsening affects the ability of the instrumental inequalities to detect bias in MR models with multiple proposed instruments under various causal structures. To do so, we simulated data mirroring existing studies of the effect of alcohol consumption on cardiovascular disease under a variety of exposure-outcome effects in which the MR assumptions were met for a continuous exposure. We categorized the exposure based on subject matter knowledge or the observed data distribution and applied the instrumental inequalities to MR models for the effects of the coarsened exposure. In simulations of multiple binary instruments, the instrumental inequalities did not detect bias under any magnitude of exposure outcome effect when the exposure was coarsened into more than 2 categories. However, in simulations of both single and multiple proposed instruments, the instrumental inequalities were violated in some scenarios when the exposure was dichotomized. The results of these simulations suggest that the instrumental inequalities are largely insensitive to bias due to exposure coarsening with greater than 2 categories, and could be used with coarsened exposures to evaluate the required assumptions in applied MR studies, even when the underlying exposure is truly continuous.