RESUMEN
Cognitive functioning in older age profoundly impacts quality of life and health. While most research on cognition in older age has focused on mean levels, intraindividual variability (IIV) around this may have risk factors and outcomes independent of the mean value. Investigating risk factors associated with IIV has typically involved deriving a summary statistic for each person from residual error around a fitted mean. However, this ignores uncertainty in the estimates, prohibits exploring associations with time-varying factors, and is biased by floor/ceiling effects. To address this, we propose a mixed-effects location scale beta-binomial model for estimating average probability and IIV in a word recall test in the English Longitudinal Study of Ageing. After adjusting for mean performance, an analysis of 9,873 individuals across 7 (mean = 3.4) waves (2002-2015) found IIV to be greater at older ages, with lower education, in females, with more difficulties in activities of daily living, in later birth cohorts, and when interviewers recorded issues potentially affecting test performance. Our study introduces a novel method for identifying groups with greater IIV in bounded discrete outcomes. Our findings have implications for daily functioning and care, and further work is needed to identify the impact for future health outcomes.
Asunto(s)
Actividades Cotidianas , Calidad de Vida , Anciano , Femenino , Humanos , Envejecimiento/psicología , Cognición , Estudios Longitudinales , Modelos Estadísticos , Factores de Riesgo , MasculinoRESUMEN
OBJECTIVE: To investigate the diagnostic role of Anti-Mullerian Hormone (AMH) in Polycystic ovary syndrome (PCOS) using advanced marginal beta-binomial statistical model and present the optimal cut-off by different age groups, geographical locations, body mass index (BMI) and other relevant factors. DATA SOURCES: A comprehensive and systematic literature search was conducted in ISI Web of Science, PubMed/Medline, Scopus, Cochrane Library, Embase and ProQuest until August 2024. STUDY ELIGIBILITY CRITERIA: Epidemiological studies whose diagnostic criterion for PCOS was Androgen Excess Society (AES) or National Institute of Health (NIH) or Rotterdam were included in the current meta-analysis. If studies had information about the sensitivity and specificity of AMH or related data thorough which we could calculate these parameters and/or data on odds ratio and mean were eligible to be included. METHODS: The diagnostic role of AMH was assessed using the marginal beta-binomial statistical model and summary receiver operating characteristics (SROC) method in terms of pooled sensitivity, specificity, and diagnostic odds ratio (DOR) with 95% confidence interval (CI). Pooled weighted mean difference (WMD) and pooled odds ratios (ORs) with 95% CI were estimated using random effects model. RESULTS: A total of 202 observational studies were included in the pooled analysis, of which 106 studies (including 19465 cases and 29318 controls) were used for meta-analysis of sensitivity/specificity and 186 studies (including 30656 cases and 34360 controls) for meta-analysis of mean difference. The pooled sensitivity, specificity, and DOR for AMH were 0.79 (95% CI: 0.52 to 0.97), 0.82 (95% CI: 0.64 to 0.99) and 17.12 (95% CI: 14.37 to 20.32), respectively. The area under curve (AUC) based on the SROC model was 0.90 (95% CI: 0.87 to 0.93). AMH levels were significantly higher in women with PCOS than control women (WMD= 4.91; 95% CI: 4.57-5.27). In addition, individuals with a higher level of AMH were more likely to be affected by PCOS (OR=23.17; 95% CI: 18.74-28.66; I2= 94%; P<0.001). A serum AMH concentration of >5.39 ng/mL was associated with PCOS (sensitivity= 88.6%; specificity= 92.75%; likelihood ratio for a positive test result (LR+)= 12.21; and likelihood ratio for a negative test result (LR-)= 0.12). CONCLUSION: According to the results of this meta-analysis, serum AMH concentration is a valuable biomarker for the diagnosis of PCOS. The cut-off points suggested by the current meta-analysis need to be evaluated and validated by future studies and before their implementation in clinical practice.
RESUMEN
Meta-analysis of binary data is challenging when the event under investigation is rare, and standard models for random-effects meta-analysis perform poorly in such settings. In this simulation study, we investigate the performance of different random-effects meta-analysis models in terms of point and interval estimation of the pooled log odds ratio in rare events meta-analysis. First and foremost, we evaluate the performance of a hypergeometric-normal model from the family of generalized linear mixed models (GLMMs), which has been recommended, but has not yet been thoroughly investigated for rare events meta-analysis. Performance of this model is compared to performance of the beta-binomial model, which yielded favorable results in previous simulation studies, and to the performance of models that are frequently used in rare events meta-analysis, such as the inverse variance model and the Mantel-Haenszel method. In addition to considering a large number of simulation parameters inspired by real-world data settings, we study the comparative performance of the meta-analytic models under two different data-generating models (DGMs) that have been used in past simulation studies. The results of this study show that the hypergeometric-normal GLMM is useful for meta-analysis of rare events when moderate to large heterogeneity is present. In addition, our study reveals important insights with regard to the performance of the beta-binomial model under different DGMs from the binomial-normal family. In particular, we demonstrate that although misalignment of the beta-binomial model with the DGM affects its performance, it shows more robustness to the DGM than its competitors.
Asunto(s)
Modelos Estadísticos , Oportunidad Relativa , Simulación por Computador , Modelos LinealesRESUMEN
BACKGROUND: Meta-analyses are used to summarise the results of several studies on a specific research question. Standard methods for meta-analyses, namely inverse variance random effects models, have unfavourable properties if only very few (2 - 4) studies are available. Therefore, alternative meta-analytic methods are needed. In the case of binary data, the "common-rho" beta-binomial model has shown good results in situations with sparse data or few studies. The major concern of this model is that it ignores the fact that each treatment arm is paired with a respective control arm from the same study. Thus, the randomisation to a study arm of a specific study is disrespected, which may lead to compromised estimates of the treatment effect. Therefore, we extended this model to a version that respects randomisation. The aim of this simulation study was to compare the "common-rho" beta-binomial model and several other beta-binomial models with standard meta-analyses models, including generalised linear mixed models and several inverse variance random effects models. METHODS: We conducted a simulation study comparing beta-binomial models and various standard meta-analysis methods. The design of the simulation aimed to consider meta-analytic situations occurring in practice. RESULTS: No method performed well in scenarios with only 2 studies in the random effects scenario. In this situation, a fixed effect model or a qualitative summary of the study results may be preferable. In scenarios with 3 or 4 studies, most methods satisfied the nominal coverage probability. The "common-rho" beta-binomial model showed the highest power under the alternative hypothesis. The beta-binomial model respecting randomisation did not improve performance. CONCLUSION: The "common-rho" beta-binomial appears to be a good option for meta-analyses of very few studies. As residual concerns about the consequences of disrespecting randomisation may still exist, we recommend a sensitivity analysis with a standard meta-analysis method that respects randomisation.
Asunto(s)
Modelos Estadísticos , Humanos , Probabilidad , Modelos Lineales , Simulación por ComputadorRESUMEN
In cluster randomized trials, it is often of interest to estimate the common intraclass correlation at the design stage for sample size and power calculations, which are greatly affected by the value of a common intraclass correlation. In this article, we construct confidence intervals (CIs) for the common intraclass correlation coefficient of several treatment groups. We consider the profile likelihood (PL)-based approach using the beta-binomial models and the approach based on the concept of generalized pivots using the ANOVA estimator and its asymptotic variance. We compare both approaches with a number of large sample procedures as well as both parametric and nonparametric bootstrap procedures in terms of coverage and expected CI length through a simulation study, and illustrate the methodology with two examples from biomedical fields. The results support the use of the PL-based CI as it holds the preassigned confidence level very well and overall gives a very competitive length.
Asunto(s)
Simulación por Computador/estadística & datos numéricos , Bases de Datos Factuales/estadística & datos numéricos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Análisis por Conglomerados , Intervalos de Confianza , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricosRESUMEN
Longitudinal binomial data are frequently generated from multiple questionnaires and assessments in various scientific settings for which the binomial data are often overdispersed. The standard generalized linear mixed effects model may result in severe underestimation of standard errors of estimated regression parameters in such cases and hence potentially bias the statistical inference. In this paper, we propose a longitudinal beta-binomial model for overdispersed binomial data and estimate the regression parameters under a probit model using the generalized estimating equation method. A hybrid algorithm of the Fisher scoring and the method of moments is implemented for computing the method. Extensive simulation studies are conducted to justify the validity of the proposed method. Finally, the proposed method is applied to analyze functional impairment in subjects who are at risk of Huntington disease from a multisite observational study of prodromal Huntington disease. Copyright © 2016 John Wiley & Sons, Ltd.
Asunto(s)
Distribución Binomial , Estudios Longitudinales , Algoritmos , Interpretación Estadística de Datos , Humanos , Enfermedad de Huntington/epidemiología , Modelos Lineales , Modelos Estadísticos , Síntomas Prodrómicos , Factores de RiesgoRESUMEN
Meta-analysis of diagnostic test accuracy often involves mixture of case-control and cohort studies. The existing bivariate random-effects models, which jointly model bivariate accuracy indices (e.g., sensitivity and specificity), do not differentiate cohort studies from case-control studies and thus do not utilize the prevalence information contained in the cohort studies. The recently proposed trivariate generalized linear mixed-effects models are only applicable to cohort studies, and more importantly, they assume a common correlation structure across studies and trivariate normality on disease prevalence, test sensitivity, and specificity after transformation by some pre-specified link functions. In practice, very few studies provide justifications of these assumptions, and sometimes these assumptions are violated. In this paper, we evaluate the performance of the commonly used random-effects model under violations of these assumptions and propose a simple and robust method to fully utilize the information contained in case-control and cohort studies. The proposed method avoids making the aforementioned assumptions and can provide valid joint inferences for any functions of overall summary measures of diagnostic accuracy. Through simulation studies, we find that the proposed method is more robust to model misspecifications than the existing methods. We apply the proposed method to a meta-analysis of diagnostic test accuracy for the detection of recurrent ovarian carcinoma. Copyright © 2016 John Wiley & Sons, Ltd.
Asunto(s)
Pruebas Diagnósticas de Rutina , Metaanálisis como Asunto , Recurrencia Local de Neoplasia/diagnóstico , Neoplasias Ováricas/diagnóstico , Proyectos de Investigación , Simulación por Computador , Femenino , Humanos , Análisis MultivarianteRESUMEN
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration.
Asunto(s)
Metaanálisis como Asunto , Modelos Estadísticos , Arilamina N-Acetiltransferasa/metabolismo , Sesgo , Biomarcadores de Tumor/metabolismo , Bioestadística/métodos , Estudios de Casos y Controles , Neoplasias Colorrectales/enzimología , Simulación por Computador , Diagnóstico Precoz , Humanos , Funciones de Verosimilitud , Modelos Lineales , Melanoma/diagnóstico , Melanoma/secundario , Programas Informáticos , Neoplasias de la Vejiga Urinaria/diagnóstico , Neoplasias de la Vejiga Urinaria/metabolismoRESUMEN
DNA methylation is an important epigenetic modification involved in gene regulation. Advances in the next generation sequencing technology have enabled the retrieval of DNA methylation information at single-base-resolution. However, due to the sequencing process and the limited amount of isolated DNA, DNA-methylation-data are often noisy and sparse, which complicates the identification of differentially methylated regions (DMRs), especially when few replicates are available. We present a varying-coefficient model for detecting DMRs by using single-base-resolved methylation information. The model simultaneously smooths the methylation profiles and allows detection of DMRs, while accounting for additional covariates. The proposed model takes into account possible overdispersion by using a beta-binomial distribution. The overdispersion itself can be modeled as a function of the genomic region and explanatory variables. We illustrate the properties of the proposed model by applying it to two real-life case studies.
Asunto(s)
Metilación de ADN , Análisis de Secuencia de ADN , Humanos , Análisis de Secuencia de ADN/métodos , Secuenciación de Nucleótidos de Alto RendimientoRESUMEN
In meta-analyses of rare events, it can be challenging to obtain a reliable estimate of the pooled effect, in particular when the meta-analysis is based on a small number of studies. Recent simulation studies have shown that the beta-binomial model is a promising candidate in this situation, but have thus far only investigated its performance in a frequentist framework. In this study, we aim to make the beta-binomial model for meta-analysis of rare events amenable to Bayesian inference by proposing prior distributions for the effect parameter and investigating the models' robustness to different specifications of priors for the scale parameter. To evaluate the performance of Bayesian beta-binomial models with different priors, we conducted a simulation study with two different data generating models in which we varied the size of the pooled effect, the degree of heterogeneity, the baseline probability, and the sample size. Our results show that while some caution must be exercised when using the Bayesian beta-binomial in meta-analyses with extremely sparse data, the use of a weakly informative prior for the effect parameter is beneficial in terms of mean bias, mean squared error, and coverage. For the scale parameter, half-normal and exponential distributions are identified as candidate priors in meta-analysis of rare events using the Bayesian beta-binomial model.
Asunto(s)
Modelos Estadísticos , Teorema de Bayes , Simulación por Computador , Probabilidad , Tamaño de la MuestraRESUMEN
Risk difference is a frequently-used effect measure for binary outcomes. In a meta-analysis, commonly-used methods to synthesize risk differences include: (1) the two-step methods that estimate study-specific risk differences first, then followed by the univariate common-effect model, fixed-effects model, or random-effects models; and (2) the one-step methods using bivariate random-effects models to estimate the summary risk difference from study-specific risks. These methods are expected to have similar performance when the number of studies is large and the event rate is not rare. However, studies with zero events are common in meta-analyses, and bias may occur with the conventional two-step methods from excluding zero-event studies or using an artificial continuity correction to zero events. In contrast, zero-event studies can be included and modeled by bivariate random-effects models in a single step. This article compares various methods to estimate risk differences in meta-analyses. Specifically, we present two case studies and three simulation studies to compare the performance of conventional two-step methods and bivariate random-effects models in the presence or absence of zero-event studies. In conclusion, we recommend researchers using bivariate random-effects models to estimate risk differences in meta-analyses, particularly in the presence of zero events.
Asunto(s)
Modelos Estadísticos , Simulación por ComputadorRESUMEN
BACKGROUND: Patients with esophageal squamous cell carcinoma (ESCC) with lymph node metastasis may be misclassified as pN0 due to an insufficient number of lymph nodes examined (LNE). The purpose of this study was to confirm that patients with ESCC are indeed pN0 and to propose an adequate number for the correct nodal stage using the nodal staging score (NSS) developed by the beta-binomial model. METHODS: A total of 1249 patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2017, and 1404 patients diagnosed with ESCC in our database between 2005 and 2018 were included. The NSS was developed to assess the probability of pN0 status based on both databases. The effectiveness of NSS was verified using survival analysis, including Kaplan-Meier curves and Cox models. RESULTS: Many patients were misclassified as pN0 based on our algorithm due to insufficient LNE. As the number of LNE increased, false-negative findings dropped; accordingly, the NSS increased. In addition, NSS was an independent prognostic indicator for pN0 in patients with ESCC in the SEER database (hazard ratio [HR] 0.182, 95% confidence interval [CI] 0.046-0.730, p = 0.016) and our database (HR 0.215, 95% CI 0.055-0.842, p = 0.027). A certain number of nodes must be examined to achieve 90% of the NSS. CONCLUSIONS: NSS could determine the probability of true pN0 status for patients, and it was sufficient in predicting survival and obtaining adequate numbers for lymphadenectomy.
Asunto(s)
Neoplasias Esofágicas , Carcinoma de Células Escamosas de Esófago , Humanos , Carcinoma de Células Escamosas de Esófago/patología , Estadificación de Neoplasias , Neoplasias Esofágicas/patología , Metástasis Linfática/patología , Ganglios Linfáticos/patología , Escisión del Ganglio Linfático , PronósticoRESUMEN
In evidence synthesis, dealing with zero-events studies is an important and complicated task that has generated broad discussion. Numerous methods provide valid solutions to synthesizing data from studies with zero-events, either based on a frequentist or a Bayesian framework. Among frequentist frameworks, the one-stage methods have their unique advantages to deal with zero-events studies, especially for double-arm-zero-events. In this article, we give a concise overview of the one-stage frequentist methods. We conducted simulation studies to compare the statistical properties of these methods to the two-stage frequentist method (continuity correction) for meta-analysis with zero-events studies when double-zero-events studies were included. Our simulation studies demonstrated that the generalized estimating equation with unstructured correlation and beta-binomial method had the best performance among the one-stage methods. The random intercepts generalized linear mixed model showed good performance in the absence of obvious between-study variance. Our results also showed that the continuity correction with inverse-variance heterogeneous (IVhet) analytic model based on the two-stage framework had good performance when the between-study variance was obvious and the group size was balanced for included studies. In summary, the one-stage framework has unique advantages to deal with studies with zero events and is not susceptive to group size ratio. It should be considered in future meta-analyses whenever possible.
Asunto(s)
Modelos Estadísticos , Proyectos de Investigación , Teorema de Bayes , Simulación por Computador , Modelos LinealesRESUMEN
BACKGROUND: Lymph node status can predict the prognosis of patients with rectal cancer treated with surgery. Thus, we sought to establish a standard for the minimum number of lymph nodes (LNs) examined in patients with rectal cancer by evaluating the probability that pathologically negative LNs prove positive during surgery. PATIENTS AND METHODS: We extracted information of 31,853 patients with stage I-III rectal carcinoma registered between 2004 and 2013 from the Surveillance, Epidemiology, and End Results database and divided them into two groups: the first group was SURG, including patients receiving surgery directly and the other group was NEO, encompassing those underwent neo-adjuvant therapy. Using a beta-binomial model, we developed nodal staging score (NSS) based on pT/ypT stage and the number of LNs retrieved. RESULTS: In both cohorts, the false-negative rate was estimated to be 16% when 12 LNs were examined, but it dropped to 10% when 20 LNs were evaluated. In the SURG cohort, to rule out 90% possibility of false staging, 3, 7, 28, and 32 LNs would be necessarily examined in patients with pT1-4 disease, respectively. While in the NEO cohort, 4, 7, 12, and 16 LNs would be included for examination in patients with ypT1-4 disease to guarantee an NSS of 90%. CONCLUSION: By determining whether a rectal cancer patient with negative LNs was appropriately staged, the NSS model we developed in this study may assist in tailoring postoperative management.
RESUMEN
For meta-analysis of studies that report outcomes as binomial proportions, the most popular measure of effect is the odds ratio (OR), usually analyzed as log(OR). Many meta-analyses use the risk ratio (RR) and its logarithm because of its simpler interpretation. Although log(OR) and log(RR) are both unbounded, use of log(RR) must ensure that estimates are compatible with study-level event rates in the interval (0, 1). These complications pose a particular challenge for random-effects models, both in applications and in generating data for simulations. As background, we review the conventional random-effects model and then binomial generalized linear mixed models (GLMMs) with the logit link function, which do not have these complications. We then focus on log-binomial models and explore implications of using them; theoretical calculations and simulation show evidence of biases. The main competitors to the binomial GLMMs use the beta-binomial (BB) distribution, either in BB regression or by maximizing a BB likelihood; a simulation produces mixed results. Two examples and an examination of Cochrane meta-analyses that used RR suggest bias in the results from the conventional inverse-variance-weighted approach. Finally, we comment on other measures of effect that have range restrictions, including risk difference, and outline further research.
Asunto(s)
Antidepresivos Tricíclicos/efectos adversos , Antidepresivos Tricíclicos/uso terapéutico , Depresión/tratamiento farmacológico , Metaanálisis como Asunto , Medición de Riesgo/métodos , Riesgo , Algoritmos , Simulación por Computador , Diuréticos/uso terapéutico , Femenino , Humanos , Funciones de Verosimilitud , Modelos Lineales , Oportunidad Relativa , Preeclampsia/tratamiento farmacológico , Embarazo , Análisis de RegresiónRESUMEN
While there is consensus that the efficacy of parasiticides is properly assessed using the Abbott formula, there is as yet no general consensus on the use of arithmetic versus geometric mean numbers of surviving parasites in the formula. The purpose of this paper is to investigate the accuracy and precision of various efficacy estimators based on the Abbott formula which alternatively use arithmetic mean, geometric mean and median numbers of surviving parasites; we also consider a maximum likelihood estimator. Our study shows that the best estimators using geometric means are competitive, with respect to root mean squared error, with the conventional Abbott estimator using arithmetic means, as they have lower average and lower median root mean square error over the parameter scenarios which we investigated. However, our study confirms that Abbott estimators using geometric means are potentially biased upwards, and this upward bias is substantial in particular when the test product has substandard efficacy (90% and below). For this reason, we recommend that the Abbott estimator be calculated using arithmetic means.
Asunto(s)
Infestaciones Ectoparasitarias/tratamiento farmacológico , Modelos Biológicos , Plaguicidas/farmacología , Animales , Funciones de Verosimilitud , Modelos EstadísticosRESUMEN
The psychometric function describes how an experimental variable, such as stimulus strength, influences the behaviour of an observer. Estimation of psychometric functions from experimental data plays a central role in fields such as psychophysics, experimental psychology and in the behavioural neurosciences. Experimental data may exhibit substantial overdispersion, which may result from non-stationarity in the behaviour of observers. Here we extend the standard binomial model which is typically used for psychometric function estimation to a beta-binomial model. We show that the use of the beta-binomial model makes it possible to determine accurate credible intervals even in data which exhibit substantial overdispersion. This goes beyond classical measures for overdispersion-goodness-of-fit-which can detect overdispersion but provide no method to do correct inference for overdispersed data. We use Bayesian inference methods for estimating the posterior distribution of the parameters of the psychometric function. Unlike previous Bayesian psychometric inference methods our software implementation-psignifit 4-performs numerical integration of the posterior within automatically determined bounds. This avoids the use of Markov chain Monte Carlo (MCMC) methods typically requiring expert knowledge. Extensive numerical tests show the validity of the approach and we discuss implications of overdispersion for experimental design. A comprehensive MATLAB toolbox implementing the method is freely available; a python implementation providing the basic capabilities is also available.
Asunto(s)
Teorema de Bayes , Interpretación Estadística de Datos , Psicometría/métodos , Psicofísica/métodos , Humanos , Modelos Estadísticos , Umbral SensorialRESUMEN
In animal studies of ectoparasiticide efficacy the total number of parasites with which experimental animals are infested is not always equal to the intended number of parasites (usually n=50 per experimental animal in the case of ticks, and n=50 or n=100 in the case of fleas). That is, in the practical implementation of a study protocol, the infestation of experimental animals may be subject to variability so that total infestation is not known precisely. The purpose of the present study is to assess the impact of this variability on the accuracy and precision of efficacy estimates. The results of a thorough simulation study show clearly that uncertainty in total parasite infestation - of the magnitude encountered in well-controlled animal studies - has virtually no effect on the accuracy and precision of estimators of ectoparasiticide efficacy.
Asunto(s)
Antiparasitarios/normas , Evaluación de Medicamentos/normas , Infestaciones Ectoparasitarias/parasitología , Carga de Parásitos/normas , Incertidumbre , Animales , Antiparasitarios/uso terapéutico , Simulación por Computador , Infestaciones Ectoparasitarias/tratamiento farmacológico , Infestaciones por Pulgas/tratamiento farmacológico , Infestaciones por Pulgas/parasitología , Interacciones Huésped-Parásitos , Reproducibilidad de los Resultados , Infestaciones por Garrapatas/tratamiento farmacológico , Infestaciones por Garrapatas/parasitologíaRESUMEN
In retrospective studies, odds ratio is often used as the measure of association. Under independent beta prior assumption, the exact posterior distribution of odds ratio given a single 2 × 2 table has been derived in the literature. However, independence between risks within the same study may be an oversimplified assumption because cases and controls in the same study are likely to share some common factors and thus to be correlated. Furthermore, in a meta-analysis of case-control studies, investigators usually have multiple 2 × 2 tables. In this article, we first extend the published results on a single 2 × 2 table to allow within study prior correlation while retaining the advantage of closed-form posterior formula, and then extend the results to multiple 2 × 2 tables and regression setting. The hyperparameters, including within study correlation, are estimated via an empirical Bayes approach. The overall odds ratio and the exact posterior distribution of the study-specific odds ratio are inferred based on the estimated hyperparameters. We conduct simulation studies to verify our exact posterior distribution formulas and investigate the finite sample properties of the inference for the overall odds ratio. The results are illustrated through a twin study for genetic heritability and a meta-analysis for the association between the N-acetyltransferase 2 (NAT2) acetylation status and colorectal cancer.
Asunto(s)
Teorema de Bayes , Estudios de Casos y Controles , Metaanálisis como Asunto , Estadística como Asunto/métodos , Arilamina N-Acetiltransferasa/genética , Neoplasias Colorrectales/genética , Predisposición Genética a la Enfermedad/genética , Humanos , Oportunidad Relativa , Análisis de Regresión , Simulación del Espacio , Estudios en Gemelos como Asunto/métodosRESUMEN
Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels), I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained <5 levels of the random intercept term, especially for estimating variance components, and this effect appeared independent of total sample size. These results suggest that OLRE are a useful tool for modelling overdispersion in Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.