Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 46
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
Biostatistics ; 2023 Dec 06.
Artículo en Inglés | MEDLINE | ID: mdl-38058013

RESUMEN

Assessing the impact of an intervention by using time-series observational data on multiple units and outcomes is a frequent problem in many fields of scientific research. Here, we propose a novel Bayesian multivariate factor analysis model for estimating intervention effects in such settings and develop an efficient Markov chain Monte Carlo algorithm to sample from the high-dimensional and nontractable posterior of interest. The proposed method is one of the few that can simultaneously deal with outcomes of mixed type (continuous, binomial, count), increase efficiency in the estimates of the causal effects by jointly modeling multiple outcomes affected by the intervention, and easily provide uncertainty quantification for all causal estimands of interest. Using the proposed approach, we evaluate the impact that Local Tracing Partnerships had on the effectiveness of England's Test and Trace programme for COVID-19.

2.
Lancet ; 399(10332): 1303-1312, 2022 04 02.
Artículo en Inglés | MEDLINE | ID: mdl-35305296

RESUMEN

BACKGROUND: The omicron variant (B.1.1.529) of SARS-CoV-2 has demonstrated partial vaccine escape and high transmissibility, with early studies indicating lower severity of infection than that of the delta variant (B.1.617.2). We aimed to better characterise omicron severity relative to delta by assessing the relative risk of hospital attendance, hospital admission, or death in a large national cohort. METHODS: Individual-level data on laboratory-confirmed COVID-19 cases resident in England between Nov 29, 2021, and Jan 9, 2022, were linked to routine datasets on vaccination status, hospital attendance and admission, and mortality. The relative risk of hospital attendance or admission within 14 days, or death within 28 days after confirmed infection, was estimated using proportional hazards regression. Analyses were stratified by test date, 10-year age band, ethnicity, residential region, and vaccination status, and were further adjusted for sex, index of multiple deprivation decile, evidence of a previous infection, and year of age within each age band. A secondary analysis estimated variant-specific and vaccine-specific vaccine effectiveness and the intrinsic relative severity of omicron infection compared with delta (ie, the relative risk in unvaccinated cases). FINDINGS: The adjusted hazard ratio (HR) of hospital attendance (not necessarily resulting in admission) with omicron compared with delta was 0·56 (95% CI 0·54-0·58); for hospital admission and death, HR estimates were 0·41 (0·39-0·43) and 0·31 (0·26-0·37), respectively. Omicron versus delta HR estimates varied with age for all endpoints examined. The adjusted HR for hospital admission was 1·10 (0·85-1·42) in those younger than 10 years, decreasing to 0·25 (0·21-0·30) in 60-69-year-olds, and then increasing to 0·47 (0·40-0·56) in those aged at least 80 years. For both variants, past infection gave some protection against death both in vaccinated (HR 0·47 [0·32-0·68]) and unvaccinated (0·18 [0·06-0·57]) cases. In vaccinated cases, past infection offered no additional protection against hospital admission beyond that provided by vaccination (HR 0·96 [0·88-1·04]); however, for unvaccinated cases, past infection gave moderate protection (HR 0·55 [0·48-0·63]). Omicron versus delta HR estimates were lower for hospital admission (0·30 [0·28-0·32]) in unvaccinated cases than the corresponding HR estimated for all cases in the primary analysis. Booster vaccination with an mRNA vaccine was highly protective against hospitalisation and death in omicron cases (HR for hospital admission 8-11 weeks post-booster vs unvaccinated: 0·22 [0·20-0·24]), with the protection afforded after a booster not being affected by the vaccine used for doses 1 and 2. INTERPRETATION: The risk of severe outcomes following SARS-CoV-2 infection is substantially lower for omicron than for delta, with higher reductions for more severe endpoints and significant variation with age. Underlying the observed risks is a larger reduction in intrinsic severity (in unvaccinated individuals) counterbalanced by a reduction in vaccine effectiveness. Documented previous SARS-CoV-2 infection offered some protection against hospitalisation and high protection against death in unvaccinated individuals, but only offered additional protection in vaccinated individuals for the death endpoint. Booster vaccination with mRNA vaccines maintains over 70% protection against hospitalisation and death in breakthrough confirmed omicron infections. FUNDING: Medical Research Council, UK Research and Innovation, Department of Health and Social Care, National Institute for Health Research, Community Jameel, and Engineering and Physical Sciences Research Council.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/epidemiología , COVID-19/prevención & control , Estudios de Cohortes , Inglaterra/epidemiología , Hospitalización , Humanos , Vacunas Sintéticas , Vacunas de ARNm
3.
Stat Med ; 42(13): 2191-2225, 2023 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-37086186

RESUMEN

Longitudinal observational data on patients can be used to investigate causal effects of time-varying treatments on time-to-event outcomes. Several methods have been developed for estimating such effects by controlling for the time-dependent confounding that typically occurs. The most commonly used is marginal structural models (MSM) estimated using inverse probability of treatment weights (IPTW) (MSM-IPTW). An alternative, the sequential trials approach, is increasingly popular, and involves creating a sequence of "trials" from new time origins and comparing treatment initiators and non-initiators. Individuals are censored when they deviate from their treatment assignment at the start of each "trial" (initiator or noninitiator), which is accounted for using inverse probability of censoring weights. The analysis uses data combined across trials. We show that the sequential trials approach can estimate the parameters of a particular MSM. The causal estimand that we focus on is the marginal risk difference between the sustained treatment strategies of "always treat" vs "never treat." We compare how the sequential trials approach and MSM-IPTW estimate this estimand, and discuss their assumptions and how data are used differently. The performance of the two approaches is compared in a simulation study. The sequential trials approach, which tends to involve less extreme weights than MSM-IPTW, results in greater efficiency for estimating the marginal risk difference at most follow-up times, but this can, in certain scenarios, be reversed at later time points and relies on modelling assumptions. We apply the methods to longitudinal observational data from the UK Cystic Fibrosis Registry to estimate the effect of dornase alfa on survival.


Asunto(s)
Modelos Estadísticos , Humanos , Causalidad , Modelos Estructurales , Probabilidad , Análisis de Supervivencia , Resultado del Tratamiento , Estudios Longitudinales
4.
J Infect Dis ; 226(5): 808-811, 2022 09 13.
Artículo en Inglés | MEDLINE | ID: mdl-35184201

RESUMEN

To investigate if the AY.4.2 sublineage of the SARS-CoV-2 delta variant is associated with hospitalization and mortality risks that differ from non-AY.4.2 delta risks, we performed a retrospective cohort study of sequencing-confirmed COVID-19 cases in England based on linkage of routine health care datasets. Using stratified Cox regression, we estimated adjusted hazard ratios (aHR) of hospital admission (aHR = 0.85; 95% confidence interval [CI], .77-.94), hospital admission or emergency care attendance (aHR = 0.87; 95% CI, .81-.94), and COVID-19 mortality (aHR = 0.85; 95% CI, .71-1.03). The results indicate that the risks of hospitalization and mortality are similar or lower for AY.4.2 compared to cases with other delta sublineages.


Asunto(s)
COVID-19 , SARS-CoV-2 , Hospitalización , Humanos , Estudios Retrospectivos
5.
Stat Med ; 40(16): 3779-3790, 2021 07 20.
Artículo en Inglés | MEDLINE | ID: mdl-33942919

RESUMEN

Using data from observational studies to estimate the causal effect of a time-varying exposure, repeatedly measured over time, on an outcome of interest requires careful adjustment for confounding. Standard regression adjustment for observed time-varying confounders is unsuitable, as it can eliminate part of the causal effect and induce bias. Inverse probability weighting, g-computation, and g-estimation have been proposed as being more suitable methods. G-estimation has some advantages over the other two methods, but until recently there has been a lack of flexible g-estimation methods for a survival time outcome. The recently proposed Structural Nested Cumulative Survival Time Model (SNCSTM) is such a method. Efficient estimation of the parameters of this model required bespoke software. In this article we show how the SNCSTM can be fitted efficiently via g-estimation using standard software for fitting generalised linear models. The ability to implement g-estimation for a survival outcome using standard statistical software greatly increases the potential uptake of this method. We illustrate the use of this method of fitting the SNCSTM by reanalyzing data from the UK Cystic Fibrosis Registry, and provide example R code to facilitate the use of this approach by other researchers.


Asunto(s)
Modelos Estadísticos , Sesgo , Causalidad , Humanos , Modelos Lineales , Probabilidad
6.
Biom J ; 63(7): 1526-1541, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-33983641

RESUMEN

Observational longitudinal data on treatments and covariates are increasingly used to investigate treatment effects, but are often subject to time-dependent confounding. Marginal structural models (MSMs), estimated using inverse probability of treatment weighting or the g-formula, are popular for handling this problem. With increasing development of advanced causal inference methods, it is important to be able to assess their performance in different scenarios to guide their application. Simulation studies are a key tool for this, but their use to evaluate causal inference methods has been limited. This paper focuses on the use of simulations for evaluations involving MSMs in studies with a time-to-event outcome. In a simulation, it is important to be able to generate the data in such a way that the correct forms of any models to be fitted to those data are known. However, this is not straightforward in the longitudinal setting because it is natural for data to be generated in a sequential conditional manner, whereas MSMs involve fitting marginal rather than conditional hazard models. We provide general results that enable the form of the correctly specified MSM to be derived based on a conditional data generating procedure, and show how the results can be applied when the conditional hazard model is an Aalen additive hazard or Cox model. Using conditional additive hazard models is advantageous because they imply additive MSMs that can be fitted using standard software. We describe and illustrate a simulation algorithm. Our results will help researchers to effectively evaluate causal inference methods via simulation.


Asunto(s)
Modelos Estadísticos , Simulación por Computador , Modelos Estructurales , Modelos de Riesgos Proporcionales
7.
Stat Med ; 39(22): 2921-2935, 2020 09 30.
Artículo en Inglés | MEDLINE | ID: mdl-32677726

RESUMEN

We develop and demonstrate methods to perform sensitivity analyses to assess sensitivity to plausible departures from missing at random in incomplete repeated binary outcome data. We use multiple imputation in the not at random fully conditional specification framework, which includes one or more sensitivity parameters (SPs) for each incomplete variable. The use of an online elicitation questionnaire is demonstrated to obtain expert opinion on the SPs, and highest prior density regions are used alongside opinion pooling methods to display credible regions for SPs. We demonstrate that substantive conclusions can be far more sensitive to departures from the missing at random assumption (MAR) when control and intervention nonresponders depart from MAR differently, and show that the correlation of arm specific SPs in expert opinion is particularly important. We illustrate these methods on the iQuit in Practice smoking cessation trial, which compared the impact of a tailored text messaging system versus standard care on smoking cessation. We show that conclusions about the effect of intervention on smoking cessation outcomes at 8 week and 6 months are broadly insensitive to departures from MAR, with conclusions significantly affected only when the differences in behavior between the nonresponders in the two trial arms is larger than expert opinion judges to be realistic.


Asunto(s)
Proyectos de Investigación , Cese del Hábito de Fumar , Interpretación Estadística de Datos , Humanos , Encuestas y Cuestionarios
8.
Biostatistics ; 19(4): 407-425, 2018 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-29028922

RESUMEN

Cohort data are often incomplete because some subjects drop out of the study, and inverse probability weighting (IPW), multiple imputation (MI), and linear increments (LI) are methods that deal with such missing data. In cohort studies of ageing, missing data can arise from dropout or death. Methods that do not distinguish between these reasons for missingness typically provide inference about a hypothetical cohort where no one can die (immortal cohort). It has been suggested that inference about the cohort composed of those who are still alive at any time point (partly conditional inference) may be more meaningful. MI, LI, and IPW can all be adapted to provide partly conditional inference. In this article, we clarify and compare the assumptions required by these MI, LI, and IPW methods for partly conditional inference on continuous outcomes. We also propose augmented IPW estimators for making partly conditional inference. These are more efficient than IPW estimators and more robust to model misspecification. Our simulation studies show that the methods give approximately unbiased estimates of partly conditional estimands when their assumptions are met, but may be biased otherwise. We illustrate the application of the missing data methods using data from the 'Origins of Variance in the Old-old' Twin study.


Asunto(s)
Investigación Biomédica/métodos , Bioestadística/métodos , Estudios de Cohortes , Interpretación Estadística de Datos , Modelos Estadísticos , Proyectos de Investigación , Humanos
9.
Epidemiology ; 30(1): 29-37, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30234550

RESUMEN

BACKGROUND: Cystic fibrosis (CF) is an inherited, chronic, progressive condition affecting around 10,000 individuals in the United Kingdom and over 70,000 worldwide. Survival in CF has improved considerably over recent decades, and it is important to provide up-to-date information on patient prognosis. METHODS: The UK Cystic Fibrosis Registry is a secure centralized database, which collects annual data on almost all CF patients in the United Kingdom. Data from 43,592 annual records from 2005 to 2015 on 6181 individuals were used to develop a dynamic survival prediction model that provides personalized estimates of survival probabilities given a patient's current health status using 16 predictors. We developed the model using the landmarking approach, giving predicted survival curves up to 10 years from 18 to 50 years of age. We compared several models using cross-validation. RESULTS: The final model has good discrimination (C-indexes: 0.873, 0.843, and 0.804 for 2-, 5-, and 10-year survival prediction) and low prediction error (Brier scores: 0.036, 0.076, and 0.133). It identifies individuals at low and high risk of short- and long-term mortality based on their current status. For patients 20 years of age during 2013-2015, for example, over 80% had a greater than 95% probability of 2-year survival and 40% were predicted to survive 10 years or more. CONCLUSIONS: Dynamic personalized prediction models can guide treatment decisions and provide personalized information for patients. Our application illustrates the utility of the landmarking approach for making the best use of longitudinal and survival data and shows how models can be defined and compared in terms of predictive performance.


Asunto(s)
Fibrosis Quística/mortalidad , Modelos Estadísticos , Adulto , Estudios de Cohortes , Femenino , Humanos , Masculino , Persona de Mediana Edad , Probabilidad , Pronóstico , Sistema de Registros , Reino Unido/epidemiología
10.
Stat Sci ; 33(2): 184-197, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-29731541

RESUMEN

Most methods for handling incomplete data can be broadly classified as inverse probability weighting (IPW) strategies or imputation strategies. The former model the occurrence of incomplete data; the latter, the distribution of the missing variables given observed variables in each missingness pattern. Imputation strategies are typically more efficient, but they can involve extrapolation, which is difficult to diagnose and can lead to large bias. Double robust (DR) methods combine the two approaches. They are typically more efficient than IPW and more robust to model misspecification than imputation. We give a formal introduction to DR estimation of the mean of a partially observed variable, before moving to more general incomplete-data scenarios. We review strategies to improve the performance of DR estimators under model misspecification, reveal connections between DR estimators for incomplete data and 'design-consistent' estimators used in sample surveys, and explain the value of double robustness when using flexible data-adaptive methods for IPW or imputation.

11.
Biometrics ; 74(4): 1427-1437, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29772074

RESUMEN

We propose semi-parametric methods to model cohort data where repeated outcomes may be missing due to death and non-ignorable dropout. Our focus is to obtain inference about the cohort composed of those who are still alive at any time point (partly conditional inference). We propose: i) an inverse probability weighted method that upweights observed subjects to represent subjects who are still alive but are not observed; ii) an outcome regression method that replaces missing outcomes of subjects who are alive with their conditional mean outcomes given past observed data; and iii) an augmented inverse probability method that combines the previous two methods and is double robust against model misspecification. These methods are described for both monotone and non-monotone missing data patterns, and are applied to a cohort of elderly adults from the Health and Retirement Study. Sensitivity analysis to departures from the assumption that missingness at some visit t is independent of the outcome at visit t given past observed data and time of death is used in the data application.


Asunto(s)
Biometría/métodos , Simulación por Computador/estadística & datos numéricos , Análisis de Regresión , Anciano , Anciano de 80 o más Años , Sesgo , Estudios de Cohortes , Muerte , Humanos , Estudios Longitudinales , Probabilidad
12.
Biometrics ; 74(4): 1438-1449, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29870056

RESUMEN

The nested case-control and case-cohort designs are two main approaches for carrying out a substudy within a prospective cohort. This article adapts multiple imputation (MI) methods for handling missing covariates in full-cohort studies for nested case-control and case-cohort studies. We consider data missing by design and data missing by chance. MI analyses that make use of full-cohort data and MI analyses based on substudy data only are described, alongside an intermediate approach in which the imputation uses full-cohort data but the analysis uses only the substudy. We describe adaptations to two imputation methods: the approximate method (MI-approx) of White and Royston (2009) and the "substantive model compatible" (MI-SMC) method of Bartlett et al. (2015). We also apply the "MI matched set" approach of Seaman and Keogh (2015) to nested case-control studies, which does not require any full-cohort information. The methods are investigated using simulation studies and all perform well when their assumptions hold. Substantial gains in efficiency can be made by imputing data missing by design using the full-cohort approach or by imputing data missing by chance in analyses using the substudy only. The intermediate approach brings greater gains in efficiency relative to the substudy approach and is more robust to imputation model misspecification than the full-cohort approach. The methods are illustrated using the ARIC Study cohort. Supplementary Materials provide R and Stata code.


Asunto(s)
Biometría/métodos , Estudios de Casos y Controles , Estudios de Cohortes , Simulación por Computador/estadística & datos numéricos , Interpretación Estadística de Datos , Humanos
13.
Biometrics ; 71(4): 1150-9, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26237003

RESUMEN

Analysis of matched case-control studies is often complicated by missing data on covariates. Analysis can be restricted to individuals with complete data, but this is inefficient and may be biased. Multiple imputation (MI) is an efficient and flexible alternative. We describe two MI approaches. The first uses a model for the data on an individual and includes matching variables; the second uses a model for the data on a whole matched set and avoids the need to model the matching variables. Within each approach, we consider three methods: full-conditional specification (FCS), joint model MI using a normal model, and joint model MI using a latent normal model. We show that FCS MI is asymptotically equivalent to joint model MI using a restricted general location model that is compatible with the conditional logistic regression analysis model. The normal and latent normal imputation models are not compatible with this analysis model. All methods allow for multiple partially-observed covariates, non-monotone missingness, and multiple controls per case. They can be easily applied in standard statistical software and valid variance estimates obtained using Rubin's Rules. We compare the methods in a simulation study. The approach of including the matching variables is most efficient. Within each approach, the FCS MI method generally yields the least-biased odds ratio estimates, but normal or latent normal joint model MI is sometimes more efficient. All methods have good confidence interval coverage. Data on colorectal cancer and fibre intake from the EPIC-Norfolk study are used to illustrate the methods, in particular showing how efficiency is gained relative to just using individuals with complete data.


Asunto(s)
Estudios de Casos y Controles , Interpretación Estadística de Datos , Biometría/métodos , Neoplasias Colorrectales/etiología , Simulación por Computador , Intervalos de Confianza , Fibras de la Dieta/administración & dosificación , Enfermedad/etiología , Humanos , Modelos Estadísticos , Oportunidad Relativa , Factores de Riesgo
14.
Paediatr Perinat Epidemiol ; 29(6): 567-75, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26332368

RESUMEN

BACKGROUND: Informative birth size occurs when the average outcome depends on the number of infants per birth. Although analysis methods have been proposed for handling informative birth size, their performance is not well understood. Our aim was to evaluate the performance of these methods and to provide recommendations for their application in randomised trials including infants from single and multiple births. METHODS: Three generalised estimating equation (GEE) approaches were considered for estimating the effect of treatment on a continuous or binary outcome: cluster weighted GEEs, which produce treatment effects with a mother-level interpretation when birth size is informative; standard GEEs with an independence working correlation structure, which produce treatment effects with an infant-level interpretation when birth size is informative; and standard GEEs with an exchangeable working correlation structure, which do not account for informative birth size. The methods were compared through simulation and analysis of an example dataset. RESULTS: Treatment effect estimates were affected by informative birth size in the simulation study when the effect of treatment in singletons differed from that in multiples (i.e. in the presence of a treatment group by multiple birth interaction). The strength of evidence supporting the effectiveness of treatment varied between methods in the example dataset. CONCLUSIONS: Informative birth size is always a possibility in randomised trials including infants from both single and multiple births, and analysis methods should be pre-specified with this in mind. We recommend estimating treatment effects using standard GEEs with an independence working correlation structure to give an infant-level interpretation.


Asunto(s)
Retardo del Crecimiento Fetal/epidemiología , Recién Nacido de Bajo Peso , Recien Nacido Prematuro , Embarazo Múltiple/estadística & datos numéricos , Nacimiento Prematuro/epidemiología , Adulto , Femenino , Humanos , Recién Nacido , Masculino , Vigilancia de la Población , Embarazo , Ensayos Clínicos Controlados Aleatorios como Asunto , Estándares de Referencia
15.
Am J Epidemiol ; 180(3): 318-24, 2014 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-24966219

RESUMEN

The C statistic is a commonly reported measure of screening test performance. Optimistic estimation of the C statistic is a frequent problem because of overfitting of statistical models in small data sets, and methods exist to correct for this issue. However, many studies do not use such methods, and those that do correct for optimism use diverse methods, some of which are known to be biased. We used clinical data sets (United Kingdom Down syndrome screening data from Glasgow (1991-2003), Edinburgh (1999-2003), and Cambridge (1990-2006), as well as Scottish national pregnancy discharge data (2004-2007)) to evaluate different approaches to adjustment for optimism. We found that sample splitting, cross-validation without replication, and leave-1-out cross-validation produced optimism-adjusted estimates of the C statistic that were biased and/or associated with greater absolute error than other available methods. Cross-validation with replication, bootstrapping, and a new method (leave-pair-out cross-validation) all generated unbiased optimism-adjusted estimates of the C statistic and had similar absolute errors in the clinical data set. Larger simulation studies confirmed that all 3 methods performed similarly with 10 or more events per variable, or when the C statistic was 0.9 or greater. However, with lower events per variable or lower C statistics, bootstrapping tended to be optimistic but with lower absolute and mean squared errors than both methods of cross-validation.


Asunto(s)
Interpretación Estadística de Datos , Bases de Datos Factuales , Modelos Estadísticos , Síndrome de Down , Métodos Epidemiológicos , Humanos , Modelos Logísticos , Análisis Multivariante , Curva ROC
17.
Biometrics ; 70(2): 449-56, 2014 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-24479899

RESUMEN

Clustered data commonly arise in epidemiology. We assume each cluster member has an outcome Y and covariates X. When there are missing data in Y, the distribution of Y given X in all cluster members ("complete clusters") may be different from the distribution just in members with observed Y ("observed clusters"). Often the former is of interest, but when data are missing because in a fundamental sense Y does not exist (e.g., quality of life for a person who has died), the latter may be more meaningful (quality of life conditional on being alive). Weighted and doubly weighted generalized estimating equations and shared random-effects models have been proposed for observed-cluster inference when cluster size is informative, that is, the distribution of Y given X in observed clusters depends on observed cluster size. We show these methods can be seen as actually giving inference for complete clusters and may not also give observed-cluster inference. This is true even if observed clusters are complete in themselves rather than being the observed part of larger complete clusters: here methods may describe imaginary complete clusters rather than the observed clusters. We show under which conditions shared random-effects models proposed for observed-cluster inference do actually describe members with observed Y. A psoriatic arthritis dataset is used to illustrate the danger of misinterpreting estimates from shared random-effects models.


Asunto(s)
Biometría/métodos , Análisis por Conglomerados , Métodos Epidemiológicos , Artritis Psoriásica/epidemiología , Femenino , Humanos , Masculino , Modelos Estadísticos
18.
Stat Med ; 33(1): 88-104, 2014 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-23922236

RESUMEN

We are concerned with multiple imputation of the ratio of two variables, which is to be used as a covariate in a regression analysis. If the numerator and denominator are not missing simultaneously, it seems sensible to make use of the observed variable in the imputation model. One such strategy is to impute missing values for the numerator and denominator, or the log-transformed numerator and denominator, and then calculate the ratio of interest; we call this 'passive' imputation. Alternatively, missing ratio values might be imputed directly, with or without the numerator and/or the denominator in the imputation model; we call this 'active' imputation. In two motivating datasets, one involving body mass index as a covariate and the other involving the ratio of total to high-density lipoprotein cholesterol, we assess the sensitivity of results to the choice of imputation model and, as an alternative, explore fully Bayesian joint models for the outcome and incomplete ratio. Fully Bayesian approaches using Winbugs were unusable in both datasets because of computational problems. In our first dataset, multiple imputation results are similar regardless of the imputation model; in the second, results are sensitive to the choice of imputation model. Sensitivity depends strongly on the coefficient of variation of the ratio's denominator. A simulation study demonstrates that passive imputation without transformation is risky because it can lead to downward bias when the coefficient of variation of the ratio's denominator is larger than about 0.1. Active imputation or passive imputation after log-transformation is preferable.


Asunto(s)
Teorema de Bayes , Modelos Estadísticos , Análisis de Regresión , Índice de Masa Corporal , Recuento de Linfocito CD4 , Colesterol/sangre , Estudios de Cohortes , Simulación por Computador , Femenino , Infecciones por VIH/sangre , Infecciones por VIH/tratamiento farmacológico , Hemoglobinas/análisis , Humanos , Masculino , Neoplasias/metabolismo , Sudáfrica
19.
BMC Med Res Methodol ; 14: 28, 2014 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-24559129

RESUMEN

BACKGROUND: Chained equations imputation is widely used in medical research. It uses a set of conditional models, so is more flexible than joint modelling imputation for the imputation of different types of variables (e.g. binary, ordinal or unordered categorical). However, chained equations imputation does not correspond to drawing from a joint distribution when the conditional models are incompatible. Concurrently with our work, other authors have shown the equivalence of the two imputation methods in finite samples. METHODS: Taking a different approach, we prove, in finite samples, sufficient conditions for chained equations and joint modelling to yield imputations from the same predictive distribution. Further, we apply this proof in four specific cases and conduct a simulation study which explores the consequences when the conditional models are compatible but the conditions otherwise are not satisfied. RESULTS: We provide an additional "non-informative margins" condition which, together with compatibility, is sufficient. We show that the non-informative margins condition is not satisfied, despite compatible conditional models, in a situation as simple as two continuous variables and one binary variable. Our simulation study demonstrates that as a consequence of this violation order effects can occur; that is, systematic differences depending upon the ordering of the variables in the chained equations algorithm. However, the order effects appear to be small, especially when associations between variables are weak. CONCLUSIONS: Since chained equations is typically used in medical research for datasets with different types of variables, researchers must be aware that order effects are likely to be ubiquitous, but our results suggest they may be small enough to be negligible.


Asunto(s)
Investigación Biomédica/métodos , Simulación por Computador , Modelos Estadísticos , Algoritmos , Humanos , Modelos Biológicos , Estadística como Asunto
20.
Stat Med ; 32(26): 4639-50, 2013 Nov 20.
Artículo en Inglés | MEDLINE | ID: mdl-23776143

RESUMEN

In phase II cancer trials, tumour response is either the primary or an important secondary endpoint. Tumour response is a binary composite endpoint determined, according to the Response Evaluation Criteria in Solid Tumors, by (1) whether the percentage change in tumour size is greater than a prescribed threshold and (2) (binary) criteria such as whether a patient develops new lesions. Further binary criteria, such as death or serious toxicity, may be added to these criteria. The probability of tumour response (i.e. 'success' on the composite endpoint) would usually be estimated simply as the proportion of successes among patients. This approach uses the tumour size variable only through a discretised form, namely whether or not it is above the threshold. In this article, we propose a method that also estimates the probability of success but that gains precision by using the information on the undiscretised (i.e. continuous) tumour size variable. This approach can also be used to increase the power to detect a difference between the probabilities of success under two different treatments in a comparative trial. We demonstrate these increases in precision and power using simulated data. We also apply the method to real data from a phase II cancer trial and show that it results in a considerably narrower confidence interval for the probability of tumour response.


Asunto(s)
Antineoplásicos/uso terapéutico , Ensayos Clínicos Fase II como Asunto/métodos , Interpretación Estadística de Datos , Neoplasias/tratamiento farmacológico , Capecitabina , Simulación por Computador , Desoxicitidina/efectos adversos , Desoxicitidina/análogos & derivados , Fluorouracilo/efectos adversos , Fluorouracilo/análogos & derivados , Síndrome Mano-Pie/etiología , Humanos , Neoplasias/patología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA