RESUMEN
The estimands framework outlined in ICH E9 (R1) describes the components needed to precisely define the effects to be estimated in clinical trials, which includes how post-baseline 'intercurrent' events (IEs) are to be handled. In late-stage clinical trials, it is common to handle IEs like 'treatment discontinuation' using the treatment policy strategy and target the treatment effect on outcomes regardless of treatment discontinuation. For continuous repeated measures, this type of effect is often estimated using all observed data before and after discontinuation using either a mixed model for repeated measures (MMRM) or multiple imputation (MI) to handle any missing data. In basic form, both these estimation methods ignore treatment discontinuation in the analysis and therefore may be biased if there are differences in patient outcomes after treatment discontinuation compared with patients still assigned to treatment, and missing data being more common for patients who have discontinued treatment. We therefore propose and evaluate a set of MI models that can accommodate differences between outcomes before and after treatment discontinuation. The models are evaluated in the context of planning a Phase 3 trial for a respiratory disease. We show that analyses ignoring treatment discontinuation can introduce substantial bias and can sometimes underestimate variability. We also show that some of the MI models proposed can successfully correct the bias, but inevitably lead to increases in variance. We conclude that some of the proposed MI models are preferable to the traditional analysis ignoring treatment discontinuation, but the precise choice of MI model will likely depend on the trial design, disease of interest and amount of observed and missing data following treatment discontinuation.
RESUMEN
BACKGROUND: Older people admitted to hospital in an emergency often have prolonged inpatient stays that worsen their outcomes, increase health-care costs, and reduce bed availability. Growing evidence suggests that the biopsychosocial complexity of their problems, which include cognitive impairment, depression, anxiety, multiple medical illnesses, and care needs resulting from functional dependency, prolongs hospital stays by making medical treatment less efficient and the planning of post-discharge care more difficult. We aimed to assess the effects of enhancing older inpatients' care with Proactive Integrated Consultation-Liaison Psychiatry (PICLP) in The HOME Study. We have previously described the benefits of PICLP reported by patients and clinicians. In this Article, we report the effectiveness and cost-effectiveness of PICLP-enhanced care, compared with usual care alone, in reducing time in hospital. METHODS: We did a parallel-group, multicentre, randomised controlled trial in 24 medical wards of three English acute general hospitals. Patients were eligible to take part if they were 65 years or older, had been admitted in an emergency, and were expected to remain in hospital for at least 2 days from the time of enrolment. Participants were randomly allocated to PICLP or usual care in a 1:1 ratio by a database software algorithm that used stratification by hospital, sex, and age, and randomly selected block sizes to ensure allocation concealment. PICLP clinicians (consultation-liaison psychiatrists supported by assisting clinicians) made proactive biopsychosocial assessments of patients' problems, then delivered discharge-focused care as integrated members of ward teams. The primary outcome was time spent as an inpatient (during the index admission and any emergency readmissions) in the 30 days post-randomisation. Secondary outcomes were the rate of discharge from hospital for the total length of the index admission; discharge destination; the length of the index admission after random allocation truncated at 30 days; the number of emergency readmissions to hospital, the number of days spent as an inpatient in an acute general hospital, and the rate of death in the year after random allocation; the patient's experience of the hospital stay; their view on the length of the hospital stay; anxiety (Generalized Anxiety Disorder-2); depression (Patient Health Questionnaire-2); cognitive function (Montreal Cognitive Assessment-Telephone version); independent functioning (Barthel Index of Activities of Daily Living); health-related quality of life (five-level EuroQol five-dimension questionnaire); and overall quality of life. Statisticians and data collectors were masked to treatment allocation; participants and ward staff could not be. Analyses were intention-to-treat. The trial had a patient and public involvement panel and was registered with ISRTCN (ISRCTN86120296). FINDINGS: 2744 participants (1399 [51·0%] male and 1345 [49·0%] female) were enrolled between May 2, 2018, and March 5, 2020; 1373 were allocated to PICLP and 1371 to usual care. Participants' mean age was 82·3 years (SD 8·2) and 2565 (93·5%) participants were White. The mean time spent in hospital in the 30 days post-randomisation (analysed for 2710 [98·8%] participants) was 11·37 days (SD 8·74) with PICLP and 11·85 days (SD 9·00) with usual care; adjusted mean difference -0·45 (95% CI -1·11 to 0·21; p=0·18). The only statistically and clinically significant difference in secondary outcomes was the rate of discharge, which was 8.5% higher (rate ratio 1·09 [95% CI 1·00 to 1·17]; p=0·042) with PICLP-a difference most apparent in patients who stayed for more than 2 weeks. Compared with usual care, PICLP was estimated to be modestly cost-saving and cost-effective over 1 and 3, but not 12, months. No intervention-related serious adverse events occurred. INTERPRETATION: This is the first randomised controlled trial of PICLP. PICLP is experienced by older medical inpatients and ward staff as enhancing medical care. It is also likely to be cost-saving in the short-term. Although the trial does not provide strong evidence that PICLP reduces time in hospital, it does support and inform its future development and evaluation. FUNDING: UK National Institute for Health and Care Research.
Asunto(s)
Tiempo de Internación , Derivación y Consulta , Humanos , Femenino , Masculino , Anciano , Inglaterra , Tiempo de Internación/estadística & datos numéricos , Análisis Costo-Beneficio , Anciano de 80 o más Años , Pacientes Internos/psicología , Hospitalización , Trastornos Mentales/terapiaRESUMEN
In clinical settings with no commonly accepted standard-of-care, multiple treatment regimens are potentially useful, but some treatments may not be appropriate for some patients. A personalized randomized controlled trial (PRACTical) design has been proposed for this setting. For a network of treatments, each patient is randomized only among treatments which are appropriate for them. The aim is to produce treatment rankings that can inform clinical decisions about treatment choices for individual patients. Here we propose methods for determining sample size in a PRACTical design, since standard power-based methods are not applicable. We derive a sample size by evaluating information gained from trials of varying sizes. For a binary outcome, we quantify how many adverse outcomes would be prevented by choosing the top-ranked treatment for each patient based on trial results rather than choosing a random treatment from the appropriate personalized randomization list. In simulations, we evaluate three performance measures: mean reduction in adverse outcomes using sample information, proportion of simulated patients for whom the top-ranked treatment performed as well or almost as well as the best appropriate treatment, and proportion of simulated trials in which the top-ranked treatment performed better than a randomly chosen treatment. We apply the methods to a trial evaluating eight different combination antibiotic regimens for neonatal sepsis (NeoSep1), in which a PRACTical design addresses varying patterns of antibiotic choice based on disease characteristics and resistance. Our proposed approach produces results that are more relevant to complex decision making by clinicians and policy makers.
Asunto(s)
Medicina de Precisión , Ensayos Clínicos Controlados Aleatorios como Asunto , Humanos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Tamaño de la Muestra , Medicina de Precisión/métodos , Simulación por Computador , Recién Nacido , Sepsis/tratamiento farmacológico , Modelos EstadísticosRESUMEN
BACKGROUND: Risk prediction models are routinely used to assist in clinical decision making. A small sample size for model development can compromise model performance when the model is applied to new patients. For binary outcomes, the calibration slope (CS) and the mean absolute prediction error (MAPE) are two key measures on which sample size calculations for the development of risk models have been based. CS quantifies the degree of model overfitting while MAPE assesses the accuracy of individual predictions. METHODS: Recently, two formulae were proposed to calculate the sample size required, given anticipated features of the development data such as the outcome prevalence and c-statistic, to ensure that the expectation of the CS and MAPE (over repeated samples) in models fitted using MLE will meet prespecified target values. In this article, we use a simulation study to evaluate the performance of these formulae. RESULTS: We found that both formulae work reasonably well when the anticipated model strength is not too high (c-statistic < 0.8), regardless of the outcome prevalence. However, for higher model strengths the CS formula underestimates the sample size substantially. For example, for c-statistic = 0.85 and 0.9, the sample size needed to be increased by at least 50% and 100%, respectively, to meet the target expected CS. On the other hand, the MAPE formula tends to overestimate the sample size for high model strengths. These conclusions were more pronounced for higher prevalence than for lower prevalence. Similar results were drawn when the outcome was time to event with censoring. Given these findings, we propose a simulation-based approach, implemented in the new R package 'samplesizedev', to correctly estimate the sample size even for high model strengths. The software can also calculate the variability in CS and MAPE, thus allowing for assessment of model stability. CONCLUSIONS: The calibration and MAPE formulae suggest sample sizes that are generally appropriate for use when the model strength is not too high. However, they tend to be biased for higher model strengths, which are not uncommon in clinical risk prediction studies. On those occasions, our proposed adjustments to the sample size calculations will be relevant.
Asunto(s)
Modelos Estadísticos , Humanos , Tamaño de la Muestra , Medición de Riesgo/métodos , Medición de Riesgo/estadística & datos numéricos , Simulación por Computador , AlgoritmosRESUMEN
Targeted maximum likelihood estimation (TMLE) is increasingly used for doubly robust causal inference, but how missing data should be handled when using TMLE with data-adaptive approaches is unclear. Based on data (1992-1998) from the Victorian Adolescent Health Cohort Study, we conducted a simulation study to evaluate 8 missing-data methods in this context: complete-case analysis, extended TMLE incorporating an outcome-missingness model, the missing covariate missing indicator method, and 5 multiple imputation (MI) approaches using parametric or machine-learning models. We considered 6 scenarios that varied in terms of exposure/outcome generation models (presence of confounder-confounder interactions) and missingness mechanisms (whether outcome influenced missingness in other variables and presence of interaction/nonlinear terms in missingness models). Complete-case analysis and extended TMLE had small biases when outcome did not influence missingness in other variables. Parametric MI without interactions had large bias when exposure/outcome generation models included interactions. Parametric MI including interactions performed best in bias and variance reduction across all settings, except when missingness models included a nonlinear term. When choosing a method for handling missing data in the context of TMLE, researchers must consider the missingness mechanism and, for MI, compatibility with the analysis method. In many settings, a parametric MI approach that incorporates interactions and nonlinearities is expected to perform well.
Asunto(s)
Causalidad , Humanos , Funciones de Verosimilitud , Adolescente , Interpretación Estadística de Datos , Sesgo , Modelos Estadísticos , Simulación por ComputadorRESUMEN
Frequent use of methylchloroisothiazolinone/methylisothiazolinone (MCI/MI) and MI in cosmetic products has been the main cause of widespread sensitization and allergic contact dermatitis to these preservatives (biocides). Their use in non-cosmetic products is also an important source of sensitization. Less is known about sensitization rates and use of benzisothiazolinone (BIT), octylisothiazolinone (OIT), and dichlorooctylisothiazolinone (DCOIT), which have never been permitted in cosmetic products in Europe. BIT and OIT have occasionally been routinely patch-tested. These preservatives are often used together in chemical products and articles. In this study, we review the occurrence of contact allergy to MI, BIT, OIT, and DCOIT over time, based on concomitant patch testing in large studies, and case reports. We review EU legislations, and we discuss the role of industry, regulators, and dermatology in prevention of sensitization and protection of health. The frequency of contact allergy to MI, BIT, and OIT has increased. The frequency of contact allergy to DCOIT is not known because it has seldom been patch-tested. Label information on isothiazolinones in chemical products and articles, irrespective of concentration, is required for assessment of relevance, information to patients, and avoidance of exposure and allergic contact dermatitis.
Asunto(s)
Cosméticos , Dermatitis Alérgica por Contacto , Desinfectantes , Tiazoles , Humanos , Dermatitis Alérgica por Contacto/epidemiología , Dermatitis Alérgica por Contacto/etiología , Dermatitis Alérgica por Contacto/prevención & control , Cosméticos/efectos adversos , Desinfectantes/efectos adversos , Europa (Continente)/epidemiología , Conservadores Farmacéuticos/efectos adversos , Pruebas del Parche/efectos adversosRESUMEN
INTRODUCTION: Network meta-analyses (NMAs) have gained popularity and grown in number due to their ability to provide estimates of the comparative effectiveness of multiple treatments for the same condition. The aim of this study is to conduct a methodological review to compile a preliminary list of concepts related to bias in NMAs. METHODS AND ANALYSIS: We included papers that present items related to bias, reporting or methodological quality, papers assessing the quality of NMAs, or method papers. We searched MEDLINE, the Cochrane Library and unpublished literature (up to July 2020). We extracted items related to bias in NMAs. An item was excluded if it related to general systematic review quality or bias and was included in currently available tools such as ROBIS or AMSTAR 2. We reworded items, typically structured as questions, into concepts (i.e. general notions). RESULTS: One hundred eighty-one articles were assessed in full text and 58 were included. Of these articles, 12 were tools, checklists or journal standards; 13 were guidance documents for NMAs; 27 were studies related to bias or NMA methods; and 6 were papers assessing the quality of NMAs. These studies yielded 99 items of which the majority related to general systematic review quality and biases and were therefore excluded. The 22 items we included were reworded into concepts specific to bias in NMAs. CONCLUSIONS: A list of 22 concepts was included. This list is not intended to be used to assess biases in NMAs, but to inform the development of items to be included in our tool.
HIGHLIGHTS: ⢠Our research aimed to develop a preliminary list of concepts related to bias with the goal of developing the first tool for assessing the risk of bias in the results and conclusions of a network meta-analysis (NMA).⢠We followed the methodology proposed by Whiting (2017) and Sanderson (2007) for creating systematically developed lists of quality items, as a first step in the development of a risk of bias tool for network meta-analysis (RoB NMA Tool).⢠We included items related to biases in NMAs and excluded items that are equally applicable to all systematic reviews as they are covered by other tools (e.g. ROBIS, AMSTAR 2).⢠Fifty-seven studies were included generating 99 items, which when screened, yielded 22 included items. These items were then reworded into concepts in preparation for a Delphi process for further vetting by external experts.⢠A limitation of our study is the challenge in retrieving methods studies as methods collections are not regularly updated.
Asunto(s)
Lista de Verificación , Humanos , Sesgo , Metaanálisis en RedRESUMEN
Multiple imputation (MI) is a popular method for handling missing data. Auxiliary variables can be added to the imputation model(s) to improve MI estimates. However, the choice of which auxiliary variables to include is not always straightforward. Several data-driven auxiliary variable selection strategies have been proposed, but there has been limited evaluation of their performance. Using a simulation study we evaluated the performance of eight auxiliary variable selection strategies: (1, 2) two versions of selection based on correlations in the observed data; (3) selection using hypothesis tests of the "missing completely at random" assumption; (4) replacing auxiliary variables with their principal components; (5, 6) forward and forward stepwise selection; (7) forward selection based on the estimated fraction of missing information; and (8) selection via the least absolute shrinkage and selection operator (LASSO). A complete case analysis and an MI analysis using all auxiliary variables (the "full model") were included for comparison. We also applied all strategies to a motivating case study. The full model outperformed all auxiliary variable selection strategies in the simulation study, with the LASSO strategy the best performing auxiliary variable selection strategy overall. All MI analysis strategies that we were able to apply to the case study led to similar estimates, although computational time was substantially reduced when variable selection was employed. This study provides further support for adopting an inclusive auxiliary variable strategy where possible. Auxiliary variable selection using the LASSO may be a promising alternative when the full model fails or is too burdensome.
Asunto(s)
Simulación por ComputadorRESUMEN
Individual participant data (IPD) meta-analyses of randomised trials are considered a reliable way to assess participant-level treatment effect modifiers but may not make the best use of the available data. Traditionally, effect modifiers are explored one covariate at a time, which gives rise to the possibility that evidence of treatment-covariate interaction may be due to confounding from a different, related covariate. We aimed to evaluate current practice when estimating treatment-covariate interactions in IPD meta-analysis, specifically focusing on involvement of additional covariates in the models. We reviewed 100 IPD meta-analyses of randomised trials, published between 2015 and 2020, that assessed at least one treatment-covariate interaction. We identified four approaches to handling additional covariates: (1) Single interaction model (unadjusted): No additional covariates included (57/100 IPD meta-analyses); (2) Single interaction model (adjusted): Adjustment for the main effect of at least one additional covariate (35/100); (3) Multiple interactions model: Adjustment for at least one two-way interaction between treatment and an additional covariate (3/100); and (4) Three-way interaction model: Three-way interaction formed between treatment, the additional covariate and the potential effect modifier (5/100). IPD is not being utilised to its fullest extent. In an exemplar dataset, we demonstrate how these approaches lead to different conclusions. Researchers should adjust for additional covariates when estimating interactions in IPD meta-analysis providing they adjust their main effects, which is already widely recommended. Further, they should consider whether more complex approaches could provide better information on who might benefit most from treatments, improving patient choice and treatment policy and practice.
Asunto(s)
Metaanálisis como Asunto , Modelos Estadísticos , Humanos , Ensayos Clínicos Controlados Aleatorios como AsuntoRESUMEN
Simulation studies are powerful tools in epidemiology and biostatistics, but they can be hard to conduct successfully. Sometimes unexpected results are obtained. We offer advice on how to check a simulation study when this occurs, and how to design and conduct the study to give results that are easier to check. Simulation studies should be designed to include some settings in which answers are already known. They should be coded in stages, with data-generating mechanisms checked before simulated data are analysed. Results should be explored carefully, with scatterplots of standard error estimates against point estimates surprisingly powerful tools. Failed estimation and outlying estimates should be identified and dealt with by changing data-generating mechanisms or coding realistic hybrid analysis procedures. Finally, we give a series of ideas that have been useful to us in the past for checking unexpected results. Following our advice may help to prevent errors and to improve the quality of published simulation studies.
Asunto(s)
Bioestadística , Humanos , Método de Montecarlo , Simulación por ComputadorRESUMEN
For simulation studies that evaluate methods of handling missing data, we argue that generating partially observed data by fixing the complete data and repeatedly simulating the missingness indicators is a superficially attractive idea but only rarely appropriate to use.
Asunto(s)
Investigación , Interpretación Estadística de Datos , Simulación por ComputadorRESUMEN
Although new biostatistical methods are published at a very high rate, many of these developments are not trustworthy enough to be adopted by the scientific community. We propose a framework to think about how a piece of methodological work contributes to the evidence base for a method. Similar to the well-known phases of clinical research in drug development, we propose to define four phases of methodological research. These four phases cover (I) proposing a new methodological idea while providing, for example, logical reasoning or proofs, (II) providing empirical evidence, first in a narrow target setting, then (III) in an extended range of settings and for various outcomes, accompanied by appropriate application examples, and (IV) investigations that establish a method as sufficiently well-understood to know when it is preferred over others and when it is not; that is, its pitfalls. We suggest basic definitions of the four phases to provoke thought and discussion rather than devising an unambiguous classification of studies into phases. Too many methodological developments finish before phase III/IV, but we give two examples with references. Our concept rebalances the emphasis to studies in phases III and IV, that is, carefully planned method comparison studies and studies that explore the empirical properties of existing methods in a wider range of problems.
Asunto(s)
Bioestadística , Proyectos de InvestigaciónRESUMEN
BACKGROUND: A 2×2 factorial design evaluates two interventions (A versus control and B versus control) by randomising to control, A-only, B-only or both A and B together. Extended factorial designs are also possible (e.g. 3×3 or 2×2×2). Factorial designs often require fewer resources and participants than alternative randomised controlled trials, but they are not widely used. We identified several issues that investigators considering this design need to address, before they use it in a late-phase setting. METHODS: We surveyed journal articles published in 2000-2022 relating to designing factorial randomised controlled trials. We identified issues to consider based on these and our personal experiences. RESULTS: We identified clinical, practical, statistical and external issues that make factorial randomised controlled trials more desirable. Clinical issues are (1) interventions can be easily co-administered; (2) risk of safety issues from co-administration above individual risks of the separate interventions is low; (3) safety or efficacy data are wanted on the combination intervention; (4) potential for interaction (e.g. effect of A differing when B administered) is low; (5) it is important to compare interventions with other interventions balanced, rather than allowing randomised interventions to affect the choice of other interventions; (6) eligibility criteria for different interventions are similar. Practical issues are (7) recruitment is not harmed by testing many interventions; (8) each intervention and associated toxicities is unlikely to reduce either adherence to the other intervention or overall follow-up; (9) blinding is easy to implement or not required. Statistical issues are (10) a suitable scale of analysis can be identified; (11) adjustment for multiplicity is not required; (12) early stopping for efficacy or lack of benefit can be done effectively. External issues are (13) adequate funding is available and (14) the trial is not intended for licensing purposes. An overarching issue (15) is that factorial design should give a lower sample size requirement than alternative designs. Across designs with varying non-adherence, retention, intervention effects and interaction effects, 2×2 factorial designs require lower sample size than a three-arm alternative when one intervention effect is reduced by no more than 24%-48% in the presence of the other intervention compared with in the absence of the other intervention. CONCLUSIONS: Factorial designs are not widely used and should be considered more often using our issues to consider. Low potential for at most small to modest interaction is key, for example, where the interventions have different mechanisms of action or target different aspects of the disease being studied.