RESUMEN
BACKGROUND: In cluster randomized trials, patients are typically recruited after clusters are randomized, and the recruiters and patients may not be blinded to the assignment. This often leads to differential recruitment and consequently systematic differences in baseline characteristics of the recruited patients between intervention and control arms, inducing post-randomization selection bias. We aim to rigorously define causal estimands in the presence of selection bias. We elucidate the conditions under which standard covariate adjustment methods can validly estimate these estimands. We further discuss the additional data and assumptions necessary for estimating causal effects when such conditions are not met. METHODS: Adopting the principal stratification framework in causal inference, we clarify there are two average treatment effect (ATE) estimands in cluster randomized trials: one for the overall population and one for the recruited population. We derive analytical formula of the two estimands in terms of principal-stratum-specific causal effects. Furthermore, using simulation studies, we assess the empirical performance of the multivariable regression adjustment method under different data generating processes leading to selection bias. RESULTS: When treatment effects are heterogeneous across principal strata, the average treatment effect on the overall population generally differs from the average treatment effect on the recruited population. A naïve intention-to-treat analysis of the recruited sample leads to biased estimates of both average treatment effects. In the presence of post-randomization selection and without additional data on the non-recruited subjects, the average treatment effect on the recruited population is estimable only when the treatment effects are homogeneous between principal strata, and the average treatment effect on the overall population is generally not estimable. The extent to which covariate adjustment can remove selection bias depends on the degree of effect heterogeneity across principal strata. CONCLUSION: There is a need and opportunity to improve the analysis of cluster randomized trials that are subject to post-randomization selection bias. For studies prone to selection bias, it is important to explicitly specify the target population that the causal estimands are defined on and adopt design and estimation strategies accordingly. To draw valid inferences about treatment effects, investigators should (1) assess the possibility of heterogeneous treatment effects, and (2) consider collecting data on covariates that are predictive of the recruitment process, and on the non-recruited population from external sources such as electronic health records.
Asunto(s)
Proyectos de Investigación , Sesgo , Causalidad , Simulación por Computador , Humanos , Análisis de Intención de Tratar , Ensayos Clínicos Controlados Aleatorios como Asunto , Sesgo de SelecciónRESUMEN
BACKGROUND: Pragmatic trials provide the opportunity to study the effectiveness of health interventions to improve care in real-world settings. However, use of open-cohort designs with patients becoming eligible after randomization and reliance on electronic health records (EHRs) to identify participants may lead to a form of selection bias referred to as identification bias. This bias can occur when individuals identified as a result of the treatment group assignment are included in analyses. METHODS: To demonstrate the importance of identification bias and how it can be addressed, we consider a motivating case study, the PRimary care Opioid Use Disorders treatment (PROUD) Trial. PROUD is an ongoing pragmatic, cluster-randomized implementation trial in six health systems to evaluate a program for increasing medication treatment of opioid use disorders (OUDs). A main study objective is to evaluate whether the PROUD intervention decreases acute care utilization among patients with OUD (effectiveness aim). Identification bias is a particular concern, because OUD is underdiagnosed in the EHR at baseline, and because the intervention is expected to increase OUD diagnosis among current patients and attract new patients with OUD to the intervention site. We propose a framework for addressing this source of bias in the statistical design and analysis. RESULTS: The statistical design sought to balance the competing goals of fully capturing intervention effects and mitigating identification bias, while maximizing power. For the primary analysis of the effectiveness aim, identification bias was avoided by defining the study sample using pre-randomization data (pre-trial modeling demonstrated that the optimal approach was to use individuals with a prior OUD diagnosis). To expand generalizability of study findings, secondary analyses were planned that also included patients newly diagnosed post-randomization, with analytic methods to account for identification bias. CONCLUSION: As more studies seek to leverage existing data sources, such as EHRs, to make clinical trials more affordable and generalizable and to apply novel open-cohort study designs, the potential for identification bias is likely to become increasingly common. This case study highlights how this bias can be addressed in the statistical study design and analysis. TRIAL REGISTRATION: ClinicalTrials.gov, NCT03407638. Registered on 23 January 2018.
Asunto(s)
Registros Electrónicos de Salud/estadística & datos numéricos , Trastornos Relacionados con Opioides/diagnóstico , Trastornos Relacionados con Opioides/terapia , Sesgo , Análisis por Conglomerados , Estudios de Cohortes , Registros Electrónicos de Salud/normas , Humanos , Trastornos Relacionados con Opioides/epidemiología , Prevalencia , Atención Primaria de Salud , Evaluación de Programas y Proyectos de Salud , Proyectos de Investigación , Sensibilidad y EspecificidadRESUMEN
OBJECTIVE: To review how articles are retrieved from bibliographic databases, what article identification and translation problems have affected research, and how these problems can contribute to research waste and affect clinical practice. METHODS: This literature review sought and appraised articles regarding identification- and translation-bias in the medical and dental literature, which limit the ability of users to find research articles and to use these in practice. RESULTS: Articles can be retrieved from bibliographic databases by performing a word or index-term (for example, MeSH for MEDLINE) search. Identification of articles is challenging when it is not clear which words are most relevant, and which terms have been allocated to indexing fields. Poor reporting quality of abstracts and articles has been reported across the medical literature at large. Specifically in dentistry, research regarding time-to-event survival analyses found the allocation of MeSH terms to be inconsistent and inaccurate, important words were omitted from abstracts by authors, and the quality of reporting in the body of articles was generally poor. These shortcomings mean that articles will be difficult to identify, and difficult to understand if found. Use of specialized electronic search strategies can decrease identification bias, and use of tailored reporting guidelines can decrease translation bias. Research that cannot be found, or cannot be used results in research waste, and undermines clinical practice. SIGNIFICANCE: Identification- and translation-bias have been shown to affect time-to-event dental articles, are likely affect other fields of research, and are largely unrecognized by authors and evidence seekers alike. By understanding that the problems exist, solutions can be sought to improve identification and translation of our research.
Asunto(s)
Indización y Redacción de Resúmenes/normas , Bases de Datos Bibliográficas , Investigación Dental , Almacenamiento y Recuperación de la Información , Medical Subject Headings , Informe de Investigación/normas , Escritura/normas , Humanos , MEDLINERESUMEN
Software-aided identification facilitates the handling of large sets of bat call recordings, which is particularly useful in extensive acoustic surveys with several collaborators. Species lists are generated by "objective" automated classification. Subsequent validation consists of removing any species not believed to be present. So far, very little is known about the identification bias introduced by individual validation of operators with varying degrees of experience. Effects on the quality of the resulting data may be considerable, especially for bat species that are difficult to identify acoustically. Using the batcorder system as an example, we compared validation results from 21 volunteer operators with 1-26 years of experience of working on bats. All of them validated identical recordings of bats from eastern Austria. The final outcomes were individual validated lists of plausible species. A questionnaire was used to enquire about individual experience and validation procedures. In the course of species validation, the operators reduced the software's estimate of species richness. The most experienced operators accepted the smallest percentage of species from the software's output and validated conservatively with low interoperator variability. Operators with intermediate experience accepted the largest percentage, with larger variability. Sixty-six percent of the operators, mainly with intermediate and low levels of experience, reintroduced species to their validated lists which had been identified by the automated classification, but were finally excluded from the unvalidated lists. These were, in many cases, rare and infrequently recorded species. The average dissimilarity of the validated species lists dropped with increasing numbers of recordings, tending toward a level of Ë20%. Our results suggest that the operators succeeded in removing false positives and that they detected species that had been wrongly excluded during automated classification. Thus, manual validation of the software's unvalidated output is indispensable for reasonable results. However, although application seems easy, software-aided bat call identification requires an advanced level of operator experience. Identification bias during validation is a major issue, particularly in studies with more than one participant. Measures should be taken to standardize the validation process and harmonize the results of different operators.