RESUMO
The widespread testing for severe acute respiratory syndrome coronavirus 2 infection has facilitated the use of test-negative designs (TNDs) for modeling coronavirus disease 2019 (COVID-19) vaccination and outcomes. Despite the comprehensive literature on TND, the use of TND in COVID-19 studies is relatively new and calls for robust design and analysis to adapt to a rapidly changing and dynamically evolving pandemic and to account for changes in testing and reporting practices. In this commentary, we aim to draw the attention of researchers to COVID-specific challenges in using TND as we are analyzing data amassed over more than two years of the pandemic. We first review when and why TND works and general challenges in TND studies presented in the literature. We then discuss COVID-specific challenges which have not received adequate acknowledgment but may add to the risk of invalid conclusions in TND studies of COVID-19.
Assuntos
COVID-19 , Humanos , Vacinas contra COVID-19 , Teste para COVID-19 , VacinaçãoRESUMO
BACKGROUND: The rule of thumb that there is little gain in statistical power by obtaining more than 4 controls per case, is based on type-1 error α = 0.05. However, association studies that evaluate thousands or millions of associations use smaller α and may have access to plentiful controls. We investigate power gains, and reductions in p-values, when increasing well beyond 4 controls per case, for small α. METHODS: We calculate the power, the median expected p-value, and the minimum detectable odds-ratio (OR), as a function of the number of controls/case, as α decreases. RESULTS: As α decreases, at each ratio of controls per case, the increase in power is larger than for α = 0.05. For α between 10-6 and 10-9 (typical for thousands or millions of associations), increasing from 4 controls per case to 10-50 controls per case increases power. For example, a study with power = 0.2 (α = 5 × 10-8) with 1 control/case has power = 0.65 with 4 controls/case, but with 10 controls/case has power = 0.78, and with 50 controls/case has power = 0.84. For situations where obtaining more than 4 controls per case provides small increases in power beyond 0.9 (at small α), the expected p-value can decrease by orders-of-magnitude below α. Increasing from 1 to 4 controls/case reduces the minimum detectable OR toward the null by 20.9%, and from 4 to 50 controls/case reduces by an additional 9.7%, a result which applies regardless of α and hence also applies to "regular" α = 0.05 epidemiology. CONCLUSIONS: At small α, versus 4 controls/case, recruiting 10 or more controls/cases can increase power, reduce the expected p-value by 1-2 orders of magnitude, and meaningfully reduce the minimum detectable OR. These benefits of increasing the controls/case ratio increase as the number of cases increases, although the amount of benefit depends on exposure frequencies and true OR. Provided that controls are comparable to cases, our findings suggest greater sharing of comparable controls in large-scale association studies.
Assuntos
Grupos Controle , Razão de Chances , Projetos de Pesquisa , HumanosRESUMO
Many pneumonia etiology case-control studies exclude controls with respiratory illness from enrollment or analyses. Herein we argue that selecting controls regardless of respiratory symptoms provides the least biased estimates of pneumonia etiology. We review 3 reasons investigators may choose to exclude controls with respiratory symptoms in light of epidemiologic principles of control selection and present data from the Pneumonia Etiology Research for Child Health (PERCH) study where relevant to assess their validity. We conclude that exclusion of controls with respiratory symptoms will result in biased estimates of etiology. Randomly selected community controls, with or without respiratory symptoms, as long as they do not meet the criteria for case-defining pneumonia, are most representative of the general population from which cases arose and the least subject to selection bias.
Assuntos
Projetos de Pesquisa Epidemiológica , Pneumonia/etiologia , Projetos de Pesquisa , Infecções Respiratórias , Criança , Interpretação Estatística de Dados , Feminino , Humanos , Masculino , Estudos Multicêntricos como Assunto , Pneumonia/epidemiologia , Pneumonia Bacteriana/epidemiologia , Pneumonia Viral/epidemiologia , Infecções Respiratórias/diagnóstico , Infecções Respiratórias/etiologia , Fatores de Risco , Viés de SeleçãoRESUMO
Scholarly debate on the use of deceased controls in epidemiologic research continues. This systematic review examined published epidemiologic research using deceased persons as a control group. A systematic search of 5 major biomedical literature databases (MEDLINE, CINAHL, PsycINFO, Scopus, and EMBASE) was conducted, using variations of the search terms "deceased" and "controls" to identify relevant peer-reviewed journal articles. Information was sought on study design, rationale for using deceased controls, application of theoretical principles of control selection, and discussion of the use of deceased controls. The review identified 134 studies using deceased controls published in English between 1978 and 2015. Common health outcomes under investigation included cancer (n = 31; 23.1%), nervous system diseases (n = 26; 19.4%), and injury and other external causes (n = 22; 16.4%). The majority of studies used deceased controls for comparison with deceased cases (n = 95; 70.9%). Investigators rarely presented their rationale for control selection (n = 25/134; 18.7%); however, common reasons included comparability of information on exposures, lack of appropriate controls from other sources, and counteracting bias associated with living controls. Comparable accuracy was the most frequently observed principle of control selection (n = 92; 68.7%). This review highlights the breadth of research using deceased controls and indicates their appropriateness in studies using deceased cases.
Assuntos
Grupos Controle , Estudos Epidemiológicos , Estudos de Casos e Controles , Confiabilidade dos Dados , Atestado de Óbito , HumanosRESUMO
Background: Recent studies suggest that cardiac amyloidosis (CA) is significantly underdiagnosed. For rare diseases like CA, the optimal selection of cases and controls for artificial intelligence model training is unknown and can significantly impact model performance. Objectives: This study evaluates the performance of electrocardiogram (ECG) waveform-based artificial intelligence models for CA screening and assesses impact of different criteria for defining cases and controls. Methods: Using a primary cohort of â¼1.3 million ECGs from 341,989 patients, models were trained using different case and control definitions. Case definitions included ECGs from patients with an amyloidosis diagnosis by International Classification of Diseases-9/10 code, patients with CA, and patients seen in CA clinic. Models were then tested on test cohorts with identical selection criteria as well as a Cedars-Sinai general patient population cohort. Results: In matched held-out test data sets, different model AUCs ranged from 0.660 (95% CI: 0.642-0.736) to 0.898 (95% CI: 0.868-0.924). However, algorithms exhibited variable generalizability when tested on a Cedars-Sinai general patient population cohort, with AUCs dropping to 0.467 (95% CI: 0.443-0.491) to 0.898 (95% CI: 0.870-0.923). Models trained on more well-curated patient cases resulted in higher AUCs on similarly constructed test cohorts. However, all models performed similarly in the overall Cedars-Sinai general patient population cohort. A model trained with International Classification of Diseases 9/10 cases and population controls matched for age and sex resulted in the best screening performance. Conclusions: Models performed similarly in population screening, regardless of stringency of cases used during training, showing that institutions without dedicated amyloid clinics can train meaningful models on less curated CA cases. Additionally, AUC or other metrics alone are insufficient in evaluating deep learning algorithm performance. Instead, evaluation in the most clinically meaningful population is key.
RESUMO
Unmatched spatially stratified random sampling (SSRS) of non-cases selects geographically balanced controls by dividing the study area into spatial strata and randomly selecting controls from all non-cases within each stratum. The performance of SSRS control selection was evaluated in a case study spatial analysis of preterm birth in Massachusetts. In a simulation study, we fit generalized additive models using controls selected by SSRS or simple random sample (SRS) designs. We compared mean squared error (MSE), bias, relative efficiency (RE), and statistically significant map results to the model results with all non-cases. SSRS designs had lower average MSE (0.0042-0.0044) and higher RE (77-80%) compared to SRS designs (MSE: 0.0072-0.0073; RE across designs: 71%). SSRS map results were more consistent across simulations, reliably identifying statistically significant areas. SSRS designs improved efficiency by selecting controls that are geographically distributed, particularly from low population density areas, and may be more appropriate for spatial analyses.
Assuntos
Nascimento Prematuro , Feminino , Humanos , Recém-Nascido , Viés , Simulação por Computador , Nascimento Prematuro/epidemiologia , Projetos de Pesquisa , Análise Espacial , GravidezRESUMO
BACKGROUND: Effect sizes are the most useful quantities for communicating the practical significance of results and helping to facilitate cumulative science. We hypothesize that the selection of the best-fitted controls can significantly affect the estimated effect sizes in case-control studies. Therefore, we decided to exemplify and clarify this effect on effect size using a large data set. The objective of this study was to investigate the association among variables in functional gastrointestinal disorders (FGIDs) and mental health problems, common ailments that reduce the quality of life of a large proportion of the community worldwide. METHOD: In this methodological study, we constitute case and control groups in our study framework using the Epidemiology of Psychological, Alimentary Health and Nutrition (SEPAHAN) dataset of 4763 participants. We devised four definitions for control in this extensive database of FGID patients and analyzed the effect of these definitions on the odds ratio (OR): 1. conventional control: without target disorder/syndrome (sample size 4040); 2. without any positive criteria: criterion-free control (sample size 1053); 3. syndrome-free control: without any disorder/syndrome (sample size 847); 4. symptom-free control: without any symptoms (sample size 204). We considered a fixed case group that included 723 patients with a Rome III-based definition of functional dyspepsia. Psychological distress, anxiety, and depression were considered as dependent variables in the analysis. Logistic regression was used for association analysis, and the odds ratio and 95% confidence interval (95%CI) for OR were reported as the effect size. RESULTS: The estimated ORs indicate that the strength of the association in the first case-control group is the lowest, and the fourth case-control group, including controls with completely asymptomatic people, is the highest. Ascending effect sizes were obtained in the conventional, criterion-free, syndrome-free, and symptom-free control groups. These results are consistent for all three psychological disorders, psychological distress, anxiety, and depression. CONCLUSIONS: This study shows that a precise definition of the control is mandatory in every case-control study and affects the estimated effect size. In clinical settings, the selection of symptomatic controls using the conventional definition could significantly diminish the effect size.
Assuntos
Dispepsia , Gastroenteropatias , Ansiedade , Estudos de Casos e Controles , Gastroenteropatias/epidemiologia , Humanos , Qualidade de VidaRESUMO
Genome-wide association studies have identified more than 150 susceptibility loci for coronary artery disease (CAD); however, there is still a large proportion of missing heritability remaining to be investigated. This study sought to identify population-based genetic variation associated with acute coronary syndromes (ACS) in individuals of Chinese Han descent. We proposed a novel strategy integrating a well-developed risk prediction model into control selection in order to lower the potential misclassification bias and increase the statistical power. An exome-wide association analysis was performed for 1,669 ACS patients and 1,935 healthy controls. Promising variants were further replicated using the existing in silico dataset. Additionally, we performed gene- and pathway-based analyses to investigate the aggregate effect of multiple variants within the same genes or pathways. Although none of the association signals were consistent across studies after Bonferroni correction, one promising variant, rs10409124 at STRN4, showed potential impact on ACS in both European and East Asian populations. Gene-based analysis explored four genes (ANXA7, ZNF655, ZNF347, and ZNF750) that showed evidence for association with ACS after multiple test correction, and identification of ZNF655 was successfully replicated by another dataset. Pathway-based analysis revealed that 32 potential pathways might be involved in the pathogenesis of ACS. Our study identified several candidate genes and pathways associated with ACS. Future studies are needed to further validate these findings and explore these genes and pathways as potential therapeutic targets in ACS.
RESUMO
Based on the unique characteristics of influenza, the concept of "monitoring" influenza vaccine effectiveness (VE) across the seasons using the same observational study design has been developed. In recent years, there has been a growing number of influenza VE reports using the test-negative design, which can minimize both misclassification of diseases and confounding by health care-seeking behavior. Although the test-negative designs offer considerable advantages, there are some concerns that widespread use of the test-negative design without knowledge of the basic principles of epidemiology could produce invalid findings. In this article, we briefly review the basic concepts of the test-negative design with respect to classic study design such as cohort studies or case-control studies. We also mention selection bias, which may be of concern in some countries where rapid diagnostic testing is frequently used in routine clinical practices, as in Japan.
Assuntos
Vacinas contra Influenza/imunologia , Influenza Humana/prevenção & controle , Projetos de Pesquisa , Potência de Vacina , Estudos de Casos e Controles , Humanos , Influenza Humana/epidemiologia , Japão/epidemiologia , Estações do Ano , VacinaçãoRESUMO
INTRODUCTION: The New South Wales (NSW) Cancer, Lifestyle and Evaluation of Risk Study (CLEAR) is an open epidemiological bioresource, using an all cancer unmatched case-spouse control design. Participant characteristics and selected confirmed associations are compared to published estimates: current smoking and lung cancer; country of birth and melanoma; body mass index (BMI) and bowel cancer; and paternal history of prostate cancer and prostate cancer, to illustrate the validity of this design. MATERIAL AND METHODS: Cases are NSW residents, ≥18 years, with an incident cancer of any type. Controls are cancer-free spouses of cases. Participants complete a consent form, a questionnaire, and provide an optional blood sample. For analyses, odds ratios for males and females are calculated for cancers and exposures of interest, by sex-matching controls to cases. RESULTS: 10,816 participants (8569 cases, 2247 controls, 54% female) recruited to-date, median age: 61.6 y cases, 61.3 y controls. The top five cancer types are female breast (n=1691), prostate (n=1102), bowel (n=888), melanoma (n=608), and lung (n=265). Adjusted odds ratios (OR) were: 20.65 (95% CI: 13.25-32.19) for lung cancer in current versus never smokers; 1.16 (1.05-1.28) for bowel cancer per 5 kg/m(2) increment in BMI; 1.41 (1.01-1.96) for melanoma in Australian-born compared to those born in UK/Ireland; and 2.47 (1.82-3.37) for prostate cancer in men with versus without a paternal history of prostate cancer. DISCUSSION: This study design, where controls are the spouses of cases diagnosed with a variety of cancers and which are analysed unmatched, avoids potential biases due to overmatching, considered problematic in standard case-spouse control studies, and illustrates that risk estimates analysed are consistent with the published literature. CLEAR methodology provides a practical design to advance local knowledge on the causes of various leading and emerging cancers.