RESUMEN
BACKGROUND: Guideline development on testing is known to be difficult for guideline developers. It requires consideration of various aspects, such as accuracy, purpose of testing, and consequences on management and people-important outcomes. This can be outlined in a test-management pathway. We aimed to create and user-test a step-by-step guide for guideline developers for designing a test-management pathway. METHODS: Developmental design with a co-creative strategy. We created a draft step-by-step guide, that was user tested in a workshop with 19 experts, and by interviewing 7 guideline panel members. RESULTS: Our proposed guide consists of five blocks of signalling questions: patients/population, index test(s), current practice/comparison/control, people-important outcomes, and the link between testing and outcome(s). The user testing led to refinement of the signalling questions, the use of inclusive terminology, and addition of a test-management pathway figure with detailed explanation. CONCLUSIONS: The step-by-step guide for formulating focused guideline questions regarding healthcare related testing can help in identifying relevant characteristics of the population, tests, and outcomes and to create a test management pathway. This should facilitate the formulation of evidence-based guideline recommendations about healthcare related testing.
Asunto(s)
Guías de Práctica Clínica como Asunto , Humanos , Guías de Práctica Clínica como Asunto/normas , Atención a la Salud/normasRESUMEN
The prediction of the risk of developing complications after colorectal surgery for colorectal carcinoma remains imprecise. Body composition measurements on a computed tomography (CT) scan can potentially contribute to a better preoperative risk assessment. The aim of this systematic review is to evaluate the evidence for the use of body composition measurements on CT scans to predict short-term complications after colorectal cancer surgery. A literature search (in PubMed, Embase and Web of Science) was performed up to 1 August 2022. Two researchers independently screened the articles, extracted data and assessed the quality of the studies using the Quality in Prognosis Studies tool. The primary outcome measure was the occurrence of complications within 30 days after surgery. Meta-analysis was conducted using a random-effects model to synthesize a pooled odds ratio (OR). The study protocol was registered in PROSPERO (CRD42021281010). Forty-five articles with a total of 16 537 patients were included. In total, 26 body composition measures were investigated: 8 muscle-related measures, 11 adipose tissue measures, 4 combined muscle and adipose tissue measures, and 3 other measures. These were investigated as potential predictors for more than 50 differently defined postoperative complications. Meta-analysis was only possible for two measurements and showed that higher amounts of visceral fat increase the risk of developing overall complications (OR: 2.52 [1.58-4.00], P < 0.0001) and anastomotic leakage (OR: 1.76 [1.17-2.65], P = 0.006). A wide variety of body composition measurements on preoperative CT scans have been investigated as a predictive factor for postoperative complications. Visceral fat appeared to be associated with overall complications and anastomotic leakage; however, the association is weak, and its clinical relevance or applicability is questionable. The current evidence is limited by methodological heterogeneity and the risk of bias. To improve comparability of results across studies and improve decision-making, future studies should use standardized methods for measuring body composition on CT scans, outcome definitions and statistical analyses.
RESUMEN
BACKGROUND: Diagnosing people with a SARS-CoV-2 infection played a critical role in managing the COVID-19 pandemic and remains a priority for the transition to long-term management of COVID-19. Initial shortages of extraction and reverse transcription polymerase chain reaction (RT-PCR) reagents impaired the desired upscaling of testing in many countries, which led to the search for alternatives to RNA extraction/purification and RT-PCR testing. Reference standard methods for diagnosing the presence of SARS-CoV-2 infection rely primarily on real-time reverse transcription-polymerase chain reaction (RT-PCR). Alternatives to RT-PCR could, if sufficiently accurate, have a positive impact by expanding the range of diagnostic tools available for the timely identification of people infected by SARS-CoV-2, access to testing and the use of resources. OBJECTIVES: To assess the diagnostic accuracy of alternative (to RT-PCR assays) laboratory-based molecular tests for diagnosing SARS-CoV-2 infection. SEARCH METHODS: We searched the COVID-19 Open Access Project living evidence database from the University of Bern until 30 September 2020 and the WHO COVID-19 Research Database until 31 October 2022. We did not apply language restrictions. SELECTION CRITERIA: We included studies of people with suspected or known SARS-CoV-2 infection, or where tests were used to screen for infection, and studies evaluating commercially developed laboratory-based molecular tests for the diagnosis of SARS-CoV-2 infection considered as alternatives to RT-PCR testing. We also included all reference standards to define the presence or absence of SARS-CoV-2, including RT-PCR tests and established clinical diagnostic criteria. DATA COLLECTION AND ANALYSIS: Two authors independently screened studies and resolved disagreements by discussing them with a third author. Two authors independently extracted data and assessed the risk of bias and applicability of the studies using the QUADAS-2 tool. We presented sensitivity and specificity, with 95% confidence intervals (CIs), for each test using paired forest plots and summarised results using average sensitivity and specificity using a bivariate random-effects meta-analysis. We illustrated the findings per index test category and assay brand compared to the WHO's acceptable sensitivity and specificity threshold for diagnosing SARS-CoV-2 infection using nucleic acid tests. MAIN RESULTS: We included data from 64 studies reporting 94 cohorts of participants and 105 index test evaluations, with 74,753 samples and 7517 confirmed SARS-CoV-2 cases. We did not identify any published or preprint reports of accuracy for a considerable number of commercially produced NAAT assays. Most cohorts were judged at unclear or high risk of bias in more than three QUADAS-2 domains. Around half of the cohorts were considered at high risk of selection bias because of recruitment based on COVID status. Three quarters of 94 cohorts were at high risk of bias in the reference standard domain because of reliance on a single RT-PCR result to determine the absence of SARS-CoV-2 infection or were at unclear risk of bias due to a lack of clarity about the time interval between the index test assessment and the reference standard, the number of missing results, or the absence of a participant flow diagram. For index tests categories with four or more evaluations and when summary estimations were possible, we found that: a) For RT-PCR assays designed to omit/adapt RNA extraction/purification, the average sensitivity was 95.1% (95% CI 91.1% to 97.3%), and the average specificity was 99.7% (95% CI 98.5% to 99.9%; based on 27 evaluations, 2834 samples and 1178 SARS-CoV-2 cases); b) For RT-LAMP assays, the average sensitivity was 88.4% (95% CI 83.1% to 92.2%), and the average specificity was 99.7% (95% CI 98.7% to 99.9%; 24 evaluations, 29,496 samples and 2255 SARS-CoV-2 cases); c) for TMA assays, the average sensitivity was 97.6% (95% CI 95.2% to 98.8%), and the average specificity was 99.4% (95% CI 94.9% to 99.9%; 14 evaluations, 2196 samples and 942 SARS-CoV-2 cases); d) for digital PCR assays, the average sensitivity was 98.5% (95% CI 95.2% to 99.5%), and the average specificity was 91.4% (95% CI 60.4% to 98.7%; five evaluations, 703 samples and 354 SARS-CoV-2 cases); e) for RT-LAMP assays omitting/adapting RNA extraction, the average sensitivity was 73.1% (95% CI 58.4% to 84%), and the average specificity was 100% (95% CI 98% to 100%; 24 evaluations, 14,342 samples and 1502 SARS-CoV-2 cases). Only two index test categories fulfil the WHO-acceptable sensitivity and specificity requirements for SARS-CoV-2 nucleic acid tests: RT-PCR assays designed to omit/adapt RNA extraction/purification and TMA assays. In addition, WHO-acceptable performance criteria were met for two assays out of 35 when tests were used according to manufacturer instructions. At 5% prevalence using a cohort of 1000 people suspected of SARS-CoV-2 infection, the positive predictive value of RT-PCR assays omitting/adapting RNA extraction/purification will be 94%, with three in 51 positive results being false positives, and around two missed cases. For TMA assays, the positive predictive value of RT-PCR assays will be 89%, with 6 in 55 positive results being false positives, and around one missed case. AUTHORS' CONCLUSIONS: Alternative laboratory-based molecular tests aim to enhance testing capacity in different ways, such as reducing the time, steps and resources needed to obtain valid results. Several index test technologies with these potential advantages have not been evaluated or have been assessed by only a few studies of limited methodological quality, so the performance of these kits was undetermined. Only two index test categories with enough evaluations for meta-analysis fulfil the WHO set of acceptable accuracy standards for SARS-CoV-2 nucleic acid tests: RT-PCR assays designed to omit/adapt RNA extraction/purification and TMA assays. These assays might prove to be suitable alternatives to RT-PCR for identifying people infected by SARS-CoV-2, especially when the alternative would be not having access to testing. However, these findings need to be interpreted and used with caution because of several limitations in the evidence, including reliance on retrospective samples without information about the symptom status of participants and the timing of assessment. No extrapolation of found accuracy data for these two alternatives to any test brands using the same techniques can be made as, for both groups, one test brand with high accuracy was overrepresented with 21/26 and 12/14 included studies, respectively. Although we used a comprehensive search and had broad eligibility criteria to include a wide range of tests that could be alternatives to RT-PCR methods, further research is needed to assess the performance of alternative COVID-19 tests and their role in pandemic management.
Asunto(s)
Prueba de Ácido Nucleico para COVID-19 , COVID-19 , ARN Viral , SARS-CoV-2 , Sensibilidad y Especificidad , Humanos , Sesgo , COVID-19/diagnóstico , Prueba de Ácido Nucleico para COVID-19/métodos , Reacciones Falso Negativas , Reacciones Falso Positivas , Pandemias , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/normas , ARN Viral/análisis , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificaciónRESUMEN
Aim: To assess the accuracy and technical characteristics of CYP2C19 point of care tests (POCTs).Patients & methods: Systematic review of primary studies, in any population or setting, that evaluated POCTs for detecting CYP2C19 loss of function (LOF) alleles.Results: Eleven studies provided accuracy data (eight Spartan; one Genomadix Cube; one GMEX; one Genedrive). The POCTs had very high sensitivity and specificity for the alleles they tested for. Twenty-two studies reported technical characteristics: POCTs were easy to operate and provided results quickly. Limited data were reported for test failure rate and cost.Conclusion: CYP2C19 POCTs may be a useful alternative to laboratory-based testing to guide antiplatelet therapy. Further data are required on accuracy (GMEX; Genedrive), test failure and cost (all POCT).
[Box: see text].
Asunto(s)
Citocromo P-450 CYP2C19 , Humanos , Alelos , Citocromo P-450 CYP2C19/genética , Inhibidores de Agregación Plaquetaria/uso terapéutico , Sistemas de Atención de Punto/normas , Pruebas en el Punto de Atención/normasRESUMEN
BACKGROUND: Identifying patients with COVID-19 disease who will deteriorate can be useful to assess whether they should receive intensive care, or whether they can be treated in a less intensive way or through outpatient care. In clinical care, routine laboratory markers, such as C-reactive protein, are used to assess a person's health status. OBJECTIVES: To assess the accuracy of routine blood-based laboratory tests to predict mortality and deterioration to severe or critical (from mild or moderate) COVID-19 in people with SARS-CoV-2. SEARCH METHODS: On 25 August 2022, we searched the Cochrane COVID-19 Study Register, encompassing searches of various databases such as MEDLINE via PubMed, CENTRAL, Embase, medRxiv, and ClinicalTrials.gov. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of all designs that produced estimates of prognostic accuracy in participants who presented to outpatient services, or were admitted to general hospital wards with confirmed SARS-CoV-2 infection, and studies that were based on serum banks of samples from people. All routine blood-based laboratory tests performed during the first encounter were included. We included any reference standard used to define deterioration to severe or critical disease that was provided by the authors. DATA COLLECTION AND ANALYSIS: Two review authors independently extracted data from each included study, and independently assessed the methodological quality using the Quality Assessment of Prognostic Accuracy Studies tool. As studies reported different thresholds for the same test, we used the Hierarchical Summary Receiver Operator Curve model for meta-analyses to estimate summary curves in SAS 9.4. We estimated the sensitivity at points on the SROC curves that corresponded to the median and interquartile range boundaries of specificities in the included studies. Direct and indirect comparisons were exclusively conducted for biomarkers with an estimated sensitivity and 95% CI of ≥ 50% at a specificity of ≥ 50%. The relative diagnostic odds ratio was calculated as a summary of the relative accuracy of these biomarkers. MAIN RESULTS: We identified a total of 64 studies, including 71,170 participants, of which 8169 participants died, and 4031 participants deteriorated to severe/critical condition. The studies assessed 53 different laboratory tests. For some tests, both increases and decreases relative to the normal range were included. There was important heterogeneity between tests and their cut-off values. None of the included studies had a low risk of bias or low concern for applicability for all domains. None of the tests included in this review demonstrated high sensitivity or specificity, or both. The five tests with summary sensitivity and specificity above 50% were: C-reactive protein increase, neutrophil-to-lymphocyte ratio increase, lymphocyte count decrease, d-dimer increase, and lactate dehydrogenase increase. Inflammation For mortality, summary sensitivity of a C-reactive protein increase was 76% (95% CI 73% to 79%) at median specificity, 59% (low-certainty evidence). For deterioration, summary sensitivity was 78% (95% CI 67% to 86%) at median specificity, 72% (very low-certainty evidence). For the combined outcome of mortality or deterioration, or both, summary sensitivity was 70% (95% CI 49% to 85%) at median specificity, 60% (very low-certainty evidence). For mortality, summary sensitivity of an increase in neutrophil-to-lymphocyte ratio was 69% (95% CI 66% to 72%) at median specificity, 63% (very low-certainty evidence). For deterioration, summary sensitivity was 75% (95% CI 59% to 87%) at median specificity, 71% (very low-certainty evidence). For mortality, summary sensitivity of a decrease in lymphocyte count was 67% (95% CI 56% to 77%) at median specificity, 61% (very low-certainty evidence). For deterioration, summary sensitivity of a decrease in lymphocyte count was 69% (95% CI 60% to 76%) at median specificity, 67% (very low-certainty evidence). For the combined outcome, summary sensitivity was 83% (95% CI 67% to 92%) at median specificity, 29% (very low-certainty evidence). For mortality, summary sensitivity of a lactate dehydrogenase increase was 82% (95% CI 66% to 91%) at median specificity, 60% (very low-certainty evidence). For deterioration, summary sensitivity of a lactate dehydrogenase increase was 79% (95% CI 76% to 82%) at median specificity, 66% (low-certainty evidence). For the combined outcome, summary sensitivity was 69% (95% CI 51% to 82%) at median specificity, 62% (very low-certainty evidence). Hypercoagulability For mortality, summary sensitivity of a d-dimer increase was 70% (95% CI 64% to 76%) at median specificity of 56% (very low-certainty evidence). For deterioration, summary sensitivity was 65% (95% CI 56% to 74%) at median specificity of 63% (very low-certainty evidence). For the combined outcome, summary sensitivity was 65% (95% CI 52% to 76%) at median specificity of 54% (very low-certainty evidence). To predict mortality, neutrophil-to-lymphocyte ratio increase had higher accuracy compared to d-dimer increase (RDOR (diagnostic Odds Ratio) 2.05, 95% CI 1.30 to 3.24), C-reactive protein increase (RDOR 2.64, 95% CI 2.09 to 3.33), and lymphocyte count decrease (RDOR 2.63, 95% CI 1.55 to 4.46). D-dimer increase had higher accuracy compared to lymphocyte count decrease (RDOR 1.49, 95% CI 1.23 to 1.80), C-reactive protein increase (RDOR 1.31, 95% CI 1.03 to 1.65), and lactate dehydrogenase increase (RDOR 1.42, 95% CI 1.05 to 1.90). Additionally, lactate dehydrogenase increase had higher accuracy compared to lymphocyte count decrease (RDOR 1.30, 95% CI 1.13 to 1.49). To predict deterioration to severe disease, C-reactive protein increase had higher accuracy compared to d-dimer increase (RDOR 1.76, 95% CI 1.25 to 2.50). The neutrophil-to-lymphocyte ratio increase had higher accuracy compared to d-dimer increase (RDOR 2.77, 95% CI 1.58 to 4.84). Lastly, lymphocyte count decrease had higher accuracy compared to d-dimer increase (RDOR 2.10, 95% CI 1.44 to 3.07) and lactate dehydrogenase increase (RDOR 2.22, 95% CI 1.52 to 3.26). AUTHORS' CONCLUSIONS: Laboratory tests, associated with hypercoagulability and hyperinflammatory response, were better at predicting severe disease and mortality in patients with SARS-CoV-2 compared to other laboratory tests. However, to safely rule out severe disease, tests should have high sensitivity (> 90%), and none of the identified laboratory tests met this criterion. In clinical practice, a more comprehensive assessment of a patient's health status is usually required by, for example, incorporating these laboratory tests into clinical prediction rules together with clinical symptoms, radiological findings, and patient's characteristics.
Asunto(s)
Proteína C-Reactiva , COVID-19 , SARS-CoV-2 , Humanos , COVID-19/mortalidad , COVID-19/sangre , COVID-19/diagnóstico , Proteína C-Reactiva/análisis , Biomarcadores/sangre , Pronóstico , Deterioro Clínico , Sesgo , Pandemias , Sensibilidad y Especificidad , Índice de Severidad de la Enfermedad , Prueba de COVID-19/métodosRESUMEN
BACKGROUND: Before a new test can be routinely used in your laboratory, its reliability must be established in the laboratory where it will be used. International standards demand validation and verification procedures for new tests. The International Organization for Standardization (ISO) 15189 was recently updated, and the European Commission's In Vitro Diagnostic Regulation (IVDR) came into effect. These events will likely increase the need for validation and verification procedures. OBJECTIVES: This paper aims to provide practical guidance in validating or verifying microbiology tests, including antimicrobial susceptibility tests in a clinical microbiology laboratory. SOURCES: It summarizes and interprets certain parts of standards such as ISO 15189:2022, and regulations, such as IVDR 2017/746 regarding validation or verification of a new test in a routine clinical microbiology laboratory. CONTENT: The reasons for choosing a new test and the outline of the validation and verification plan are discussed. Furthermore, the following topics are touched upon: the choice of reference standard, number of samples, testing procedures, how to solve the discrepancies between results from new test and reference standard, and acceptance criteria. Arguments for selecting certain parameters (such as reference standard and sample size) and examples are given. IMPLICATIONS: With the expected increase in validation and verification procedures because of the implementation of IVDR, this paper may aid in planning and executing these procedures.
Asunto(s)
Estándares de Referencia , Humanos , Reproducibilidad de los Resultados , Pruebas de Sensibilidad Microbiana/normas , Pruebas Diagnósticas de Rutina/normas , Pruebas Diagnósticas de Rutina/métodos , Técnicas Microbiológicas/normas , Técnicas Microbiológicas/métodos , Estudios de Validación como Asunto , Laboratorios Clínicos/normasRESUMEN
OBJECTIVES: To define the minimum knowledge required for guideline panel members (healthcare professionals and consumers) involved in developing recommendations about healthcare related testing. STUDY DESIGN AND SETTING: A developmental study with a multistaged approach. We derived a first set of knowledge components from literature and subsequently performed semistructured interviews with 9 experts. We refined the set of knowledge components and checked it with the interviewees for final approval. RESULTS: Understanding the test-management pathway, for example, how test results should be used in context of decisions about interventions, is the key knowledge component. The final list includes 26 items on the following topics: health question, test-management pathway, target population, test, test result, interpretation of test results and subsequent management, and impact on people important outcomes. For each item, the required level of knowledge is defined. CONCLUSION: We developed a list of knowledge components required for guideline panels to formulate recommendations on healthcare related testing. The list could be used to design specific training programs for guideline panel members when developing recommendations about tests and testing strategies in healthcare.
Asunto(s)
Guías de Práctica Clínica como Asunto , Humanos , Guías de Práctica Clínica como Asunto/normas , Conocimientos, Actitudes y Práctica en Salud , Personal de SaludRESUMEN
BACKGROUND: Prenatal ultrasound is widely used to screen for structural anomalies before birth. While this is traditionally done in the second trimester, there is an increasing use of first-trimester ultrasound for early detection of lethal and certain severe structural anomalies. OBJECTIVES: To evaluate the diagnostic accuracy of ultrasound in detecting fetal structural anomalies before 14 and 24 weeks' gestation in low-risk and unselected pregnant women and to compare the current two main prenatal screening approaches: a single second-trimester scan (single-stage screening) and a first- and second-trimester scan combined (two-stage screening) in terms of anomaly detection before 24 weeks' gestation. SEARCH METHODS: We searched MEDLINE, EMBASE, Science Citation Index Expanded (Web of Science), Social Sciences Citation Index (Web of Science), Arts & Humanities Citation Index and Emerging Sources Citation Index (Web of Science) from 1 January 1997 to 22 July 2022. We limited our search to studies published after 1997 and excluded animal studies, reviews and case reports. No further restrictions were applied. We also screened reference lists and citing articles of each of the included studies. SELECTION CRITERIA: Studies were eligible if they included low-risk or unselected pregnant women undergoing a first- and/or second-trimester fetal anomaly scan, conducted at 11 to 14 or 18 to 24 weeks' gestation, respectively. The reference standard was detection of anomalies at birth or postmortem. DATA COLLECTION AND ANALYSIS: Two review authors independently undertook study selection, quality assessment (QUADAS-2), data extraction and evaluation of the certainty of evidence (GRADE approach). We used univariate random-effects logistic regression models for the meta-analysis of sensitivity and specificity. MAIN RESULTS: Eighty-seven studies covering 7,057,859 fetuses (including 25,202 with structural anomalies) were included. No study was deemed low risk across all QUADAS-2 domains. Main methodological concerns included risk of bias in the reference standard domain and risk of partial verification. Applicability concerns were common in studies evaluating first-trimester scans and two-stage screening in terms of patient selection due to frequent recruitment from single tertiary centres without exclusion of referrals. We reported ultrasound accuracy for fetal structural anomalies overall, by severity, affected organ system and for 46 specific anomalies. Detection rates varied widely across categories, with the highest estimates of sensitivity for thoracic and abdominal wall anomalies and the lowest for gastrointestinal anomalies across all tests. The summary sensitivity of a first-trimester scan was 37.5% for detection of structural anomalies overall (95% confidence interval (CI) 31.1 to 44.3; low-certainty evidence) and 91.3% for lethal anomalies (95% CI 83.9 to 95.5; moderate-certainty evidence), with an overall specificity of 99.9% (95% CI 99.9 to 100; low-certainty evidence). Two-stage screening had a combined sensitivity of 83.8% (95% CI 74.7 to 90.1; low-certainty evidence), while single-stage screening had a sensitivity of 50.5% (95% CI 38.5 to 62.4; very low-certainty evidence). The specificity of two-stage screening was 99.9% (95% CI 99.7 to 100; low-certainty evidence) and for single-stage screening, it was 99.8% (95% CI 99.2 to 100; moderate-certainty evidence). Indirect comparisons suggested superiority of two-stage screening across all analyses regarding sensitivity, with no significant difference in specificity. However, the certainty of the evidence is very low due to the absence of direct comparisons. AUTHORS' CONCLUSIONS: A first-trimester scan has the potential to detect lethal and certain severe anomalies with high accuracy before 14 weeks' gestation, despite its limited overall sensitivity. Conversely, two-stage screening shows high accuracy in detecting most fetal structural anomalies before 24 weeks' gestation with high sensitivity and specificity. In a hypothetical cohort of 100,000 fetuses, the first-trimester scan is expected to correctly identify 113 out of 124 fetuses with lethal anomalies (91.3%) and 665 out of 1776 fetuses with any anomaly (37.5%). However, 79 false-positive diagnoses are anticipated among 98,224 fetuses (0.08%). Two-stage screening is expected to correctly identify 1448 out of 1776 cases of structural anomalies overall (83.8%), with 118 false positives (0.1%). In contrast, single-stage screening is expected to correctly identify 896 out of 1776 cases before 24 weeks' gestation (50.5%), with 205 false-positive diagnoses (0.2%). This represents a difference of 592 fewer correct identifications and 88 more false positives compared to two-stage screening. However, it is crucial to acknowledge the uncertainty surrounding the additional benefits of two-stage versus single-stage screening, as there are no studies directly comparing them. Moreover, the evidence supporting the accuracy of first-trimester ultrasound and two-stage screening approaches primarily originates from studies conducted in single tertiary care facilities, which restricts the generalisability of the results of this meta-analysis to the broader population.
Asunto(s)
Primer Trimestre del Embarazo , Segundo Trimestre del Embarazo , Ultrasonografía Prenatal , Femenino , Humanos , Embarazo , Sesgo , Anomalías Congénitas/diagnóstico por imagen , Sensibilidad y Especificidad , Ultrasonografía Prenatal/estadística & datos numéricosRESUMEN
OBJECTIVES: To review the findings of studies that have evaluated the design and/or usability of key risk of bias (RoB) tools for the assessment of RoB in primary studies, as categorized by the Library of Assessment Tools and InsTruments Used to assess Data validity in Evidence Synthesis Network (a searchable library of RoB tools for evidence synthesis): Prediction model Risk Of Bias ASessment Tool (PROBAST) , Risk of Bias-2 (RoB2), Risk Of Bias In Non-randomised Studies of Interventions (ROBINS-I), Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2), Quality Assessment of Diagnostic Accuracy Studies-Comparative (QUADAS-C), Quality Assessment of Prognostic Accuracy Studies (QUAPAS), Risk Of Bias in Non-randomised Studies of Exposures (ROBINS-E), and the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) RoB checklist. STUDY DESIGN AND SETTING: Systematic review of methodological studies. We conducted a forward citation search from the primary report of each tool, to identify primary studies that aimed to evaluate the design and/or usability of the tool. Two reviewers assessed studies for inclusion. We extracted tool features into Microsoft Word and used NVivo for document analysis, comprising a mix of deductive and inductive approaches. We summarized findings within each tool and explored common findings across tools. RESULTS: We identified 13 tool evaluations meeting our inclusion criteria: PROBAST (3), RoB2 (3), ROBINS-I (4), and QUADAS-2 (3). We identified no evaluations for the other tools. Evaluations varied in clinical topic area, methodology, approach to bias assessment, and tool user background. Some had limitations affecting generalizability. We identified common findings across tools for 6/14 themes: (1) challenging items (eg, RoB2/ROBINS-I "deviations from intended interventions" domain), (2) overall RoB judgment (concerns with overall risk calculation in PROBAST/ROBINS-I), (3) tool usability (concerns about complexity), (4) time to complete tool (varying demands on time, eg, depending on number of outcomes assessed), (5) user agreement (varied across tools), and (6) recommendations for future use (eg, piloting) and development (add intermediate domain answer to QUADAS-2/PROBAST; provide clearer guidance for all tools). Of the other eight themes, seven only had findings for the QUADAS-2 tool, limiting comparison across tools, and one ("reorganization of questions") had no findings. CONCLUSION: Evaluations of key RoB tools have posited common challenges and recommendations for tool use and development. These findings may be helpful to people who use or develop RoB tools. Guidance is necessary to support the design and implementation of future RoB tool evaluations.
Asunto(s)
Sesgo , Humanos , Proyectos de Investigación/normasAsunto(s)
Anemia de Células Falciformes , Tamizaje Neonatal , Humanos , Anemia de Células Falciformes/mortalidad , Anemia de Células Falciformes/epidemiología , Anemia de Células Falciformes/diagnóstico , Anemia de Células Falciformes/complicaciones , Recién Nacido , Países Bajos/epidemiología , Estudios de Seguimiento , Femenino , Masculino , Niño , Preescolar , Lactante , AdolescenteRESUMEN
OBJECTIVES: To provide guidance on rating imprecision in a body of evidence assessing the accuracy of a single test. This guide will clarify when Grading of Recommendations Assessment, Development and Evaluation (GRADE) users should consider rating down the certainty of evidence by one or more levels for imprecision in test accuracy. STUDY DESIGN AND SETTING: A project group within the GRADE working group conducted iterative discussions and presentations at GRADE working group meetings to produce this guidance. RESULTS: Before rating the certainty of evidence, GRADE users should define the target of their certainty rating. GRADE recommends setting judgment thresholds defining what they consider a very accurate, accurate, inaccurate, and very inaccurate test. These thresholds should be set after considering consequences of testing and effects on people-important outcomes. GRADE's primary criterion for judging imprecision in test accuracy evidence is considering confidence intervals (i.e., CI approach) of absolute test accuracy results (true and false, positive, and negative results in a cohort of people). Based on the CI approach, when a CI appreciably crosses the predefined judgment threshold(s), one should consider rating down certainty of evidence by one or more levels, depending on the number of thresholds crossed. When the CI does not cross judgment threshold(s), GRADE suggests considering the sample size for an adequately powered test accuracy review (optimal or review information size [optimal information size (OIS)/review information size (RIS)]) in rating imprecision. If the combined sample size of the included studies in the review is smaller than the required OIS/RIS, one should consider rating down by one or more levels for imprecision. CONCLUSION: This paper extends previous GRADE guidance for rating imprecision in single test accuracy systematic reviews and guidelines, with a focus on the circumstances in which one should consider rating down one or more levels for imprecision.
Asunto(s)
Enfoque GRADE , Procesos de Grupo , Humanos , Juicio , Tamaño de la MuestraRESUMEN
BACKGROUND: We aimed to summarize the published evidence on the fall risk reducing potential of cardiovascular diagnostics and treatments in older adults. METHODS: Design: scoping review and evidence map. DATA SOURCES: Medline and Embase. ELIGIBILITY CRITERIA: all available published evidence; Key search concepts: "older adults," "cardiovascular evaluation," "cardiovascular intervention," and "falls." Studies reporting on fall risk reducing effect of the diagnostic/treatment were included in the evidence map. Studies that investigated cardiovascular diagnostics or treatments within the context of falls, but without reporting a fall-related outcome, were included in the scoping review for qualitative synthesis. RESULTS: Two articles on cardiovascular diagnostics and eight articles on cardiovascular treatments were included in the evidence map. Six out of ten studies concerned pacemaker intervention of which one meta-analyses that included randomized controlled trials with contradictory results. A combined cardiovascular assessment/evaluation (one study) and pharmacotherapy in orthostatic hypotension (one study) showed fall reducing potential. The scoping review contained 40 articles on cardiovascular diagnostics and one on cardiovascular treatments. It provides an extensive overview of several diagnostics (e.g., orthostatic blood pressure measurements, heart rhythm assessment) useful in fall prevention. Also, diagnostics were identified, that could potentially provide added value in fall prevention (e.g., blood pressure variability and head turning). CONCLUSION: Although the majority of studies showed a reduction in falls after the intervention, the total amount of evidence regarding the effect of cardiovascular diagnostics/treatments on falls is small. Our findings can be used to optimize fall prevention strategies and develop an evidence-based fall prevention care pathway. Adhering to the World guidelines on fall prevention recommendations, it is crucial to undertake a standardized assessment of cardiovascular risk factors, followed by supplementary testing and corresponding interventions, as effective components of fall prevention strategies. In addition, accompanying diagnostics such as blood pressure variability and head turning can be of added value.
Asunto(s)
Accidentes por Caídas , Accidentes por Caídas/prevención & control , Presión SanguíneaRESUMEN
OBJECTIVES: To facilitate informed decision making on participating in colorectal cancer (CRC) screening, we assessed the benefit-harm balance of CRC screening for a wide range of subgroups over different time horizons. METHODS: The study combined incidence proportions of benefits and harms of (not) participating in CRC screening estimated by the Adenoma and Serrated pathway to CAncer microsimulation model, a preference eliciting survey, and benefit-harm balance modeling combining all outcomes to determine the net health benefit of CRC screening over 10, 20, and 30 years. Probability of net health benefit was estimated for 210 different subgroups based on age, sex, previous participation in CRC screening, and lifestyle. RESULTS: CRC screening was net beneficial in 183 of 210 subgroups over 30 years (median probability [MP] of 0.79, interquartile range [IQR] of 0.69-0.85) across subgroups. Net health benefit was greater for men (MP 0.82; IQR 0.69-0.89) than women (MP 0.76; IQR 0.67-0.83) and for those without history of participation in previous screenings (MP 0.84; IQR 0.80-0.89) compared with those with (MP 0.69; IQR 0.59-0.75). Net health benefit decreased with increasing age, from MP of 0.84 (IQR 0.80-0.86) at age 55 to 0.61 (IQR 0.56-0.71) at age 75. Shorter time horizons led to lower benefit, with MP of 0.70 (IQR 0.62-0.80) over 20 years and 0.54 (IQR 0.48-0.67) over 10 years. CONCLUSIONS: Our benefit-harm analysis provides information about net health benefit of screening participation, based on important characteristics and preferences of individuals, which could assist screening invitees in making informed decisions on screening participation.
Asunto(s)
Neoplasias Colorrectales , Detección Precoz del Cáncer , Masculino , Humanos , Femenino , Anciano , Lactante , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/epidemiología , Toma de Decisiones , Tamizaje MasivoRESUMEN
BACKGROUND: Clinical and laboratory diagnosis of cutaneous leishmaniasis (CL) is hampered by under-ascertainment of direct microscopy. METHODS: This study compared the diagnostic accuracy of qPCR on DNA extracted from filter paper to the accuracy of direct smear slide microscopy in participants presenting with a cutaneous lesion suspected of leishmaniasis to 16 rural healthcare centers in the Ecuadorian Amazon and Pacific regions, from January 2019 to June 2021. We used Bayesian latent class analysis to estimate test sensitivity, specificity, likelihood ratios (LR), and predictive values (PV) with their 95% credible intervals (95%CrI). The impact of sociodemographic and clinical characteristics on predictive values was assessed as a secondary objective. RESULTS: Of 320 initially included participants, paired valid test results were available and included in the diagnostic accuracy analysis for 129 from the Amazon and 185 from the Pacific region. We estimated sensitivity of 68% (95%CrI 49% to 82%) and 73% (95%CrI 73% to 83%) for qPCR, and 51% (95%CrI 36% to 66%) and 76% (95%CrI 65% to 86%) for microscopy in the Amazon and Pacific region, respectively. In the Amazon, with an estimated disease prevalence among participants of 73%, negative PV for qPCR was 54% (95%CrI 5% to 77%) and 44% (95%CrI 4% to 65%) for microscopy. In the Pacific, (prevalence 88%) the negative PV was 34% (95%CrI 3% to 58%) and 37% (95%CrI 3% to 63%). The addition of qPCR parallel to microscopy in the Amazon increases the observed prevalence from 38% to 64% (+26 (95%CrI 19 to 34) percentage points). CONCLUSION: The accuracy of either qPCR on DNA extracted from filter paper or microscopy for CL diagnosis as a stand-alone test seems to be unsatisfactory and region-dependent. We recommend further studies to confirm the clinically relevant increment found in the diagnostic yield due to the addition of qPCR.
Asunto(s)
Leishmaniasis Cutánea , Microscopía , Humanos , Ecuador/epidemiología , Análisis de Clases Latentes , Teorema de Bayes , Leishmaniasis Cutánea/diagnóstico , Leishmaniasis Cutánea/epidemiología , ADN , Sensibilidad y EspecificidadRESUMEN
Objective: To evaluate open science policies of imaging journals, and compliance to these policies in published articles. Methods: From imaging journals listed we extracted open science policy details: protocol registration, reporting guidelines, funding, ethics and conflicts of interest (COI), data sharing, and open access publishing. The 10 most recently published studies from each journal were assessed to determine adherence to these policies. We calculated the proportion of open science policies into an Open Science Score (OSS) for all journals and articles. We evaluated relationships between OSS and journal/article level variables. Results: 82 journals/820 articles were included. The OSS of journals and articles was 58.3% and 31.8%, respectively. Of the journals, 65.9% had registration and 78.1% had reporting guideline policies. 79.3% of journals were members of COPE, 81.7% had plagiarism policies, 100% required disclosure of funding, and 97.6% required disclosure of COI and ethics approval. 81.7% had data sharing policies and 15.9% were fully open access. 7.8% of articles had a registered protocol, 8.4% followed a reporting guideline, 77.4% disclosed funding, 88.7% disclosed COI, and 85.6% reported ethics approval. 12.3% of articles shared their data. 51% of articles were available through open access or as a preprint. OSS was higher for journal with DOAJ membership (80% vs 54.2%; P < .0001). Impact factor was not correlated with journal OSS. Knowledge synthesis articles has a higher OSS scores (44.5%) than prospective/retrospective studies (32.6%, 30.0%, P < .0001). Conclusion: Imaging journals endorsed just over half of open science practices considered; however, the application of these practices at the article level was lower.
RESUMEN
BACKGROUND: Although many predictive parameters have been studied, an internationally accepted, validated predictive model to predict the clinical outcome of asphyxiated infants suffering from hypoxic-ischemic encephalopathy is currently lacking. The aim of this study was to identify, appraise and summarize available clinical prediction models, and provide an overview of all investigated predictors for the outcome death or neurodevelopmental impairment in this population. METHODS: A systematic literature search was performed in Medline and Embase. Two reviewers independently included eligible studies and extracted data. The quality was assessed using PROBAST for prediction model studies and QUIPS assessment tools for predictor studies. RESULTS: A total of nine prediction models were included. These models were very heterogeneous in number of predictors assessed, methods of model derivation, and primary outcomes. All studies had a high risk of bias following the PROBAST assessment and low applicability due to complex model presentation. A total of 104 predictor studies were included investigating various predictors, showing tremendous heterogeneity in investigated predictors, timing of predictors, primary outcomes, results, and methodological quality according to QUIPS. Selected high-quality studies with accurate discriminating performance provide clinicians and researchers an evidence map of predictors for prognostication after HIE in newborns. CONCLUSION: Given the low methodological quality of the currently published clinical prediction models, implementation into clinical practice is not yet possible. Therefore, there is an urgent need to develop a prediction model which complies with the PROBAST guideline. An overview of potential predictors to include in a prediction model is presented.
Asunto(s)
Hipoxia-Isquemia Encefálica , Modelos Estadísticos , Lactante , Recién Nacido , Humanos , Pronóstico , Hipoxia-Isquemia Encefálica/diagnósticoRESUMEN
Background: This is a systematic review and meta-analysis of diagnostic test accuracy studies to assess the predictive value of both tuberculin skin test (TST) and interferon-gamma release assays (IGRA) for active tuberculosis (TB) among solid organ transplantation (SOT) recipients. Methods: Medline, Embase, and the CENTRAL databases were searched from 1946 until June 30, 2022. Two independent assessors extracted data from studies. Sensitivity analyses were performed to investigate the effect of studies with high or low risk of bias. Methodological quality of each publication was assessed using QUADAS-2. Results: A total of 43 studies (36 403 patients) with patients who were screened for latent TB infection (LTBI) and who underwent SOT were included: 18 were comparative and 25 noncomparative (19 TST, 6 QuantiFERON-TB Gold In-Tube [QFT-GIT]). For IGRA tests taken together, positive predictive value (PPV) and negative predictive value (NPV) were 1.2% and 99.6%, respectively. For TST, PPV was 2.13% and NPV was 95.5%. Overall, PPV is higher when TB burden is higher, regardless of test type, although still low in absolute terms. Incidence of active TB was similar between studies using LTBI prophylaxis (mean incidence 1.22%; 95% confidence interval [CI], .2179-2.221) and those not using prophylaxis (mean incidence 1.045%; 95% CI, 0.2731-1.817; P = .7717). Strengths of this study include the large number of studies available from multiple different countries; limitations include absence of gold standard for diagnosis of latent TB and low incidence of active TB. Conclusions: We found both TST and IGRA had a low PPV and high NPV for the development of active TB posttransplant. Further studies are needed to better understand how to prevent active TB in the SOT population.