RESUMEN
BACKGROUND: Diagnosing people with a SARS-CoV-2 infection played a critical role in managing the COVID-19 pandemic and remains a priority for the transition to long-term management of COVID-19. Initial shortages of extraction and reverse transcription polymerase chain reaction (RT-PCR) reagents impaired the desired upscaling of testing in many countries, which led to the search for alternatives to RNA extraction/purification and RT-PCR testing. Reference standard methods for diagnosing the presence of SARS-CoV-2 infection rely primarily on real-time reverse transcription-polymerase chain reaction (RT-PCR). Alternatives to RT-PCR could, if sufficiently accurate, have a positive impact by expanding the range of diagnostic tools available for the timely identification of people infected by SARS-CoV-2, access to testing and the use of resources. OBJECTIVES: To assess the diagnostic accuracy of alternative (to RT-PCR assays) laboratory-based molecular tests for diagnosing SARS-CoV-2 infection. SEARCH METHODS: We searched the COVID-19 Open Access Project living evidence database from the University of Bern until 30 September 2020 and the WHO COVID-19 Research Database until 31 October 2022. We did not apply language restrictions. SELECTION CRITERIA: We included studies of people with suspected or known SARS-CoV-2 infection, or where tests were used to screen for infection, and studies evaluating commercially developed laboratory-based molecular tests for the diagnosis of SARS-CoV-2 infection considered as alternatives to RT-PCR testing. We also included all reference standards to define the presence or absence of SARS-CoV-2, including RT-PCR tests and established clinical diagnostic criteria. DATA COLLECTION AND ANALYSIS: Two authors independently screened studies and resolved disagreements by discussing them with a third author. Two authors independently extracted data and assessed the risk of bias and applicability of the studies using the QUADAS-2 tool. We presented sensitivity and specificity, with 95% confidence intervals (CIs), for each test using paired forest plots and summarised results using average sensitivity and specificity using a bivariate random-effects meta-analysis. We illustrated the findings per index test category and assay brand compared to the WHO's acceptable sensitivity and specificity threshold for diagnosing SARS-CoV-2 infection using nucleic acid tests. MAIN RESULTS: We included data from 64 studies reporting 94 cohorts of participants and 105 index test evaluations, with 74,753 samples and 7517 confirmed SARS-CoV-2 cases. We did not identify any published or preprint reports of accuracy for a considerable number of commercially produced NAAT assays. Most cohorts were judged at unclear or high risk of bias in more than three QUADAS-2 domains. Around half of the cohorts were considered at high risk of selection bias because of recruitment based on COVID status. Three quarters of 94 cohorts were at high risk of bias in the reference standard domain because of reliance on a single RT-PCR result to determine the absence of SARS-CoV-2 infection or were at unclear risk of bias due to a lack of clarity about the time interval between the index test assessment and the reference standard, the number of missing results, or the absence of a participant flow diagram. For index tests categories with four or more evaluations and when summary estimations were possible, we found that: a) For RT-PCR assays designed to omit/adapt RNA extraction/purification, the average sensitivity was 95.1% (95% CI 91.1% to 97.3%), and the average specificity was 99.7% (95% CI 98.5% to 99.9%; based on 27 evaluations, 2834 samples and 1178 SARS-CoV-2 cases); b) For RT-LAMP assays, the average sensitivity was 88.4% (95% CI 83.1% to 92.2%), and the average specificity was 99.7% (95% CI 98.7% to 99.9%; 24 evaluations, 29,496 samples and 2255 SARS-CoV-2 cases); c) for TMA assays, the average sensitivity was 97.6% (95% CI 95.2% to 98.8%), and the average specificity was 99.4% (95% CI 94.9% to 99.9%; 14 evaluations, 2196 samples and 942 SARS-CoV-2 cases); d) for digital PCR assays, the average sensitivity was 98.5% (95% CI 95.2% to 99.5%), and the average specificity was 91.4% (95% CI 60.4% to 98.7%; five evaluations, 703 samples and 354 SARS-CoV-2 cases); e) for RT-LAMP assays omitting/adapting RNA extraction, the average sensitivity was 73.1% (95% CI 58.4% to 84%), and the average specificity was 100% (95% CI 98% to 100%; 24 evaluations, 14,342 samples and 1502 SARS-CoV-2 cases). Only two index test categories fulfil the WHO-acceptable sensitivity and specificity requirements for SARS-CoV-2 nucleic acid tests: RT-PCR assays designed to omit/adapt RNA extraction/purification and TMA assays. In addition, WHO-acceptable performance criteria were met for two assays out of 35 when tests were used according to manufacturer instructions. At 5% prevalence using a cohort of 1000 people suspected of SARS-CoV-2 infection, the positive predictive value of RT-PCR assays omitting/adapting RNA extraction/purification will be 94%, with three in 51 positive results being false positives, and around two missed cases. For TMA assays, the positive predictive value of RT-PCR assays will be 89%, with 6 in 55 positive results being false positives, and around one missed case. AUTHORS' CONCLUSIONS: Alternative laboratory-based molecular tests aim to enhance testing capacity in different ways, such as reducing the time, steps and resources needed to obtain valid results. Several index test technologies with these potential advantages have not been evaluated or have been assessed by only a few studies of limited methodological quality, so the performance of these kits was undetermined. Only two index test categories with enough evaluations for meta-analysis fulfil the WHO set of acceptable accuracy standards for SARS-CoV-2 nucleic acid tests: RT-PCR assays designed to omit/adapt RNA extraction/purification and TMA assays. These assays might prove to be suitable alternatives to RT-PCR for identifying people infected by SARS-CoV-2, especially when the alternative would be not having access to testing. However, these findings need to be interpreted and used with caution because of several limitations in the evidence, including reliance on retrospective samples without information about the symptom status of participants and the timing of assessment. No extrapolation of found accuracy data for these two alternatives to any test brands using the same techniques can be made as, for both groups, one test brand with high accuracy was overrepresented with 21/26 and 12/14 included studies, respectively. Although we used a comprehensive search and had broad eligibility criteria to include a wide range of tests that could be alternatives to RT-PCR methods, further research is needed to assess the performance of alternative COVID-19 tests and their role in pandemic management.
Asunto(s)
Prueba de Ácido Nucleico para COVID-19 , COVID-19 , ARN Viral , SARS-CoV-2 , Sensibilidad y Especificidad , Humanos , Sesgo , COVID-19/diagnóstico , Prueba de Ácido Nucleico para COVID-19/métodos , Reacciones Falso Negativas , Reacciones Falso Positivas , Pandemias , Reacción en Cadena en Tiempo Real de la Polimerasa/métodos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/métodos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/normas , ARN Viral/análisis , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificaciónRESUMEN
BACKGROUND: Identifying patients with COVID-19 disease who will deteriorate can be useful to assess whether they should receive intensive care, or whether they can be treated in a less intensive way or through outpatient care. In clinical care, routine laboratory markers, such as C-reactive protein, are used to assess a person's health status. OBJECTIVES: To assess the accuracy of routine blood-based laboratory tests to predict mortality and deterioration to severe or critical (from mild or moderate) COVID-19 in people with SARS-CoV-2. SEARCH METHODS: On 25 August 2022, we searched the Cochrane COVID-19 Study Register, encompassing searches of various databases such as MEDLINE via PubMed, CENTRAL, Embase, medRxiv, and ClinicalTrials.gov. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of all designs that produced estimates of prognostic accuracy in participants who presented to outpatient services, or were admitted to general hospital wards with confirmed SARS-CoV-2 infection, and studies that were based on serum banks of samples from people. All routine blood-based laboratory tests performed during the first encounter were included. We included any reference standard used to define deterioration to severe or critical disease that was provided by the authors. DATA COLLECTION AND ANALYSIS: Two review authors independently extracted data from each included study, and independently assessed the methodological quality using the Quality Assessment of Prognostic Accuracy Studies tool. As studies reported different thresholds for the same test, we used the Hierarchical Summary Receiver Operator Curve model for meta-analyses to estimate summary curves in SAS 9.4. We estimated the sensitivity at points on the SROC curves that corresponded to the median and interquartile range boundaries of specificities in the included studies. Direct and indirect comparisons were exclusively conducted for biomarkers with an estimated sensitivity and 95% CI of ≥ 50% at a specificity of ≥ 50%. The relative diagnostic odds ratio was calculated as a summary of the relative accuracy of these biomarkers. MAIN RESULTS: We identified a total of 64 studies, including 71,170 participants, of which 8169 participants died, and 4031 participants deteriorated to severe/critical condition. The studies assessed 53 different laboratory tests. For some tests, both increases and decreases relative to the normal range were included. There was important heterogeneity between tests and their cut-off values. None of the included studies had a low risk of bias or low concern for applicability for all domains. None of the tests included in this review demonstrated high sensitivity or specificity, or both. The five tests with summary sensitivity and specificity above 50% were: C-reactive protein increase, neutrophil-to-lymphocyte ratio increase, lymphocyte count decrease, d-dimer increase, and lactate dehydrogenase increase. Inflammation For mortality, summary sensitivity of a C-reactive protein increase was 76% (95% CI 73% to 79%) at median specificity, 59% (low-certainty evidence). For deterioration, summary sensitivity was 78% (95% CI 67% to 86%) at median specificity, 72% (very low-certainty evidence). For the combined outcome of mortality or deterioration, or both, summary sensitivity was 70% (95% CI 49% to 85%) at median specificity, 60% (very low-certainty evidence). For mortality, summary sensitivity of an increase in neutrophil-to-lymphocyte ratio was 69% (95% CI 66% to 72%) at median specificity, 63% (very low-certainty evidence). For deterioration, summary sensitivity was 75% (95% CI 59% to 87%) at median specificity, 71% (very low-certainty evidence). For mortality, summary sensitivity of a decrease in lymphocyte count was 67% (95% CI 56% to 77%) at median specificity, 61% (very low-certainty evidence). For deterioration, summary sensitivity of a decrease in lymphocyte count was 69% (95% CI 60% to 76%) at median specificity, 67% (very low-certainty evidence). For the combined outcome, summary sensitivity was 83% (95% CI 67% to 92%) at median specificity, 29% (very low-certainty evidence). For mortality, summary sensitivity of a lactate dehydrogenase increase was 82% (95% CI 66% to 91%) at median specificity, 60% (very low-certainty evidence). For deterioration, summary sensitivity of a lactate dehydrogenase increase was 79% (95% CI 76% to 82%) at median specificity, 66% (low-certainty evidence). For the combined outcome, summary sensitivity was 69% (95% CI 51% to 82%) at median specificity, 62% (very low-certainty evidence). Hypercoagulability For mortality, summary sensitivity of a d-dimer increase was 70% (95% CI 64% to 76%) at median specificity of 56% (very low-certainty evidence). For deterioration, summary sensitivity was 65% (95% CI 56% to 74%) at median specificity of 63% (very low-certainty evidence). For the combined outcome, summary sensitivity was 65% (95% CI 52% to 76%) at median specificity of 54% (very low-certainty evidence). To predict mortality, neutrophil-to-lymphocyte ratio increase had higher accuracy compared to d-dimer increase (RDOR (diagnostic Odds Ratio) 2.05, 95% CI 1.30 to 3.24), C-reactive protein increase (RDOR 2.64, 95% CI 2.09 to 3.33), and lymphocyte count decrease (RDOR 2.63, 95% CI 1.55 to 4.46). D-dimer increase had higher accuracy compared to lymphocyte count decrease (RDOR 1.49, 95% CI 1.23 to 1.80), C-reactive protein increase (RDOR 1.31, 95% CI 1.03 to 1.65), and lactate dehydrogenase increase (RDOR 1.42, 95% CI 1.05 to 1.90). Additionally, lactate dehydrogenase increase had higher accuracy compared to lymphocyte count decrease (RDOR 1.30, 95% CI 1.13 to 1.49). To predict deterioration to severe disease, C-reactive protein increase had higher accuracy compared to d-dimer increase (RDOR 1.76, 95% CI 1.25 to 2.50). The neutrophil-to-lymphocyte ratio increase had higher accuracy compared to d-dimer increase (RDOR 2.77, 95% CI 1.58 to 4.84). Lastly, lymphocyte count decrease had higher accuracy compared to d-dimer increase (RDOR 2.10, 95% CI 1.44 to 3.07) and lactate dehydrogenase increase (RDOR 2.22, 95% CI 1.52 to 3.26). AUTHORS' CONCLUSIONS: Laboratory tests, associated with hypercoagulability and hyperinflammatory response, were better at predicting severe disease and mortality in patients with SARS-CoV-2 compared to other laboratory tests. However, to safely rule out severe disease, tests should have high sensitivity (> 90%), and none of the identified laboratory tests met this criterion. In clinical practice, a more comprehensive assessment of a patient's health status is usually required by, for example, incorporating these laboratory tests into clinical prediction rules together with clinical symptoms, radiological findings, and patient's characteristics.
Asunto(s)
Proteína C-Reactiva , COVID-19 , SARS-CoV-2 , Humanos , COVID-19/mortalidad , COVID-19/sangre , COVID-19/diagnóstico , Proteína C-Reactiva/análisis , Biomarcadores/sangre , Pronóstico , Deterioro Clínico , Sesgo , Pandemias , Sensibilidad y Especificidad , Índice de Severidad de la Enfermedad , Prueba de COVID-19/métodosRESUMEN
BACKGROUND: Prenatal ultrasound is widely used to screen for structural anomalies before birth. While this is traditionally done in the second trimester, there is an increasing use of first-trimester ultrasound for early detection of lethal and certain severe structural anomalies. OBJECTIVES: To evaluate the diagnostic accuracy of ultrasound in detecting fetal structural anomalies before 14 and 24 weeks' gestation in low-risk and unselected pregnant women and to compare the current two main prenatal screening approaches: a single second-trimester scan (single-stage screening) and a first- and second-trimester scan combined (two-stage screening) in terms of anomaly detection before 24 weeks' gestation. SEARCH METHODS: We searched MEDLINE, EMBASE, Science Citation Index Expanded (Web of Science), Social Sciences Citation Index (Web of Science), Arts & Humanities Citation Index and Emerging Sources Citation Index (Web of Science) from 1 January 1997 to 22 July 2022. We limited our search to studies published after 1997 and excluded animal studies, reviews and case reports. No further restrictions were applied. We also screened reference lists and citing articles of each of the included studies. SELECTION CRITERIA: Studies were eligible if they included low-risk or unselected pregnant women undergoing a first- and/or second-trimester fetal anomaly scan, conducted at 11 to 14 or 18 to 24 weeks' gestation, respectively. The reference standard was detection of anomalies at birth or postmortem. DATA COLLECTION AND ANALYSIS: Two review authors independently undertook study selection, quality assessment (QUADAS-2), data extraction and evaluation of the certainty of evidence (GRADE approach). We used univariate random-effects logistic regression models for the meta-analysis of sensitivity and specificity. MAIN RESULTS: Eighty-seven studies covering 7,057,859 fetuses (including 25,202 with structural anomalies) were included. No study was deemed low risk across all QUADAS-2 domains. Main methodological concerns included risk of bias in the reference standard domain and risk of partial verification. Applicability concerns were common in studies evaluating first-trimester scans and two-stage screening in terms of patient selection due to frequent recruitment from single tertiary centres without exclusion of referrals. We reported ultrasound accuracy for fetal structural anomalies overall, by severity, affected organ system and for 46 specific anomalies. Detection rates varied widely across categories, with the highest estimates of sensitivity for thoracic and abdominal wall anomalies and the lowest for gastrointestinal anomalies across all tests. The summary sensitivity of a first-trimester scan was 37.5% for detection of structural anomalies overall (95% confidence interval (CI) 31.1 to 44.3; low-certainty evidence) and 91.3% for lethal anomalies (95% CI 83.9 to 95.5; moderate-certainty evidence), with an overall specificity of 99.9% (95% CI 99.9 to 100; low-certainty evidence). Two-stage screening had a combined sensitivity of 83.8% (95% CI 74.7 to 90.1; low-certainty evidence), while single-stage screening had a sensitivity of 50.5% (95% CI 38.5 to 62.4; very low-certainty evidence). The specificity of two-stage screening was 99.9% (95% CI 99.7 to 100; low-certainty evidence) and for single-stage screening, it was 99.8% (95% CI 99.2 to 100; moderate-certainty evidence). Indirect comparisons suggested superiority of two-stage screening across all analyses regarding sensitivity, with no significant difference in specificity. However, the certainty of the evidence is very low due to the absence of direct comparisons. AUTHORS' CONCLUSIONS: A first-trimester scan has the potential to detect lethal and certain severe anomalies with high accuracy before 14 weeks' gestation, despite its limited overall sensitivity. Conversely, two-stage screening shows high accuracy in detecting most fetal structural anomalies before 24 weeks' gestation with high sensitivity and specificity. In a hypothetical cohort of 100,000 fetuses, the first-trimester scan is expected to correctly identify 113 out of 124 fetuses with lethal anomalies (91.3%) and 665 out of 1776 fetuses with any anomaly (37.5%). However, 79 false-positive diagnoses are anticipated among 98,224 fetuses (0.08%). Two-stage screening is expected to correctly identify 1448 out of 1776 cases of structural anomalies overall (83.8%), with 118 false positives (0.1%). In contrast, single-stage screening is expected to correctly identify 896 out of 1776 cases before 24 weeks' gestation (50.5%), with 205 false-positive diagnoses (0.2%). This represents a difference of 592 fewer correct identifications and 88 more false positives compared to two-stage screening. However, it is crucial to acknowledge the uncertainty surrounding the additional benefits of two-stage versus single-stage screening, as there are no studies directly comparing them. Moreover, the evidence supporting the accuracy of first-trimester ultrasound and two-stage screening approaches primarily originates from studies conducted in single tertiary care facilities, which restricts the generalisability of the results of this meta-analysis to the broader population.
Asunto(s)
Primer Trimestre del Embarazo , Segundo Trimestre del Embarazo , Ultrasonografía Prenatal , Femenino , Humanos , Embarazo , Sesgo , Anomalías Congénitas/diagnóstico por imagen , Sensibilidad y Especificidad , Ultrasonografía Prenatal/estadística & datos numéricosRESUMEN
BACKGROUND: Echocardiogram is the reference standard for the diagnosis of haemodynamically significant patent ductus arteriosus (hsPDA) in preterm infants. A simple blood assay for brain natriuretic peptide (BNP) or amino-terminal pro-B-type natriuretic peptide (NT-proBNP) may be useful in the diagnosis and management of hsPDA, but a summary of the diagnostic accuracy has not been reviewed recently. OBJECTIVES: Primary objective: To determine the diagnostic accuracy of the cardiac biomarkers BNP and NT-proBNP for diagnosis of haemodynamically significant patent ductus arteriosus (hsPDA) in preterm neonates. Our secondary objectives were: to compare the accuracy of BNP and NT-proBNP; and to explore possible sources of heterogeneity among studies evaluating BNP and NT-proBNP, including type of commercial assay, chronological age of the infant at testing, gestational age at birth, whether used to initiate medical or surgical treatment, test threshold, and criteria of the reference standard (type of echocardiographic parameter used for diagnosis, clinical symptoms or physical signs if data were available). SEARCH METHODS: We searched the following databases in September 2021: MEDLINE, Embase, Cumulative Index to Nursing and Allied Health Literature (CINAHL) and Web of Science. We also searched clinical trial registries and conference abstracts. We checked references of included studies and conducted cited reference searches of included studies. We did not apply any language or date restrictions to the electronic searches or use methodological filters, so as to maximise sensitivity. SELECTION CRITERIA: We included prospective or retrospective, cohort or cross-sectional studies, which evaluated BNP or NT-proBNP (index tests) in preterm infants (participants) with suspected hsPDA (target condition) in comparison with echocardiogram (reference standard). DATA COLLECTION AND ANALYSIS: Two authors independently screened title/abstracts and full-texts, resolving any inclusion disagreements through discussion or with a third reviewer. We extracted data from included studies to create 2 × 2 tables. Two independent assessors performed quality assessment using the Quality Assessment of Diagnostic-Accuracy Studies-2 (QUADAS 2) tool. We excluded studies that did not report data in sufficient detail to construct 2 × 2 tables, and where this information was not available from the primary investigators. We used bivariate and hierarchical summary receiver operating characteristic (HSROC) random-effects models for meta-analysis and generated summary receiver operating characteristic space (ROC) curves. Since both BNP and NTproBNP are continuous variables, sensitivity and specificity were reported at multiple thresholds. We dealt with the threshold effect by reporting summary ROC curves without summary points. MAIN RESULTS: We included 34 studies: 13 evaluated BNP and 21 evaluated NT-proBNP in the diagnosis of hsPDA. Studies varied by methodological quality, type of commercial assay, thresholds, age at testing, gestational age and whether the assay was used to initiate medical or surgical therapy. We noted some variability in the definition of hsPDA among the included studies. For BNP, the summary curve is reported in the ROC space (13 studies, 768 infants, low-certainty evidence). The estimated specificities from the ROC curve at fixed values of sensitivities at median (83%), lower and upper quartiles (79% and 92%) were 93.6% (95% confidence interval (CI) 77.8 to 98.4), 95.5% (95% CI 83.6 to 98.9) and 81.1% (95% CI 50.6 to 94.7), respectively. Subgroup comparisons revealed differences by type of assay and better diagnostic accuracy at lower threshold cut-offs (< 250 pg/ml compared to ≥ 250 pg/ml), testing at gestational age < 30 weeks and chronological age at testing at one to three days. Data were insufficient for subgroup analysis of whether the BNP testing was indicated for medical or surgical management of PDA. For NT-proBNP, the summary ROC curve is reported in the ROC space (21 studies, 1459 infants, low-certainty evidence). The estimated specificities from the ROC curve at fixed values of sensitivities at median (92%), lower and upper quartiles (85% and 94%) were 83.6% (95% CI 73.3 to 90.5), 90.6% (95% CI 83.8 to 94.7) and 79.4% (95% CI 67.5 to 87.8), respectively. Subgroup analyses by threshold (< 6000 pg/ml and ≥ 6000 pg/ml) did not reveal any differences. Subgroup analysis by mean gestational age (< 30 weeks vs 30 weeks and above) showed better accuracy with < 30 weeks, and chronological age at testing (days one to three vs over three) showed testing at days one to three had better diagnostic accuracy. Data were insufficient for subgroup analysis of whether the NTproBNP testing was indicated for medical or surgical management of PDA. We performed meta-regression for BNP and NT-proBNP using the covariates: assay type, threshold, mean gestational age and chronological age; none of the covariates significantly affected summary sensitivity and specificity. AUTHORS' CONCLUSIONS: Low-certainty evidence suggests that BNP and NT-proBNP have moderate accuracy in diagnosing hsPDA and may work best as a triage test to select infants for echocardiography. The studies evaluating the diagnostic accuracy of BNP and NT-proBNP for hsPDA varied considerably by assay characteristics (assay kit and threshold) and infant characteristics (gestational and chronological age); hence, generalisability between centres is not possible. We recommend that BNP or NT-proBNP assays be locally validated for specific populations and outcomes, to initiate therapy or follow response to therapy.
Asunto(s)
Recien Nacido Prematuro , Péptido Natriurético Encefálico , Humanos , Lactante , Recién Nacido , Estudios Transversales , Estudios Prospectivos , Estudios RetrospectivosRESUMEN
BACKGROUND: The diagnostic challenges associated with the COVID-19 pandemic resulted in rapid development of diagnostic test methods for detecting SARS-CoV-2 infection. Serology tests to detect the presence of antibodies to SARS-CoV-2 enable detection of past infection and may detect cases of SARS-CoV-2 infection that were missed by earlier diagnostic tests. Understanding the diagnostic accuracy of serology tests for SARS-CoV-2 infection may enable development of effective diagnostic and management pathways, inform public health management decisions and understanding of SARS-CoV-2 epidemiology. OBJECTIVES: To assess the accuracy of antibody tests, firstly, to determine if a person presenting in the community, or in primary or secondary care has current SARS-CoV-2 infection according to time after onset of infection and, secondly, to determine if a person has previously been infected with SARS-CoV-2. Sources of heterogeneity investigated included: timing of test, test method, SARS-CoV-2 antigen used, test brand, and reference standard for non-SARS-CoV-2 cases. SEARCH METHODS: The COVID-19 Open Access Project living evidence database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) was searched on 30 September 2020. We included additional publications from the Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI-Centre) 'COVID-19: Living map of the evidence' and the Norwegian Institute of Public Health 'NIPH systematic and living map on COVID-19 evidence'. We did not apply language restrictions. SELECTION CRITERIA: We included test accuracy studies of any design that evaluated commercially produced serology tests, targeting IgG, IgM, IgA alone, or in combination. Studies must have provided data for sensitivity, that could be allocated to a predefined time period after onset of symptoms, or after a positive RT-PCR test. Small studies with fewer than 25 SARS-CoV-2 infection cases were excluded. We included any reference standard to define the presence or absence of SARS-CoV-2 (including reverse transcription polymerase chain reaction tests (RT-PCR), clinical diagnostic criteria, and pre-pandemic samples). DATA COLLECTION AND ANALYSIS: We use standard screening procedures with three reviewers. Quality assessment (using the QUADAS-2 tool) and numeric study results were extracted independently by two people. Other study characteristics were extracted by one reviewer and checked by a second. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test and, for meta-analysis, we fitted univariate random-effects logistic regression models for sensitivity by eligible time period and for specificity by reference standard group. Heterogeneity was investigated by including indicator variables in the random-effects logistic regression models. We tabulated results by test manufacturer and summarised results for tests that were evaluated in 200 or more samples and that met a modification of UK Medicines and Healthcare products Regulatory Agency (MHRA) target performance criteria. MAIN RESULTS: We included 178 separate studies (described in 177 study reports, with 45 as pre-prints) providing 527 test evaluations. The studies included 64,688 samples including 25,724 from people with confirmed SARS-CoV-2; most compared the accuracy of two or more assays (102/178, 57%). Participants with confirmed SARS-CoV-2 infection were most commonly hospital inpatients (78/178, 44%), and pre-pandemic samples were used by 45% (81/178) to estimate specificity. Over two-thirds of studies recruited participants based on known SARS-CoV-2 infection status (123/178, 69%). All studies were conducted prior to the introduction of SARS-CoV-2 vaccines and present data for naturally acquired antibody responses. Seventy-nine percent (141/178) of studies reported sensitivity by week after symptom onset and 66% (117/178) for convalescent phase infection. Studies evaluated enzyme-linked immunosorbent assays (ELISA) (165/527; 31%), chemiluminescent assays (CLIA) (167/527; 32%) or lateral flow assays (LFA) (188/527; 36%). Risk of bias was high because of participant selection (172, 97%); application and interpretation of the index test (35, 20%); weaknesses in the reference standard (38, 21%); and issues related to participant flow and timing (148, 82%). We judged that there were high concerns about the applicability of the evidence related to participants in 170 (96%) studies, and about the applicability of the reference standard in 162 (91%) studies. Average sensitivities for current SARS-CoV-2 infection increased by week after onset for all target antibodies. Average sensitivity for the combination of either IgG or IgM was 41.1% in week one (95% CI 38.1 to 44.2; 103 evaluations; 3881 samples, 1593 cases), 74.9% in week two (95% CI 72.4 to 77.3; 96 evaluations, 3948 samples, 2904 cases) and 88.0% by week three after onset of symptoms (95% CI 86.3 to 89.5; 103 evaluations, 2929 samples, 2571 cases). Average sensitivity during the convalescent phase of infection (up to a maximum of 100 days since onset of symptoms, where reported) was 89.8% for IgG (95% CI 88.5 to 90.9; 253 evaluations, 16,846 samples, 14,183 cases), 92.9% for IgG or IgM combined (95% CI 91.0 to 94.4; 108 evaluations, 3571 samples, 3206 cases) and 94.3% for total antibodies (95% CI 92.8 to 95.5; 58 evaluations, 7063 samples, 6652 cases). Average sensitivities for IgM alone followed a similar pattern but were of a lower test accuracy in every time slot. Average specificities were consistently high and precise, particularly for pre-pandemic samples which provide the least biased estimates of specificity (ranging from 98.6% for IgM to 99.8% for total antibodies). Subgroup analyses suggested small differences in sensitivity and specificity by test technology however heterogeneity in study results, timing of sample collection, and smaller sample numbers in some groups made comparisons difficult. For IgG, CLIAs were the most sensitive (convalescent-phase infection) and specific (pre-pandemic samples) compared to both ELISAs and LFAs (P < 0.001 for differences across test methods). The antigen(s) used (whether from the Spike-protein or nucleocapsid) appeared to have some effect on average sensitivity in the first weeks after onset but there was no clear evidence of an effect during convalescent-phase infection. Investigations of test performance by brand showed considerable variation in sensitivity between tests, and in results between studies evaluating the same test. For tests that were evaluated in 200 or more samples, the lower bound of the 95% CI for sensitivity was 90% or more for only a small number of tests (IgG, n = 5; IgG or IgM, n = 1; total antibodies, n = 4). More test brands met the MHRA minimum criteria for specificity of 98% or above (IgG, n = 16; IgG or IgM, n = 5; total antibodies, n = 7). Seven assays met the specified criteria for both sensitivity and specificity. In a low-prevalence (2%) setting, where antibody testing is used to diagnose COVID-19 in people with symptoms but who have had a negative PCR test, we would anticipate that 1 (1 to 2) case would be missed and 8 (5 to 15) would be falsely positive in 1000 people undergoing IgG or IgM testing in week three after onset of SARS-CoV-2 infection. In a seroprevalence survey, where prevalence of prior infection is 50%, we would anticipate that 51 (46 to 58) cases would be missed and 6 (5 to 7) would be falsely positive in 1000 people having IgG tests during the convalescent phase (21 to 100 days post-symptom onset or post-positive PCR) of SARS-CoV-2 infection. AUTHORS' CONCLUSIONS: Some antibody tests could be a useful diagnostic tool for those in whom molecular- or antigen-based tests have failed to detect the SARS-CoV-2 virus, including in those with ongoing symptoms of acute infection (from week three onwards) or those presenting with post-acute sequelae of COVID-19. However, antibody tests have an increasing likelihood of detecting an immune response to infection as time since onset of infection progresses and have demonstrated adequate performance for detection of prior infection for sero-epidemiological purposes. The applicability of results for detection of vaccination-induced antibodies is uncertain.
Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , COVID-19/diagnóstico , COVID-19/epidemiología , Anticuerpos Antivirales , Inmunoglobulina G , Vacunas contra la COVID-19 , Pandemias , Estudios Seroepidemiológicos , Inmunoglobulina MRESUMEN
BACKGROUND: Good patient adherence to antiretroviral (ART) medication determines effective HIV viral suppression, and thus reduces the risk of progression and transmission of HIV. With accurate methods to monitor treatment adherence, we could use simple triage to target adherence support interventions that could help in the community or at health centres in resource-limited settings. OBJECTIVES: To determine the accuracy of simple measures of ART adherence (including patient self-report, tablet counts, pharmacy records, electronic monitoring, or composite methods) for detecting non-suppressed viral load in people living with HIV and receiving ART treatment. SEARCH METHODS: The Cochrane Infectious Diseases Group Information Specialists searched CENTRAL, MEDLINE, Embase, LILACS, CINAHL, African-Wide information, and Web of Science up to 22 April 2021. They also searched the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) and ClinicalTrials.gov for ongoing studies. No restrictions were placed on the language or date of publication when searching the electronic databases. SELECTION CRITERIA: We included studies of all designs that evaluated a simple measure of adherence (index test) such as self-report, tablet counts, pharmacy records or secondary database analysis, or both, electronic monitoring or composite measures of any of those tests, in people living with HIV and receiving ART treatment. We used a viral load assay with a limit of detection ranging from 10 copies/mL to 400 copies/mL as the reference standard. We created 2 × 2 tables to calculate sensitivity and specificity. DATA COLLECTION AND ANALYSIS: We screened studies, extracted data, and assessed risk of bias using QUADAS-2 independently and in duplicate. We assessed the certainty of evidence using the GRADE method. The results of estimated sensitivity and specificity were presented using paired forest plots and tabulated summaries. We encountered a high level of variation among studies which precluded a meaningful meta-analysis or comparison of adherence measures. We explored heterogeneity using pre-defined subgroup analysis. MAIN RESULTS: We included 51 studies involving children and adults with HIV, mostly living in low- and middle-income settings, conducted between 2003 and 2021. Several studies assessed more than one index test, and the most common measure of adherence to ART was self-report. - Self-report questionnaires (25 studies, 9211 participants; very low-certainty): sensitivity ranged from 10% to 85% and specificity ranged from 10% to 99%. - Self-report using a visual analogue scale (VAS) (11 studies, 4235 participants; very low-certainty): sensitivity ranged from 0% to 58% and specificity ranged from 55% to 100%. - Tablet counts (12 studies, 3466 participants; very low-certainty): sensitivity ranged from 0% to 100% and specificity ranged from 5% to 99%. - Electronic monitoring devices (3 studies, 186 participants; very low-certainty): sensitivity ranged from 60% to 88% and the specificity ranged from 27% to 67%. - Pharmacy records or secondary databases (6 studies, 2254 participants; very low-certainty): sensitivity ranged from 17% to 88% and the specificity ranged from 9% to 95%. - Composite measures (9 studies, 1513 participants; very low-certainty): sensitivity ranged from 10% to 100% and specificity ranged from 49% to 100%. Across all included studies, the ability of adherence measures to detect viral non-suppression showed a large variation in both sensitivity and specificity that could not be explained by subgroup analysis. We assessed the overall certainty of the evidence as very low due to risk of bias, indirectness, inconsistency, and imprecision. The risk of bias and the applicability concerns for patient selection, index test, and reference standard domains were generally low or unclear due to unclear reporting. The main methodological issues identified were related to flow and timing due to high numbers of missing data. For all index tests, we assessed the certainty of the evidence as very low due to limitations in the design and conduct of the studies, applicability concerns and inconsistency of results. AUTHORS' CONCLUSIONS: We encountered high variability for all index tests, and the overall certainty of evidence in all areas was very low. No measure consistently offered either a sufficiently high sensitivity or specificity to detect viral non-suppression. These concerns limit their value in triaging patients for viral load monitoring or enhanced adherence support interventions.
Asunto(s)
Antirretrovirales , Infecciones por VIH , Adulto , Antirretrovirales/uso terapéutico , Niño , Infecciones por VIH/complicaciones , Infecciones por VIH/tratamiento farmacológico , Humanos , Estándares de Referencia , Sensibilidad y Especificidad , Carga ViralRESUMEN
BACKGROUND: Accurate rapid diagnostic tests for SARS-CoV-2 infection would be a useful tool to help manage the COVID-19 pandemic. Testing strategies that use rapid antigen tests to detect current infection have the potential to increase access to testing, speed detection of infection, and inform clinical and public health management decisions to reduce transmission. This is the second update of this review, which was first published in 2020. OBJECTIVES: To assess the diagnostic accuracy of rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection. We consider accuracy separately in symptomatic and asymptomatic population groups. Sources of heterogeneity investigated included setting and indication for testing, assay format, sample site, viral load, age, timing of test, and study design. SEARCH METHODS: We searched the COVID-19 Open Access Project living evidence database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) on 08 March 2021. We included independent evaluations from national reference laboratories, FIND and the Diagnostics Global Health website. We did not apply language restrictions. SELECTION CRITERIA: We included studies of people with either suspected SARS-CoV-2 infection, known SARS-CoV-2 infection or known absence of infection, or those who were being screened for infection. We included test accuracy studies of any design that evaluated commercially produced, rapid antigen tests. We included evaluations of single applications of a test (one test result reported per person) and evaluations of serial testing (repeated antigen testing over time). Reference standards for presence or absence of infection were any laboratory-based molecular test (primarily reverse transcription polymerase chain reaction (RT-PCR)) or pre-pandemic respiratory sample. DATA COLLECTION AND ANALYSIS: We used standard screening procedures with three people. Two people independently carried out quality assessment (using the QUADAS-2 tool) and extracted study results. Other study characteristics were extracted by one review author and checked by a second. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test, and pooled data using the bivariate model. We investigated heterogeneity by including indicator variables in the random-effects logistic regression models. We tabulated results by test manufacturer and compliance with manufacturer instructions for use and according to symptom status. MAIN RESULTS: We included 155 study cohorts (described in 166 study reports, with 24 as preprints). The main results relate to 152 evaluations of single test applications including 100,462 unique samples (16,822 with confirmed SARS-CoV-2). Studies were mainly conducted in Europe (101/152, 66%), and evaluated 49 different commercial antigen assays. Only 23 studies compared two or more brands of test. Risk of bias was high because of participant selection (40, 26%); interpretation of the index test (6, 4%); weaknesses in the reference standard for absence of infection (119, 78%); and participant flow and timing 41 (27%). Characteristics of participants (45, 30%) and index test delivery (47, 31%) differed from the way in which and in whom the test was intended to be used. Nearly all studies (91%) used a single RT-PCR result to define presence or absence of infection. The 152 studies of single test applications reported 228 evaluations of antigen tests. Estimates of sensitivity varied considerably between studies, with consistently high specificities. Average sensitivity was higher in symptomatic (73.0%, 95% CI 69.3% to 76.4%; 109 evaluations; 50,574 samples, 11,662 cases) compared to asymptomatic participants (54.7%, 95% CI 47.7% to 61.6%; 50 evaluations; 40,956 samples, 2641 cases). Average sensitivity was higher in the first week after symptom onset (80.9%, 95% CI 76.9% to 84.4%; 30 evaluations, 2408 cases) than in the second week of symptoms (53.8%, 95% CI 48.0% to 59.6%; 40 evaluations, 1119 cases). For those who were asymptomatic at the time of testing, sensitivity was higher when an epidemiological exposure to SARS-CoV-2 was suspected (64.3%, 95% CI 54.6% to 73.0%; 16 evaluations; 7677 samples, 703 cases) compared to where COVID-19 testing was reported to be widely available to anyone on presentation for testing (49.6%, 95% CI 42.1% to 57.1%; 26 evaluations; 31,904 samples, 1758 cases). Average specificity was similarly high for symptomatic (99.1%) or asymptomatic (99.7%) participants. We observed a steady decline in summary sensitivities as measures of sample viral load decreased. Sensitivity varied between brands. When tests were used according to manufacturer instructions, average sensitivities by brand ranged from 34.3% to 91.3% in symptomatic participants (20 assays with eligible data) and from 28.6% to 77.8% for asymptomatic participants (12 assays). For symptomatic participants, summary sensitivities for seven assays were 80% or more (meeting acceptable criteria set by the World Health Organization (WHO)). The WHO acceptable performance criterion of 97% specificity was met by 17 of 20 assays when tests were used according to manufacturer instructions, 12 of which demonstrated specificities above 99%. For asymptomatic participants the sensitivities of only two assays approached but did not meet WHO acceptable performance standards in one study each; specificities for asymptomatic participants were in a similar range to those observed for symptomatic people. At 5% prevalence using summary data in symptomatic people during the first week after symptom onset, the positive predictive value (PPV) of 89% means that 1 in 10 positive results will be a false positive, and around 1 in 5 cases will be missed. At 0.5% prevalence using summary data for asymptomatic people, where testing was widely available and where epidemiological exposure to COVID-19 was suspected, resulting PPVs would be 38% to 52%, meaning that between 2 in 5 and 1 in 2 positive results will be false positives, and between 1 in 2 and 1 in 3 cases will be missed. AUTHORS' CONCLUSIONS: Antigen tests vary in sensitivity. In people with signs and symptoms of COVID-19, sensitivities are highest in the first week of illness when viral loads are higher. Assays that meet appropriate performance standards, such as those set by WHO, could replace laboratory-based RT-PCR when immediate decisions about patient care must be made, or where RT-PCR cannot be delivered in a timely manner. However, they are more suitable for use as triage to RT-PCR testing. The variable sensitivity of antigen tests means that people who test negative may still be infected. Many commercially available rapid antigen tests have not been evaluated in independent validation studies. Evidence for testing in asymptomatic cohorts has increased, however sensitivity is lower and there is a paucity of evidence for testing in different settings. Questions remain about the use of antigen test-based repeat testing strategies. Further research is needed to evaluate the effectiveness of screening programmes at reducing transmission of infection, whether mass screening or targeted approaches including schools, healthcare setting and traveller screening.
Asunto(s)
COVID-19 , COVID-19/diagnóstico , Prueba de COVID-19 , Humanos , Pandemias , Sistemas de Atención de Punto , SARS-CoV-2 , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: Cellular tests for Lyme borreliosis might be able to overcome major shortcomings of serological testing, such as its low sensitivity in early stages of infection. Therefore, we aimed to assess the sensitivity and specificity of three cellular tests. METHODS: This was a nationwide, prospective, multiple-gate case-control study done in the Netherlands. Patients with physician-confirmed Lyme borreliosis, either early localised or disseminated, were consecutively included as cases at the start of antibiotic treatment. Controls were those without Lyme borreliosis from the general population (healthy controls) and those with potentially cross-reactive conditions (eg, autoimmune disease). We used three cellular tests for Lyme borreliosis (Spirofind Revised, iSpot Lyme, and LTT-MELISA) as index tests, and standard two-tier serological testing (STTT) as a comparator. Clinical data from Lyme borreliosis patients were collected at baseline and at 12 weeks after inclusion, and blood samples were obtained at baseline, 6 weeks, and 12 weeks. Control participants underwent clinical and laboratory assessments at baseline only. FINDINGS: Cases comprised 271 patients with Lyme borreliosis (of whom 245 had early-localised Lyme borreliosis and 26 had disseminated disease) and controls comprised 228 participants without Lyme borreliosis from the general population and 41 participants with potentially cross-reactive conditions. Recruitment occurred between May 14, 2018, and March 16, 2020. The specificity of STTT in healthy controls (216 of 228 samples [94·7%, 95% CI 91·5-97·7]) was higher than that of the cellular tests: Spirofind (140 of 171 [81·9%, 76·1-87·2]), iSpot Lyme (32 of 103 [31·1%, 21·5-40·3]) and LTT-MELISA (100 of 190 [52·6%, 44·9-60·3]). Cellular tests had varying sensitivities: Spirofind (88 of 204 [43·1%, 36·4-50·4]), iSpot Lyme (51 of 94 [54·3%, 44·5-63·7]), and LTT-MELISA (66 of 218 [30·3%, 23·8-36·7]). The Spirofind and iSpot Lyme outperformed STTT for sensitivity, but were similar to the C6-ELISA (C6-ELISA: 135 of 270 [50·0%, 44·5-55·5]; STTT: 76 of 270 [28·1%, 23·0-33·6]). INTERPRETATION: The cellular tests for Lyme borreliosis used in this study have a low specificity compared with serological tests, which leads to a high number of false-positive test results. We conclude that these cellular tests are unfit for clinical use at this stage. FUNDING: Netherlands Organization for Health Research and Development, AMC Foundation (Amsterdam UMC), and Ministry of Health of the Netherlands.
Asunto(s)
Enfermedad de Lyme , Anticuerpos Antibacterianos , Estudios de Casos y Controles , Europa (Continente) , Humanos , Estudios Prospectivos , Sensibilidad y Especificidad , Pruebas SerológicasRESUMEN
BACKGROUND: COVID-19 illness is highly variable, ranging from infection with no symptoms through to pneumonia and life-threatening consequences. Symptoms such as fever, cough, or loss of sense of smell (anosmia) or taste (ageusia), can help flag early on if the disease is present. Such information could be used either to rule out COVID-19 disease, or to identify people who need to go for COVID-19 diagnostic tests. This is the second update of this review, which was first published in 2020. OBJECTIVES: To assess the diagnostic accuracy of signs and symptoms to determine if a person presenting in primary care or to hospital outpatient settings, such as the emergency department or dedicated COVID-19 clinics, has COVID-19. SEARCH METHODS: We undertook electronic searches up to 10 June 2021 in the University of Bern living search database. In addition, we checked repositories of COVID-19 publications. We used artificial intelligence text analysis to conduct an initial classification of documents. We did not apply any language restrictions. SELECTION CRITERIA: Studies were eligible if they included people with clinically suspected COVID-19, or recruited known cases with COVID-19 and also controls without COVID-19 from a single-gate cohort. Studies were eligible when they recruited people presenting to primary care or hospital outpatient settings. Studies that included people who contracted SARS-CoV-2 infection while admitted to hospital were not eligible. The minimum eligible sample size of studies was 10 participants. All signs and symptoms were eligible for this review, including individual signs and symptoms or combinations. We accepted a range of reference standards. DATA COLLECTION AND ANALYSIS: Pairs of review authors independently selected all studies, at both title and abstract, and full-text stage. They resolved any disagreements by discussion with a third review author. Two review authors independently extracted data and assessed risk of bias using the QUADAS-2 checklist, and resolved disagreements by discussion with a third review author. Analyses were restricted to prospective studies only. We presented sensitivity and specificity in paired forest plots, in receiver operating characteristic (ROC) space and in dumbbell plots. We estimated summary parameters using a bivariate random-effects meta-analysis whenever five or more primary prospective studies were available, and whenever heterogeneity across studies was deemed acceptable. MAIN RESULTS: We identified 90 studies; for this update we focused on the results of 42 prospective studies with 52,608 participants. Prevalence of COVID-19 disease varied from 3.7% to 60.6% with a median of 27.4%. Thirty-five studies were set in emergency departments or outpatient test centres (46,878 participants), three in primary care settings (1230 participants), two in a mixed population of in- and outpatients in a paediatric hospital setting (493 participants), and two overlapping studies in nursing homes (4007 participants). The studies did not clearly distinguish mild COVID-19 disease from COVID-19 pneumonia, so we present the results for both conditions together. Twelve studies had a high risk of bias for selection of participants because they used a high level of preselection to decide whether reverse transcription polymerase chain reaction (RT-PCR) testing was needed, or because they enrolled a non-consecutive sample, or because they excluded individuals while they were part of the study base. We rated 36 of the 42 studies as high risk of bias for the index tests because there was little or no detail on how, by whom and when, the symptoms were measured. For most studies, eligibility for testing was dependent on the local case definition and testing criteria that were in effect at the time of the study, meaning most people who were included in studies had already been referred to health services based on the symptoms that we are evaluating in this review. The applicability of the results of this review iteration improved in comparison with the previous reviews. This version has more studies of people presenting to ambulatory settings, which is where the majority of assessments for COVID-19 take place. Only three studies presented any data on children separately, and only one focused specifically on older adults. We found data on 96 symptoms or combinations of signs and symptoms. Evidence on individual signs as diagnostic tests was rarely reported, so this review reports mainly on the diagnostic value of symptoms. Results were highly variable across studies. Most had very low sensitivity and high specificity. RT-PCR was the most often used reference standard (40/42 studies). Only cough (11 studies) had a summary sensitivity above 50% (62.4%, 95% CI 50.6% to 72.9%)); its specificity was low (45.4%, 95% CI 33.5% to 57.9%)). Presence of fever had a sensitivity of 37.6% (95% CI 23.4% to 54.3%) and a specificity of 75.2% (95% CI 56.3% to 87.8%). The summary positive likelihood ratio of cough was 1.14 (95% CI 1.04 to 1.25) and that of fever 1.52 (95% CI 1.10 to 2.10). Sore throat had a summary positive likelihood ratio of 0.814 (95% CI 0.714 to 0.929), which means that its presence increases the probability of having an infectious disease other than COVID-19. Dyspnoea (12 studies) and fatigue (8 studies) had a sensitivity of 23.3% (95% CI 16.4% to 31.9%) and 40.2% (95% CI 19.4% to 65.1%) respectively. Their specificity was 75.7% (95% CI 65.2% to 83.9%) and 73.6% (95% CI 48.4% to 89.3%). The summary positive likelihood ratio of dyspnoea was 0.96 (95% CI 0.83 to 1.11) and that of fatigue 1.52 (95% CI 1.21 to 1.91), which means that the presence of fatigue slightly increases the probability of having COVID-19. Anosmia alone (7 studies), ageusia alone (5 studies), and anosmia or ageusia (6 studies) had summary sensitivities below 50% but summary specificities over 90%. Anosmia had a summary sensitivity of 26.4% (95% CI 13.8% to 44.6%) and a specificity of 94.2% (95% CI 90.6% to 96.5%). Ageusia had a summary sensitivity of 23.2% (95% CI 10.6% to 43.3%) and a specificity of 92.6% (95% CI 83.1% to 97.0%). Anosmia or ageusia had a summary sensitivity of 39.2% (95% CI 26.5% to 53.6%) and a specificity of 92.1% (95% CI 84.5% to 96.2%). The summary positive likelihood ratios of anosmia alone and anosmia or ageusia were 4.55 (95% CI 3.46 to 5.97) and 4.99 (95% CI 3.22 to 7.75) respectively, which is just below our arbitrary definition of a 'red flag', that is, a positive likelihood ratio of at least 5. The summary positive likelihood ratio of ageusia alone was 3.14 (95% CI 1.79 to 5.51). Twenty-four studies assessed combinations of different signs and symptoms, mostly combining olfactory symptoms. By combining symptoms with other information such as contact or travel history, age, gender, and a local recent case detection rate, some multivariable prediction scores reached a sensitivity as high as 90%. AUTHORS' CONCLUSIONS: Most individual symptoms included in this review have poor diagnostic accuracy. Neither absence nor presence of symptoms are accurate enough to rule in or rule out the disease. The presence of anosmia or ageusia may be useful as a red flag for the presence of COVID-19. The presence of cough also supports further testing. There is currently no evidence to support further testing with PCR in any individuals presenting only with upper respiratory symptoms such as sore throat, coryza or rhinorrhoea. Combinations of symptoms with other readily available information such as contact or travel history, or the local recent case detection rate may prove more useful and should be further investigated in an unselected population presenting to primary care or hospital outpatient settings. The diagnostic accuracy of symptoms for COVID-19 is moderate to low and any testing strategy using symptoms as selection mechanism will result in both large numbers of missed cases and large numbers of people requiring testing. Which one of these is minimised, is determined by the goal of COVID-19 testing strategies, that is, controlling the epidemic by isolating every possible case versus identifying those with clinically important disease so that they can be monitored or treated to optimise their prognosis. The former will require a testing strategy that uses very few symptoms as entry criterion for testing, the latter could focus on more specific symptoms such as fever and anosmia.
Asunto(s)
Ageusia , COVID-19 , Faringitis , Anciano , Ageusia/complicaciones , Anosmia/diagnóstico , Anosmia/etiología , Inteligencia Artificial , COVID-19/diagnóstico , COVID-19/epidemiología , Prueba de COVID-19 , Niño , Tos/etiología , Disnea , Fatiga/etiología , Fiebre/diagnóstico , Fiebre/etiología , Hospitales , Humanos , Pacientes Ambulatorios , Atención Primaria de Salud , Estudios Prospectivos , SARS-CoV-2 , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: Our March 2021 edition of this review showed thoracic imaging computed tomography (CT) to be sensitive and moderately specific in diagnosing COVID-19 pneumonia. This new edition is an update of the review. OBJECTIVES: Our objectives were to evaluate the diagnostic accuracy of thoracic imaging in people with suspected COVID-19; assess the rate of positive imaging in people who had an initial reverse transcriptase polymerase chain reaction (RT-PCR) negative result and a positive RT-PCR result on follow-up; and evaluate the accuracy of thoracic imaging for screening COVID-19 in asymptomatic individuals. The secondary objective was to assess threshold effects of index test positivity on accuracy. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, The Stephen B. Thacker CDC Library, and repositories of COVID-19 publications through to 17 February 2021. We did not apply any language restrictions. SELECTION CRITERIA: We included diagnostic accuracy studies of all designs, except for case-control, that recruited participants of any age group suspected to have COVID-19. Studies had to assess chest CT, chest X-ray, or ultrasound of the lungs for the diagnosis of COVID-19, use a reference standard that included RT-PCR, and report estimates of test accuracy or provide data from which we could compute estimates. We excluded studies that used imaging as part of the reference standard and studies that excluded participants with normal index test results. DATA COLLECTION AND ANALYSIS: The review authors independently and in duplicate screened articles, extracted data and assessed risk of bias and applicability concerns using QUADAS-2. We presented sensitivity and specificity per study on paired forest plots, and summarized pooled estimates in tables. We used a bivariate meta-analysis model where appropriate. MAIN RESULTS: We included 98 studies in this review. Of these, 94 were included for evaluating the diagnostic accuracy of thoracic imaging in the evaluation of people with suspected COVID-19. Eight studies were included for assessing the rate of positive imaging in individuals with initial RT-PCR negative results and positive RT-PCR results on follow-up, and 10 studies were included for evaluating the accuracy of thoracic imaging for imagining asymptomatic individuals. For all 98 included studies, risk of bias was high or unclear in 52 (53%) studies with respect to participant selection, in 64 (65%) studies with respect to reference standard, in 46 (47%) studies with respect to index test, and in 48 (49%) studies with respect to flow and timing. Concerns about the applicability of the evidence to: participants were high or unclear in eight (8%) studies; index test were high or unclear in seven (7%) studies; and reference standard were high or unclear in seven (7%) studies. Imaging in people with suspected COVID-19 We included 94 studies. Eighty-seven studies evaluated one imaging modality, and seven studies evaluated two imaging modalities. All studies used RT-PCR alone or in combination with other criteria (for example, clinical signs and symptoms, positive contacts) as the reference standard for the diagnosis of COVID-19. For chest CT (69 studies, 28285 participants, 14,342 (51%) cases), sensitivities ranged from 45% to 100%, and specificities from 10% to 99%. The pooled sensitivity of chest CT was 86.9% (95% confidence interval (CI) 83.6 to 89.6), and pooled specificity was 78.3% (95% CI 73.7 to 82.3). Definition for index test positivity was a source of heterogeneity for sensitivity, but not specificity. Reference standard was not a source of heterogeneity. For chest X-ray (17 studies, 8529 participants, 5303 (62%) cases), the sensitivity ranged from 44% to 94% and specificity from 24 to 93%. The pooled sensitivity of chest X-ray was 73.1% (95% CI 64. to -80.5), and pooled specificity was 73.3% (95% CI 61.9 to 82.2). Definition for index test positivity was not found to be a source of heterogeneity. Definition for index test positivity and reference standard were not found to be sources of heterogeneity. For ultrasound of the lungs (15 studies, 2410 participants, 1158 (48%) cases), the sensitivity ranged from 73% to 94% and the specificity ranged from 21% to 98%. The pooled sensitivity of ultrasound was 88.9% (95% CI 84.9 to 92.0), and the pooled specificity was 72.2% (95% CI 58.8 to 82.5). Definition for index test positivity and reference standard were not found to be sources of heterogeneity. Indirect comparisons of modalities evaluated across all 94 studies indicated that chest CT and ultrasound gave higher sensitivity estimates than X-ray (P = 0.0003 and P = 0.001, respectively). Chest CT and ultrasound gave similar sensitivities (P=0.42). All modalities had similar specificities (CT versus X-ray P = 0.36; CT versus ultrasound P = 0.32; X-ray versus ultrasound P = 0.89). Imaging in PCR-negative people who subsequently became positive For rate of positive imaging in individuals with initial RT-PCR negative results, we included 8 studies (7 CT, 1 ultrasound) with a total of 198 participants suspected of having COVID-19, all of whom had a final diagnosis of COVID-19. Most studies (7/8) evaluated CT. Of 177 participants with initially negative RT-PCR who had positive RT-PCR results on follow-up testing, 75.8% (95% CI 45.3 to 92.2) had positive CT findings. Imaging in asymptomatic PCR-positive people For imaging asymptomatic individuals, we included 10 studies (7 CT, 1 X-ray, 2 ultrasound) with a total of 3548 asymptomatic participants, of whom 364 (10%) had a final diagnosis of COVID-19. For chest CT (7 studies, 3134 participants, 315 (10%) cases), the pooled sensitivity was 55.7% (95% CI 35.4 to 74.3) and the pooled specificity was 91.1% (95% CI 82.6 to 95.7). AUTHORS' CONCLUSIONS: Chest CT and ultrasound of the lungs are sensitive and moderately specific in diagnosing COVID-19. Chest X-ray is moderately sensitive and moderately specific in diagnosing COVID-19. Thus, chest CT and ultrasound may have more utility for ruling out COVID-19 than for differentiating SARS-CoV-2 infection from other causes of respiratory illness. The uncertainty resulting from high or unclear risk of bias and the heterogeneity of included studies limit our ability to confidently draw conclusions based on our results.
Asunto(s)
COVID-19 , COVID-19/diagnóstico por imagen , Humanos , SARS-CoV-2 , Sensibilidad y Especificidad , Tomografía Computarizada por Rayos X , UltrasonografíaRESUMEN
BACKGROUND: Systematic screening in high-burden settings is recommended as a strategy for early detection of pulmonary tuberculosis disease, reducing mortality, morbidity and transmission, and improving equity in access to care. Questioning for symptoms and chest radiography (CXR) have historically been the most widely available tools to screen for tuberculosis disease. Their accuracy is important for the design of tuberculosis screening programmes and determines, in combination with the accuracy of confirmatory diagnostic tests, the yield of a screening programme and the burden on individuals and the health service. OBJECTIVES: To assess the sensitivity and specificity of questioning for the presence of one or more tuberculosis symptoms or symptom combinations, CXR, and combinations of these as screening tools for detecting bacteriologically confirmed pulmonary tuberculosis disease in HIV-negative adults and adults with unknown HIV status who are considered eligible for systematic screening for tuberculosis disease. Second, to investigate sources of heterogeneity, especially in relation to regional, epidemiological, and demographic characteristics of the study populations. SEARCH METHODS: We searched the MEDLINE, Embase, LILACS, and HTA (Health Technology Assessment) databases using pre-specified search terms and consulted experts for unpublished reports, for the period 1992 to 2018. The search date was 10 December 2018. This search was repeated on 2 July 2021. SELECTION CRITERIA: Studies were eligible if participants were screened for tuberculosis disease using symptom questions, or abnormalities on CXR, or both, and were offered confirmatory testing with a reference standard. We included studies if diagnostic two-by-two tables could be generated for one or more index tests, even if not all participants were subjected to a microbacteriological reference standard. We excluded studies evaluating self-reporting of symptoms. DATA COLLECTION AND ANALYSIS: We categorized symptom and CXR index tests according to commonly used definitions. We assessed the methodological quality of included studies using the QUADAS-2 instrument. We examined the forest plots and receiver operating characteristic plots visually for heterogeneity. We estimated summary sensitivities and specificities (and 95% confidence intervals (CI)) for each index test using bivariate random-effects methods. We analyzed potential sources of heterogeneity in a hierarchical mixed-model. MAIN RESULTS: The electronic database search identified 9473 titles and abstracts. Through expert consultation, we identified 31 reports on national tuberculosis prevalence surveys as eligible (of which eight were already captured in the search of the electronic databases), and we identified 957 potentially relevant articles through reference checking. After removal of duplicates, we assessed 10,415 titles and abstracts, of which we identified 430 (4%) for full text review, whereafter we excluded 364 articles. In total, 66 articles provided data on 59 studies. We assessed the 2 July 2021 search results; seven studies were potentially eligible but would make no material difference to the review findings or grading of the evidence, and were not added in this edition of the review. We judged most studies at high risk of bias in one or more domains, most commonly because of incorporation bias and verification bias. We judged applicability concerns low in more than 80% of studies in all three domains. The three most common symptom index tests, cough for two or more weeks (41 studies), any cough (21 studies), and any tuberculosis symptom (29 studies), showed a summary sensitivity of 42.1% (95% CI 36.6% to 47.7%), 51.3% (95% CI 42.8% to 59.7%), and 70.6% (95% CI 61.7% to 78.2%, all very low-certainty evidence), and a specificity of 94.4% (95% CI 92.6% to 95.8%, high-certainty evidence), 87.6% (95% CI 81.6% to 91.8%, low-certainty evidence), and 65.1% (95% CI 53.3% to 75.4%, low-certainty evidence), respectively. The data on symptom index tests were more heterogenous than those for CXR. The studies on any tuberculosis symptom were the most heterogeneous, but had the lowest number of variables explaining this variation. Symptom index tests also showed regional variation. The summary sensitivity of any CXR abnormality (23 studies) was 94.7% (95% CI 92.2% to 96.4%, very low-certainty evidence) and 84.8% (95% CI 76.7% to 90.4%, low-certainty evidence) for CXR abnormalities suggestive of tuberculosis (19 studies), and specificity was 89.1% (95% CI 85.6% to 91.8%, low-certainty evidence) and 95.6% (95% CI 92.6% to 97.4%, high-certainty evidence), respectively. Sensitivity was more heterogenous than specificity, and could be explained by regional variation. The addition of cough for two or more weeks, whether to any (pulmonary) CXR abnormality or to CXR abnormalities suggestive of tuberculosis, resulted in a summary sensitivity and specificity of 99.2% (95% CI 96.8% to 99.8%) and 84.9% (95% CI 81.2% to 88.1%) (15 studies; certainty of evidence not assessed). AUTHORS' CONCLUSIONS: The summary estimates of the symptom and CXR index tests may inform the choice of screening and diagnostic algorithms in any given setting or country where screening for tuberculosis is being implemented. The high sensitivity of CXR index tests, with or without symptom questions in parallel, suggests a high yield of persons with tuberculosis disease. However, additional considerations will determine the design of screening and diagnostic algorithms, such as the availability and accessibility of CXR facilities or the resources to fund them, and the need for more or fewer diagnostic tests to confirm the diagnosis (depending on screening test specificity), which also has resource implications. These review findings should be interpreted with caution due to methodological limitations in the included studies and regional variation in sensitivity and specificity. The sensitivity and specificity of an index test in a specific setting cannot be predicted with great precision due to heterogeneity. This should be borne in mind when planning for and implementing tuberculosis screening programmes.
Asunto(s)
Infecciones por VIH , Tuberculosis Pulmonar , Adulto , Tos , Infecciones por VIH/complicaciones , Humanos , Tamizaje Masivo , Radiografía , Sensibilidad y Especificidad , Tuberculosis Pulmonar/diagnóstico por imagen , Tuberculosis Pulmonar/epidemiologíaRESUMEN
BACKGROUND: Accurate rapid diagnostic tests for SARS-CoV-2 infection could contribute to clinical and public health strategies to manage the COVID-19 pandemic. Point-of-care antigen and molecular tests to detect current infection could increase access to testing and early confirmation of cases, and expediate clinical and public health management decisions that may reduce transmission. OBJECTIVES: To assess the diagnostic accuracy of point-of-care antigen and molecular-based tests for diagnosis of SARS-CoV-2 infection. We consider accuracy separately in symptomatic and asymptomatic population groups. SEARCH METHODS: Electronic searches of the Cochrane COVID-19 Study Register and the COVID-19 Living Evidence Database from the University of Bern (which includes daily updates from PubMed and Embase and preprints from medRxiv and bioRxiv) were undertaken on 30 Sept 2020. We checked repositories of COVID-19 publications and included independent evaluations from national reference laboratories, the Foundation for Innovative New Diagnostics and the Diagnostics Global Health website to 16 Nov 2020. We did not apply language restrictions. SELECTION CRITERIA: We included studies of people with either suspected SARS-CoV-2 infection, known SARS-CoV-2 infection or known absence of infection, or those who were being screened for infection. We included test accuracy studies of any design that evaluated commercially produced, rapid antigen or molecular tests suitable for a point-of-care setting (minimal equipment, sample preparation, and biosafety requirements, with results within two hours of sample collection). We included all reference standards that define the presence or absence of SARS-CoV-2 (including reverse transcription polymerase chain reaction (RT-PCR) tests and established diagnostic criteria). DATA COLLECTION AND ANALYSIS: Studies were screened independently in duplicate with disagreements resolved by discussion with a third author. Study characteristics were extracted by one author and checked by a second; extraction of study results and assessments of risk of bias and applicability (made using the QUADAS-2 tool) were undertaken independently in duplicate. We present sensitivity and specificity with 95% confidence intervals (CIs) for each test and pooled data using the bivariate model separately for antigen and molecular-based tests. We tabulated results by test manufacturer and compliance with manufacturer instructions for use and according to symptom status. MAIN RESULTS: Seventy-eight study cohorts were included (described in 64 study reports, including 20 pre-prints), reporting results for 24,087 samples (7,415 with confirmed SARS-CoV-2). Studies were mainly from Europe (n = 39) or North America (n = 20), and evaluated 16 antigen and five molecular assays. We considered risk of bias to be high in 29 (50%) studies because of participant selection; in 66 (85%) because of weaknesses in the reference standard for absence of infection; and in 29 (45%) for participant flow and timing. Studies of antigen tests were of a higher methodological quality compared to studies of molecular tests, particularly regarding the risk of bias for participant selection and the index test. Characteristics of participants in 35 (45%) studies differed from those in whom the test was intended to be used and the delivery of the index test in 39 (50%) studies differed from the way in which the test was intended to be used. Nearly all studies (97%) defined the presence or absence of SARS-CoV-2 based on a single RT-PCR result, and none included participants meeting case definitions for probable COVID-19. Antigen tests Forty-eight studies reported 58 evaluations of antigen tests. Estimates of sensitivity varied considerably between studies. There were differences between symptomatic (72.0%, 95% CI 63.7% to 79.0%; 37 evaluations; 15530 samples, 4410 cases) and asymptomatic participants (58.1%, 95% CI 40.2% to 74.1%; 12 evaluations; 1581 samples, 295 cases). Average sensitivity was higher in the first week after symptom onset (78.3%, 95% CI 71.1% to 84.1%; 26 evaluations; 5769 samples, 2320 cases) than in the second week of symptoms (51.0%, 95% CI 40.8% to 61.0%; 22 evaluations; 935 samples, 692 cases). Sensitivity was high in those with cycle threshold (Ct) values on PCR ≤25 (94.5%, 95% CI 91.0% to 96.7%; 36 evaluations; 2613 cases) compared to those with Ct values >25 (40.7%, 95% CI 31.8% to 50.3%; 36 evaluations; 2632 cases). Sensitivity varied between brands. Using data from instructions for use (IFU) compliant evaluations in symptomatic participants, summary sensitivities ranged from 34.1% (95% CI 29.7% to 38.8%; Coris Bioconcept) to 88.1% (95% CI 84.2% to 91.1%; SD Biosensor STANDARD Q). Average specificities were high in symptomatic and asymptomatic participants, and for most brands (overall summary specificity 99.6%, 95% CI 99.0% to 99.8%). At 5% prevalence using data for the most sensitive assays in symptomatic people (SD Biosensor STANDARD Q and Abbott Panbio), positive predictive values (PPVs) of 84% to 90% mean that between 1 in 10 and 1 in 6 positive results will be a false positive, and between 1 in 4 and 1 in 8 cases will be missed. At 0.5% prevalence applying the same tests in asymptomatic people would result in PPVs of 11% to 28% meaning that between 7 in 10 and 9 in 10 positive results will be false positives, and between 1 in 2 and 1 in 3 cases will be missed. No studies assessed the accuracy of repeated lateral flow testing or self-testing. Rapid molecular assays Thirty studies reported 33 evaluations of five different rapid molecular tests. Sensitivities varied according to test brand. Most of the data relate to the ID NOW and Xpert Xpress assays. Using data from evaluations following the manufacturer's instructions for use, the average sensitivity of ID NOW was 73.0% (95% CI 66.8% to 78.4%) and average specificity 99.7% (95% CI 98.7% to 99.9%; 4 evaluations; 812 samples, 222 cases). For Xpert Xpress, the average sensitivity was 100% (95% CI 88.1% to 100%) and average specificity 97.2% (95% CI 89.4% to 99.3%; 2 evaluations; 100 samples, 29 cases). Insufficient data were available to investigate the effect of symptom status or time after symptom onset. AUTHORS' CONCLUSIONS: Antigen tests vary in sensitivity. In people with signs and symptoms of COVID-19, sensitivities are highest in the first week of illness when viral loads are higher. The assays shown to meet appropriate criteria, such as WHO's priority target product profiles for COVID-19 diagnostics ('acceptable' sensitivity ≥ 80% and specificity ≥ 97%), can be considered as a replacement for laboratory-based RT-PCR when immediate decisions about patient care must be made, or where RT-PCR cannot be delivered in a timely manner. Positive predictive values suggest that confirmatory testing of those with positive results may be considered in low prevalence settings. Due to the variable sensitivity of antigen tests, people who test negative may still be infected. Evidence for testing in asymptomatic cohorts was limited. Test accuracy studies cannot adequately assess the ability of antigen tests to differentiate those who are infectious and require isolation from those who pose no risk, as there is no reference standard for infectiousness. A small number of molecular tests showed high accuracy and may be suitable alternatives to RT-PCR. However, further evaluations of the tests in settings as they are intended to be used are required to fully establish performance in practice. Several important studies in asymptomatic individuals have been reported since the close of our search and will be incorporated at the next update of this review. Comparative studies of antigen tests in their intended use settings and according to test operator (including self-testing) are required.
Asunto(s)
Antígenos Virales/análisis , Prueba Serológica para COVID-19/métodos , COVID-19/diagnóstico , Técnicas de Diagnóstico Molecular/métodos , Sistemas de Atención de Punto , SARS-CoV-2/inmunología , Adulto , Infecciones Asintomáticas , Sesgo , Prueba de Ácido Nucleico para COVID-19 , Prueba Serológica para COVID-19/normas , Niño , Estudios de Cohortes , Reacciones Falso Negativas , Reacciones Falso Positivas , Humanos , Técnicas de Diagnóstico Molecular/normas , Valor Predictivo de las Pruebas , Estándares de Referencia , Sensibilidad y EspecificidadRESUMEN
BACKGROUND: The respiratory illness caused by SARS-CoV-2 infection continues to present diagnostic challenges. Our 2020 edition of this review showed thoracic (chest) imaging to be sensitive and moderately specific in the diagnosis of coronavirus disease 2019 (COVID-19). In this update, we include new relevant studies, and have removed studies with case-control designs, and those not intended to be diagnostic test accuracy studies. OBJECTIVES: To evaluate the diagnostic accuracy of thoracic imaging (computed tomography (CT), X-ray and ultrasound) in people with suspected COVID-19. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, The Stephen B. Thacker CDC Library, and repositories of COVID-19 publications through to 30 September 2020. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of all designs, except for case-control, that recruited participants of any age group suspected to have COVID-19 and that reported estimates of test accuracy or provided data from which we could compute estimates. DATA COLLECTION AND ANALYSIS: The review authors independently and in duplicate screened articles, extracted data and assessed risk of bias and applicability concerns using the QUADAS-2 domain-list. We presented the results of estimated sensitivity and specificity using paired forest plots, and we summarised pooled estimates in tables. We used a bivariate meta-analysis model where appropriate. We presented the uncertainty of accuracy estimates using 95% confidence intervals (CIs). MAIN RESULTS: We included 51 studies with 19,775 participants suspected of having COVID-19, of whom 10,155 (51%) had a final diagnosis of COVID-19. Forty-seven studies evaluated one imaging modality each, and four studies evaluated two imaging modalities each. All studies used RT-PCR as the reference standard for the diagnosis of COVID-19, with 47 studies using only RT-PCR and four studies using a combination of RT-PCR and other criteria (such as clinical signs, imaging tests, positive contacts, and follow-up phone calls) as the reference standard. Studies were conducted in Europe (33), Asia (13), North America (3) and South America (2); including only adults (26), all ages (21), children only (1), adults over 70 years (1), and unclear (2); in inpatients (2), outpatients (32), and setting unclear (17). Risk of bias was high or unclear in thirty-two (63%) studies with respect to participant selection, 40 (78%) studies with respect to reference standard, 30 (59%) studies with respect to index test, and 24 (47%) studies with respect to participant flow. For chest CT (41 studies, 16,133 participants, 8110 (50%) cases), the sensitivity ranged from 56.3% to 100%, and specificity ranged from 25.4% to 97.4%. The pooled sensitivity of chest CT was 87.9% (95% CI 84.6 to 90.6) and the pooled specificity was 80.0% (95% CI 74.9 to 84.3). There was no statistical evidence indicating that reference standard conduct and definition for index test positivity were sources of heterogeneity for CT studies. Nine chest CT studies (2807 participants, 1139 (41%) cases) used the COVID-19 Reporting and Data System (CO-RADS) scoring system, which has five thresholds to define index test positivity. At a CO-RADS threshold of 5 (7 studies), the sensitivity ranged from 41.5% to 77.9% and the pooled sensitivity was 67.0% (95% CI 56.4 to 76.2); the specificity ranged from 83.5% to 96.2%; and the pooled specificity was 91.3% (95% CI 87.6 to 94.0). At a CO-RADS threshold of 4 (7 studies), the sensitivity ranged from 56.3% to 92.9% and the pooled sensitivity was 83.5% (95% CI 74.4 to 89.7); the specificity ranged from 77.2% to 90.4% and the pooled specificity was 83.6% (95% CI 80.5 to 86.4). For chest X-ray (9 studies, 3694 participants, 2111 (57%) cases) the sensitivity ranged from 51.9% to 94.4% and specificity ranged from 40.4% to 88.9%. The pooled sensitivity of chest X-ray was 80.6% (95% CI 69.1 to 88.6) and the pooled specificity was 71.5% (95% CI 59.8 to 80.8). For ultrasound of the lungs (5 studies, 446 participants, 211 (47%) cases) the sensitivity ranged from 68.2% to 96.8% and specificity ranged from 21.3% to 78.9%. The pooled sensitivity of ultrasound was 86.4% (95% CI 72.7 to 93.9) and the pooled specificity was 54.6% (95% CI 35.3 to 72.6). Based on an indirect comparison using all included studies, chest CT had a higher specificity than ultrasound. For indirect comparisons of chest CT and chest X-ray, or chest X-ray and ultrasound, the data did not show differences in specificity or sensitivity. AUTHORS' CONCLUSIONS: Our findings indicate that chest CT is sensitive and moderately specific for the diagnosis of COVID-19. Chest X-ray is moderately sensitive and moderately specific for the diagnosis of COVID-19. Ultrasound is sensitive but not specific for the diagnosis of COVID-19. Thus, chest CT and ultrasound may have more utility for excluding COVID-19 than for differentiating SARS-CoV-2 infection from other causes of respiratory illness. Future diagnostic accuracy studies should pre-define positive imaging findings, include direct comparisons of the various modalities of interest in the same participant population, and implement improved reporting practices.
Asunto(s)
COVID-19/diagnóstico por imagen , Radiografía Torácica , Tomografía Computarizada por Rayos X , Ultrasonografía , Adolescente , Adulto , Anciano , Sesgo , Prueba de Ácido Nucleico para COVID-19/normas , Niño , Intervalos de Confianza , Humanos , Pulmón/diagnóstico por imagen , Persona de Mediana Edad , Radiografía Torácica/normas , Radiografía Torácica/estadística & datos numéricos , Estándares de Referencia , Sensibilidad y Especificidad , Tomografía Computarizada por Rayos X/normas , Tomografía Computarizada por Rayos X/estadística & datos numéricos , Ultrasonografía/normas , Ultrasonografía/estadística & datos numéricos , Adulto JovenRESUMEN
BACKGROUND: The clinical implications of SARS-CoV-2 infection are highly variable. Some people with SARS-CoV-2 infection remain asymptomatic, whilst the infection can cause mild to moderate COVID-19 and COVID-19 pneumonia in others. This can lead to some people requiring intensive care support and, in some cases, to death, especially in older adults. Symptoms such as fever, cough, or loss of smell or taste, and signs such as oxygen saturation are the first and most readily available diagnostic information. Such information could be used to either rule out COVID-19, or select patients for further testing. This is an update of this review, the first version of which published in July 2020. OBJECTIVES: To assess the diagnostic accuracy of signs and symptoms to determine if a person presenting in primary care or to hospital outpatient settings, such as the emergency department or dedicated COVID-19 clinics, has COVID-19. SEARCH METHODS: For this review iteration we undertook electronic searches up to 15 July 2020 in the Cochrane COVID-19 Study Register and the University of Bern living search database. In addition, we checked repositories of COVID-19 publications. We did not apply any language restrictions. SELECTION CRITERIA: Studies were eligible if they included patients with clinically suspected COVID-19, or if they recruited known cases with COVID-19 and controls without COVID-19. Studies were eligible when they recruited patients presenting to primary care or hospital outpatient settings. Studies in hospitalised patients were only included if symptoms and signs were recorded on admission or at presentation. Studies including patients who contracted SARS-CoV-2 infection while admitted to hospital were not eligible. The minimum eligible sample size of studies was 10 participants. All signs and symptoms were eligible for this review, including individual signs and symptoms or combinations. We accepted a range of reference standards. DATA COLLECTION AND ANALYSIS: Pairs of review authors independently selected all studies, at both title and abstract stage and full-text stage. They resolved any disagreements by discussion with a third review author. Two review authors independently extracted data and resolved disagreements by discussion with a third review author. Two review authors independently assessed risk of bias using the Quality Assessment tool for Diagnostic Accuracy Studies (QUADAS-2) checklist. We presented sensitivity and specificity in paired forest plots, in receiver operating characteristic space and in dumbbell plots. We estimated summary parameters using a bivariate random-effects meta-analysis whenever five or more primary studies were available, and whenever heterogeneity across studies was deemed acceptable. MAIN RESULTS: We identified 44 studies including 26,884 participants in total. Prevalence of COVID-19 varied from 3% to 71% with a median of 21%. There were three studies from primary care settings (1824 participants), nine studies from outpatient testing centres (10,717 participants), 12 studies performed in hospital outpatient wards (5061 participants), seven studies in hospitalised patients (1048 participants), 10 studies in the emergency department (3173 participants), and three studies in which the setting was not specified (5061 participants). The studies did not clearly distinguish mild from severe COVID-19, so we present the results for all disease severities together. Fifteen studies had a high risk of bias for selection of participants because inclusion in the studies depended on the applicable testing and referral protocols, which included many of the signs and symptoms under study in this review. This may have especially influenced the sensitivity of those features used in referral protocols, such as fever and cough. Five studies only included participants with pneumonia on imaging, suggesting that this is a highly selected population. In an additional 12 studies, we were unable to assess the risk for selection bias. This makes it very difficult to judge the validity of the diagnostic accuracy of the signs and symptoms from these included studies. The applicability of the results of this review update improved in comparison with the original review. A greater proportion of studies included participants who presented to outpatient settings, which is where the majority of clinical assessments for COVID-19 take place. However, still none of the studies presented any data on children separately, and only one focused specifically on older adults. We found data on 84 signs and symptoms. Results were highly variable across studies. Most had very low sensitivity and high specificity. Only cough (25 studies) and fever (7 studies) had a pooled sensitivity of at least 50% but specificities were moderate to low. Cough had a sensitivity of 67.4% (95% confidence interval (CI) 59.8% to 74.1%) and specificity of 35.0% (95% CI 28.7% to 41.9%). Fever had a sensitivity of 53.8% (95% CI 35.0% to 71.7%) and a specificity of 67.4% (95% CI 53.3% to 78.9%). The pooled positive likelihood ratio of cough was only 1.04 (95% CI 0.97 to 1.11) and that of fever 1.65 (95% CI 1.41 to 1.93). Anosmia alone (11 studies), ageusia alone (6 studies), and anosmia or ageusia (6 studies) had sensitivities below 50% but specificities over 90%. Anosmia had a pooled sensitivity of 28.0% (95% CI 17.7% to 41.3%) and a specificity of 93.4% (95% CI 88.3% to 96.4%). Ageusia had a pooled sensitivity of 24.8% (95% CI 12.4% to 43.5%) and a specificity of 91.4% (95% CI 81.3% to 96.3%). Anosmia or ageusia had a pooled sensitivity of 41.0% (95% CI 27.0% to 56.6%) and a specificity of 90.5% (95% CI 81.2% to 95.4%). The pooled positive likelihood ratios of anosmia alone and anosmia or ageusia were 4.25 (95% CI 3.17 to 5.71) and 4.31 (95% CI 3.00 to 6.18) respectively, which is just below our arbitrary definition of a 'red flag', that is, a positive likelihood ratio of at least 5. The pooled positive likelihood ratio of ageusia alone was only 2.88 (95% CI 2.02 to 4.09). Only two studies assessed combinations of different signs and symptoms, mostly combining fever and cough with other symptoms. These combinations had a specificity above 80%, but at the cost of very low sensitivity (< 30%). AUTHORS' CONCLUSIONS: The majority of individual signs and symptoms included in this review appear to have very poor diagnostic accuracy, although this should be interpreted in the context of selection bias and heterogeneity between studies. Based on currently available data, neither absence nor presence of signs or symptoms are accurate enough to rule in or rule out COVID-19. The presence of anosmia or ageusia may be useful as a red flag for COVID-19. The presence of fever or cough, given their high sensitivities, may also be useful to identify people for further testing. Prospective studies in an unselected population presenting to primary care or hospital outpatient settings, examining combinations of signs and symptoms to evaluate the syndromic presentation of COVID-19, are still urgently needed. Results from such studies could inform subsequent management decisions.
Asunto(s)
Atención Ambulatoria , COVID-19/diagnóstico , Atención Primaria de Salud , SARS-CoV-2 , Evaluación de Síntomas , Ageusia/diagnóstico , Ageusia/etiología , Anosmia/diagnóstico , Anosmia/etiología , Artralgia/diagnóstico , Artralgia/etiología , Sesgo , COVID-19/complicaciones , COVID-19/epidemiología , Tos/diagnóstico , Tos/etiología , Diarrea/diagnóstico , Diarrea/etiología , Disnea/diagnóstico , Disnea/etiología , Fatiga/diagnóstico , Fatiga/etiología , Fiebre/diagnóstico , Fiebre/etiología , Cefalea/diagnóstico , Cefalea/etiología , Humanos , Mialgia/diagnóstico , Mialgia/etiología , Servicio Ambulatorio en Hospital/estadística & datos numéricos , Pandemias , Examen Físico , Sesgo de Selección , Evaluación de Síntomas/clasificación , Evaluación de Síntomas/estadística & datos numéricosRESUMEN
BACKGROUND: The respiratory illness caused by SARS-CoV-2 infection continues to present diagnostic challenges. Early research showed thoracic (chest) imaging to be sensitive but not specific in the diagnosis of coronavirus disease 2019 (COVID-19). However, this is a rapidly developing field and these findings need to be re-evaluated in the light of new research. This is the first update of this 'living systematic review'. This update focuses on people suspected of having COVID-19 and excludes studies with only confirmed COVID-19 participants. OBJECTIVES: To evaluate the diagnostic accuracy of thoracic imaging (computed tomography (CT), X-ray and ultrasound) in people with suspected COVID-19. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, The Stephen B. Thacker CDC Library, and repositories of COVID-19 publications through to 22 June 2020. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of all designs that recruited participants of any age group suspected to have COVID-19, and which reported estimates of test accuracy, or provided data from which estimates could be computed. When studies used a variety of reference standards, we retained the classification of participants as COVID-19 positive or negative as used in the study. DATA COLLECTION AND ANALYSIS: We screened studies, extracted data, and assessed the risk of bias and applicability concerns using the QUADAS-2 domain-list independently, in duplicate. We categorised included studies into three groups based on classification of index test results: studies that reported specific criteria for index test positivity (group 1); studies that did not report specific criteria, but had the test reader(s) explicitly classify the imaging test result as either COVID-19 positive or negative (group 2); and studies that reported an overview of index test findings, without explicitly classifying the imaging test as either COVID-19 positive or negative (group 3). We presented the results of estimated sensitivity and specificity using paired forest plots, and summarised in tables. We used a bivariate meta-analysis model where appropriate. We presented uncertainty of the accuracy estimates using 95% confidence intervals (CIs). MAIN RESULTS: We included 34 studies: 30 were cross-sectional studies with 8491 participants suspected of COVID-19, of which 4575 (54%) had a final diagnosis of COVID-19; four were case-control studies with 848 cases and controls in total, of which 464 (55%) had a final diagnosis of COVID-19. Chest CT was evaluated in 31 studies (8014 participants, 4224 (53%) cases), chest X-ray in three studies (1243 participants, 784 (63%) cases), and ultrasound of the lungs in one study (100 participants, 31 (31%) cases). Twenty-six per cent (9/34) of all studies were available only as preprints. Nineteen studies were conducted in Asia, 10 in Europe, four in North America and one in Australia. Sixteen studies included only adults, 15 studies included both adults and children and one included only children. Two studies did not report the ages of participants. Twenty-four studies included inpatients, four studies included outpatients, while the remaining six studies were conducted in unclear settings. The majority of included studies had a high or unclear risk of bias with respect to participant selection, index test, reference standard, and participant flow. For chest CT in suspected COVID-19 participants (31 studies, 8014 participants, 4224 (53%) cases) the sensitivity ranged from 57.4% to 100%, and specificity ranged from 0% to 96.0%. The pooled sensitivity of chest CT in suspected COVID-19 participants was 89.9% (95% CI 85.7 to 92.9) and the pooled specificity was 61.1% (95% CI 42.3 to 77.1). Sensitivity analyses showed that when the studies from China were excluded, the studies from other countries demonstrated higher specificity compared to the overall included studies. When studies that did not classify index tests as positive or negative for COVID-19 (group 3) were excluded, the remaining studies (groups 1 and 2) demonstrated higher specificity compared to the overall included studies. Sensitivity analyses limited to cross-sectional studies, or studies where at least two reverse transcriptase polymerase chain reaction (RT-PCR) tests were conducted if the first was negative, did not substantively alter the accuracy estimates. We did not identify publication status as a source of heterogeneity. For chest X-ray in suspected COVID-19 participants (3 studies, 1243 participants, 784 (63%) cases) the sensitivity ranged from 56.9% to 89.0% and specificity from 11.1% to 88.9%. The sensitivity and specificity of ultrasound of the lungs in suspected COVID-19 participants (1 study, 100 participants, 31 (31%) cases) were 96.8% and 62.3%, respectively. We could not perform a meta-analysis for chest X-ray or ultrasound due to the limited number of included studies. AUTHORS' CONCLUSIONS: Our findings indicate that chest CT is sensitive and moderately specific for the diagnosis of COVID-19 in suspected patients, meaning that CT may have limited capability in differentiating SARS-CoV-2 infection from other causes of respiratory illness. However, we are limited in our confidence in these results due to the poor study quality and the heterogeneity of included studies. Because of limited data, accuracy estimates of chest X-ray and ultrasound of the lungs for the diagnosis of suspected COVID-19 cases should be carefully interpreted. Future diagnostic accuracy studies should pre-define positive imaging findings, include direct comparisons of the various modalities of interest on the same participant population, and implement improved reporting practices. Planned updates of this review will aim to: increase precision around the accuracy estimates for chest CT (ideally with low risk of bias studies); obtain further data to inform accuracy of chest X-rays and ultrasound; and obtain data to further fulfil secondary objectives (e.g. 'threshold' effects, comparing accuracy estimates across different imaging modalities) to inform the utility of imaging along different diagnostic pathways.
Asunto(s)
COVID-19/diagnóstico por imagen , Radiografía Torácica , SARS-CoV-2 , Tomografía Computarizada por Rayos X , Ultrasonografía , Adulto , Sesgo , Estudios de Casos y Controles , Niño , Estudios Transversales/estadística & datos numéricos , Errores Diagnósticos/estadística & datos numéricos , Humanos , Pulmón/diagnóstico por imagen , Radiografía Torácica/estadística & datos numéricos , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/estadística & datos numéricos , Sensibilidad y Especificidad , Tomografía Computarizada por Rayos X/estadística & datos numéricos , Ultrasonografía/estadística & datos numéricosRESUMEN
BACKGROUND: Specific diagnostic tests to detect severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and resulting COVID-19 disease are not always available and take time to obtain results. Routine laboratory markers such as white blood cell count, measures of anticoagulation, C-reactive protein (CRP) and procalcitonin, are used to assess the clinical status of a patient. These laboratory tests may be useful for the triage of people with potential COVID-19 to prioritize them for different levels of treatment, especially in situations where time and resources are limited. OBJECTIVES: To assess the diagnostic accuracy of routine laboratory testing as a triage test to determine if a person has COVID-19. SEARCH METHODS: On 4 May 2020 we undertook electronic searches in the Cochrane COVID-19 Study Register and the COVID-19 Living Evidence Database from the University of Bern, which is updated daily with published articles from PubMed and Embase and with preprints from medRxiv and bioRxiv. In addition, we checked repositories of COVID-19 publications. We did not apply any language restrictions. SELECTION CRITERIA: We included both case-control designs and consecutive series of patients that assessed the diagnostic accuracy of routine laboratory testing as a triage test to determine if a person has COVID-19. The reference standard could be reverse transcriptase polymerase chain reaction (RT-PCR) alone; RT-PCR plus clinical expertise or and imaging; repeated RT-PCR several days apart or from different samples; WHO and other case definitions; and any other reference standard used by the study authors. DATA COLLECTION AND ANALYSIS: Two review authors independently extracted data from each included study. They also assessed the methodological quality of the studies, using QUADAS-2. We used the 'NLMIXED' procedure in SAS 9.4 for the hierarchical summary receiver operating characteristic (HSROC) meta-analyses of tests for which we included four or more studies. To facilitate interpretation of results, for each meta-analysis we estimated summary sensitivity at the points on the SROC curve that corresponded to the median and interquartile range boundaries of specificities in the included studies. MAIN RESULTS: We included 21 studies in this review, including 14,126 COVID-19 patients and 56,585 non-COVID-19 patients in total. Studies evaluated a total of 67 different laboratory tests. Although we were interested in the diagnotic accuracy of routine tests for COVID-19, the included studies used detection of SARS-CoV-2 infection through RT-PCR as reference standard. There was considerable heterogeneity between tests, threshold values and the settings in which they were applied. For some tests a positive result was defined as a decrease compared to normal vaues, for other tests a positive result was defined as an increase, and for some tests both increase and decrease may have indicated test positivity. None of the studies had either low risk of bias on all domains or low concerns for applicability for all domains. Only three of the tests evaluated had a summary sensitivity and specificity over 50%. These were: increase in interleukin-6, increase in C-reactive protein and lymphocyte count decrease. Blood count Eleven studies evaluated a decrease in white blood cell count, with a median specificity of 93% and a summary sensitivity of 25% (95% CI 8.0% to 27%; very low-certainty evidence). The 15 studies that evaluated an increase in white blood cell count had a lower median specificity and a lower corresponding sensitivity. Four studies evaluated a decrease in neutrophil count. Their median specificity was 93%, corresponding to a summary sensitivity of 10% (95% CI 1.0% to 56%; low-certainty evidence). The 11 studies that evaluated an increase in neutrophil count had a lower median specificity and a lower corresponding sensitivity. The summary sensitivity of an increase in neutrophil percentage (4 studies) was 59% (95% CI 1.0% to 100%) at median specificity (38%; very low-certainty evidence). The summary sensitivity of an increase in monocyte count (4 studies) was 13% (95% CI 6.0% to 26%) at median specificity (73%; very low-certainty evidence). The summary sensitivity of a decrease in lymphocyte count (13 studies) was 64% (95% CI 28% to 89%) at median specificity (53%; low-certainty evidence). Four studies that evaluated a decrease in lymphocyte percentage showed a lower median specificity and lower corresponding sensitivity. The summary sensitivity of a decrease in platelets (4 studies) was 19% (95% CI 10% to 32%) at median specificity (88%; low-certainty evidence). Liver function tests The summary sensitivity of an increase in alanine aminotransferase (9 studies) was 12% (95% CI 3% to 34%) at median specificity (92%; low-certainty evidence). The summary sensitivity of an increase in aspartate aminotransferase (7 studies) was 29% (95% CI 17% to 45%) at median specificity (81%) (low-certainty evidence). The summary sensitivity of a decrease in albumin (4 studies) was 21% (95% CI 3% to 67%) at median specificity (66%; low-certainty evidence). The summary sensitivity of an increase in total bilirubin (4 studies) was 12% (95% CI 3.0% to 34%) at median specificity (92%; very low-certainty evidence). Markers of inflammation The summary sensitivity of an increase in CRP (14 studies) was 66% (95% CI 55% to 75%) at median specificity (44%; very low-certainty evidence). The summary sensitivity of an increase in procalcitonin (6 studies) was 3% (95% CI 1% to 19%) at median specificity (86%; very low-certainty evidence). The summary sensitivity of an increase in IL-6 (four studies) was 73% (95% CI 36% to 93%) at median specificity (58%) (very low-certainty evidence). Other biomarkers The summary sensitivity of an increase in creatine kinase (5 studies) was 11% (95% CI 6% to 19%) at median specificity (94%) (low-certainty evidence). The summary sensitivity of an increase in serum creatinine (four studies) was 7% (95% CI 1% to 37%) at median specificity (91%; low-certainty evidence). The summary sensitivity of an increase in lactate dehydrogenase (4 studies) was 25% (95% CI 15% to 38%) at median specificity (72%; very low-certainty evidence). AUTHORS' CONCLUSIONS: Although these tests give an indication about the general health status of patients and some tests may be specific indicators for inflammatory processes, none of the tests we investigated are useful for accurately ruling in or ruling out COVID-19 on their own. Studies were done in specific hospitalized populations, and future studies should consider non-hospital settings to evaluate how these tests would perform in people with milder symptoms.
Asunto(s)
Prueba de COVID-19/métodos , COVID-19/diagnóstico , Pruebas Diagnósticas de Rutina/métodos , SARS-CoV-2/aislamiento & purificación , Sesgo , Biomarcadores/sangre , Proteína C-Reactiva/análisis , COVID-19/sangre , COVID-19/epidemiología , Prueba de COVID-19/normas , Creatina Quinasa/sangre , Creatinina/sangre , Pruebas Diagnósticas de Rutina/normas , Humanos , Interleucina-6/sangre , L-Lactato Deshidrogenasa/sangre , Recuento de Leucocitos , Pruebas de Función Hepática , Recuento de Linfocitos , Pandemias , Recuento de Plaquetas , Curva ROC , Valores de Referencia , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa/normas , Sensibilidad y Especificidad , TriajeRESUMEN
BACKGROUND: In children with urinary tract infection (UTI), only those with pyelonephritis (and not cystitis) are at risk for developing long-term renal sequelae. If non-invasive biomarkers could accurately differentiate children with cystitis from children with pyelonephritis, treatment and follow-up could potentially be individualized. This is an update of a review first published in 2015. OBJECTIVES: The objectives of this review were to 1) determine whether procalcitonin (PCT), C-reactive protein (CRP), erythrocyte sedimentation rate (ESR) can replace the acute DMSA scan in the diagnostic evaluation of children with UTI; 2) assess the influence of patient and study characteristics on the diagnostic accuracy of these tests, and 3) compare the performance of the three tests to each other. SEARCH METHODS: We searched MEDLINE, EMBASE, DARE, Web of Science, and BIOSIS Previews through to 17th December 2019 for this review. The reference lists of all included articles and relevant systematic reviews were searched to identify additional studies not found through the electronic search. SELECTION CRITERIA: We only considered published studies that evaluated the results of an index test (PCT, CRP, ESR) against the results of an acute-phase 99Tc-dimercaptosuccinic acid (DMSA) scan (conducted within 30 days of the UTI) in children aged 0 to 18 years with a culture-confirmed episode of UTI. The following cut-off values were used for the primary analysis: 0.5 ng/mL for procalcitonin, 20 mg/L for CRP and 30 mm/hour for ESR. DATA COLLECTION AND ANALYSIS: Two authors independently applied the selection criteria to all citations and independently abstracted data. We used the bivariate model to calculate pooled random-effects pooled sensitivity and specificity values. MAIN RESULTS: A total of 36 studies met our inclusion criteria. Twenty-five studies provided data for the primary analysis: 12 studies (1000 children) included data on PCT, 16 studies (1895 children) included data on CRP, and eight studies (1910 children) included data on ESR (some studies had data on more than one test). The summary sensitivity estimates (95% CI) for the PCT, CRP, ESR tests at the aforementioned cut-offs were 0.81 (0.67 to 0.90), 0.93 (0.86 to 0.96), and 0.83 (0.71 to 0.91), respectively. The summary specificity values for PCT, CRP, and ESR tests at these cut-offs were 0.76 (0.66 to 0.84), 0.37 (0.24 to 0.53), and 0.57 (0.41 to 0.72), respectively. AUTHORS' CONCLUSIONS: The ESR test does not appear to be sufficiently accurate to be helpful in differentiating children with cystitis from children with pyelonephritis. A low CRP value (< 20 mg/L) appears to be somewhat useful in ruling out pyelonephritis (decreasing the probability of pyelonephritis to < 20%), but unexplained heterogeneity in the data prevents us from making recommendations at this time. The procalcitonin test seems better suited for ruling in pyelonephritis, but the limited number of studies and the marked heterogeneity between studies prevents us from reaching definitive conclusions. Thus, at present, we do not find any compelling evidence to recommend the routine use of any of these tests in clinical practice.
Asunto(s)
Sedimentación Sanguínea , Proteína C-Reactiva/análisis , Calcitonina/sangre , Cistitis/diagnóstico , Polipéptido alfa Relacionado con Calcitonina/sangre , Pielonefritis/diagnóstico , Enfermedad Aguda , Biomarcadores/sangre , Niño , Cistitis/sangre , Diagnóstico Diferencial , Humanos , Pielonefritis/sangre , Pielonefritis/complicaciones , Ensayos Clínicos Controlados Aleatorios como Asunto , Sensibilidad y Especificidad , Infecciones Urinarias/sangreRESUMEN
BACKGROUND: The diagnosis of infection by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) presents major challenges. Reverse transcriptase polymerase chain reaction (RT-PCR) testing is used to diagnose a current infection, but its utility as a reference standard is constrained by sampling errors, limited sensitivity (71% to 98%), and dependence on the timing of specimen collection. Chest imaging tests are being used in the diagnosis of COVID-19 disease, or when RT-PCR testing is unavailable. OBJECTIVES: To determine the diagnostic accuracy of chest imaging (computed tomography (CT), X-ray and ultrasound) in people with suspected or confirmed COVID-19. SEARCH METHODS: We searched the COVID-19 Living Evidence Database from the University of Bern, the Cochrane COVID-19 Study Register, and The Stephen B. Thacker CDC Library. In addition, we checked repositories of COVID-19 publications. We did not apply any language restrictions. We conducted searches for this review iteration up to 5 May 2020. SELECTION CRITERIA: We included studies of all designs that produce estimates of test accuracy or provide data from which estimates can be computed. We included two types of cross-sectional designs: a) where all patients suspected of the target condition enter the study through the same route and b) where it is not clear up front who has and who does not have the target condition, or where the patients with the target condition are recruited in a different way or from a different population from the patients without the target condition. When studies used a variety of reference standards, we included all of them. DATA COLLECTION AND ANALYSIS: We screened studies and extracted data independently, in duplicate. We also assessed the risk of bias and applicability concerns independently, in duplicate, using the QUADAS-2 checklist and presented the results of estimated sensitivity and specificity, using paired forest plots, and summarised in tables. We used a hierarchical meta-analysis model where appropriate. We presented uncertainty of the accuracy estimates using 95% confidence intervals (CIs). MAIN RESULTS: We included 84 studies, falling into two categories: studies with participants with confirmed diagnoses of COVID-19 at the time of recruitment (71 studies with 6331 participants) and studies with participants suspected of COVID-19 (13 studies with 1948 participants, including three case-control studies with 549 cases and controls). Chest CT was evaluated in 78 studies (8105 participants), chest X-ray in nine studies (682 COVID-19 cases), and chest ultrasound in two studies (32 COVID-19 cases). All evaluations of chest X-ray and ultrasound were conducted in studies with confirmed diagnoses only. Twenty-five per cent (21/84) of all studies were available only as preprints, 15/71 studies in the confirmed cases group and 6/13 of the studies in the suspected group. Among 71 studies that included confirmed cases, 41 studies had included symptomatic cases only, 25 studies had included cases regardless of their symptoms, five studies had included asymptomatic cases only, three of which included a combination of confirmed and suspected cases. Seventy studies were conducted in Asia, 2 in Europe, 2 in North America and one in South America. Fifty-one studies included inpatients while the remaining 24 studies were conducted in mixed or unclear settings. Risk of bias was high in most studies, mainly due to concerns about selection of participants and applicability. Among the 13 studies that included suspected cases, nine studies were conducted in Asia, and one in Europe. Seven studies included inpatients while the remaining three studies were conducted in mixed or unclear settings. In studies that included confirmed cases the pooled sensitivity of chest CT was 93.1% (95%CI: 90.2 - 95.0 (65 studies, 5759 cases); and for X-ray 82.1% (95%CI: 62.5 to 92.7 (9 studies, 682 cases). Heterogeneity judged by visual assessment of the ROC plots was considerable. Two studies evaluated the diagnostic accuracy of point-of-care ultrasound and both reported zero false negatives (with 10 and 22 participants having undergone ultrasound, respectively). These studies only reported True Positive and False Negative data, therefore it was not possible to pool and derive estimates of specificity. In studies that included suspected cases, the pooled sensitivity of CT was 86.2% (95%CI: 71.9 to 93.8 (13 studies, 2346 participants) and specificity was 18.1% (95%CI: 3.71 to 55.8). Heterogeneity judged by visual assessment of the forest plots was high. Chest CT may give approximately the same proportion of positive results for patients with and without a SARS-CoV-2 infection: the chances of getting a positive CT result are 86% (95% CI: 72 to 94) in patient with a SARS-CoV-2 infection and 82% (95% CI: 44 to 96) in patients without. AUTHORS' CONCLUSIONS: The uncertainty resulting from the poor study quality and the heterogeneity of included studies limit our ability to confidently draw conclusions based on our results. Our findings indicate that chest CT is sensitive but not specific for the diagnosis of COVID-19 in suspected patients, meaning that CT may not be capable of differentiating SARS-CoV-2 infection from other causes of respiratory illness. This low specificity could also be the result of the poor sensitivity of the reference standard (RT-PCR), as CT could potentially be more sensitive than RT-PCR in some cases. Because of limited data, accuracy estimates of chest X-ray and ultrasound of the lungs for the diagnosis of COVID-19 should be carefully interpreted. Future diagnostic accuracy studies should avoid cases-only studies and pre-define positive imaging findings. Planned updates of this review will aim to: increase precision around the accuracy estimates for CT (ideally with low risk of bias studies); obtain further data to inform accuracy of chest X rays and ultrasound; and continue to search for studies that fulfil secondary objectives to inform the utility of imaging along different diagnostic pathways.
Asunto(s)
Betacoronavirus , Técnicas de Laboratorio Clínico/métodos , Infecciones por Coronavirus/diagnóstico por imagen , Neumonía Viral/diagnóstico por imagen , Adulto , COVID-19 , Prueba de COVID-19 , Niño , Infecciones por Coronavirus/diagnóstico , Humanos , Pulmón/diagnóstico por imagen , Pandemias , Radiografía Torácica/estadística & datos numéricos , SARS-CoV-2 , Sensibilidad y Especificidad , Tomografía Computarizada por Rayos X/estadística & datos numéricos , Ultrasonografía/estadística & datos numéricosRESUMEN
BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and the resulting COVID-19 pandemic present important diagnostic challenges. Several diagnostic strategies are available to identify or rule out current infection, identify people in need of care escalation, or to test for past infection and immune response. Point-of-care antigen and molecular tests to detect current SARS-CoV-2 infection have the potential to allow earlier detection and isolation of confirmed cases compared to laboratory-based diagnostic methods, with the aim of reducing household and community transmission. OBJECTIVES: To assess the diagnostic accuracy of point-of-care antigen and molecular-based tests to determine if a person presenting in the community or in primary or secondary care has current SARS-CoV-2 infection. SEARCH METHODS: On 25 May 2020 we undertook electronic searches in the Cochrane COVID-19 Study Register and the COVID-19 Living Evidence Database from the University of Bern, which is updated daily with published articles from PubMed and Embase and with preprints from medRxiv and bioRxiv. In addition, we checked repositories of COVID-19 publications. We did not apply any language restrictions. SELECTION CRITERIA: We included studies of people with suspected current SARS-CoV-2 infection, known to have, or not to have SARS-CoV-2 infection, or where tests were used to screen for infection. We included test accuracy studies of any design that evaluated antigen or molecular tests suitable for a point-of-care setting (minimal equipment, sample preparation, and biosafety requirements, with results available within two hours of sample collection). We included all reference standards to define the presence or absence of SARS-CoV-2 (including reverse transcription polymerase chain reaction (RT-PCR) tests and established clinical diagnostic criteria). DATA COLLECTION AND ANALYSIS: Two review authors independently screened studies and resolved any disagreements by discussion with a third review author. One review author independently extracted study characteristics, which were checked by a second review author. Two review authors independently extracted 2x2 contingency table data and assessed risk of bias and applicability of the studies using the QUADAS-2 tool. We present sensitivity and specificity, with 95% confidence intervals (CIs), for each test using paired forest plots. We pooled data using the bivariate hierarchical model separately for antigen and molecular-based tests, with simplifications when few studies were available. We tabulated available data by test manufacturer. MAIN RESULTS: We included 22 publications reporting on a total of 18 study cohorts with 3198 unique samples, of which 1775 had confirmed SARS-CoV-2 infection. Ten studies took place in North America, two in South America, four in Europe, one in China and one was conducted internationally. We identified data for eight commercial tests (four antigen and four molecular) and one in-house antigen test. Five of the studies included were only available as preprints. We did not find any studies at low risk of bias for all quality domains and had concerns about applicability of results across all studies. We judged patient selection to be at high risk of bias in 50% of the studies because of deliberate over-sampling of samples with confirmed COVID-19 infection and unclear in seven out of 18 studies because of poor reporting. Sixteen (89%) studies used only a single, negative RT-PCR to confirm the absence of COVID-19 infection, risking missing infection. There was a lack of information on blinding of index test (n = 11), and around participant exclusions from analyses (n = 10). We did not observe differences in methodological quality between antigen and molecular test evaluations. Antigen tests Sensitivity varied considerably across studies (from 0% to 94%): the average sensitivity was 56.2% (95% CI 29.5 to 79.8%) and average specificity was 99.5% (95% CI 98.1% to 99.9%; based on 8 evaluations in 5 studies on 943 samples). Data for individual antigen tests were limited with no more than two studies for any test. Rapid molecular assays Sensitivity showed less variation compared to antigen tests (from 68% to 100%), average sensitivity was 95.2% (95% CI 86.7% to 98.3%) and specificity 98.9% (95% CI 97.3% to 99.5%) based on 13 evaluations in 11 studies of on 2255 samples. Predicted values based on a hypothetical cohort of 1000 people with suspected COVID-19 infection (with a prevalence of 10%) result in 105 positive test results including 10 false positives (positive predictive value 90%), and 895 negative results including 5 false negatives (negative predictive value 99%). Individual tests We calculated pooled results of individual tests for ID NOW (Abbott Laboratories) (5 evaluations) and Xpert Xpress (Cepheid Inc) (6 evaluations). Summary sensitivity for the Xpert Xpress assay (99.4%, 95% CI 98.0% to 99.8%) was 22.6 (95% CI 18.8 to 26.3) percentage points higher than that of ID NOW (76.8%, (95% CI 72.9% to 80.3%), whilst the specificity of Xpert Xpress (96.8%, 95% CI 90.6% to 99.0%) was marginally lower than ID NOW (99.6%, 95% CI 98.4% to 99.9%; a difference of -2.8% (95% CI -6.4 to 0.8)) AUTHORS' CONCLUSIONS: This review identifies early-stage evaluations of point-of-care tests for detecting SARS-CoV-2 infection, largely based on remnant laboratory samples. The findings currently have limited applicability, as we are uncertain whether tests will perform in the same way in clinical practice, and according to symptoms of COVID-19, duration of symptoms, or in asymptomatic people. Rapid tests have the potential to be used to inform triage of RT-PCR use, allowing earlier detection of those testing positive, but the evidence currently is not strong enough to determine how useful they are in clinical practice. Prospective and comparative evaluations of rapid tests for COVID-19 infection in clinically relevant settings are urgently needed. Studies should recruit consecutive series of eligible participants, including both those presenting for testing due to symptoms and asymptomatic people who may have come into contact with confirmed cases. Studies should clearly describe symptomatic status and document time from symptom onset or time since exposure. Point-of-care tests must be conducted on samples according to manufacturer instructions for use and be conducted at the point of care. Any future research study report should conform to the Standards for Reporting of Diagnostic Accuracy (STARD) guideline.