RESUMO
While HC2 and GP5+/6+ PCR-EIA were pivotal in test validation of new HPV assays, they represent the first generation of comparator tests based upon technologies that are not in widespread use anymore. In the current guideline, criteria for second-generation comparator tests are presented that include more detailed resolution of HPV genotypes. Second-generation comparator tests should preferentially target only the 12 genotypes classified as carcinogenic (IARC-group I), and show consistent non-inferior sensitivity for CIN2+ and CIN3+ and specificity for ≤CIN1 compared to one of the first-generations comparators, in at least three validation studies using benchmarks of 0.95 for relative sensitivity and 0.98 for relative specificity. Validation should take into account used storage media and other sample handling procedures. Meta-analyses were conducted to identify the assays that fulfill these stringent criteria. Four tests fulfilled the new criteria: (1) RealTime High-Risk HPV Test (Abbott), (2) Cobas-4800 HPV test (Roche Molecular System), (3) Onclarity HPV Assay (BD Diagnostics), and (4) Anyplex II HPV HR Detection (Seegene), each evaluated in three to six studies. Whereas the four assays target 14 carcinogenic genotypes, the first two identify separately HPV16 and 18, the third assay identifies five types separately and the fourth identifies all the types separately.
Assuntos
Detecção Precoce de Câncer , Papillomaviridae , Infecções por Papillomavirus , Sensibilidade e Especificidade , Neoplasias do Colo do Útero , Feminino , Humanos , DNA Viral/genética , Detecção Precoce de Câncer/métodos , Genótipo , Testes de DNA para Papilomavírus Humano/métodos , Testes de DNA para Papilomavírus Humano/normas , Técnicas de Diagnóstico Molecular/métodos , Técnicas de Diagnóstico Molecular/normas , Papillomaviridae/genética , Papillomaviridae/classificação , Papillomaviridae/isolamento & purificação , Infecções por Papillomavirus/diagnóstico , Infecções por Papillomavirus/virologia , Neoplasias do Colo do Útero/diagnóstico , Neoplasias do Colo do Útero/virologiaRESUMO
BACKGROUND: Previous studies indicated that pleural fluid complement C1q was helpful for diagnosing tuberculous pleural effusion (TPE), but the participants in these studies were young. The diagnostic accuracy of C1q for TPE in elderly patients remains unknown. This study aimed to investigate the diagnostic accuracy of C1q for TPE in elderly patients. METHODS: We prospectively recruited patients with undiagnosed pleural effusion who visited the Affiliated Hospital of Inner Mongolia Medical University between September 2018 and July 2021. Their C1q in pleural fluid was detected, and the diagnostic accuracy of C1q was assessed by the receiver operating characteristic (ROC) curve analysis. RESULTS: The median ages of patients with TPE and non-TPE were 75 and 71 years, respectively. TPE patients had significantly higher C1q than non-TPE. The area under the ROC curve (AUC) of C1q was 0.67 (95 %CI: 0.51-0.82). At the threshold of 100 mg/L, C1q had a sensitivity of 0.44 (95 %CI: 0.19-0.69) and specificity of 0.79 (95 %CI: 0.71-0.86). CONCLUSION: C1q in pleural fluid has low diagnostic accuracy for TPE in elderly patients.
RESUMO
BACKGROUND: Pleural biomarkers represent potential diagnostic tools for tuberculous pleural effusion (TPE) due to their advantages of low cost, short turnaround time, and less invasiveness. This study evaluated the diagnostic accuracy of two CXCR3 ligands, C-X-C motif chemokine ligand 9 (CXCL9) and CXCL11, for TPE. In addition, we investigated the cellular origins and biological roles of CXCL9 and CXCL11 in the development of TPE. METHODS: This double-blind study prospectively enrolled patients with undiagnosed pleural effusion from two centers (Hohhot and Changshu) in China. Pleural fluid on admission was obtained and levels of CXCL9 and CXCL11 were measured by an enzyme-linked immunosorbent assay (ELISA). The receiver operating characteristic (ROC) curve and the decision curve analysis (DCA) were used to evaluate their diagnostic accuracy and net benefit, respectively. THP-1 cell-derived macrophages were treated with Bacillus Calmette-Guérin (BCG), and quantitative real-time PCR (qRT-PCR) and ELISA were used to determine the mRNA and protein levels of CXCL9 and CXCL11. The chemoattractant activities of CXCL9 and CXCL11 for T helper (Th) cells were analyzed by a transwell assay. RESULTS: One hundred and fifty-three (20 TPEs and 133 non-TPEs) patients were enrolled in the Hohhot Center, and 58 (13 TPEs and 45 non-TPEs) were enrolled in the Changshu Center. In both centers, we observed increased CXCL9 and CXCL11 in TPE patients. The areas under the ROC curves (AUCs) of pleural CXCL9 and CXCL11 in the Hohhot Center were 0.70 (95 % CI: 0.55-0.85) and 0.68 (95 % CI: 0.52-0.84), respectively. In the Changshu Center, the AUCs of CXCL9 and CXCL11 were 0.96 (95 % CI: 0.92-1.00) and 0.97 (95 % CI: 0.94-1.00), respectively. The AUCs of CXCL9 and CXCL11 decreased with the advancement of age. The decision curves of CXCL9 and CXCL11 showed net benefits in both centers. CXCL9 and CXCL11 were upregulated in BCG-treated macrophages. Pleural fluid from TPE and conditioned medium from BCG-treated macrophages were chemotactic for Th cells. Anti-CXCL9 or CXCL11 neutralizing antibodies could partly block the chemotactic activity. CONCLUSIONS: Pleural CXCL9 and CXCL11 are potential diagnostic markers for TPE, but their diagnostic accuracy is compromised in elderly patients. CXCL9 and CXCL11 can promote the migration of peripheral Th cells, thus representing a therapeutic target for the treatment of TPE.
Assuntos
Quimiocina CXCL11 , Quimiocina CXCL9 , Derrame Pleural , Receptores CXCR3 , Tuberculose Pleural , Humanos , Quimiocina CXCL9/metabolismo , Quimiocina CXCL11/metabolismo , Masculino , Feminino , Pessoa de Meia-Idade , Derrame Pleural/metabolismo , Derrame Pleural/diagnóstico , Receptores CXCR3/metabolismo , Tuberculose Pleural/diagnóstico , Tuberculose Pleural/metabolismo , Adulto , Ligantes , Método Duplo-Cego , Células THP-1 , Biomarcadores/metabolismo , Macrófagos/metabolismo , Estudos Prospectivos , Idoso , Curva ROCRESUMO
The summary receiver operating characteristic (SROC) curve has been recommended as one important meta-analytical summary to represent the accuracy of a diagnostic test in the presence of heterogeneous cutoff values. However, selective publication of diagnostic studies for meta-analysis can induce publication bias (PB) on the estimate of the SROC curve. Several sensitivity analysis methods have been developed to quantify PB on the SROC curve, and all these methods utilize parametric selection functions to model the selective publication mechanism. The main contribution of this article is to propose a new sensitivity analysis approach that derives the worst-case bounds for the SROC curve by adopting nonparametric selection functions under minimal assumptions. The estimation procedures of the worst-case bounds use the Monte Carlo method to approximate the bias on the SROC curves along with the corresponding area under the curves, and then the maximum and minimum values of PB under a range of marginal selection probabilities are optimized by nonlinear programming. We apply the proposed method to real-world meta-analyses to show that the worst-case bounds of the SROC curves can provide useful insights for discussing the robustness of meta-analytical findings on diagnostic test accuracy.
Assuntos
Metanálise como Assunto , Viés de Publicação , Curva ROC , Humanos , Simulação por Computador , Interpretação Estatística de Dados , Testes Diagnósticos de Rotina/estatística & dados numéricos , Modelos Estatísticos , Método de Monte Carlo , Viés de Publicação/estatística & dados numéricos , Estatísticas não ParamétricasRESUMO
INTRODUCTION: Cognitive impairment is a critical concern in stroke care, and international guidelines recommend early cognitive screening. The aim of this study was to determine the prognostic accuracy of both the short and standard forms of the Montreal Cognitive Assessment (MoCA) in predicting long-term cognitive recovery following a stroke. METHODS: For this study, we used data from the Efficacy of Fluoxetine - a Randomized Controlled Trial in Stroke (EFFECTS) study, which encompassed stroke patients from 35 Swedish centers over the period from 2014 to 2019. Cognitive assessments were initially conducted at 2-15 days post-stroke, with follow-up data gathered at 6 months. We used the MoCA for objective cognitive evaluation. For assessing subjective cognitive impairment, we used the memory and thinking domain of the Stroke Impact Scale. For psychometric evaluation of the short Swedish version of MoCA (s-MoCA-SWE), we used cross tables and binary logistic regression. RESULTS: The study included 1,141 patients (62.2% men; median [interquartile range; IQR] age, 72.3 [13.2] years; median [IQR] stroke severity, 3.0 [3.0]). At baseline, the prevalence of cognitive impairment was 71.7% according to the s-MoCA-SWE (≤12) and 67.0% according to the MoCA (≤25). The s-MoCA-SWE demonstrated a sensitivity of 92.3% for correctly identifying patients with objective cognitive impairment and 81.5% for identifying those with subjective impairments at 6 months. Although the s-MoCA-SWE had higher sensitivity, the MoCA had a more balanced sensitivity and specificity in detecting both subjective and objective cognitive impairments. In both crude and multivariable models, the s-MoCA-SWE was more strongly associated than the MoCA with cognitive impairment at 6 months. CONCLUSIONS: Both the short and standard versions of the MoCA appear to be effective in identifying individuals likely to experience persistent cognitive issues following a stroke. Considering the limited time available in an acute stroke unit, the short-form version may be more practical. Nevertheless, further prospective studies are required to validate these findings.
RESUMO
BACKGROUND: A Generalized Linear Mixed Model (GLMM) is recommended to meta-analyze diagnostic test accuracy studies (DTAs) based on aggregate or individual participant data. Since a GLMM does not have a closed-form likelihood function or parameter solutions, computational methods are conventionally used to approximate the likelihoods and obtain parameter estimates. The most commonly used computational methods are the Iteratively Reweighted Least Squares (IRLS), the Laplace approximation (LA), and the Adaptive Gauss-Hermite quadrature (AGHQ). Despite being widely used, it has not been clear how these computational methods compare and perform in the context of an aggregate data meta-analysis (ADMA) of DTAs. METHODS: We compared and evaluated the performance of three commonly used computational methods for GLMM - the IRLS, the LA, and the AGHQ, via a comprehensive simulation study and real-life data examples, in the context of an ADMA of DTAs. By varying several parameters in our simulations, we assessed the performance of the three methods in terms of bias, root mean squared error, confidence interval (CI) width, coverage of the 95% CI, convergence rate, and computational speed. RESULTS: For most of the scenarios, especially when the meta-analytic data were not sparse (i.e., there were no or negligible studies with perfect diagnosis), the three computational methods were comparable for the estimation of sensitivity and specificity. However, the LA had the largest bias and root mean squared error for pooled sensitivity and specificity when the meta-analytic data were sparse. Moreover, the AGHQ took a longer computational time to converge relative to the other two methods, although it had the best convergence rate. CONCLUSIONS: We recommend practitioners and researchers carefully choose an appropriate computational algorithm when fitting a GLMM to an ADMA of DTAs. We do not recommend the LA for sparse meta-analytic data sets. However, either the AGHQ or the IRLS can be used regardless of the characteristics of the meta-analytic data.
Assuntos
Simulação por Computador , Testes Diagnósticos de Rotina , Metanálise como Assunto , Humanos , Testes Diagnósticos de Rotina/métodos , Testes Diagnósticos de Rotina/normas , Testes Diagnósticos de Rotina/estatística & dados numéricos , Modelos Lineares , Algoritmos , Funções Verossimilhança , Sensibilidade e EspecificidadeRESUMO
OBJECTIVE: To evaluate the sensitivity of human papillomavirus (HPV) tested urine to detect high-grade cervical precancer (cervical intraepithelial neoplasia grade 2+ [CIN2+]) using two urine collection devices. DESIGN: Randomised controlled trial. SETTING: St Mary's Hospital, Manchester, UK. POPULATION: Colposcopy attendees with abnormal cervical screening; a total of 480 participants were randomised. Matched urine and cervical samples were available for 235 and 230 participants using a first-void urine (FVU)-collection device and standard pot, respectively. METHODS: Urine was self-collected and mixed with preservative - randomised 1:1 to FVU-collection device (Novosanis Colli-pee® 10 mL with urine conservation medium [UCM]) or standard pot. Matched clinician-collected cervical samples were taken before colposcopy. HPV testing used Roche cobas® 8800. A questionnaire evaluated urine self-sampling acceptability. MAIN OUTCOME MEASURES: The primary outcome measured sensitivity of HPV-tested urine (FVU-collection device and standard pot) for CIN2+ detection. Secondary outcomes compared HPV-tested cervical and urine samples for CIN2+ and evaluated the acceptability of urine self-sampling. RESULTS: Urine HPV test sensitivity for CIN2+ was higher with the FVU-collection device (90.3%, 95% CI 83.7%-94.9%, 112/124) than the standard pot (73.4%, 95% CI 64.7%-80.9%, 91/124, p = 0.0005). The relative sensitivity of FVU-device-collected urine was 0.92 (95% CI 0.87-0.97, pMcN = 0.004) compared with cervical, considering that all women were referred after a positive cervical HPV test. Urine-based sampling was acceptable to colposcopy attendees. CONCLUSIONS: Testing of FVU-device-collected urine for HPV was superior to standard-pot-collected urine in colposcopy attendees and has promising sensitivity for CIN2+ detection. General population HPV testing of FVU-device-collected urine will establish its clinical performance and acceptability as an alternative to routine cervical screening.
Assuntos
Colposcopia , Detecção Precoce de Câncer , Infecções por Papillomavirus , Sensibilidade e Especificidade , Coleta de Urina , Displasia do Colo do Útero , Neoplasias do Colo do Útero , Humanos , Feminino , Adulto , Infecções por Papillomavirus/diagnóstico , Infecções por Papillomavirus/urina , Neoplasias do Colo do Útero/diagnóstico , Neoplasias do Colo do Útero/virologia , Displasia do Colo do Útero/diagnóstico , Displasia do Colo do Útero/virologia , Detecção Precoce de Câncer/métodos , Pessoa de Meia-Idade , Coleta de Urina/métodos , Coleta de Urina/instrumentação , Papillomaviridae/isolamento & purificação , Manejo de Espécimes/métodos , Manejo de Espécimes/instrumentação , Esfregaço Vaginal/instrumentação , Papillomavirus HumanoRESUMO
BACKGROUND: Acute respiratory distress syndrome (ARDS) is a life-threatening respiratory condition with high mortality rates, accounting for 10% of all intensive care unit admissions. Lung ultrasound (LUS) as diagnostic tool for acute respiratory failure has garnered widespread recognition and was recently incorporated into the updated definitions of ARDS. This raised the hypothesis that LUS is a reliable method for diagnosing ARDS. OBJECTIVES: We aimed to establish the accuracy of LUS for ARDS diagnosis and classification of focal versus non-focal ARDS subphenotypes. METHODS: This systematic review and meta-analysis used a systematic search strategy, which was applied to PubMed, EMBASE and cochrane databases. Studies investigating the diagnostic accuracy of LUS compared to thoracic CT or chest radiography (CXR) in ARDS diagnosis or focal versus non-focal subphenotypes in adult patients were included. Quality of studies was evaluated using the QUADAS-2 tool. Statistical analyses were performed using "Mada" in Rstudio, version 4.0.3. Sensitivity and specificity with 95% confidence interval of each separate study were summarized in a Forest plot. RESULTS: The search resulted in 2648 unique records. After selection, 11 reports were included, involving 2075 patients and 598 ARDS cases (29%). Nine studies reported on ARDS diagnosis and two reported on focal versus non-focal ARDS subphenotypes classification. Meta-analysis showed a pooled sensitivity of 0.631 (95% CI 0.450-0.782) and pooled specificity of 0.942 (95% CI 0.856-0.978) of LUS for ARDS diagnosis. In two studies, LUS could accurately differentiate between focal versus non-focal ARDS subphenotypes. Insufficient data was available to perform a meta-analysis. CONCLUSION: This review confirms the hypothesis that LUS is a reliable method for diagnosing ARDS in adult patients. For the classification of focal or non-focal subphenotypes, LUS showed promising results, but more research is needed.
Assuntos
Pulmão , Síndrome do Desconforto Respiratório , Ultrassonografia , Humanos , Síndrome do Desconforto Respiratório/diagnóstico por imagem , Síndrome do Desconforto Respiratório/classificação , Ultrassonografia/métodos , Ultrassonografia/normas , Pulmão/diagnóstico por imagem , FenótipoRESUMO
OBJECTIVE: To determine the diagnostic test accuracy of transvaginal ultrasound (TVS) using a standardized technique for the diagnosis of deep endometriosis (DE) of the uterosacral ligaments (USLs) and adjacent torus uterinus (TU). METHODS: This was a prospective diagnostic test accuracy study conducted at the McMaster University Medical Center Tertiary Endometriosis Clinic, Hamilton, ON, Canada. Consecutive participants were enrolled if they successfully underwent TVS and surgery by our team from 10 August 2020 to 31 October 2021. The index test was TVS using a standardized posterior approach performed and interpreted by an expert sonologist. The reference standard included direct surgical visualization on laparoscopy by the same person who performed and interpreted the ultrasound scans. Accuracy, sensitivity, specificity, positive and negative predictive values (PPV and NPV) and positive and negative likelihood ratios were calculated for the TVS posterior approach for each location using the reference standard. RESULTS: There were 54 consecutive participants included upon completion of laparoscopy and histological assessment. The prevalence of DE for the left USL, right USL and TU was 42.6%, 22.2% and 14.8%, respectively. Based on surgical visualization as the reference standard, TVS demonstrated an accuracy of 92.6% (95% CI, 82.1-97.9%), sensitivity of 82.6% (95% CI, 61.2-95.1%), specificity of 100% (95% CI, 88.8-100%), PPV of 100% and NPV of 88.6% (95% CI, 76.1-95.0%) for diagnosing DE in the left USL. For DE of the right USL, TVS demonstrated an accuracy of 94.4% (95% CI, 84.6-98.8%), sensitivity of 75.0% (95% CI, 42.8-94.5%), specificity of 100% (95% CI, 91.6-100%), PPV of 100% and NPV of 93.3% (95% CI, 84.0-97.4%). For DE of the TU, TVS demonstrated an accuracy of 100% (95% CI, 93.4-100%), sensitivity of 100% (95% CI, 63.1-100%), specificity of 100% (95% CI, 92.3-100%), PPV of 100% and NPV of 100%. CONCLUSIONS: We observed high diagnostic test accuracy of the evaluated standardized TVS technique for assessing DE of the USLs and TU. Further studies evaluating this technique should be performed, particularly with less experienced observers, before considering this technique as the standard approach. © 2023 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Assuntos
Endometriose , Vagina , Feminino , Gravidez , Humanos , Vagina/diagnóstico por imagem , Vagina/patologia , Endometriose/diagnóstico por imagem , Endometriose/cirurgia , Sensibilidade e Especificidade , Estudos Prospectivos , Ultrassonografia/métodos , Ligamentos/diagnóstico por imagem , Ligamentos/patologia , Testes Diagnósticos de RotinaRESUMO
BACKGROUND: In two-step population screening for colorectal cancer (CRC), a simple non-invasive test, commonly a fecal immunochemical test for hemoglobin (FIT), is first undertaken to predict, based on the fecal hemoglobin concentration (f-Hb), who is more likely to have colorectal neoplasia and needs colonoscopy. AIM: To evaluate the importance of being able to adjust the f-Hb threshold that triggers follow-up colonoscopy (the "positivity threshold"), we evaluated the predictive value of f-Hb for colorectal neoplasia and its implications for the configuration of new non-invasive tests. METHODS: A literature review was conducted on the use of quantitative FIT to select the positivity threshold, followed by using f-Hb from a large population to model how adjusting the positivity threshold enabled achievement of the desired program outcomes in a feasible manner. RESULTS: The literature review and the modeling found that while the f-Hb positivity threshold is predictive for colorectal neoplasia across a wide range of f-Hb, there is a complex relationship between program outcomes and f-Hb. The threshold determines not just clinical accuracy (including true- and false-positive results for CRC and/or advanced precursor lesions), but also the colonoscopy workload. A lower f-Hb threshold is associated with a higher sensitivity for neoplasia but a lower specificity and a heavier load of follow-up colonoscopies. Consequently, the threshold determines a program's impact on population CRC mortality and incidence, but also its feasibility and cost-effectiveness within a health-care system. DISCUSSION: We are entering a new era of non-invasive screening tests, where multiple biomarkers found in biological samples such as blood as well as feces, are being developed and evaluated. These typically specify a non-transparent algorithm, developed with machine learning, to provide a predictive dichotomous positive/negative result with a fixed associated clinical accuracy and colonoscopy workload. This will restrict use of new tests in jurisdictions where the accuracy and workload implications do not match the desired screening program outcomes. CONCLUSION: However, similar to flexible FIT positivity thresholds, it would be ideal if new tests also provide capacity for screening program providers to select the positivity threshold that delivers their desired screening outcomes in a feasible manner. How marketing, distribution and reimbursement of non-invasive tests are approved, funded and implemented varies widely across jurisdictions and must be taken into account.
RESUMO
BACKGROUND: Lipid accumulation product (LAP) is a novel predictor index of central lipid accumulation associated with metabolic and cardiovascular diseases. This study aims to investigate the accuracy of LAP for the screening of metabolic syndrome (MetS) in general adult males and females and its comparison with other lipid-related indicators. METHODS: A systematic literature search was conducted in PubMed, Scopus, Web of Science, Cumulative Index to Nursing and Allied Health Literature (CINAHL), and ProQuest for eligible studies up to May 8, 2024. Outcomes were pooled mean difference (MD), odds ratio (OR), and diagnostic accuracy parameters (sensitivity, specificity, and area under the summary receiver operating characteristic [AUSROC] curve). Comparative analysis was conducted using Z-test. RESULTS: Forty-three studies involving 202,313 participants (98,164 males and 104,149 females) were included. Pooled MD analysis showed that LAP was 45.92 (P < 0.001) and 41.70 units (P < 0.001) higher in men and women with MetS, respectively. LAP was also significantly associated with MetS, with pooled ORs of 1.07 (P < 0.001) in men and 1.08 (P < 0.001) in women. In men, LAP could detect MetS with a pooled sensitivity of 85% (95% CI: 82%-87%), specificity of 81% (95% CI: 80%-83%), and AUSROC curve of 0.88 (95% CI: 0.85-0.90), while in women, LAP had a sensitivity of 83% (95% CI: 80%-86%), specificity of 80% (95% CI: 78%-82%), and AUSROC curve of 0.88 (95% CI: 0.85-0.91). LAP had a significantly higher AUSROC curve (P < 0.05) for detecting MetS compared to body mass index (BMI), waist-to-height ratio (WHtR), waist-to-hip ratio (WHR), body roundness index (BRI), a body shape index (ABSI), body adiposity index (BAI), conicity index (CI) in both genders, and waist circumference (WC) and abdominal volume index (AVI) in females. CONCLUSION: LAP may serve as a simple, cost-effective, and more accurate screening tool for MetS in general adult male and female populations.
Assuntos
Adiposidade , Produto da Acumulação Lipídica , Síndrome Metabólica , Humanos , Síndrome Metabólica/diagnóstico , Feminino , Masculino , Adulto , Curva ROC , Programas de Rastreamento/métodos , Fatores Sexuais , Circunferência da CinturaRESUMO
The development of methods for the meta-analysis of diagnostic test accuracy (DTA) studies is still an active area of research. While methods for the standard case where each study reports a single pair of sensitivity and specificity are nearly routinely applied nowadays, methods to meta-analyze receiver operating characteristic (ROC) curves are not widely used. This situation is more complex, as each primary DTA study may report on several pairs of sensitivity and specificity, each corresponding to a different threshold. In a case study published earlier, we applied a number of methods for meta-analyzing DTA studies with multiple thresholds to a real-world data example (Zapf et al., Biometrical Journal. 2021; 63(4): 699-711). To date, no simulation study exists that systematically compares different approaches with respect to their performance in various scenarios when the truth is known. In this article, we aim to fill this gap and present the results of a simulation study that compares three frequentist approaches for the meta-analysis of ROC curves. We performed a systematic simulation study, motivated by an example from medical research. In the simulations, all three approaches worked partially well. The approach by Hoyer and colleagues was slightly superior in most scenarios and is recommended in practice.
Assuntos
Biometria , Metanálise como Assunto , Curva ROC , Biometria/métodos , Testes Diagnósticos de Rotina/métodos , Humanos , Simulação por ComputadorRESUMO
INTRODUCTION: The diagnosis of endometriomas in patients with endometriosis is of primary importance because it influences the management and prognosis of infertility and pain. Imaging techniques are evolving constantly. This study aimed to systematically assess the diagnostic accuracy of transvaginal ultrasound (TVUS) and magnetic resonance imaging (MRI) in detecting endometrioma using the surgical visualisation of lesions with or without histopathological confirmation as reference standards in patients of reproductive age with suspected endometriosis. METHODS: PubMed, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature and ClinicalTrials.gov databases were searched from their inception to 12 October 2022, using a manual search for additional articles. Two authors independently performed title, abstract and full-text screening of the identified records, extracted study details and quantitative data and assessed the quality of the studies using the 'Quality Assessment of Diagnostic Accuracy Study 2' tool. Bivariate random-effects models were used to determine the pooled sensitivity and specificity, compare the two imaging modalities and evaluate the sources of heterogeneity. RESULTS: Sixteen prospective studies (10 assessing TVUS, 4 assessing MRI and 2 assessing both TVUS and MRI) were included, representing 1976 participants. Pooled TVUS and MRI sensitivities for endometrioma were 0.89 (95% confidence interval 'CI', 0.86-0.92) and 0.94 (95% CI, 0.74-0.99), respectively (indirect comparison p-value of 0.47). Pooled TVUS and MRI specificities for endometrioma were 0.95 (95% CI, 0.92-0.97) and 0.94 (95% CI, 0.89-0.97), respectively (indirect comparison p-value of 0.51). These studies had a high or unclear risk of bias. A direct comparison (all participants undergoing TVUS and MRI) of the modalities was available in only two studies. CONCLUSION: TVUS and MRI have high accuracy for diagnosing endometriomas; however, high-quality studies comparing the two modalities are lacking.
The diagnosis of endometriomas in patients with endometriosis impacts infertility and pain management. We performed a systematic review and meta-analysis to assess the accuracy of transvaginal ultrasound and magnetic resonance imaging for the diagnosis of endometrioma in patients of reproductive age with suspected endometriosis, and to compare the accuracy of the two imaging modalities. Five databases (PubMed, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature and ClinicalTrials.gov databases) were searched. Sixteen prospective studies were included, representing 1976 participants. We found high accuracy of transvaginal ultrasound and magnetic resonance imaging for diagnosing endometriomas. There was no statistically significant difference in diagnostic accuracy between the two modalities. However, high-quality studies comparing the two modalities in the same population are lacking.
Assuntos
Endometriose , Imageamento por Ressonância Magnética , Ultrassonografia , Adulto , Feminino , Humanos , Endometriose/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Sensibilidade e Especificidade , Ultrassonografia/métodos , Vagina/diagnóstico por imagemRESUMO
BACKGROUND & AIMS: We performed a clinical trial that aimed to inform the clinical utility of anorectal manometry (ARM) and balloon expulsion time (BET) as up-front tests to predict outcomes with community-based pelvic floor physical therapy as the next best step to address chronic constipation after failing an empiric trial of soluble fiber supplementation or osmotic laxatives. METHODS: We enrolled 60 treatment-naïve patients with Rome IV functional constipation failing 2 weeks of soluble fiber supplementation or osmotic laxatives. All patients underwent ARM/BET (London protocol) followed by community-based pelvic floor physical therapy. Outcomes were assessed at baseline and 12 weeks. The primary end point was clinical response (Patient Assessment of Constipation-Symptoms instrument). RESULTS: Fifty-three patients completed pelvic rehabilitation and the post-treatment questionnaire. Contemporary frameworks define dyssynergia on balloon expulsion time and dyssynergic patterns (ARM), but these parameters did not inform clinical outcomes (area under the curve [AUC], <0.6). Squeeze pressure (>192.5 mm Hg on at least 1 of 3 attempts; sensitivity, 47.6%; specificity, 83.9%) and limited squeeze duration (inability to sustain 50% of squeeze pressure for >20 seconds; sensitivity, 71.4%; specificity, 58.1%) were the strongest predictors of clinical outcomes. Combining BET with squeeze duration (BET greater than 6.5 seconds and limited squeeze duration) improved predictive accuracy (AUC, 0.75; 95% CI, 0.59-0.90). BET poorly predicted outcomes as a single test (AUC, 0.54; 95% CI, 0.38-0.69). CONCLUSIONS: Using ARM to evaluate squeeze profiles, rather than dyssynergia, appears useful to screen patients with chronic constipation for up-front pelvic floor physical therapy based on likelihood of response. BET appears noninformative as a single screening test (ClinicalTrials.gov: NCT04159350).
Assuntos
Laxantes , Diafragma da Pelve , Humanos , Canal Anal , Ataxia/terapia , Constipação Intestinal/diagnóstico , Constipação Intestinal/terapia , Defecação/fisiologia , Manometria/métodos , Diafragma da Pelve/fisiologia , Modalidades de Fisioterapia , RetoRESUMO
BACKGROUND & AIMS: Rectal evacuation disorders are common among constipated patients. We aimed to evaluate the accuracy of an investigational point-of-care test (rectal expulsion device [RED]) to predict outcomes with community-based pelvic floor physical therapy. METHODS: We enrolled patients meeting Rome IV criteria for functional constipation failing fiber/laxatives for more than 2 weeks. RED was inserted and self-inflated, and then time-to-expel was measured in a left lateral position. All patients underwent empiric community-based pelvic floor physical therapy in routine care with outcomes measured at 12 weeks. The primary end point was global clinical response (Patient Assessment of Constipation Symptoms score reduction, >0.75 vs baseline). Secondary end points included improvement in health-related quality-of-life (Patient Assessment of Constipation Quality of Life score reduction, >1.0) and complete spontaneous bowel movement frequency (Food and Drug Administration complete spontaneous bowel movement responder definition). RESULTS: Thirty-nine patients enrolled in a feasibility phase to develop the use-case protocol. Sixty patients enrolled in a blinded validation phase; 52 patients (mean, 46.9 y; 94.2% women) were included in the intention-to-treat analysis. In the left lateral position, RED predicted global clinical response (generalized area under the curve [gAUC], 0.67; 95% CI, 0.58-0.76]), health-related quality-of-life response (gAUC, 0.67; 95% CI, 0.58-0.77; P < .001), and complete spontaneous bowel movement response (gAUC, 0.63; 95% CI, 0.57-0.71; P < .001). As a screening test, a normal RED effectively rules out evacuation disorders (expected clinical response, 8.9%; P = .042). Abnormal RED in the left lateral position (defined as expulsion within 5 seconds or >120 seconds) predicted 48.9% clinical response to physical therapy. A seated maneuver enhanced the likelihood of clinical response (71.1% response with seated RED retained >13 seconds) but likely is unnecessary in most settings. CONCLUSIONS: RED offers an opportunity to disrupt the paradigm by offering a personalized approach to managing chronic constipation in the community (Clinicaltrials.gov: NCT04159350).
Assuntos
Diafragma da Pelve , Doenças Retais , Humanos , Feminino , Masculino , Qualidade de Vida , Constipação Intestinal/diagnóstico , Constipação Intestinal/terapia , Defecação/fisiologia , Resultado do Tratamento , Modalidades de FisioterapiaRESUMO
BACKGROUND: The global spread of COVID-19 created an explosion in rapid tests with results in < 1 hour, but their relative performance characteristics are not fully understood yet. Our aim was to determine the most sensitive and specific rapid test for the diagnosis of SARS-CoV-2. METHODS: Design: Rapid review and diagnostic test accuracy network meta-analysis (DTA-NMA). ELIGIBILITY CRITERIA: Randomized controlled trials (RCTs) and observational studies assessing rapid antigen and/or rapid molecular test(s) to detect SARS-CoV-2 in participants of any age, suspected or not with SARS-CoV-2 infection. INFORMATION SOURCES: Embase, MEDLINE, and Cochrane Central Register of Controlled Trials, up to September 12, 2021. OUTCOME MEASURES: Sensitivity and specificity of rapid antigen and molecular tests suitable for detecting SARS-CoV-2. Data extraction and risk of bias assessment: Screening of literature search results was conducted by one reviewer; data abstraction was completed by one reviewer and independently verified by a second reviewer. Risk of bias was not assessed in the included studies. DATA SYNTHESIS: Random-effects meta-analysis and DTA-NMA. RESULTS: We included 93 studies (reported in 88 articles) relating to 36 rapid antigen tests in 104,961 participants and 23 rapid molecular tests in 10,449 participants. Overall, rapid antigen tests had a sensitivity of 0.75 (95% confidence interval 0.70-0.79) and specificity of 0.99 (0.98-0.99). Rapid antigen test sensitivity was higher when nasal or combined samples (e.g., combinations of nose, throat, mouth, or saliva samples) were used, but lower when nasopharyngeal samples were used, and in those classified as asymptomatic at the time of testing. Rapid molecular tests may result in fewer false negatives than rapid antigen tests (sensitivity: 0.93, 0.88-0.96; specificity: 0.98, 0.97-0.99). The tests with the highest sensitivity and specificity estimates were the Xpert Xpress rapid molecular test by Cepheid (sensitivity: 0.99, 0.83-1.00; specificity: 0.97, 0.69-1.00) among the 23 commercial rapid molecular tests and the COVID-VIRO test by AAZ-LMB (sensitivity: 0.93, 0.48-0.99; specificity: 0.98, 0.44-1.00) among the 36 rapid antigen tests we examined. CONCLUSIONS: Rapid molecular tests were associated with both high sensitivity and specificity, while rapid antigen tests were mainly associated with high specificity, according to the minimum performance requirements by WHO and Health Canada. Our rapid review was limited to English, peer-reviewed published results of commercial tests, and study risk of bias was not assessed. A full systematic review is required. REVIEW REGISTRATION: PROSPERO CRD42021289712.
Assuntos
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , COVID-19/diagnóstico , Metanálise em Rede , Viés , Testes Diagnósticos de Rotina , Sensibilidade e Especificidade , Teste para COVID-19RESUMO
Forensic neuropsychological examinations to detect malingering in patients with neurocognitive, physical, and psychological dysfunction have tremendous social, legal, and economic importance. Thousands of studies have been published to develop and validate methods to forensically detect malingering based largely on approximately 50 validity tests, including embedded and stand-alone performance and symptom validity tests. This is Part II of a two-part review of statistical and methodological issues in the forensic prediction of malingering based on validity tests. The Part I companion paper explored key statistical issues. Part II examines related methodological issues through conceptual analysis, statistical simulations, and reanalysis of findings from prior validity test validation studies. Methodological issues examined include the distinction between analog simulation and forensic studies, the effect of excluding too-close-to-call (TCTC) cases from analyses, the distinction between criterion-related and construct validation studies, and the application of the Revised Quality Assessment of Diagnostic Accuracy Studies tool (QUADAS-2) in all Test of Memory Malingering (TOMM) validation studies published within approximately the first 20 years following its initial publication to assess risk of bias. Findings include that analog studies are commonly confused for forensic validation studies, and that construct validation studies are routinely presented as if they were criterion-reference validation studies. After accounting for the exclusion of TCTC cases, actual classification accuracy was found to be well below claimed levels. QUADAS-2 results revealed that extant TOMM validation studies all had a high risk of bias, with not a single TOMM validation study with low risk of bias. Recommendations include adoption of well-established guidelines from the biomedical diagnostics literature for good quality criterion-referenced validation studies and examination of implications for malingering determination practices. Design of future studies may hinge on the availability of an incontrovertible reference standard of the malingering status of examinees.
RESUMO
The thoughtful commentaries in this volume of Drs. Bush, Jewsbury, and Faust add to the impact of the two reviews in this volume of statistical and methodological issues in the forensic neuropsychological determination of malingering based on performance and symptom validity tests (PVTs and SVTs). In his commentary, Dr. Bush raises, among others, the important question of whether such malingering determinations can still be considered as meeting the legal Daubert standard which is the basis for neuropsychological expert testimony. Dr. Jewsbury focuses mostly on statistical issues and agrees with two key points of the statistical review: Positive likelihood chaining is not a mathematically tenable method to combine findings of multiple PVTs and SVTs, and the Simple Bayes method is not applicable to malingering determinations. Dr. Faust adds important narrative texture to the implications for forensic neuropsychological practice and points to a need for research into factors other than malingering that may explain PVT and SVT failures. These commentaries put into even sharper focus the serious questions raised in the reviews about the scientific basis of present practices in the forensic neuropsychological determination of malingering.
RESUMO
Forensic neuropsychological examinations with determination of malingering have tremendous social, legal, and economic consequences. Thousands of studies have been published aimed at developing and validating methods to diagnose malingering in forensic settings, based largely on approximately 50 validity tests, including embedded and stand-alone performance validity tests. This is the first part of a two-part review. Part I explores three statistical issues related to the validation of validity tests as predictors of malingering, including (a) the need to report a complete set of classification accuracy statistics, (b) how to detect and handle collinearity among validity tests, and (c) how to assess the classification accuracy of algorithms for aggregating information from multiple validity tests. In the Part II companion paper, three closely related research methodological issues will be examined. Statistical issues are explored through conceptual analysis, statistical simulations, and through reanalysis of findings from prior validation studies. Findings suggest extant neuropsychological validity tests are collinear and contribute redundant information to the prediction of malingering among forensic examinees. Findings further suggest that existing diagnostic algorithms may miss diagnostic accuracy targets under most realistic conditions. The review makes several recommendations to address these concerns, including (a) reporting of full confusion table statistics with 95% confidence intervals in diagnostic trials, (b) the use of logistic regression, and (c) adoption of the consensus model on the "transparent reporting of multivariate prediction models for individual prognosis or diagnosis" (TRIPOD) in the malingering literature.
RESUMO
BACKGROUND: Accurate diagnosis of axillary lymph node metastasis (ALNM) of breast cancer patients is important to guide local and systemic treatment. PURPOSE: To evaluate the diagnostic performance of different imaging modalities for ALNM in patients with breast cancer. STUDY TYPE: Systematic review and network meta-analysis (NMA). SUBJECTS: Sixty-one original articles with 8011 participants. FIELD STRENGTH: 1.5 T and 3.0 T. ASSESSMENT: We used the QUADAS-2 and QUADAS-C tools to assess the risk of bias in eligible studies. The identified articles assessed ultrasonography (US), MRI, mammography, ultrasound elastography (UE), PET, CT, PET/CT, scintimammography, and PET/MRI. STATISTICAL ANALYSIS: We used random-effects conventional meta-analyses and Bayesian network meta-analyses for data analyses. We used sensitivity and specificity, relative sensitivity and specificity, superiority index, and summary receiver operating characteristic curve (SROC) analysis to compare the diagnostic value of different imaging modalities. RESULTS: Sixty-one studies evaluated nine imaging modalities. At patient level, sensitivities of the nine imaging modalities ranged from 0.27 to 0.84 and specificities ranged from 0.84 to 0.95. Patient-based NMA showed that UE had the highest superiority index (5.95) with the highest relative sensitivity of 1.13 (95% confidence interval [CI]: 0.93-1.29) among all imaging methods when compared to US. At lymph node level, MRI had the highest superiority index (6.91) with highest relative sensitivity of 1.13 (95% CI: 1.01-1.23) and highest relative specificity of 1.11 (95% CI: 0.95-1.23) among all imaging methods when compared to US. SROCs also showed that UE and MRI had the largest area under the curve (AUC) at patient level and lymph node level of 0.92 and 0.94, respectively. DATA CONCLUSION: UE and MRI may be superior to other imaging modalities in the diagnosis of ALNM in breast cancer patients at the patient level and the lymph node level, respectively. Further studies are needed to provide high-quality evidence to validate our findings. EVIDENCE LEVEL: 3 TECHNICAL EFFICACY: Stage 2.