Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 280
Filtrar
Más filtros

Tipo del documento
Intervalo de año de publicación
1.
Gastrointest Endosc ; 2024 Mar 08.
Artículo en Inglés | MEDLINE | ID: mdl-38462057

RESUMEN

BACKGROUND AND AIMS: The modified Rutgeerts' score (mRS) is widely used for the assessment of endoscopic postoperative recurrence (ePOR) in Crohn's disease (CD) after ileocolic resection to guide therapeutic decisions. To improve the validity and prognostic value of this endoscopic assessment, two new scores have been proposed. This study assessed the interobserver agreement of the current (mRS) and new endoscopic scores for ePOR in CD. METHODS: Sixteen Dutch academic and non-academic IBD specialists assessed endoscopic videos (n=71) of postoperative CD patients (n=66) retrieved from nine Dutch centers. Each video was assessed for the degree of inflammation by four gastroenterologists using the mRS and the new proposed endoscopic score: REMIND score (separate score of anastomosis and neoterminal ileum) and updated Rutgeerts score (assessment of lesions at the anastomotic line, ileal inlet, ileal body and neoterminal ileum). In addition, lesions at the ileal body, ileal inlet, neoterminal ileum, colonic and/or ileal blind loop were separately assessed. Interobserver agreement was assessed using Fleiss' weighted kappa. RESULTS: Fleiss' weighted kappa for the mRS was 0.67 (95% confidence interval [CI] 0.59-0.74). The weighted kappa for the REMIND score was 0.73 (95% CI 0.65-0.80) for lesions in the neoterminal ileum and 0.46 (95% CI 0.35-0.58) for anastomotic lesions. The weighted kappa for the updated Rutgeerts' score was 0.69 (95% CI 0.62-0.77). The weighted kappa for lesions in the ileal body, ileal inlet, neoterminal ileum, colonic and ileal blind loop was 0.61 (95% CI 0.49-0.73), 0.63 (95% CI 0.54-0.72), 0.61 (95% CI 0.49-0.74), 0.83 (95% CI 0.62-1.00) and 0.68 (95% CI 0.46-0.89). CONCLUSION: The interobserver agreement of the mRS is substantial. Similarly, the interobserver agreement is substantial for the updated Rutgeerts' score. According to the REMIND score, the interobserver agreement was substantial for lesions in the neoterminal ileum, whereas only moderate for anastomotic lesions. Since therapeutic decisions in clinical practice are based on these assessments and these scores are used as outcome measure in clinical studies, further improvement of the interobserver agreement is essential.

2.
BMC Pregnancy Childbirth ; 24(1): 136, 2024 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-38355457

RESUMEN

BACKGROUND: While the effectiveness of cardiotocography in reducing neonatal morbidity is still debated, it remains the primary method for assessing fetal well-being during labor. Evaluating how accurately professionals interpret cardiotocography signals is essential for its effective use. The objective was to evaluate the accuracy of fetal hypoxia prediction by practitioners through the interpretation of cardiotocography signals and clinical variables during labor. MATERIAL AND METHODS: We conducted a cross-sectional online survey, involving 120 obstetric healthcare providers from several countries. One hundred cases, including fifty cases of fetal hypoxia, were randomly assigned to participants who were invited to predict the fetal outcome (binary criterion of pH with a threshold of 7.15) based on the cardiotocography signals and clinical variables. After describing the participants, we calculated (with a 95% confidence interval) the success rate, sensitivity and specificity to predict the fetal outcome for the whole population and according to pH ranges, professional groups and number of years of experience. Interobserver agreement and reliability were evaluated using the proportion of agreement and Cohen's kappa respectively. RESULTS: The overall ability to predict a pH level below 7.15 yielded a success rate of 0.58 (95% CI 0.56-0.60), a sensitivity of 0.58 (95% CI 0.56-0.60) and a specificity of 0.63 (95% CI 0.61-0.65). No significant difference in the success rates was observed with respect to profession and number of years of experience. The success rate was higher for the cases with a pH level below 7.05 (0.69) and above 7.20 (0.66) compared to those falling between 7.05 and 7.20 (0.48). The proportion of agreement between participants was good (0.82), with an overall kappa coefficient indicating substantial reliability (0.63). CONCLUSIONS: The use of an online tool enabled us to collect a large amount of data to analyze how practitioners interpret cardiotocography data during labor. Despite a good level of agreement and reliability among practitioners, the overall accuracy is poor, particularly for cases with a neonatal pH between 7.05 and 7.20. Factors such as profession and experience level do not present notable impact on the accuracy of the annotations. The implementation and use of a computerized cardiotocography analysis software has the potential to enhance the accuracy to detect fetal hypoxia, especially for ambiguous cardiotocography tracings.


Asunto(s)
Cardiotocografía , Hipoxia Fetal , Embarazo , Recién Nacido , Femenino , Humanos , Cardiotocografía/métodos , Hipoxia Fetal/diagnóstico , Variaciones Dependientes del Observador , Reproducibilidad de los Resultados , Estudios Transversales , Frecuencia Cardíaca Fetal
3.
Acta Obstet Gynecol Scand ; 103(1): 68-76, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37890863

RESUMEN

INTRODUCTION: It is a shortcoming of traditional cardiotocography (CTG) classification table formats that CTG traces are frequently classified differently by different users, resulting in poor interobserver agreements. A fast-and-frugal tree (FFTree) flow chart may help provide better concordance because it is straightforward and has clearly structured binary questions with understandable "yes" or "no" responses. The initial triage to determine whether a fetus is suitable for labor when utilizing fetal ECG ST analysis (STAN) is very important, since a fetus with restricted capacity to respond to hypoxic stress may not generate STAN events and therefore may become falsely negative. This study aimed to compare physiology-focused FFTree CTG interpretation with FIGO classification for assessing the suitability for STAN monitoring. MATERIAL AND METHODS: A retrospective study of 36 CTG traces with a high proportion of adverse outcomes (17/36) selected from a European multicenter study database. Eight experienced European obstetricians evaluated the initial 40 minutes of the CTG recordings and judged whether STAN was a suitable fetal surveillance method and whether intervention was indicated. The experts rated the CTGs using the FFTree and FIGO classifications at least 6 weeks apart. Interobserver agreements were calculated using proportions of agreement and Fleiss' kappa (κ). RESULTS: The proportions of agreement for "not suitable for STAN" were for FIGO 47% (95% confidence interval [CI] 42%-52%) and for FFTree 60% (95% CI 56-64), ie a significant difference; the corresponding figures for "yes, suitable" were 74% (95% CI 71-77) and 70% (95% CI 67-74). For "intervention needed" the figures were 52% (95% CI 47-56) vs 58% (95% CI 54-62) and for "expectant management" 74% (95% CI 71-77) vs 72% (95% CI 69-75). Fleiss' κ agreement on "suitability for STAN" was 0.50 (95% CI 0.44-0.56) for the FIGO classification and 0.57 (95% CI 0.51-0.63) for the FFTree classification; the corresponding figures for "intervention or expectancy" were 0.53 (95% CI 0.47-0.59) and 0.57 (95% CI 0.51-0.63). CONCLUSIONS: The proportion of agreement among expert obstetricians using the FFTree physiological approach was significantly higher compared with the traditional FIGO classification system in rejecting cases not suitable for STAN monitoring. That might be of importance to avoid false negative STAN recordings. Other agreement figures were similar. It remains to be shown whether the FFTree simplicity will benefit less experienced users and how it will work in real-world clinical scenarios.


Asunto(s)
Electrocardiografía , Monitoreo Fetal , Triaje , Femenino , Humanos , Embarazo , Cardiotocografía/métodos , Electrocardiografía/métodos , Monitoreo Fetal/métodos , Feto , Frecuencia Cardíaca Fetal/fisiología , Variaciones Dependientes del Observador , Estudios Retrospectivos
4.
Actas Dermosifiliogr ; 2024 Jul 05.
Artículo en Inglés, Español | MEDLINE | ID: mdl-38972585

RESUMEN

INTRODUCTION: Since the field of dermatopathology is not an exact science, it is prone to personal subjectivity, which sometimes causes disagreements on the diagnosis and assessment of some histological features. In the case of melanoma, some variables such as regression are associated with low interobserver agreement. On the contrary, other variables such as the measurement of Breslow thickness show high reproducibility. OBJECTIVE: The main objective of our study was to investigate multiple features of 60 consecutive cases of melanoma to establish interobserver reproducibility. METHODS AND MAIN RESULTS: We conducted an observational and descriptive study at Hospital de Manises, Valencia, Spain, IVO Foundation, Valencia, Spain, and Hospital 12 de Octubre, Madrid, Spain. The mean level of agreement of all study variables was moderate (Cohen's kappa coefficient statistic = 0.5). The highest agreement corresponded to polypoid morphology, pigmentation, ulceration, and solar elastosis. On the other hand, the lowest level agreement was reached for the presence of cellular pleomorphism and tumor necrosis. CONCLUSIONS: Our mean level of agreement was moderate, which reflects that some of the measured characteristics such as cellular pleomorphism or the presence of necrosis cannot be used for future studies or must be redefined and their reproducibility, reestablished. When conducting a research study, it is necessary to analyze the study variables to demonstrate their validity to measure or classify a certain feature. It is also advisable to warrant that that the variables are reproducible to be able to use them for other studies or in the routine clinical practice.

5.
Stroke ; 54(6): 1587-1592, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37154054

RESUMEN

BACKGROUND: The Heidelberg Bleeding Classification, developed for computed tomography, is also frequently used to classify intracranial hemorrhage (ICH) on magnetic resonance imaging. Additionally, the presence of any ICH is frequently used as (safety) outcome measure in clinical stroke trials that evaluate acute interventions. We assessed the interobserver agreement on the presence of any ICH and the type of ICH according to the Heidelberg Bleeding Classification on magnetic resonance imaging in patients treated with reperfusion therapy. METHODS: We used 300 magnetic resonance imaging scans including susceptibility-weighted imaging or T2*-weighted gradient echo imaging of ischemic stroke patients within 1 week after reperfusion therapy. Six observers, blinded to clinical characteristics except for suspected location of the infarction, independently rated ICH according to the Heidelberg Bleeding Classification in random pairs. Percent agreement and Cohen's kappa (κ) were estimated for the presence of any ICH (yes/no), and for agreement on the Heidelberg Bleeding Classification class 1 and 2. For the Heidelberg Bleeding Classification class 1 and 2, weighted κ was estimated to take the degree of disagreement into account. RESULTS: In 297 of 300 scans, the quality of scans was sufficient to score ICH. Observers agreed on the presence or absence of any ICH in 264 of 297 scans (88.9%; κ 0.78 [95% CI, 0.71-0.85]). There was agreement on the Heidelberg Bleeding Classification class 1 and 2 and no ICH in class 1 and 2 in 226 of 297 scans (76.1%; κ 0.63 [95% CI, 0.56-0.69]; weighted κ 0.90 [95% CI, 0.87-0.93]). CONCLUSIONS: The presence of any ICH can be reliably scored on magnetic resonance imaging and can, therefore, be used as (safety) outcome measure in clinical stroke trials that evaluate acute interventions. Agreement of ICH types according to the Heidelberg Bleeding Classification is substantial and disagreements are small.


Asunto(s)
Isquemia Encefálica , Accidente Cerebrovascular Isquémico , Accidente Cerebrovascular , Humanos , Isquemia Encefálica/diagnóstico por imagen , Isquemia Encefálica/terapia , Variaciones Dependientes del Observador , Hemorragias Intracraneales/diagnóstico por imagen , Hemorragias Intracraneales/patología , Accidente Cerebrovascular/terapia , Imagen por Resonancia Magnética/métodos , Hemorragia Cerebral
6.
Mod Pathol ; 36(1): 100009, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36788064

RESUMEN

The classification of human epidermal growth factor receptor 2 (HER2) expression is optimized to detect HER2-amplified breast cancer (BC). However, novel HER2-targeting agents are also effective for BCs with low levels of HER2. This raises the question whether the current guidelines for HER2 testing are sufficiently reproducible to identify HER2-low BC. The aim of this multicenter international study was to assess the interobserver agreement of specific HER2 immunohistochemistry scores in cases with negative HER2 results (0, 1+, or 2+/in situ hybridization negative) according to the current American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) guidelines. Furthermore, we evaluated whether the agreement improved by redefining immunohistochemistry (IHC) scoring criteria or by adding fluorescent in situ hybridization (FISH). We conducted a 2-round study of 105 nonamplified BCs. During the first assessment, 16 pathologists used the latest version of the ASCO/CAP guidelines. After a consensus meeting, the same pathologists scored the same digital slides using modified IHC scoring criteria based on the 2007 ASCO/CAP guidelines, and an extra "ultralow" category was added. Overall, the interobserver agreement was limited (4.7% of cases with 100% agreement) in the first round, but this was improved by clustering IHC categories. In the second round, the highest reproducibility was observed when comparing IHC 0 with the ultralow/1+/2+ grouped cluster (74.3% of cases with 100% agreement). The FISH results were not statistically different between HER2-0 and HER2-low cases, regardless of the IHC criteria used. In conclusion, our study suggests that the modified 2007 ASCO/CAP criteria were more reproducible in distinguishing HER2-0 from HER2-low cases than the 2018 ASCO/CAP criteria. However, the reproducibility was still moderate, which was not improved by adding FISH. This could lead to a suboptimal selection of patients eligible for novel HER2-targeting agents. If the threshold between HER2 IHC 0 and 1+ is to be clinically actionable, there is a need for clearer, more reproducible IHC definitions, training, and/or development of more accurate methods to detect this subtle difference in protein expression levels.


Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Hibridación Fluorescente in Situ/métodos , Neoplasias de la Mama/patología , Variaciones Dependientes del Observador , Inmunohistoquímica , Reproducibilidad de los Resultados , Receptor ErbB-2/genética , Biomarcadores de Tumor
7.
Mod Pathol ; 36(5): 100154, 2023 05.
Artículo en Inglés | MEDLINE | ID: mdl-36925069

RESUMEN

Reliable, reproducible methods to interpret programmed death ligand-1 (PD-L1) expression on tumor cells (TC) and immune cells (IC) are needed for pathologists to inform decisions associated with checkpoint inhibitor therapies. Our international study compared interpathologist agreement of PD-L1 expression using the combined positive score (CPS) under standardized conditions on samples from patients with gastric/gastroesophageal junction/esophageal adenocarcinoma. Tissue sections from 100 adenocarcinoma pretreatment biopsies were stained in a single laboratory using the PD-L1 immunohistochemistry 28-8 and 22C3 (Agilent) pharmDx immunohistochemical assays. PD-L1 CPS was evaluated by 12 pathologists on scanned whole slide images of these biopsies before and after a 2-hour CPS training session by Agilent. Additionally, pathologists determined PD-L1-positive TC, IC, and total viable TC on a single tissue fragment from 35 of 100 biopsy samples. Scoring agreement among pathologists was assessed using the intraclass correlation coefficient (ICC). Interobserver variability for CPS for 100 biopsies was high, with only fair agreement among pathologists both pre- (range, 0.45-0.55) and posttraining (range, 0.56-0.57) for both assays. For the 35 single biopsy samples, poor/fair agreement was also observed for the total number of viable TC (ICC, 0.09), number of PD-L1-positive IC (ICC, 0.19), number of PD-L1-positive TC (ICC, 0.54), and calculated CPS (ICC, 0.14), whereas calculated TC score (positive TC/total TC) showed excellent agreement (ICC, 0.82). Retrospective histologic review of samples with the poorest interpathologist agreement revealed the following as possible confounding factors: (1) ambiguous identification of positively staining stromal cells, (2) faint or variable intensity of staining, (3) difficulty in distinguishing membranous from cytoplasmic tumor staining, and (4) cautery and crush artifacts. These results emphasize the need for objective techniques to standardize the interpretation of PD-L1 expression when using the CPS methodology on gastric/gastroesophageal junction cancer biopsies to accurately identify patients most likely to benefit from immune checkpoint inhibitor therapy.


Asunto(s)
Adenocarcinoma , Neoplasias Gástricas , Humanos , Antígeno B7-H1/metabolismo , Estudios Retrospectivos , Variaciones Dependientes del Observador , Patólogos , Biomarcadores de Tumor , Adenocarcinoma/patología , Unión Esofagogástrica/metabolismo , Unión Esofagogástrica/patología , Neoplasias Gástricas/patología
8.
Strahlenther Onkol ; 199(11): 973-981, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37268767

RESUMEN

PURPOSE: The aim of this study was to evaluate interobserver agreement (IOA) on target volume definition for pancreatic cancer (PACA) within the Radiosurgery and Stereotactic Radiotherapy Working Group of the German Society of Radiation Oncology (DEGRO) and to identify the influence of imaging modalities on the definition of the target volumes. METHODS: Two cases of locally advanced PACA and one local recurrence were selected from a large SBRT database. Delineation was based on either a planning 4D CT with or without (w/wo) IV contrast, w/wo PET/CT, and w/wo diagnostic MRI. Novel compared to other studies, a combination of four metrics was used to integrate several aspects of target volume segmentation: the Dice coefficient (DSC), the Hausdorff distance (HD), the probabilistic distance (PBD), and the volumetric similarity (VS). RESULTS: For all three GTVs, the median DSC was 0.75 (range 0.17-0.95), the median HD 15 (range 3.22-67.11) mm, the median PBD 0.33 (range 0.06-4.86), and the median VS was 0.88 (range 0.31-1). For ITVs and PTVs the results were similar. When comparing the imaging modalities for delineation, the best agreement for the GTV was achieved using PET/CT, and for the ITV and PTV using 4D PET/CT, in treatment position with abdominal compression. CONCLUSION: Overall, there was good GTV agreement (DSC). Combined metrics appeared to allow a more valid detection of interobserver variation. For SBRT, either 4D PET/CT or 3D PET/CT in treatment position with abdominal compression leads to better agreement and should be considered as a very useful imaging modality for the definition of treatment volumes in pancreatic SBRT. Contouring does not appear to be the weakest link in the treatment planning chain of SBRT for PACA.


Asunto(s)
Adenocarcinoma , Neoplasias Pulmonares , Neoplasias Pancreáticas , Radiocirugia , Humanos , Radiocirugia/métodos , Adenocarcinoma/diagnóstico por imagen , Adenocarcinoma/radioterapia , Adenocarcinoma/cirugía , Tomografía Computarizada por Tomografía de Emisión de Positrones , Variaciones Dependientes del Observador , Neoplasias Pancreáticas/diagnóstico por imagen , Neoplasias Pancreáticas/radioterapia , Neoplasias Pancreáticas/cirugía , Planificación de la Radioterapia Asistida por Computador/métodos , Neoplasias Pulmonares/radioterapia , Neoplasias Pancreáticas
9.
AJR Am J Roentgenol ; 220(1): 126-133, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-35946860

RESUMEN

BACKGROUND. The simplified MR index of activity (MaRIA) score is used to assess the severity of small-bowel inflammation without use of IV contrast material. OBJECTIVE. The purposes of this study were to assess interreader agreement on the use of simplified MaRIA scores for evaluation of the inflammatory activity of terminal ileal Crohn disease in children and young adults and to assess whether simplified MaRIA scores change after biologic medical therapy. METHODS. This analysis was ancillary to a previously reported primary prospective research investigation. The study included 20 children and young adults with newly diagnosed ileal Crohn disease and 15 healthy control participants who underwent research small-bowel MRI examinations between December 2018 and October 2021. The participants with Crohn disease underwent baseline MRI and MRI 6 weeks and 6 months after beginning anti-tumor necrosis factor α-treatment as well as weighted pediatric Crohn disease activity index (wPCDAI) and C-reactive protein (CRP) assessment on the day of each examination. Control participants underwent one MRI examination. Four pediatric radiologists independently assigned simplified MaRIA scores using axial and coronal T2-weighted SSFSE images. Median simplified MaRIA score among readers was computed. Interreader agreement was assessed with Fleiss kappa coefficients and intra-class correlation coefficient (ICC). Analysis included the Mann-Whitney U test, Friedman test, and Spearman rank correlation. RESULTS. Simplified MaRIA scores (across time points and study groups) had substantial interreader agreement (κ = 0.65 [95% CI, 0.56-0.74]; ICC, 0.71 [95% CI, 0.63-0.78]). Median scores were higher in participants with Crohn disease at baseline than in healthy control participants (3.5 [IQR, 2.5-4.9] vs 0.5 [IQR, 0-2.0]; p < .001). Scores decreased after medical treatment in participants with Crohn disease (p = .005). The median score was 3.5 (IQR, 2.5-4.9) at baseline, 2.3 (IQR, 1.6-3.9) at 6 weeks, and 2.0 (IQR, 0.5-2.5) at 6 months. In participants with Crohn disease, median scores had significant correlations with wPCDAI (ρ = 0.46 [95% CI, 0.18-0.64]; p < .001) and CRP level (ρ = 0.48 [95% CI, 0.27-0.65]; p < .001). CONCLUSION. Radiologists had substantial agreement in use of simplified MaRIA scores to assess intestinal inflammation in ileal Crohn disease. Scores changed over time after medical therapy. CLINICAL IMPACT. The results support the simplified MaRIA score as an objective MRI-based clinical measure of intestinal inflammation in children and young adults with Crohn disease.


Asunto(s)
Enfermedad de Crohn , Adulto Joven , Humanos , Niño , Enfermedad de Crohn/diagnóstico por imagen , Enfermedad de Crohn/patología , Estudios Prospectivos , Intestino Delgado/diagnóstico por imagen , Intestino Delgado/patología , Imagen por Resonancia Magnética/métodos , Inflamación
10.
Artículo en Inglés | MEDLINE | ID: mdl-37970762

RESUMEN

OBJECTIVES: Timely and accurate preoperative diagnosis of uterine sarcoma will increase patient survival. The primary aim of this study was to describe the ultrasound features of uterine sarcoma compared with those of uterine leiomyoma based on the terms and definitions of the Morphological Uterus Sonographic Assessment (MUSA) group. A secondary aim was to assess the interobserver agreement for reporting on ultrasound features according to MUSA terminology. METHODS: This was a retrospective cohort study of patients with uterine sarcoma or uterine leiomyoma treated in a single tertiary center during the periods 1997-2019 and 2016-2019, respectively. Demographic characteristics, presenting symptoms and surgical outcomes were extracted from patients' files. Ultrasound images were re-evaluated independently by two sonologists using MUSA terms and definitions. Descriptive statistics were calculated and interobserver agreement was assessed using Cohen's κ (with squared weights) or intraclass correlation coefficient, as appropriate. RESULTS: A total of 107 patients were included, of whom 16 had a uterine sarcoma and 91 had a uterine leiomyoma. Abnormal uterine bleeding was the most frequent presenting symptom (69/107 (64%)). Compared with leiomyoma cases, patients with uterine sarcoma were older (median age, 65 (interquartile range (IQR), 60-70) years vs 48 (IQR, 43-52) years) and more likely to be postmenopausal (13/16 (81%) vs 15/91 (16%)). In the uterine sarcoma cohort, leiomyosarcoma was the most frequent histological type (6/16 (38%)), followed by adenosarcoma (4/16 (25%)). On ultrasound evaluation, according to Observers 1 and 2, the tumor border was irregular in most sarcomas (11/16 (69%) and 13/16 (81%) cases, respectively), but regular in most leiomyomas (65/91 (71%) and 82/91 (90%) cases, respectively). Lesion echogenicity was classified as non-uniform in 68/91 (75%) and 51/91 (56%) leiomyomas by Observers 1 and 2, respectively, and 15/16 (94%) uterine sarcomas by both observers. More than 60% of the uterine sarcomas showed acoustic shadows (11/16 (69%) and 10/16 (63%) cases by Observers 1 and 2, respectively), whereas calcifications were reported in a small minority (0/16 (0%) and 2/16 (13%) cases by Observers 1 and 2, respectively). In uterine sarcomas, intralesional vascularity was reported as moderate to abundant in 13/16 (81%) cases by Observer 1 and 15/16 (94%) cases by Observer 2, while circumferential vascularity was scored as moderate to abundant in 6/16 (38%) by both observers. Interobserver agreement for the presence of cystic areas, calcifications, acoustic shadow, central necrosis, color score (overall, intralesional and circumferential) and maximum diameter of the lesion was moderate. The agreement for shape of lesion, tumor border and echogenicity was fair. CONCLUSIONS: A postmenopausal patient presenting with abnormal uterine bleeding and a new or growing mesenchymal mass with irregular tumor borders, moderate-to-abundant intralesional vascularity, cystic areas and an absence of calcifications on ultrasonography is at a higher risk of having a uterine sarcoma. Interobserver agreement for most MUSA terms and definitions is moderate. Future studies should validate the abovementioned clinical and ultrasound findings on uterine mesenchymal tumors in a prospective multicenter fashion. © 2023 International Society of Ultrasound in Obstetrics and Gynecology.

11.
Ultrasound Obstet Gynecol ; 61(3): 399-407, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-35802514

RESUMEN

OBJECTIVES: To evaluate the reproducibility of lower uterine segment (LUS) thickness measurement before induction of labor (IOL), and to assess the relationship between LUS thickness and IOL outcomes. METHODS: This was a prospective cohort study of pregnant women undergoing IOL at term, conducted in a single tertiary hospital between July 2014 and February 2017. Women with a singleton pregnancy at ≥ 37 weeks' gestation, with a live fetus in cephalic presentation and a Bishop score of ≤ 6, were eligible for inclusion. Both nulliparous and parous women, and those with a previous Cesarean section (CS), were eligible. All women underwent transvaginal ultrasound assessment before IOL admission, and cervical length and LUS thickness were measured offline after delivery. Maternal and obstetric characteristics and Bishop score were recorded. The main outcome was the overall rate of CS after IOL, and secondary outcomes were CS for either failure to progress in the active phase of labor or failed IOL, and CS for failed IOL only. Interobserver agreement for measurement of LUS thickness between two operators was assessed using the intraclass correlation coefficient (ICC) and Bland-Altman analysis with the ANOVA test to evaluate systematic bias. Univariable and multivariable analysis were employed to evaluate the relationship between clinical and sonographic characteristics and IOL outcomes. RESULTS: Of 265 women included in the analysis, 195 (73.6%) had a vaginal delivery and 70 (26.4%) required a CS after IOL. Reproducibility analysis showed excellent interobserver agreement for the measurement of LUS thickness (ICC, 0.96 (95% CI, 0.93-0.98)). On Bland-Altman analysis, the mean difference in LUS thickness between the two operators was 0.15 mm (95% limits of agreement, -1.84 to 2.14 mm), and there was no evidence of systematic bias (ANOVA test, P = 0.46). Univariable analysis showed that LUS thickness was associated significantly with overall CS (P = 0.002), CS for failure to progress in the active phase of labor or failed IOL (P = 0.03) and CS for failed IOL (P = 0.037). On multivariable logistic regression analysis, LUS thickness was an independent predictive factor for overall CS (odds ratio (OR), 1.149 (95% CI, 1.031-1.281)) and CS for failure to progress in the active phase of labor or failed IOL (OR, 1.226 (95% CI, 1.039-1.445)). CONCLUSIONS: In women undergoing IOL at term, measurement of LUS thickness is feasible and reproducible, and is associated significantly with IOL outcome. © 2022 International Society of Ultrasound in Obstetrics and Gynecology.


Asunto(s)
Cesárea , Ultrasonografía Prenatal , Embarazo , Femenino , Humanos , Estudios Prospectivos , Reproducibilidad de los Resultados , Trabajo de Parto Inducido
12.
Pediatr Dev Pathol ; 26(4): 333-344, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37082923

RESUMEN

INTRODUCTION: Placental pathology is key for investigating adverse pregnancy outcomes, however, lack of standardization in reporting has limited clinical utility. We evaluated a novel placental pathology synoptic report, comparing its robustness to narrative reports, and assessed interobserver agreement. METHODS: 100 singleton placentas were included. Histology slides were examined by 2 senior perinatal pathologists and 2 pathology residents using a synoptic report (32 lesions). Historical narrative reports were compared to synoptic reports. Kappa scores were calculated for interobserver agreement between senior, resident, and senior vs resident pathologists. RESULTS: Synoptic reporting detected 169 (51.4%) lesion instances initially not included in historical reports. Amongst senior pathologists, 64% of all lesions examined demonstrated fair-to-excellent agreement (Kappa ≥0.41), with only 26% of Kappas ≥0.41 amongst those examined by resident pathologists. Well-characterized lesions (e.g., chorioamnionitis) demonstrated higher agreement, with lower agreement for uncommon lesions and those previously shown to have poor consensus. DISCUSSION: Synoptic reporting is one proposed method to address issues in placenta pathology reporting. The synoptic report generally identifies more lesions compared to the narrative report, however clinical significance remains unclear. Interobserver agreement is likely related to differential in experience. Further efforts to improve overall standardization of placenta pathology reporting are needed.


Asunto(s)
Patología Clínica , Placenta , Embarazo , Femenino , Humanos , Variaciones Dependientes del Observador , Resultado del Embarazo , Informe de Investigación
13.
J Ultrasound Med ; 42(4): 843-851, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-35796343

RESUMEN

OBJECTIVES: Lung ultrasound (LUS) has sparked significant interest during COVID-19. LUS is based on the detection and analysis of imaging patterns. Vertical artifacts and consolidations are some of the recognized patterns in COVID-19. However, the interrater reliability (IRR) of these findings has not been yet thoroughly investigated. The goal of this study is to assess IRR in LUS COVID-19 data and determine how many LUS videos and operators are required to obtain a reliable result. METHODS: A total of 1035 LUS videos from 59 COVID-19 patients were included. Videos were randomly selected from a dataset of 1807 videos and scored by six human operators (HOs). The videos were also analyzed by artificial intelligence (AI) algorithms. Fleiss' kappa coefficient results are presented, evaluated at both the video and prognostic levels. RESULTS: Findings show a stable agreement when evaluating a minimum of 500 videos. The statistical analysis illustrates that, at a video level, a Fleiss' kappa coefficient of 0.464 (95% confidence interval [CI] = 0.455-0.473) and 0.404 (95% CI = 0.396-0.412) is obtained for pairs of HOs and for AI versus HOs, respectively. At prognostic level, a Fleiss' kappa coefficient of 0.505 (95% CI = 0.448-0.562) and 0.506 (95% CI = 0.458-0.555) is obtained for pairs of HOs and for AI versus HOs, respectively. CONCLUSIONS: To examine IRR and obtain a reliable evaluation, a minimum of 500 videos are recommended. Moreover, the employed AI algorithms achieve results that are comparable with HOs. This research further provides a methodology that can be useful to benchmark future LUS studies.


Asunto(s)
COVID-19 , Humanos , Inteligencia Artificial , Reproducibilidad de los Resultados , Pulmón/diagnóstico por imagen , Ultrasonografía/métodos
14.
Tech Coloproctol ; 27(12): 1219-1225, 2023 12.
Artículo en Inglés | MEDLINE | ID: mdl-37036637

RESUMEN

PURPOSE: When an optical colonoscopy is carried out, Scope Guide can assist the endoscopist in determining the localization. In colon capsule endoscopy (CCE), this support is not available. To our knowledge, the interobserver agreement on landmark identification has never been studied. This study aims to investigate the interobserver agreement on landmark identification in CCE. METHODS: An interobserver study was carried out comparing the landmark identification (the ileocecal valve, hepatic flexure, splenic flexure, and anus) in CCE investigations between an external private contractor and three in-house CCE readers with different levels of experience. All CCE investigations analyzed in this study were carried out as a part of the Danish screening program for colorectal cancer. Patients were between 50 and 74 years old with a positive fecal immunochemical test (FIT). A random sample of 20 CCE investigations was taken from the total sample of more than 800 videos. RESULTS: Overall interobserver agreement on all landmarks was 51%. Interobserver agreement on the first cecal image (ileocecal valve), hepatic flexure, splenic flexure, and last rectal image (anus) was 72%, 29%, 22%, and 83%, respectively. The overall interobserver agreement, including only examinations with adequate bowel preparation (n = 16), was 54%, and for individual landmarks, 73%, 32%, 24%, and 85%. CONCLUSION: Overall interobserver agreement on all four landmarks from CCE was poor. Measures are needed to improve landmark identification in CCE investigations. Artificial intelligence could be a possible solution to this problem.


Asunto(s)
Endoscopía Capsular , Neoplasias Colorrectales , Humanos , Persona de Mediana Edad , Anciano , Variaciones Dependientes del Observador , Inteligencia Artificial , Neoplasias Colorrectales/diagnóstico por imagen , Estudios Prospectivos , Colonoscopía/métodos
15.
J Shoulder Elbow Surg ; 32(4): 713-728, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36481456

RESUMEN

BACKGROUND: Ultrasound is commonly used to assess rotator cuff repair (RCR), but no standardized criterion exists to characterize the tendon. PURPOSE: The aims of this study were to (1) develop content validity for ultrasound specific criteria to grade the postoperative appearance of a tendon after RCR, (2) assess the reliability of the criteria, and (3) assess the feasibility to use these assessments. METHODOLOGY: Following expert consultation and literature review for content validity, 2 scales were created: 1) the Fibrillar matrix, Echogenicity, Contour, Thickness, and Suture (FECTS) scale and 2) the Rotator Cuff Repair-Investigator Global Assessment (RCR-IGA). A prospective cohort study was undertaken on patients who had received a RCR and serial B-mode ultrasound images. Four raters assessed the 64-ultrasound images using the scales created in a blinded fashion using intraclass correlation coefficients. RESULTS: The FECTS scale was a composite score with 5 key parameters and the RCR-IGA scale was a 5-point global score. The intrarater reliability for the FECTS scale was excellent for the most experienced rater (0.92) and fair for the rater with no experience (0.72). The intrarater reliability for the RCR-IGA scale was excellent for 3 of the 4 raters (0.80-0.87) and fair when used by the least experienced rater (0.56). Inter-rater testing for all the FECTS scale parameters had excellent reliability (0.82-0.92) except for Fibrillar matrix (0.73). The average time to complete the FECTS scale per image was 23 seconds and 11 seconds for the RCR-IGA scale. CONCLUSION: The FECTS scale and the RCR-IGA scale are reliable tools to assess the ultrasonic appearance of the repaired rotator cuff tendon. The FECTS scale was more reliable for less experienced assessors. The RCR-IGA scale was easier, more time efficient and reliable for those with experience.


Asunto(s)
Lesiones del Manguito de los Rotadores , Humanos , Artroscopía/métodos , Inmunoglobulina A , Estudios Prospectivos , Reproducibilidad de los Resultados , Manguito de los Rotadores/diagnóstico por imagen , Manguito de los Rotadores/cirugía , Lesiones del Manguito de los Rotadores/diagnóstico por imagen , Lesiones del Manguito de los Rotadores/cirugía , Suturas , Resultado del Tratamiento , Ultrasonografía
16.
Emerg Radiol ; 30(4): 419-423, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37273151

RESUMEN

INTRODUCTION: Grey Scale Inversion Imaging (GSII), a radiology reading software, has been utilized to improve anatomical and pathological delineation and consequently increase the diagnostic accuracy in a variety of trauma and Orthopaedic conditions. OBJECTIVE/AIM: The objective of this study was to assess whether Grey Scale Inversion Imaging (GSII) has any impact on the diagnostic accuracy and inter-observer reliability in diagnosing neck of femur fractures. METHOD: We performed a retrospective, single-centre study, to identify 50 consecutive anteroposterior (AP) pelvis radiographs of patients who presented to our unit with suspected neck of femur fractures between 2020 and 2021. The images included a combination of normal pelvic radiographs and others with features suggestive either intracapsular or extracapsular neck of femur fractures, which had been confirmed on computed tomography (CT), magnetic resonance imaging (MRI) and/or subsequent surgery. Four independent observers (two Trauma and Orthopaedics (T&O) consultants, one T&O Trainee Registrar (ST3 level) and one Trainee Senior House Officer (SHO in T&O) reviewed the images and graded each radiograph image using the Likert scale in response to the statement "there is a fracture". Following this, the same radiographs were inverted to Grey Scale Inversion Imaging (GSII) grey scale images and reassessed. RAND correlation was used for statistical analysis. RESULTS: Overall, observers appeared to have similar accuracy with normal radiographic imaging and with GSI sequences. CONCLUSION: Grey Scale Inversion Imaging (GSII) of digital radiographs did not affect the diagnostic accuracy of detecting neck of femur fractures in our study.


Asunto(s)
Fracturas del Fémur , Humanos , Reproducibilidad de los Resultados , Estudios Retrospectivos , Fracturas del Fémur/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Fémur , Variaciones Dependientes del Observador
17.
J Foot Ankle Surg ; 62(4): 644-650, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-36813634

RESUMEN

This study aimed to develop a comprehensive classification system for fractures of the lateral process of the talus (LPTF) based on CT, and to evaluate its prognostic value, reliability and reproducibility. We retrospectively reviewed 42 patients involving LPTF with an average follow-up of 35.9 months for clinical and radiographic evaluations. In order to develop a comprehensive classification, a panel of experienced orthopedic surgeons discussed the cases. All fractures were classified according to Hawkins, McCrory-Bladin and new proposed classifications by 6 observers. The analysis of interobserver and intraobserver agreements was measured using kappa statistics. The new classification included 2 types based on presence of concomitant injuries or not, with type I consisting of 3 subtypes and type II of 5 subtypes. Average AOFAS score was 91.5 in the type Ia of new classification, 86 in type Ib, 90.5 in type Ic, 89 in type IIa, 76.7 in type IIb, 76.6 in type IIc, 91.3 in type IId, and 83.5 in type IIe. Interobserver and intraobserver reliability of the new classification system were almost perfect (κ = 0.776 and 0.837, respectively), showing a higher interobserver and intraobserver reliability compared to the Hawkins classification (κ 0.572 and 0.649, respectively) as well as McCrory-Bladin classification (κ = 0.582 and 0.685, respectively). The new classification system is a comprehensive one that takes into account concomitant injuries and shows good prognostic value with clinical outcomes. It is more reliable and reproducible and could be a useful tool for decision-making on treatment options for LPTF.


Asunto(s)
Fracturas Óseas , Astrágalo , Humanos , Reproducibilidad de los Resultados , Estudios Retrospectivos , Astrágalo/diagnóstico por imagen , Variaciones Dependientes del Observador , Fracturas Óseas/diagnóstico por imagen , Tomografía Computarizada por Rayos X
18.
Turk J Med Sci ; 53(5): 1214-1223, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38813029

RESUMEN

Background and aim: To evaluate and compare magnetic resonance imaging (MRI) sequences that could potentially be used in the diagnosis of coronavirus disease 2019 (COVID-19). Materials and methods: Included in the study were 42 patient who underwent thorax computed tomography (CT) for COVID-19 pneumonia and thorax MRI for any reason within 24 h after CT. The T2-weighted fast spin echo periodically rotated overlapping parallel lines with enhanced reconstruction (PROPELLER) (T2W-FSE-P), fast imaging employing steady-state acquisition, T2 fat-saturated FSE, axial T1 liver acquisition with volume acceleration (LAVA) and single-shot FSE images were compared in terms of their ability to show COVID-19 findings. Results: The mean age of the patients was 47.2 ± 24 years. Of the patients, 22 were male (52.4%) and 20 (47.6%) were female. The interobserver intraclass coefficient (ICC) for the image quality score was the highest in the T2W-FSE-P sequence and lowest in the T1 LAVA sequence. All of the lesion-based evaluations of the interobserver agreement were statistically significant, with the kappa value varying between 0.798 and 0.998. Conclusion: All 5 sequences evaluated in the study were successful in showing the parenchymal findings of COVID-19. Since the T2W-FSE-P sequence had the best scores in both interobserver agreement and ICC for the image quality score, it was considered that it can be included in thorax MRI examinations to assist the diagnosis of COVID-19.


Asunto(s)
COVID-19 , Imagen por Resonancia Magnética , SARS-CoV-2 , Humanos , COVID-19/diagnóstico por imagen , Masculino , Femenino , Persona de Mediana Edad , Imagen por Resonancia Magnética/métodos , Adulto , Tórax/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Anciano , Pulmón/diagnóstico por imagen
19.
Ophthalmology ; 129(7): e69-e76, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35157950

RESUMEN

PURPOSE: To validate a vascular severity score as an appropriate output for artificial intelligence (AI) Software as a Medical Device (SaMD) for retinopathy of prematurity (ROP) through comparison with ordinal disease severity labels for stage and plus disease assigned by the International Classification of Retinopathy of Prematurity, Third Edition (ICROP3), committee. DESIGN: Validation study of an AI-based ROP vascular severity score. PARTICIPANTS: A total of 34 ROP experts from the ICROP3 committee. METHODS: Two separate datasets of 30 fundus photographs each for stage (0-5) and plus disease (plus, preplus, neither) were labeled by members of the ICROP3 committee using an open-source platform. Averaging these results produced a continuous label for plus (1-9) and stage (1-3) for each image. Experts were also asked to compare each image to each other in terms of relative severity for plus disease. Each image was also labeled with a vascular severity score from the Imaging and Informatics in ROP deep learning system, which was compared with each grader's diagnostic labels for correlation, as well as the ophthalmoscopic diagnosis of stage. MAIN OUTCOME MEASURES: Weighted kappa and Pearson correlation coefficients (CCs) were calculated between each pair of grader classification labels for stage and plus disease. The Elo algorithm was also used to convert pairwise comparisons for each expert into an ordered set of images from least to most severe. RESULTS: The mean weighted kappa and CC for all interobserver pairs for plus disease image comparison were 0.67 and 0.88, respectively. The vascular severity score was found to be highly correlated with both the average plus disease classification (CC = 0.90, P < 0.001) and the ophthalmoscopic diagnosis of stage (P < 0.001 by analysis of variance) among all experts. CONCLUSIONS: The ROP vascular severity score correlates well with the International Classification of Retinopathy of Prematurity committee member's labels for plus disease and stage, which had significant intergrader variability. Generation of a consensus for a validated scoring system for ROP SaMD can facilitate global innovation and regulatory authorization of these technologies.


Asunto(s)
Retinopatía de la Prematuridad , Inteligencia Artificial , Diagnóstico por Imagen , Edad Gestacional , Humanos , Recién Nacido , Oftalmoscopía/métodos , Reproducibilidad de los Resultados , Retinopatía de la Prematuridad/diagnóstico
20.
Histopathology ; 80(4): 648-655, 2022 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-34601750

RESUMEN

AIMS: Management of anal dysplasia relies upon the accurate diagnosis of anal biopsy specimens. As institutions move towards subspecialty signout (SSSO), decisions must be made regarding whether to assign anal biopsies to the gastrointestinal (GI) or gynaecological (GYN) pathology service. MATERIALS AND RESULTS: We identified 200 archival tissue biopsies of anal mucosa and circulated them among three GI pathologists and three GYN pathologists. Each pathologist separately scored each biopsy as normal, atypical, low-grade squamous intra-epithelial lesion (LSIL) or high-grade squamous intra-epithelial lesion (HSIL). Every case that was called HSIL by at least one pathologist was stained with p16 immunostain and a 'gold standard' interpretation of whether or not a case represented HSIL was made. The GI pathologists agreed on 97 (49%) cases prior to consensus; the GYN pathologists agreed on 33 (17%). The sensitivities of the three GI pathologists in detecting HSIL against the 'gold standard' were 47, 100 and 21% and for the GYN pathologists the sensitivities were 74, 89 and 84%; the sensitivities of the GI and GYN consensus diagnoses were 74% each. The specificities of the three GI pathologists in detecting HSIL were 99, 90 and 100% and for the GYN pathologists the specificities were 99, 92 and 91%; the specificities of both the GI and GYN consensus diagnoses were 100%. CONCLUSIONS: A mild to moderate degree of interobserver variability exists in the diagnosis of anal dysplasia among pathologists. Our study indicates the utility of some form of consensus conference, as overall agreement among GI pathologists and among GYN pathologists improved following in-person consensus.


Asunto(s)
Canal Anal/patología , Neoplasias del Ano/patología , Conferencias de Consenso como Asunto , Gastroenterología , Ginecología , Patología Clínica , Biopsia , Humanos , Variaciones Dependientes del Observador
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA