Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 134
Filtrar
1.
J Sci Med Sport ; 2024 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-39217069

RESUMEN

OBJECTIVES: To determine whether spotters with medical training and experience in managing concussion have higher inter-rater reliability and accuracy than non-medical personnel when identifying video signs associated with concussion in Australian football. DESIGN: Retrospective cohort study. METHODS: Video clips were collected of all impacts potentially resulting in concussion during 2012 and 2013 Australian Football League (AFL) seasons. Raters were divided into medical doctors and a non-medical group comprising allied health practitioners (physiotherapists) and non-medical/non-allied health personnel (performance analysts). Raters assessed 102 randomly selected videos for signs of concussion. The inter-rater reliability was calculated. Sensitivity, specificity, positive and negative predictive values were calculated by comparing the rater responses to the consensus opinion from two highly experienced clinicians with expertise in concussion. RESULTS: No statistically significant difference in inter-rater reliability was observed between the medical doctors and the non-medical group. Both groups demonstrated good to excellent agreement for slow to get up, clutching at head/face and facial injury. Both groups displayed intra-class coefficient >0.55 for no protective action-floppy, loss of responsiveness, and motor incoordination, and displayed lowest agreement for no protective action-tonic posturing, impact seizure and blank/vacant look. No statistically significant difference was found between the groups for sensitivity, specificity, positive and negative predictive values for correctly classifying video signs compared to the expert consensus opinion. CONCLUSIONS: After completing sufficient standardised training and testing, medical and non-medical personnel demonstrate comparable reliability in identifying video signs of concussion in professional Australian football and may be suitable for the role of video spotter.

2.
Radiat Oncol ; 19(1): 90, 2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39010133

RESUMEN

BACKGROUND: The planification of radiation therapy (RT) for pancreatic cancer (PC) requires a dosimetric computed tomography (CT) scan to define the gross tumor volume (GTV). The main objective of this study was to compare the inter-observer variability in RT planning between the arterial and the venous phases following intravenous contrast. METHODS: PANCRINJ was a prospective monocentric study that included twenty patients with non-metastatic PC. Patients underwent a pre-therapeutic CT scan at the arterial and venous phases. The delineation of the GTV was performed by one radiologist (gold standard) and two senior radiation oncologists (operators). The primary objective was to compare the Jaccard conformity index (JCI) for the GTVs computed between the GS (gold standard) and the operators between the arterial and the venous phases with a Wilcoxon signed rank test for paired samples. The secondary endpoints were the geographical miss index (GMI), the kappa index, the intra-operator variability, and the dose-volume histograms between the arterial and venous phases. RESULTS: The median JCI for the arterial and venous phases were 0.50 (range, 0.17-0.64) and 0.41 (range, 0.23-0.61) (p = 0.10) respectively. The median GS-GTV was statistically significantly smaller compared to the operators at the arterial (p < 0.0001) and venous phases (p < 0.001), respectively. The GMI were low with few tumors missed for all patients with a median GMI of 0.07 (range, 0-0.79) and 0.05 (range, 0-0.39) at the arterial and venous phases, respectively (p = 0.15). There was a moderate agreement between the radiation oncologists with a median kappa index of 0.52 (range 0.38-0.57) on the arterial phase, and 0.52 (range 0.36-0.57) on the venous phase (p = 0.08). The intra-observer variability for GTV delineation was lower at the venous phase than at the arterial phase for the two operators. There was no significant difference between the arterial and the venous phases regarding the dose-volume histogram for the operators. CONCLUSIONS: Our results showed inter- and intra-observer variability in delineating GTV for PC without significant differences between the arterial and the venous phases. The use of both phases should be encouraged. Our findings suggest the need to provide training for radiation oncologists in pancreatic imaging and to collaborate within a multidisciplinary team.


Asunto(s)
Neoplasias Pancreáticas , Planificación de la Radioterapia Asistida por Computador , Tomografía Computarizada por Rayos X , Humanos , Neoplasias Pancreáticas/radioterapia , Neoplasias Pancreáticas/diagnóstico por imagen , Neoplasias Pancreáticas/patología , Planificación de la Radioterapia Asistida por Computador/métodos , Estudios Prospectivos , Masculino , Femenino , Anciano , Persona de Mediana Edad , Tomografía Computarizada por Rayos X/métodos , Dosificación Radioterapéutica , Anciano de 80 o más Años , Variaciones Dependientes del Observador , Carga Tumoral
3.
J Pers Med ; 14(7)2024 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-39064003

RESUMEN

BACKGROUND: Managing osteochondral cartilage defects (OCDs) of the talus is a common daily challenge in orthopaedics as they predispose patients to further cartilage damage and progression to osteoarthritis. Therefore, the implementation of a reliable tool to quantify the amount of cartilage damage that is present is of the essence. METHODS: We retrospectively identified 15 adult patients diagnosed with uncontained OCDs of the talus measuring <150 mm2, which were treated arthroscopically with bone marrow stimulation. Five independent assessors evaluated the pre-operative MRI scans with the AMADEUS scoring system (i.e., MR-based pre-operative assessment system) and the intra-/inter-observer variability was then calculated by means of the intraclass correlation coefficients (ICC) and Kappa (κ) statistics, respectively. In addition, the correlation between the mean AMADEUS scores and pre-operative self-reported outcomes as measured by the Manchester-Oxford foot questionnaire (MOxFQ) was assessed. RESULTS: The mean ICC and the κ statistic were 0.82 (95% CI [0.71, 0.94]) and 0.42 (95% CI [0.25, 0.59]). The Pearson correlation coefficient was found to be r = -0.618 (p = 0.014). CONCLUSIONS: The AMADEUS tool, which was originally designed to quantify knee osteochondral defect severity prior to cartilage repair surgery, demonstrated good reliability and moderate inter-observer variability for small OCDs of the talar shoulder. Given the strong negative correlation between the AMADEUS tool and pre-operative clinical scores, this tool could be implemented in clinical practise to reliably quantify the extent of the osteochondral defects of the talus.

4.
Neuroradiology ; 66(11): 2033-2042, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38980343

RESUMEN

PURPOSE: For patients with vestibular schwannomas (VS), a conservative observational approach is increasingly used. Therefore, the need for accurate and reliable volumetric tumor monitoring is important. Currently, a volumetric cutoff of 20% increase in tumor volume is widely used to define tumor growth in VS. The study investigates the tumor volume dependency on the limits of agreement (LoA) for volumetric measurements of VS by means of an inter-observer study. METHODS: This retrospective study included 100 VS patients who underwent contrast-enhanced T1-weighted MRI. Five observers volumetrically annotated the images. Observer agreement and reliability was measured using the LoA, estimated using the limits of agreement with the mean (LOAM) method, and the intraclass correlation coefficient (ICC). RESULTS: The 100 patients had a median average tumor volume of 903 mm3 (IQR: 193-3101). Patients were divided into four volumetric size categories based on tumor volume quartile. The smallest tumor volume quartile showed a LOAM relative to the mean of 26.8% (95% CI: 23.7-33.6), whereas for the largest tumor volume quartile this figure was found to be 7.3% (95% CI: 6.5-9.7) and when excluding peritumoral cysts: 4.8% (95% CI: 4.2-6.2). CONCLUSION: Agreement limits within volumetric annotation of VS are affected by tumor volume, since the LoA improves with increasing tumor volume. As a result, for tumors larger than 200 mm3, growth can reliably be detected at an earlier stage, compared to the currently widely used cutoff of 20%. However, for very small tumors, growth should be assessed with higher agreement limits than previously thought.


Asunto(s)
Medios de Contraste , Imagen por Resonancia Magnética , Neuroma Acústico , Variaciones Dependientes del Observador , Carga Tumoral , Humanos , Neuroma Acústico/diagnóstico por imagen , Neuroma Acústico/patología , Femenino , Masculino , Imagen por Resonancia Magnética/métodos , Persona de Mediana Edad , Estudios Retrospectivos , Adulto , Anciano , Reproducibilidad de los Resultados , Anciano de 80 o más Años , Aumento de la Imagen/métodos
5.
J Imaging ; 10(5)2024 May 09.
Artículo en Inglés | MEDLINE | ID: mdl-38786570

RESUMEN

Hyperfluorescence (HF) and reduced autofluorescence (RA) are important biomarkers in fundus autofluorescence images (FAF) for the assessment of health of the retinal pigment epithelium (RPE), an important indicator of disease progression in geographic atrophy (GA) or central serous chorioretinopathy (CSCR). Autofluorescence images have been annotated by human raters, but distinguishing biomarkers (whether signals are increased or decreased) from the normal background proves challenging, with borders being particularly open to interpretation. Consequently, significant variations emerge among different graders, and even within the same grader during repeated annotations. Tests on in-house FAF data show that even highly skilled medical experts, despite previously discussing and settling on precise annotation guidelines, reach a pair-wise agreement measured in a Dice score of no more than 63-80% for HF segmentations and only 14-52% for RA. The data further show that the agreement of our primary annotation expert with herself is a 72% Dice score for HF and 51% for RA. Given these numbers, the task of automated HF and RA segmentation cannot simply be refined to the improvement in a segmentation score. Instead, we propose the use of a segmentation ensemble. Learning from images with a single annotation, the ensemble reaches expert-like performance with an agreement of a 64-81% Dice score for HF and 21-41% for RA with all our experts. In addition, utilizing the mean predictions of the ensemble networks and their variance, we devise ternary segmentations where FAF image areas are labeled either as confident background, confident HF, or potential HF, ensuring that predictions are reliable where they are confident (97% Precision), while detecting all instances of HF (99% Recall) annotated by all experts.

6.
Neurochirurgie ; 70(4): 101566, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38749318

RESUMEN

BACKGROUND: The results of a clinical trial are given in terms of primary and secondary outcomes that are obtained for each patient. Just as an instrument should provide the same result when the same object is measured repeatedly, the agreement of the adjudication of a clinical outcome between various raters is fundamental to interpret study results. The reliability of the adjudication of study endpoints determined by examination of the electronic case report forms of a pragmatic trial has not previously been tested. METHODS: The electronic case report forms of 62/434 (14%) patients selected to be observed in a study on brain AVMs were independently examined twice (4 weeks apart) by 8 raters who judged whether each patient had reached the following study endpoints: (1) new intracranial hemorrhage related to AVM or to treatment; (2) new non-hemorrhagic neurological event; (3) increase in mRS ≥1; (4) serious adverse events (SAE). Inter and intra-rater reliability were assessed using Gwet's AC1 (κG) statistics, and correlations with mRS score using Cramer's V test. RESULTS: There was almost perfect agreement for intracranial hemorrhage (92% agreement; κG = 0.84 (95%CI: 0.76-0.93), and substantial agreement for SAEs (88% agreement; κG = 0.77 (95%CI: 0.67-0.86) and new non-hemorrhagic neurological event (80% agreement; κG = 0.61 (95%CI: 0.50-0.72). Most endpoints correlated (V = 0.21-0.57) with an increase in mRS of ≥1, an endpoint which was itself moderately reliable (76% agreement; κG = 0.54 (95%CI: 0.43-0.64). CONCLUSION: Study endpoints of a pragmatic trial were shown to be reliable. More studies on the reliability of pragmatic trial endpoints are needed.


Asunto(s)
Malformaciones Arteriovenosas Intracraneales , Humanos , Reproducibilidad de los Resultados , Femenino , Masculino , Resultado del Tratamiento , Adulto , Hemorragias Intracraneales/etiología , Hemorragias Intracraneales/diagnóstico , Persona de Mediana Edad , Determinación de Punto Final
7.
Breast Cancer ; 31(4): 671-683, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38619787

RESUMEN

BACKGROUND: Visual assessment of mammographic breast composition remains the most common worldwide, although subjective variability limits its reproducibility. This study aimed to investigate the inter- and intra-observer variability in qualitative visual assessment of mammographic breast composition through a multi-institutional observer performance study for the first time in Japan. METHODS: This study enrolled 10 Japanese physicians from five different institutions. They used the new Japanese breast-composition classification system 4th edition to subjectively evaluate the breast composition in 200 pairs of right and left normal mediolateral oblique mammograms (number determined using precise sample size calculations) twice, with a 1-month interval (median patient age: 59 years [range 40-69 years]). The primary endpoint of this study was the inter-observer variability using kappa (κ) value. RESULTS: Inter-observer variability for the four and two classes of breast-composition assessment revealed moderate agreement (Fleiss' κ: first and second reading = 0.553 and 0.587, respectively) and substantial agreement (Fleiss' κ: first and second reading = 0.689 and 0.70, respectively). Intra-observer variability for the four and two classes of breast-composition assessment demonstrated substantial agreement (Cohen's κ, median = 0.758) and almost perfect agreement (Cohen's κ, median = 0.813). Assessments of consensus between the 10 physicians and the automated software Volpara® revealed slight agreement (Cohen's κ; first and second reading: 0.104 and 0.075, respectively). CONCLUSIONS: Qualitative visual assessment of mammographic breast composition using the new Japanese classification revealed excellent intra-observer reproducibility. However, persistent inter-observer variability, presenting a challenge in establishing it as the gold standard in Japan.


Asunto(s)
Neoplasias de la Mama , Mamografía , Variaciones Dependientes del Observador , Humanos , Persona de Mediana Edad , Femenino , Mamografía/métodos , Adulto , Japón , Anciano , Reproducibilidad de los Resultados , Neoplasias de la Mama/diagnóstico por imagen , Mama/diagnóstico por imagen , Mama/patología , Médicos , Densidad de la Mama
8.
Eur Radiol ; 34(4): 2791-2804, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-37733025

RESUMEN

OBJECTIVES: To investigate the intra- and inter-rater reliability of the total radiomics quality score (RQS) and the reproducibility of individual RQS items' score in a large multireader study. METHODS: Nine raters with different backgrounds were randomly assigned to three groups based on their proficiency with RQS utilization: Groups 1 and 2 represented the inter-rater reliability groups with or without prior training in RQS, respectively; group 3 represented the intra-rater reliability group. Thirty-three original research papers on radiomics were evaluated by raters of groups 1 and 2. Of the 33 papers, 17 were evaluated twice with an interval of 1 month by raters of group 3. Intraclass coefficient (ICC) for continuous variables, and Fleiss' and Cohen's kappa (k) statistics for categorical variables were used. RESULTS: The inter-rater reliability was poor to moderate for total RQS (ICC 0.30-055, p < 0.001) and very low to good for item's reproducibility (k - 0.12 to 0.75) within groups 1 and 2 for both inexperienced and experienced raters. The intra-rater reliability for total RQS was moderate for the less experienced rater (ICC 0.522, p = 0.009), whereas experienced raters showed excellent intra-rater reliability (ICC 0.91-0.99, p < 0.001) between the first and second read. Intra-rater reliability on RQS items' score reproducibility was higher and most of the items had moderate to good intra-rater reliability (k - 0.40 to 1). CONCLUSIONS: Reproducibility of the total RQS and the score of individual RQS items is low. There is a need for a robust and reproducible assessment method to assess the quality of radiomics research. CLINICAL RELEVANCE STATEMENT: There is a need for reproducible scoring systems to improve quality of radiomics research and consecutively close the translational gap between research and clinical implementation. KEY POINTS: • Radiomics quality score has been widely used for the evaluation of radiomics studies. • Although the intra-rater reliability was moderate to excellent, intra- and inter-rater reliability of total score and point-by-point scores were low with radiomics quality score. • A robust, easy-to-use scoring system is needed for the evaluation of radiomics research.


Asunto(s)
Radiómica , Lectura , Humanos , Variaciones Dependientes del Observador , Reproducibilidad de los Resultados
9.
J Appl Clin Med Phys ; 25(1): e14220, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37994694

RESUMEN

PURPOSE: This study aimed to demonstrate the potential clinical applicability of an organ-contour-driven auto-matching algorithm in image-guided radiotherapy. METHODS: This study included eleven consecutive patients with cervical cancer who underwent radiotherapy in 23 or 25 fractions. Daily and reference magnetic resonance images were converted into mesh models. A weight-based algorithm was implemented to optimize the distance between the mesh model vertices and surface of the reference model during the positioning process. Within the cost function, weight parameters were employed to prioritize specific organs for positioning. In this study, three scenarios with different weight parameters were prepared. The optimal translation and rotation values for the cervix and uterus were determined based on the calculated translations alone or in combination with rotations, with a rotation limit of ±3°. Subsequently, the coverage probabilities of the following two planning target volumes (PTV), an isotropic 5 mm and anisotropic margins derived from a previous study, were evaluated. RESULTS: The percentage of translations exceeding 10 mm varied from 9% to 18% depending on the scenario. For small PTV sizes, more than 80% of all fractions had a coverage of 80% or higher. In contrast, for large PTV sizes, more than 90% of all fractions had a coverage of 95% or higher. The difference between the median coverage with translational positioning alone and that with both translational and rotational positioning was 1% or less. CONCLUSION: This algorithm facilitates quantitative positioning by utilizing a cost function that prioritizes organs for positioning. Consequently, consistent displacement values were algorithmically generated. This study also revealed that the impact of rotational corrections, limited to ±3°, on PTV coverage was minimal.


Asunto(s)
Radioterapia Guiada por Imagen , Radioterapia de Intensidad Modulada , Femenino , Humanos , Radioterapia Guiada por Imagen/métodos , Dosificación Radioterapéutica , Planificación de la Radioterapia Asistida por Computador/métodos , Radioterapia de Intensidad Modulada/métodos , Algoritmos
10.
Iran J Pathol ; 18(3): 335-346, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37942205

RESUMEN

Background & Objective: Invasive breast carcinoma (IBC) is the most commonly diagnosed cancer among women in India. The conventional visual method of evaluation of Tumor-Stroma Ratio (TSR) and Stromal Tumor-Infiltrating Lymphocytes (sTIL) appears to be subjective. The present study aims to evaluate the utility of the indigenously designed square grid method for the evaluation of tumor-stroma ratio and stromal tumor-infiltrating lymphocytes in invasive breast carcinoma by assessing the inter-observer variability. Methods: This was a retrospective study conducted at a rural tertiary care referral institute from July 2018 to June 2020. In each case, microphotographs were taken from 10 representative fields in H&E-stained sections for evaluating TSR in low-power and sTIL in high-power. Both the parameters were evaluated employing an indigenously designed square grid applied onto microphotographs in the power-point slides by making use of principles of the Pythagorean theorem. Both parameters were separately evaluated by two pathologists. Cohen kappa statistics was the statistical tool used to analyze inter-observer variability. Results: Thirty cases were analyzed. Invasive breast carcinoma of no special type (IBC-NST) was the most common histopathological type (26 cases (86.67%)). For TRS evaluation, a Kappa value of 0.78 suggested substantial agreement with an agreement of 91.67%. For sTIL evaluation, a Kappa value of 0.51 suggested moderate agreement with an agreement of 88.33%. The P-values were statistically highly significant (P<0.001). Conclusion: Square grid method is a novel technique for evaluating TSR and sTIL in invasive breast carcinoma. It can be considered an example of the application of Pythagoras' theorem in Pathology.

11.
J Magn Reson Imaging ; 2023 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-37846440

RESUMEN

BACKGROUND: Accurate breast density evaluation allows for more precise risk estimation but suffers from high inter-observer variability. PURPOSE: To evaluate the feasibility of reducing inter-observer variability of breast density assessment through artificial intelligence (AI) assisted interpretation. STUDY TYPE: Retrospective. POPULATION: Six hundred and twenty-one patients without breast prosthesis or reconstructions were randomly divided into training (N = 377), validation (N = 98), and independent test (N = 146) datasets. FIELD STRENGTH/SEQUENCE: 1.5 T and 3.0 T; T1-weighted spectral attenuated inversion recovery. ASSESSMENT: Five radiologists independently assessed each scan in the independent test set to establish the inter-observer variability baseline and to reach a reference standard. Deep learning and three radiomics models were developed for three classification tasks: (i) four Breast Imaging-Reporting and Data System (BI-RADS) breast composition categories (A-D), (ii) dense (categories C, D) vs. non-dense (categories A, B), and (iii) extremely dense (category D) vs. moderately dense (categories A-C). The models were tested against the reference standard on the independent test set. AI-assisted interpretation was performed by majority voting between the models and each radiologist's assessment. STATISTICAL TESTS: Inter-observer variability was assessed using linear-weighted kappa (κ) statistics. Kappa statistics, accuracy, and area under the receiver operating characteristic curve (AUC) were used to assess models against reference standard. RESULTS: In the independent test set, five readers showed an overall substantial agreement on tasks (i) and (ii), but moderate agreement for task (iii). The best-performing model showed substantial agreement with reference standard for tasks (i) and (ii), but moderate agreement for task (iii). With the assistance of the AI models, almost perfect inter-observer variability was obtained for tasks (i) (mean κ = 0.86), (ii) (mean κ = 0.94), and (iii) (mean κ = 0.94). DATA CONCLUSION: Deep learning and radiomics models have the potential to help reduce inter-observer variability of breast density assessment. LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY: Stage 1.

12.
J Appl Clin Med Phys ; 24(11): e14170, 2023 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-37788333

RESUMEN

INTRODUCTION: In the Library-of-Plans (LoP) approach, correct plan selection is essential for delivering radiotherapy treatment accurately. However, poor image quality of the cone-beam computed tomography (CBCT) may introduce inter-observer variability and thereby hamper accurate plan selection. In this study, we investigated whether new techniques to improve the CBCT image quality and improve consistency in plan selection, affects the accuracy of LoP selection in cervical cancer patients. MATERIALS AND METHODS: CBCT images of 12 patients were used to investigate the inter-observer variability of plan selection based on different CBCT image types. Six observers were asked to individually select a plan based on clinical X-ray Volumetric Imaging (XVI) CBCT, iterative reconstructed CBCT (iCBCT) and synthetic CTs (sCT). Selections were performed before and after a consensus meeting with the entire group, in which guidelines were created. A scoring by all observers on the image quality and plan selection procedure was also included. For plan selection, Fleiss' kappa (κ) statistical test was used to determine the inter-observer variability within one image type. RESULTS: The agreement between observers was significantly higher on sCT compared to CBCT. The consensus meeting improved the duration and inter-observer variability. In this manuscript, the guidelines attributed the overall results in the plan selection. Before the meeting, the gold standard was selected in 76% of the cases on XVI CBCT, 74% on iCBCT, and 76% on sCT. After the meeting, the gold standard was selected in 83% of the cases on XVI CBCT, 81% on iCBCT, and 90% on sCT. CONCLUSION: The use of sCTs can increase the agreement of plan selection among observers and the gold standard was indicated to be selected more often. It is important that clear guidelines for plan selection are implemented in order to benefit from the increased image quality, accurate selection, and decrease inter-observer variability.


Asunto(s)
Tomografía Computarizada de Haz Cónico Espiral , Neoplasias del Cuello Uterino , Femenino , Humanos , Neoplasias del Cuello Uterino/diagnóstico por imagen , Neoplasias del Cuello Uterino/radioterapia , Variaciones Dependientes del Observador , Planificación de la Radioterapia Asistida por Computador/métodos , Tomografía Computarizada de Haz Cónico/métodos
13.
J Contemp Brachytherapy ; 15(4): 253-260, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37799120

RESUMEN

Purpose: Delineation is a critical and challenging step in radiotherapy planning. Differences in delineation among observers are common, despite the presence of contouring guidelines. This study aimed to identify the inter-observer variability in the target volume delineation of computed tomography (CT)-guided brachytherapy for cervical cancer. Material and methods: Four radiation oncologists (ROs) with different expertise levels delineated high-risk (HR) and intermediate-risk (IR) clinical target volume (CTV) according to GYN GEC-ESTRO recommendations, in a blinded manner on every CT set of ten locally advanced cervical cancer cases. The most experienced RO's contours were determined as the index and used for comparison. Dice similarity coefficient (DSC) and pairwise Hausdorff distance (HD) metrics were applied to compare the overlap and gross deviations of all contours. Results: Median DSC for HR-CTV and IR-CTV were 0.73 and 0.76, respectively, and a good concordance was achieved for both in majority of contours. While there was no difference in DSC measurements for HR-CTV among the three ROs, RO-3 provided improved DSC values for IR-CTV (p = 0.01). Median HD95 was 5.02 mm and 6.83 mm, and median HDave was 1.69 mm and 2.21 mm for HR-CTV and IR-CTV, respectively. There was no significant difference among ROs in HR-CTV for HD95 or HDave; however, IR-CTV value was significantly improved according to RO-3 (p = 0.01). Case-by-case HD analysis showed no significant inter-observer variations, except for two cases. Conclusions: The inter-observer agreement is generally high for target volumes in CT-guided brachytherapy for cervical cancer. The agreement is lower for IR-CTV than HR-CTV. The individual characteristics of each case and different expertise levels of the ROs may have caused the differences. Despite the good concordance for delineation, dosimetric consequences can still be clinically significant.

14.
Tumori ; 109(6): 570-575, 2023 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-37688419

RESUMEN

This study quantified the incidental dose to the first axillary level (L1) in locoregional treatment plan for breast cancer. Eighteen radiotherapy centres contoured L1-L4 on three different patients (P1,2,3), created the L2-L4 planning target volume (single centre planning target volume, SC-PTV) and elaborated a locoregional treatment plan. The L2-L4 gold standard clinical target volume (CTV) along with the gold standard L1 contour (GS-L1) were created by an expert consensus. The SC-PTV was then replaced by the GS-PTV and the incidental dose to GS-L1 was measured. Dosimetric data were analysed with Kruskal-Wallis test. Plans were intensity modulated radiotherapy (IMRT)-based. P3 with 90° arm setup had statistically significant higher L1 dose across the board than P1 and P2, with the mean dose (Dmean) reaching clinical significance. Dmean of P1 and P2 was consistent with the literature (77.4% and 74.7%, respectively). The incidental dose depended mostly on L1 proportion included in the breast fields, underlining the importance of the setup, even in case of IMRT.


Asunto(s)
Neoplasias de la Mama , Radioterapia de Intensidad Modulada , Humanos , Femenino , Neoplasias de la Mama/radioterapia , Planificación de la Radioterapia Asistida por Computador , Dosificación Radioterapéutica , Variaciones Dependientes del Observador , Mama
15.
Comput Biol Med ; 159: 106856, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37075600

RESUMEN

BACKGROUND: Among all the cancers known today, prostate cancer is one of the most commonly diagnosed in men. With modern advances in medicine, its mortality has been considerably reduced. However, it is still a leading type of cancer in terms of deaths. The diagnosis of prostate cancer is mainly conducted by biopsy test. From this test, Whole Slide Images are obtained, from which pathologists diagnose the cancer according to the Gleason scale. Within this scale from 1 to 5, grade 3 and above is considered malignant tissue. Several studies have shown an inter-observer discrepancy between pathologists in assigning the value of the Gleason scale. Due to the recent advances in artificial intelligence, its application to the computational pathology field with the aim of supporting and providing a second opinion to the professional is of great interest. METHOD: In this work, the inter-observer variability of a local dataset of 80 whole-slide images annotated by a team of 5 pathologists from the same group was analyzed at both area and label level. Four approaches were followed to train six different Convolutional Neural Network architectures, which were evaluated on the same dataset on which the inter-observer variability was analyzed. RESULTS: An inter-observer variability of 0.6946 κ was obtained, with 46% discrepancy in terms of area size of the annotations performed by the pathologists. The best trained models achieved 0.826±0.014κ on the test set when trained with data from the same source. CONCLUSIONS: The obtained results show that deep learning-based automatic diagnosis systems could help reduce the widely-known inter-observer variability that is present among pathologists and support them in their decision, serving as a second opinion or as a triage tool for medical centers.


Asunto(s)
Aprendizaje Profundo , Neoplasias de la Próstata , Masculino , Humanos , Inteligencia Artificial , Clasificación del Tumor , Variaciones Dependientes del Observador , Reproducibilidad de los Resultados
16.
Clin Imaging ; 99: 38-40, 2023 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-37060680

RESUMEN

Indeterminate lung nodules detected on CT are common findings in the clinical practice, and the correct assessment of their size is critical for patient evaluation and management. We compared the stability of three definitions of nodule diameter (Feret's mean diameter, Martin's mean diameter and area-equivalent diameter) to inter-observer variability on a population of 336 solid nodules from 207 subjects. We found that inter-observer agreement was highest with Martin's mean diameter (intra-class correlation coefficient = 0.977, 95% Confidence interval = 0.977-0.978), followed by area-equivalent diameter (0.972, 0.971-0.973) and Feret's mean diameter (0.965, 0.964-0.966). The differences were statistically significant. In conclusion, although all the three diameter definitions achieved very good inter-observer agreement (ICC > 0.96), Martin's mean diameter was significantly better than the others. Future guidelines may consider adopting Martin's mean diameter as an alternative to the currently used Feret's (caliper) diameter for assessing the size of lung nodules on CT.


Asunto(s)
Neoplasias Pulmonares , Humanos , Neoplasias Pulmonares/diagnóstico por imagen , Tomografía Computarizada por Rayos X , Variaciones Dependientes del Observador , Pulmón
17.
Med Phys ; 50(6): 3324-3337, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-36940384

RESUMEN

BACKGROUND: Absorbable hydrogel spacer injected between prostate and rectum is gaining popularity for rectal sparing. The spacer alters patient anatomy and thus requires new auto-contouring models. PURPOSE: To report the development and comprehensive evaluation of two deep-learning models for patients injected with a radio-transparent (model I) versus radiopaque (model II) spacer. METHODS AND MATERIALS: Model I was trained and cross-validated by 135 cases with transparent spacer and tested on 24 cases. Using refined training methods, model II was trained and cross-validated by the same dataset, but with the Hounsfield Unit distribution in the spacer overridden by that obtained from ten cases with opaque spacer. Model II was tested on 64 cases. The models auto-contour eight regions of interest (ROIs): spacer, prostate, proximal seminal vesicles (SVs), left and right femurs, bladder, rectum, and penile bulb. Qualitatively, each auto contour (AC), as well as the composite set, was assessed against manual contour (MC), by a radiation oncologist using a 1 (accepted directly or after minor editing), 2 (accepted after moderate editing), 3 (accepted after major editing), and 4 (rejected) scoring scale. The efficiency gain was characterized by the mean score as nearly complete [1-1.75], substantial (1.75-2.5], meaningful (2.5-3.25], and no (3.25-4.00]. Quantitatively, the geometric similarity between AC and MC was evaluated by dice similarity coefficient (DSC) and mean distance to agreement (MDA), using tolerance recommended by AAPM TG-132 Report. The results by the two models were compared to examine the outcome of the refined training methods. The large number of testing cases for model II allowed further investigation of inter-observer variability in clinical dataset. The correlation between score and DSC/MDA was studied on the ROIs with 10 or more counts of each acceptable score (1, 2, 3). RESULTS: For model I/model II: the mean score was 3.63/1.30 for transparent/opaque spacer, 2.71/2.16 for prostate, 3.25/2.44 for proximal SVs, 1.13/1.02 for both femurs, 2.25/1.25 for bladder, 3.00/2.06 for rectum, 3.38/2.42 for penile bulb, and 2.79/2.20 for the composite set; the mean DSC was 0.52/0.84 for spacer, 0.84/0.85 for prostate, 0.60/0.62 for proximal SVs, 0.94/0.96 for left femur, 0.95/0.96 for right femur, 0.91/0.95 for bladder, 0.81/0.84 for rectum, and 0.65/0.65 for penile bulb; and the mean MDA was 2.9/0.9 mm for spacer, 1.9/1.7 mm for prostate, 2.4/2.3 mm for proximal SVs, 0.8/0.5 mm for left femur, 0.7/0.5 mm for right femur, 1.5/0.9 mm for bladder, 2.3/1.9 mm for rectum, and 2.2/2.2 mm for penile bulb. Model II showed significantly improved scores for all ROIs, and metrics for spacer, femurs, bladder, and rectum. Significant inter-observer variability was only found for prostate. Highly linear correlation between the score and DSC was found for the two qualified ROIs (prostate and rectum). CONCLUSIONS: The overall efficiency gain was meaningful for model I and substantial for model II. The ROIs meeting the clinical deployment criteria (mean score below 3.25, DSC above 0.8, and MDA below 2.5 mm) included prostate, both femurs, bladder and rectum for both models, and spacer for model II.


Asunto(s)
Aprendizaje Profundo , Neoplasias de la Próstata , Masculino , Humanos , Hidrogeles , Planificación de la Radioterapia Asistida por Computador/métodos , Neoplasias de la Próstata/diagnóstico por imagen , Neoplasias de la Próstata/radioterapia , Próstata/diagnóstico por imagen , Próstata/anatomía & histología
18.
J Med Radiat Sci ; 70 Suppl 2: 59-69, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36751021

RESUMEN

INTRODUCTION: Magnetic resonance imaging (MRI) demonstrates superior soft tissue contrast and is increasingly being used in radiotherapy planning. This study evaluated the impact of an education workshop in minimising inter-observer variation (IOV) for nasopharyngeal organs at risk (OAR) delineation on MRI. METHODS: Ten observers delineated 14 OARs on 4 retrospective nasopharyngeal MRI data sets. Standard contouring guidelines were provided pre-workshop. Following an education workshop on MRI OAR delineation, observers blinded to their original contours repeated the 14 OAR delineations. For comparison, reference volumes were delineated by two head and neck radiation oncologists. IOV was evaluated using dice similarity coefficient (DSC), Hausdorff distance (HD) and relative volume. Location of largest deviations was evaluated with centroid values. Observer confidence pre- and post-workshop was also recorded using a 6-point Likert scale. The workshop was deemed beneficial for an OAR if ≥50% of observers mean scores improved in any metric and ≥50% of observers' confidence improved. RESULTS: All OARs had ≥50% of observers improve in at least one metric. Base of tongue, larynx, spinal cord and right temporal lobe were the only OARs achieving a mean DSC score of ≥0.7. Base of tongue, left and right lacrimal glands, larynx, left optic nerve and right parotid gland all exhibited statistically significant HD improvements post-workshop (P < 0.05). Brainstem and left and right temporal lobes all had statistically significant relative volume improvements post-workshop (P < 0.05). Post-workshop observer confidence improvement was observed for all OARs (P < 0.001). CONCLUSIONS: The educational workshop reduced IOV and improved observers' confidence when delineating nasopharyngeal OARs on MRI.


Asunto(s)
Imagen por Resonancia Magnética , Oncología por Radiación , Humanos , Estudios Retrospectivos , Cuello , Órganos en Riesgo , Planificación de la Radioterapia Asistida por Computador/métodos , Variaciones Dependientes del Observador
19.
Comput Biol Med ; 154: 106536, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36708654

RESUMEN

PROBLEM: Convolutional Neural Networks (CNNs) for medical image analysis usually only output a probability value, providing no further information about the original image or inter-relationships between different images. Dimensionality Reduction Techniques (DRTs) are used for visualization of high dimensional medical image data, but they are not intended for discriminative classification analysis. AIM: We develop an interactive phenotype distribution field visualization system for medical images to accurately reflect the pathological characteristics of lesions and their similarity to assist radiologists in diagnosis and medical research. METHODS: We propose a novel method, Classification Regularized Uniform Manifold Approximation and Projection (UMAP) referred as CReUMAP, combining the advantages of CNN and DRT, to project the extracted feature vector fused with the malignant probability predicted by a CNN to a two-dimensional space, and then apply a spatial segmentation classifier trained on 2614 ultrasound images for prediction of thyroid nodule malignancy and guidance to radiologists. RESULTS: The CReUMAP embedding correlates well with the TI-RADS categories of thyroid nodules. The parametric version that embeds external test dataset of 303 images in presence of the training data with known pathological diagnosis improves the benign and malignant nodule diagnostic accuracy (p-value = 0.016) and confidence (p-value = 1.902 × 10-6) of eight radiologists of different experience levels significantly as well as their inter-observer agreements (kappa≥0.75). CReUMAP achieve 90.8% accuracy, 92.1% sensitivity and 88.6% specificity in test set. CONCLUSION: CReUMAP embedding is well correlated with the pathological diagnosis of thyroid nodules, and helps radiologists achieve more accurate, confident and consistent diagnosis. It allows a medical center to generate its locally adapted embedding using an already-trained classification model in an updateable manner on an ever-growing local database as long as the extracted feature vectors and predicted diagnostic probabilities of the correspondent classification model can be outputted.


Asunto(s)
Neoplasias de la Tiroides , Nódulo Tiroideo , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Ultrasonografía/métodos , Redes Neurales de la Computación , Neoplasias de la Tiroides/diagnóstico por imagen , Probabilidad
20.
Radiother Oncol ; 180: 109461, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36634852

RESUMEN

BACKGROUND AND PURPOSE: The use of SBRT for the treatment of oligometastatic prostate cancer is increasing rapidly. While consensus guidelines are available for non-spinal bone metastases practice continues to vary widely. The aim of this study is to look at inter-observer variability in the contouring of prostate cancer non-spinal bone metastases with different imaging modalities. MATERIALS AND METHODS: 15 metastases from 13 patients treated at our centre were selected. 4 observers independently contoured clinical target volumes (CTV) on planning CT alone, planning CT with MRI fusion, planning CT with PET-CT fusion and planning CT with both MRI and PET-CT fusion combined. The mean inter-observer agreement on each modality was compared by measuring the delineated volume, generalized conformity index (CIgen), and the distance of the centre of mass (dCOM), calculated per metastasis and imaging modality. RESULTS: Mean CTV volume delineated on planning CT with MRI and PET-CT fusion combined was significantly larger compared to other imaging modalities (p = 0.0001). CIgen showed marked variation between modalities with the highest agreement between planning CT + PET-CT (mean CIgen 0.55, range 0.32-0.73) and planning CT + MRI + PET-CT (mean CIgen 0.59, range 0.34-0.73). dCOM showed small variations between imaging modalities but a significantly shorter distance found on planning CT + PET-CT when compared with planning CT + PET-CT + MRI combined (p = 0.03). CONCLUSIONS: Highest consistency in CTV delineation between observers was seen with planning CT + PET-CT and planning CT + PET-CT + MRI combined.


Asunto(s)
Neoplasias Óseas , Neoplasias de la Próstata , Radiocirugia , Planificación de la Radioterapia Asistida por Computador , Neoplasias Óseas/diagnóstico por imagen , Neoplasias Óseas/radioterapia , Imagen por Resonancia Magnética , Metástasis de la Neoplasia/diagnóstico por imagen , Metástasis de la Neoplasia/radioterapia , Variaciones Dependientes del Observador , Tomografía Computarizada por Tomografía de Emisión de Positrones , Neoplasias de la Próstata/patología , Neoplasias de la Próstata/cirugía , Tomografía Computarizada por Rayos X , Humanos , Masculino
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA