Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.404
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Gastroenterology ; 167(3): 493-504.e10, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38467384

RESUMO

BACKGROUND & AIMS: Histologic evaluation of gut biopsies is a cornerstone for diagnosis and management of celiac disease (CeD). Despite its wide use, the method depends on proper biopsy orientation, and it suffers from interobserver variability. Biopsy proteome measurement reporting on the tissue state can be obtained by mass spectrometry analysis of formalin-fixed paraffin-embedded tissue. Here we aimed to transform biopsy proteome data into numerical scores that give observer-independent measures of mucosal remodeling in CeD. METHODS: A pipeline using glass-mounted formalin-fixed paraffin-embedded sections for mass spectrometry-based proteome analysis was established. Proteome data were converted to numerical scores using 2 complementary approaches: a rank-based enrichment score and a score based on machine learning using logistic regression. The 2 scoring approaches were compared with each other and with histology analyzing 18 patients with CeD with biopsies collected before and after treatment with a gluten-free diet as well as biopsies from patients with CeD with varying degree of remission (n = 22). Biopsies from individuals without CeD (n = 32) were also analyzed. RESULTS: The method yielded reliable proteome scoring of both unstained and H&E-stained glass-mounted sections. The scores of the 2 approaches were highly correlated, reflecting that both approaches pick up proteome changes in the same biological pathways. The proteome scores correlated with villus height-to-crypt depth ratio. Thus, the method is able to score biopsies with poor orientation. CONCLUSIONS: Biopsy proteome scores give reliable observer and orientation-independent measures of mucosal remodeling in CeD. The proteomic method can readily be implemented by nonexpert laboratories in parallel to histology assessment and easily scaled for clinical trial settings.


Assuntos
Doença Celíaca , Dieta Livre de Glúten , Mucosa Intestinal , Proteoma , Proteômica , Doença Celíaca/patologia , Doença Celíaca/metabolismo , Doença Celíaca/diagnóstico , Humanos , Mucosa Intestinal/patologia , Mucosa Intestinal/metabolismo , Biópsia , Proteoma/análise , Proteômica/métodos , Feminino , Masculino , Adulto , Aprendizado de Máquina , Pessoa de Meia-Idade , Espectrometria de Massas , Variações Dependentes do Observador , Valor Preditivo dos Testes , Inclusão em Parafina , Reprodutibilidade dos Testes , Estudos de Casos e Controles
2.
Breast Cancer Res ; 26(1): 31, 2024 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-38395930

RESUMO

BACKGROUND: Accurate classification of breast cancer molecular subtypes is crucial in determining treatment strategies and predicting clinical outcomes. This classification largely depends on the assessment of human epidermal growth factor receptor 2 (HER2), estrogen receptor (ER), and progesterone receptor (PR) status. However, variability in interpretation among pathologists pose challenges to the accuracy of this classification. This study evaluates the role of artificial intelligence (AI) in enhancing the consistency of these evaluations. METHODS: AI-powered HER2 and ER/PR analyzers, consisting of cell and tissue models, were developed using 1,259 HER2, 744 ER, and 466 PR-stained immunohistochemistry (IHC) whole-slide images of breast cancer. External validation cohort comprising HER2, ER, and PR IHCs of 201 breast cancer cases were analyzed with these AI-powered analyzers. Three board-certified pathologists independently assessed these cases without AI annotation. Then, cases with differing interpretations between pathologists and the AI analyzer were revisited with AI assistance, focusing on evaluating the influence of AI assistance on the concordance among pathologists during the revised evaluation compared to the initial assessment. RESULTS: Reevaluation was required in 61 (30.3%), 42 (20.9%), and 80 (39.8%) of HER2, in 15 (7.5%), 17 (8.5%), and 11 (5.5%) of ER, and in 26 (12.9%), 24 (11.9%), and 28 (13.9%) of PR evaluations by the pathologists, respectively. Compared to initial interpretations, the assistance of AI led to a notable increase in the agreement among three pathologists on the status of HER2 (from 49.3 to 74.1%, p < 0.001), ER (from 93.0 to 96.5%, p = 0.096), and PR (from 84.6 to 91.5%, p = 0.006). This improvement was especially evident in cases of HER2 2+ and 1+, where the concordance significantly increased from 46.2 to 68.4% and from 26.5 to 70.7%, respectively. Consequently, a refinement in the classification of breast cancer molecular subtypes (from 58.2 to 78.6%, p < 0.001) was achieved with AI assistance. CONCLUSIONS: This study underscores the significant role of AI analyzers in improving pathologists' concordance in the classification of breast cancer molecular subtypes.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/diagnóstico , Neoplasias da Mama/metabolismo , Receptores de Estrogênio/metabolismo , Biomarcadores Tumorais/metabolismo , Inteligência Artificial , Variações Dependentes do Observador , Receptores de Progesterona/metabolismo , Receptor ErbB-2/metabolismo
3.
Breast Cancer Res Treat ; 204(2): 415-422, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38157098

RESUMO

PURPOSE: Ki-67 expression levels in breast cancer have prognostic and predictive significance. Therefore, accurate Ki-67 evaluation is important for optimal patient care. Although an algorithm developed by the International Ki-67 in Breast Cancer Working Group (IKWG) improves interobserver variability, it is tedious and time-consuming. In this study, we simplify IKWG algorithm and evaluate its interobserver agreement among breast pathologists in Ki-67 evaluation. METHODS: Six subspecialized breast pathologists (4 juniors, 2 seniors) assessed the percentage of positive cells in 5% increments in 57 immunostained Ki-67 slides. The time spent on each slide was recorded. Two rounds of ring study (R1, R2) were performed before and after training with the modified IKWG algorithm (eyeballing method at 400× instead of counting 100 tumor nuclei per area). Concordance was assessed using Kendall's and Kappa coefficients. RESULTS: Analysis of ordinal scale ratings for all categories with 5% increments showed almost perfect agreement in R1 (0.821) and substantial in R2 (0.793); Seniors and juniors had substantial agreement in R1 (0.718 vs. 0.649) and R2 (0.756 vs. 0.658). In dichotomous scale analysis using 20% as the cutoff, the overall agreement was moderate in R1 (0.437) and R2 (0.479), among seniors (R1: 0.436; R2: 0.437) and juniors (R1: 0.445; R2: 0.505). Average scoring time per case was higher in R2 (71 vs. 37 s). CONCLUSION: The modified IKWG algorithm does not significantly improve interobserver agreement. A better algorithm or assistance from digital image analysis is needed to improve interobserver variability in Ki-67 evaluation.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/patologia , Antígeno Ki-67/metabolismo , Variações Dependentes do Observador , Patologistas , Mama/patologia , Reprodutibilidade dos Testes
4.
Breast Cancer Res Treat ; 205(2): 403-411, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38441847

RESUMO

PURPOSE: The recent findings from the DESTINY-Breast04 trial highlighted the clinical importance of distinguishing between HER2 immunohistochemistry (IHC) scores 0 and 1 + in metastatic breast cancer (BC). However, pathologist interpretation of HER2 IHC scoring is subjective, and standardized methodology is needed. We evaluated the consistency of HER2 IHC scoring among pathologists and the accuracy of digital image analysis (DIA) in interpreting HER2 IHC staining in cases of HER2-low BC. METHODS: Fifty whole-slide biopsies of BC with HER2 IHC staining were evaluated, comprising 25 cases originally reported as IHC score 0 and 25 as 1 +. These slides were digitally scanned. Six pathologists with breast expertise independently reviewed and scored the scanned images, and DIA was applied. Agreement among pathologists and concordance between pathologist scores and DIA results were statistically analyzed using Kendall coefficient of concordance (W) tests. RESULTS: Substantial agreement among at least five of the six pathologists was found for 18 of the score 0 cases (72%) and 15 of the score 1 + cases (60%), indicating excellent interobserver agreement (W = 0.828). DIA scores were highly concordant with pathologist scores in 96% of cases (47/49), indicating excellent concordance (W = 0.959). CONCLUSION: Although breast subspecialty pathologists were relatively consistent in evaluating BC with HER2 IHC scores of 0 and 1 +, DIA may be a reliable supplementary tool to enhance the standardization and quantification of HER2 IHC assessment, especially in challenging cases where results may be ambiguous (i.e., scores 0-1 +). These findings hold promise for improving the accuracy and consistency of HER2 testing.


Assuntos
Neoplasias da Mama , Imuno-Histoquímica , Variações Dependentes do Observador , Receptor ErbB-2 , Humanos , Neoplasias da Mama/patologia , Neoplasias da Mama/metabolismo , Receptor ErbB-2/metabolismo , Feminino , Imuno-Histoquímica/métodos , Reprodutibilidade dos Testes , Biomarcadores Tumorais/metabolismo , Biomarcadores Tumorais/análise , Processamento de Imagem Assistida por Computador/métodos
5.
Radiology ; 312(1): e233341, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38980184

RESUMO

Background Due to conflicting findings in the literature, there are concerns about a lack of objectivity in grading knee osteoarthritis (KOA) on radiographs. Purpose To examine how artificial intelligence (AI) assistance affects the performance and interobserver agreement of radiologists and orthopedists of various experience levels when evaluating KOA on radiographs according to the established Kellgren-Lawrence (KL) grading system. Materials and Methods In this retrospective observer performance study, consecutive standing knee radiographs from patients with suspected KOA were collected from three participating European centers between April 2019 and May 2022. Each center recruited four readers across radiology and orthopedic surgery at in-training and board-certified experience levels. KL grading (KL-0 = no KOA, KL-4 = severe KOA) on the frontal view was assessed by readers with and without assistance from a commercial AI tool. The majority vote of three musculoskeletal radiology consultants established the reference standard. The ordinal receiver operating characteristic method was used to estimate grading performance. Light kappa was used to estimate interrater agreement, and bootstrapped t statistics were used to compare groups. Results Seventy-five studies were included from each center, totaling 225 studies (mean patient age, 55 years ± 15 [SD]; 113 female patients). The KL grades were KL-0, 24.0% (n = 54); KL-1, 28.0% (n = 63); KL-2, 21.8% (n = 49); KL-3, 18.7% (n = 42); and KL-4, 7.6% (n = 17). Eleven readers completed their readings. Three of the six junior readers showed higher KL grading performance with versus without AI assistance (area under the receiver operating characteristic curve, 0.81 ± 0.017 [SEM] vs 0.88 ± 0.011 [P < .001]; 0.76 ± 0.018 vs 0.86 ± 0.013 [P < .001]; and 0.89 ± 0.011 vs 0.91 ± 0.009 [P = .008]). Interobserver agreement for KL grading among all readers was higher with versus without AI assistance (κ = 0.77 ± 0.018 [SEM] vs 0.85 ± 0.013; P < .001). Board-certified radiologists achieved almost perfect agreement for KL grading when assisted by AI (κ = 0.90 ± 0.01), which was higher than that achieved by the reference readers independently (κ = 0.84 ± 0.017; P = .01). Conclusion AI assistance increased junior readers' radiographic KOA grading performance and increased interobserver agreement for osteoarthritis grading across all readers and experience levels. Published under a CC BY 4.0 license. Supplemental material is available for this article.


Assuntos
Inteligência Artificial , Variações Dependentes do Observador , Osteoartrite do Joelho , Humanos , Feminino , Masculino , Osteoartrite do Joelho/diagnóstico por imagem , Pessoa de Meia-Idade , Estudos Retrospectivos , Radiografia/métodos , Idoso
6.
Ann Rheum Dis ; 83(8): 1060-1071, 2024 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-38531611

RESUMO

OBJECTIVES: The main objective was to generate a GLobal OMERACT Ultrasound DActylitis Score (GLOUDAS) in psoriatic arthritis and to test its reliability. To this end, we assessed the validity, feasibility and applicability of ultrasound assessment of finger entheses to incorporate them into the scoring system. METHODS: The study consisted of a stepwise process. First, in cadaveric specimens, we identified enthesis sites of the fingers by ultrasound and gross anatomy, and then verified presence of entheseal tissue in histological samples. We then selected the entheses to be incorporated into a dactylitis scoring system through a Delphi consensus process among international experts. Next, we established and defined the ultrasound components of dactylitis and their scoring systems using Delphi methodology. Finally, we tested the interobserver and intraobserver reliability of the consensus- based scoring systemin patients with psoriatic dactylitis. RESULTS: 32 entheses were identified in cadaveric fingers. The presence of entheseal tissues was confirmed in all cadaveric samples. Of these, following the consensus process, 12 entheses were selected for inclusion in GLOUDAS. Ultrasound components of GLOUDAS agreed on through the Delphi process were synovitis, tenosynovitis, enthesitis, subcutaneous tissue inflammation and periextensor tendon inflammation. The scoring system for each component was also agreed on. Interobserver reliability was fair to good (κ 0.39-0.71) and intraobserver reliability good to excellent (κ 0.80-0.88) for dactylitis components. Interobserver and intraobserver agreement for the total B-mode and Doppler mode scores (sum of the scores of the individual abnormalities) were excellent (interobserver intraclass correlation coefficient (ICC) 0.98 for B-mode and 0.99 for Doppler mode; intraobserver ICC 0.98 for both modes). CONCLUSIONS: We have produced a consensus-driven ultrasound dactylitis scoring system that has shown acceptable interobserver reliability and excellent intraobserver reliability. Through anatomical knowledge, small entheses of the fingers were identified and histologically validated.


Assuntos
Artrite Psoriásica , Articulações dos Dedos , Índice de Gravidade de Doença , Ultrassonografia , Humanos , Artrite Psoriásica/diagnóstico por imagem , Reprodutibilidade dos Testes , Articulações dos Dedos/diagnóstico por imagem , Articulações dos Dedos/patologia , Ultrassonografia/métodos , Masculino , Feminino , Técnica Delphi , Sinovite/diagnóstico por imagem , Sinovite/patologia , Pessoa de Meia-Idade , Variações Dependentes do Observador , Entesopatia/diagnóstico por imagem , Tenossinovite/diagnóstico por imagem , Cadáver , Estudos de Viabilidade , Adulto , Idoso , Dedos/diagnóstico por imagem , Dedos/patologia
7.
J Anat ; 244(4): 620-627, 2024 04.
Artigo em Inglês | MEDLINE | ID: mdl-38214341

RESUMO

Imaging techniques in anatomy have developed rapidly over the last decades through the emergence of various 3D scanning systems. Depending on the dissection level, non-contact or tactile contact methods can be applied on the targeted structure. The aim of this study was to assess the inter and intra-observer reproducibility of an ArUco-based localisation stylus, that is, a manual technique on a hand-held stylus. Ten fresh-frozen, unembalmed adult arms were used to digitalise the glenoid cartilage related to the glenohumeral joint and the contour of the clavicle cartilage related to the acromioclavicular joint. Three operators performed consecutive digitalisations of each cartilage contour using an ArUco-based localisation stylus recorded by a single monocular camera. The shape of each cartilage was defined by nine shape parameters. Intra-observer repeatability and inter-observer reproducibility were computed using an intra-class correlation (ICC) for each of these parameters. Overall, 35.2 ± 2.4 s and 26.6 ± 10.2 s were required by each examiner to digitalise the contour of a glenoid and acromioclavicular cartilage, respectively. For most parameters, good-to-excellent agreements were observed concerning intra-observer (ICC ranging between 0.81 and 1.00) and inter-observer (ICC ranging between 0.75 and 0.99) reproducibility. To conclude, through a fast and versatile process, the use of an ArUco-based localisation stylus can be a reliable low-cost alternative to conventional imaging methods to digitalise shoulder cartilage contours.


Assuntos
Articulação do Ombro , Ombro , Adulto , Humanos , Reprodutibilidade dos Testes , Variações Dependentes do Observador , Cartilagem
8.
J Vasc Res ; 61(3): 122-128, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38547846

RESUMO

INTRODUCTION: We aimed to compare conventional vessel wall MR imaging techniques and quantitative susceptibility mapping (QSM) to determine the optimal sequence for detecting carotid artery calcification. METHODS: Twenty-two patients who underwent carotid vessel wall MR imaging and neck CT were enrolled. Four slices of 6-mm sections from the bilateral internal carotid bifurcation were subdivided into 4 segments according to clock position (0-3, 3-6, 6-9, and 9-12) and assessed for calcification. Two blinded radiologists independently reviewed a total of 704 segments and scored the likelihood of calcification using a 5-point scale on spin-echo imaging, FLASH, and QSM. The observer performance for detecting calcification was evaluated by a multireader, multiple-case receiver operating characteristic study. Weighted κ statistics were calculated to assess interobserver agreement. RESULTS: QSM had a mean area under the receiver operating characteristic curve of 0.85, which was significantly higher than that of any other sequence (p < 0.01) and showed substantial interreader agreement (κ = 0.68). A segment with a score of 3-5 was defined as positive, and a segment with a score of 1-2 was defined as negative; the sensitivity and specificity of QSM were 0.75 and 0.87, respectively. CONCLUSION: QSM was the most reliable MR sequence for the detection of plaque calcification.


Assuntos
Doenças das Artérias Carótidas , Variações Dependentes do Observador , Placa Aterosclerótica , Valor Preditivo dos Testes , Calcificação Vascular , Humanos , Calcificação Vascular/diagnóstico por imagem , Calcificação Vascular/patologia , Feminino , Masculino , Idoso , Pessoa de Meia-Idade , Doenças das Artérias Carótidas/diagnóstico por imagem , Doenças das Artérias Carótidas/patologia , Reprodutibilidade dos Testes , Angiografia por Ressonância Magnética , Estudos Retrospectivos , Idoso de 80 Anos ou mais , Angiografia por Tomografia Computadorizada , Artéria Carótida Interna/diagnóstico por imagem , Artéria Carótida Interna/patologia , Imageamento por Ressonância Magnética
9.
Histopathology ; 85(1): 171-181, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38571446

RESUMO

AIMS: Following the increased use of neoadjuvant therapy for pancreatic cancer, grading of tumour regression (TR) has become part of routine diagnostics. However, it suffers from marked interobserver variation, which is mainly ascribed to the subjectivity of the defining criteria of the categories in TR grading systems. We hypothesized that a further cause for the interobserver variation is the use of divergent and nonspecific morphological criteria to identify tumour regression. METHODS AND RESULTS: Twenty treatment-naïve pancreatic cancers and 20 pancreatic cancers treated with neoadjuvant chemotherapy were reviewed by three experienced pancreatic pathologists who, blinded for treatment status, categorized each tumour as treatment-naïve or neoadjuvantly treated, and annotated all tissue areas they considered showing tumour regression. Only 50%-65% of the cases were categorized correctly, and the annotated tissue areas were highly discrepant (only 3%-41% overlap). When the prevalence of various morphological features deemed to indicate TR was compared between treatment-naïve and neoadjuvantly treated tumours, only one pattern, characterized by reduced cancer cell density and prominent stroma affecting a large area of the tumour bed, occurred significantly more frequently, but not exclusively, in the neoadjuvantly treated group. Finally, stromal features, both morphological and biological, were investigated as possible markers for tumour regression, but failed to distinguish TR from native tumour stroma. CONCLUSION: There is considerable divergence in opinion between pathologists when it comes to the identification of tumour regression. Reliable identification of TR is only possible if it is extensive, while lesser degrees of treatment effect cannot be recognized with certainty.


Assuntos
Terapia Neoadjuvante , Neoplasias Pancreáticas , Humanos , Neoplasias Pancreáticas/patologia , Neoplasias Pancreáticas/diagnóstico , Neoplasias Pancreáticas/terapia , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Variações Dependentes do Observador , Protocolos de Quimioterapia Combinada Antineoplásica/uso terapêutico , Gradação de Tumores
10.
Histopathology ; 85(1): 81-91, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38477366

RESUMO

AIMS: Immune checkpoint inhibitors targeting programmed death-ligand 1 (PD-L1) have shown promising clinical outcomes in urothelial carcinoma (UC). The combined positive score (CPS) quantifies PD-L1 22C3 expression in UC, but it can vary between pathologists due to the consideration of both immune and tumour cell positivity. METHODS AND RESULTS: An artificial intelligence (AI)-powered PD-L1 CPS analyser was developed using 1,275,907 cells and 6175.42 mm2 of tissue annotated by pathologists, extracted from 400 PD-L1 22C3-stained whole slide images of UC. We validated the AI model on 543 UC PD-L1 22C3 cases collected from three institutions. There were 446 cases (82.1%) where the CPS results (CPS ≥10 or <10) were in complete agreement between three pathologists, and 486 cases (89.5%) where the AI-powered CPS results matched the consensus of two or more pathologists. In the pathologist's assessment of the CPS, statistically significant differences were noted depending on the source hospital (P = 0.003). Three pathologists reevaluated discrepancy cases with AI-powered CPS results. After using the AI as a guide and revising, the complete agreement increased to 93.9%. The AI model contributed to improving the concordance between pathologists across various factors including hospital, specimen type, pathologic T stage, histologic subtypes, and dominant PD-L1-positive cell type. In the revised results, the evaluation discordance among slides from different hospitals was mitigated. CONCLUSION: This study suggests that AI models can help pathologists to reduce discrepancies between pathologists in quantifying immunohistochemistry including PD-L1 22C3 CPS, especially when evaluating data from different institutions, such as in a telepathology setting.


Assuntos
Inteligência Artificial , Antígeno B7-H1 , Carcinoma de Células de Transição , Variações Dependentes do Observador , Neoplasias da Bexiga Urinária , Humanos , Antígeno B7-H1/análise , Antígeno B7-H1/metabolismo , Neoplasias da Bexiga Urinária/patologia , Neoplasias da Bexiga Urinária/diagnóstico , Neoplasias da Bexiga Urinária/metabolismo , Carcinoma de Células de Transição/patologia , Carcinoma de Células de Transição/metabolismo , Carcinoma de Células de Transição/diagnóstico , Biomarcadores Tumorais/análise , Biomarcadores Tumorais/metabolismo , Neoplasias Urológicas/patologia , Neoplasias Urológicas/diagnóstico , Masculino , Imuno-Histoquímica/métodos , Feminino , Idoso
11.
Eur J Nucl Med Mol Imaging ; 51(6): 1741-1752, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38273003

RESUMO

PURPOSE: Prostate-specific membrane antigen (PSMA) positron emission tomography/ computed tomography (PET/CT) is recognized as the most accurate imaging modality for detection of metastatic high-risk prostate cancer (PCa). Its role in the local staging of disease is yet unclear. We assessed the intra- and interobserver variability, as well as the diagnostic accuracy of the PSMA PET/CT based molecular imaging local tumour stage (miT-stage) for the local tumour stage assessment in a large, multicentre cohort of patients with intermediate and high-risk primary PCa, with the radical prostatectomy specimen (pT-stage) serving as the reference standard. METHODS: A total of 600 patients who underwent staging PSMA PET/CT before robot-assisted radical prostatectomy was studied. In 579 PSMA positive primary prostate tumours a comparison was made between miT-stage as assessed by four nuclear physicians and the pT-stage according to ISUP protocol. Sensitivity, specificity and diagnostic accuracy were determined. In a representative subset of 100 patients, the intra-and interobserver variability were assessed using Kappa-estimates. RESULTS: The sensitivity and specificity of the PSMA PET/CT based miT-stage were 58% and 59% for pT3a-stage, 30% and 97% for ≥ pT3b-stage, and 68% and 61% for overall ≥ pT3-stage, respectively. No statistically significant differences in diagnostic accuracy were found between tracers. We found a substantial intra-observer agreement for PSMA PET/CT assessment of ≥ T3-stage (k 0.70) and ≥ T3b-stage (k 0.75), whereas the interobserver agreement for the assessment of ≥ T3-stage (k 0.47) and ≥ T3b-stage (k 0.41) were moderate. CONCLUSION: In a large, multicentre study evaluating 600 patients with newly diagnosed intermediate and high-risk PCa, we showed that PSMA PET/CT may have a value in local tumour staging when pathological tumour stage in the radical prostatectomy specimen was used as the reference standard. The intra-observer and interobserver variability of assessment of tumour extent on PSMA PET/CT was moderate to substantial.


Assuntos
Antígenos de Superfície , Glutamato Carboxipeptidase II , Estadiamento de Neoplasias , Variações Dependentes do Observador , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias da Próstata , Humanos , Masculino , Neoplasias da Próstata/diagnóstico por imagem , Neoplasias da Próstata/patologia , Neoplasias da Próstata/cirurgia , Idoso , Pessoa de Meia-Idade , Glutamato Carboxipeptidase II/metabolismo
12.
J Magn Reson Imaging ; 60(3): 1037-1048, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-38100302

RESUMO

BACKGROUND: MR elastography (MRE) may provide quantitative imaging biomarkers of lumbar back muscles (LBMs), complementing MRI in spinal diseases by assessing muscle mechanical properties. However, reproducibility analyses for MRE of LBM are lacking. PURPOSE: To assess technical failure, within-day and inter-day reproducibility, robustness with the excitation source positioning, and inter-observer agreement of MRE of muscles. STUDY TYPE: Prospective. SUBJECTS: Seventeen healthy subjects (mean age 28 ± 4 years; 11 females). FIELD STRENGTH/SEQUENCE: 1.5 T, gradient-echo MRE, T1-weighted turbo spin echo. ASSESSMENT: The pneumatic driver was centered at L3 level. Four MRE were performed during two visits, 2-4 weeks apart, each consisting of two MRE with less than 10 minutes inter-scan interval. At Visit 1, after the first MRE, the coil and driver were removed, then reinstalled. The MRE was repeated. At Visit 2, following the first MRE, only the driver was moved down 5 cm. The MRE was repeated. Two radiologists segmented the multifidus and erector spinae muscles. STATISTICAL TESTS: Paired t-test, analysis of variance, intraclass correlation coefficients (ICCs). P-values <0.05 were considered statistically significant. RESULTS: Mean stiffness of LBM ranged from 1.44 to 1.60 kPa. Mean technical failure rate was 2.5%. Inter-observer agreement was excellent (ICC ranging from 0.82 [0.64-0.96] to 0.99 [0.98-0.99] in the multifidus, and from 0.85 [0.69-0.92] to 0.99 [0.97-0.99] in the erector spinae muscles). Within-day reproducibility was fair in the multifidus (ICC: 0.53 [0.47-0.77]) and good in the erector spinae muscles (ICC: 0.74 [0.48-0.88]). Reproducibility after moving the driver was excellent in both multifidus (ICC: 0.85 [0.69-0.93]) and erector spinae muscles (ICC: 0.84 [0.67-0.92]). Inter-day reproducibility was excellent in the multifidus (ICC: 0.76 [0.48-0.89]) and poor in the erector spinae muscles (ICC: 0.23 [-0.61 to 0.63]). DATA CONCLUSION: MRE of LBM provides measurements of stiffness with fair to excellent reproducibility and excellent inter-observer agreement. However, inter-day reproducibility in the multifidus muscles indicated that the herein used MRE protocol may not be optimal for this muscle. EVIDENCE LEVEL: 2 TECHNICAL EFFICACY: Stage 1.


Assuntos
Músculos do Dorso , Técnicas de Imagem por Elasticidade , Imageamento por Ressonância Magnética , Humanos , Feminino , Técnicas de Imagem por Elasticidade/métodos , Reprodutibilidade dos Testes , Adulto , Masculino , Estudos Prospectivos , Imageamento por Ressonância Magnética/métodos , Músculos do Dorso/diagnóstico por imagem , Variações Dependentes do Observador , Região Lombossacral/diagnóstico por imagem , Voluntários Saudáveis , Vértebras Lombares/diagnóstico por imagem , Adulto Jovem
13.
BJU Int ; 134(1): 89-95, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38627205

RESUMO

OBJECTIVES: To assess the intra/inter-observer reliability of cystoscopic sphincter evaluation (CSE) in men undergoing sling surgery for urinary incontinence and if possible to evaluate its correlation with the final clinical decision. PATIENTS AND METHODS: Two expert urologists prospectively filmed and recorded, incontinent patient's cystoscopies according to a standard scenario. Anonymised recordings where randomly offered to the same observer twice. The observers (medical students, urology residents and full urologist with 0-5, 5-10, >10 years of practice, respectively) were asked to assess and score the recordings without knowing any of the patients' characteristics. RESULTS: In total, 37 recordings were scored twice by the 26 observers. The intraclass correlation coefficient (ICC) for intra-observer reliability of the CSE was 0.54 (moderate), 0.58 (moderate) and 0.60 (substantial) for medical students, residents, and urologists, respectively. However, when stratifying observers according to their experience, the lowest agreement values were found between experts with >10 years of experience. The inter-observer reliability for the CSE ICCs ranged between 0.31and 0.53, with the lowest ICC value observed between urologists (0.31). CONCLUSIONS: The study demonstrates poor intra- and inter-observer reliability of the CSE. According to these results, a CSE does not add valuable information to the clinical evaluation. In this scenario, it should not be considered in isolation from the patient's characteristics.


Assuntos
Cistoscopia , Variações Dependentes do Observador , Humanos , Masculino , Reprodutibilidade dos Testes , Estudos Prospectivos , Slings Suburetrais , Pessoa de Meia-Idade , Idoso , Adulto , Incontinência Urinária/diagnóstico , Competência Clínica
14.
World J Urol ; 42(1): 450, 2024 Jul 27.
Artigo em Inglês | MEDLINE | ID: mdl-39066902

RESUMO

PURPOSE: Urothelial bladder cancer (UCB) care requires frequent follow-up cystoscopy and surgery. Confocal laser endomicroscopy (CLE) is a probe-based optical technique that can provide real-time microscopic evaluation with the potential for outpatient grading of UCB. This study aims to investigate the diagnostic accuracy and interobserver variability for the grading of UCB with CLE during flexible cystoscopy (fCLE). METHODS: Participants scheduled for transurethral resection of papillary bladder tumors were prospectively included for intra-operative fCLE. Exclusion criteria were flat lesions, fluorescein allergy or pregnancy. Two independent observers evaluated fCLE, classifying tumors as low- or high-grade urothelial carcinoma (LGUC/HGUC) or benign. Interobserver agreement was calculated with Cohens kappa (κ) and diagnostic accuracy with 2 × 2 tables. Histopathology was the reference test. RESULTS: Histopathology of 34 lesions revealed 14 HGUC, 14 LGUC and 6 benign tumors. Diagnostic yield for fCLE was 80-85% with a κ of 0.75. Respectively, sensitivity, specificity, NPV and PPV were: for benign tumors 0-20%, 96-100%, unmeasureable-50% and 87%, for LGUC 57-64%, 41-58%, 44-53% and 54-69% and for HGUC 38-57%, 56-68%, 38-57% and 56-68%, with an interobserver agreement of κ 0.61. CONCLUSION: fCLE is currently insufficient to grade UCB.


Assuntos
Carcinoma de Células de Transição , Cistoscopia , Microscopia Confocal , Gradação de Tumores , Neoplasias da Bexiga Urinária , Humanos , Microscopia Confocal/métodos , Cistoscopia/métodos , Neoplasias da Bexiga Urinária/patologia , Feminino , Idoso , Masculino , Pessoa de Meia-Idade , Carcinoma de Células de Transição/patologia , Estudos Prospectivos , Idoso de 80 Anos ou mais , Variações Dependentes do Observador
15.
Eur Radiol ; 34(7): 4494-4503, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38165429

RESUMO

OBJECTIVES: The aim of this study is to improve the reliability of subjective IQ assessment using a pairwise comparison (PC) method instead of a Likert scale method in abdominal CT scans. METHODS: Abdominal CT scans (single-center) were retrospectively selected between September 2019 and February 2020 in a prior study. Sample variance in IQ was obtained by adding artificial noise using dedicated reconstruction software, including reconstructions with filtered backprojection and varying iterative reconstruction strengths. Two datasets (each n = 50) were composed with either higher or lower IQ variation with the 25 original scans being part of both datasets. Using in-house developed software, six observers (five radiologists, one resident) rated both datasets via both the PC method (forcing observers to choose preferred scans out of pairs of scans resulting in a ranking) and a 5-point Likert scale. The PC method was optimized using a sorting algorithm to minimize necessary comparisons. The inter- and intraobserver agreements were assessed for both methods with the intraclass correlation coefficient (ICC). RESULTS: Twenty-five patients (mean age 61 years ± 15.5; 56% men) were evaluated. The ICC for interobserver agreement for the high-variation dataset increased from 0.665 (95%CI 0.396-0.814) to 0.785 (95%CI 0.676-0.867) when the PC method was used instead of a Likert scale. For the low-variation dataset, the ICC increased from 0.276 (95%CI 0.034-0.500) to 0.562 (95%CI 0.337-0.729). Intraobserver agreement increased for four out of six observers. CONCLUSION: The PC method is more reliable for subjective IQ assessment indicated by improved inter- and intraobserver agreement. CLINICAL RELEVANCE STATEMENT: This study shows that the pairwise comparison method is a more reliable method for subjective image quality assessment. Improved reliability is of key importance for optimization studies, validation of automatic image quality assessment algorithms, and training of AI algorithms. KEY POINTS: • Subjective assessment of diagnostic image quality via Likert scale has limited reliability. • A pairwise comparison method improves the inter- and intraobserver agreement. • The pairwise comparison method is more reliable for CT optimization studies.


Assuntos
Tomografia Computadorizada por Raios X , Humanos , Masculino , Feminino , Tomografia Computadorizada por Raios X/métodos , Reprodutibilidade dos Testes , Pessoa de Meia-Idade , Estudos Retrospectivos , Variações Dependentes do Observador , Interpretação de Imagem Radiográfica Assistida por Computador/métodos , Radiografia Abdominal/métodos , Algoritmos , Software
16.
Eur Radiol ; 34(7): 4801-4809, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38165432

RESUMO

OBJECTIVE: To evaluate the learning progress of less experienced readers in prostate MRI segmentation. MATERIALS AND METHODS: One hundred bi-parametric prostate MRI scans were retrospectively selected from the Göteborg Prostate Cancer Screening 2 Trial (single center). Nine readers with varying degrees of segmentation experience were involved: one expert radiologist, two experienced radiology residents, two inexperienced radiology residents, and four novices. The task was to segment the whole prostate gland. The expert's segmentations were used as reference. For all other readers except three novices, the 100 MRI scans were divided into five rounds (cases 1-10, 11-25, 26-50, 51-76, 76-100). Three novices segmented only 50 cases (three rounds). After each round, a one-on-one feedback session between the expert and the reader was held, with feedback on systematic errors and potential improvements for the next round. Dice similarity coefficient (DSC) > 0.8 was considered accurate. RESULTS: Using DSC > 0.8 as the threshold, the novices had a total of 194 accurate segmentations out of 250 (77.6%). The residents had a total of 397/400 (99.2%) accurate segmentations. In round 1, the novices had 19/40 (47.5%) accurate segmentations, in round 2 41/60 (68.3%), and in round 3 84/100 (84.0%) indicating learning progress. CONCLUSIONS: Radiology residents, regardless of prior experience, showed high segmentation accuracy. Novices showed larger interindividual variation and lower segmentation accuracy than radiology residents. To prepare datasets for artificial intelligence (AI) development, employing radiology residents seems safe and provides a good balance between cost-effectiveness and segmentation accuracy. Employing novices should only be considered on an individual basis. CLINICAL RELEVANCE STATEMENT: Employing radiology residents for prostate MRI segmentation seems safe and can potentially reduce the workload of expert radiologists. Employing novices should only be considered on an individual basis. KEY POINTS: • Using less experienced readers for prostate MRI segmentation is cost-effective but may reduce quality. • Radiology residents provided high accuracy segmentations while novices showed large inter-reader variability. • To prepare datasets for AI development, employing radiology residents seems safe and might provide a good balance between cost-effectiveness and segmentation accuracy while novices should only be employed on an individual basis.


Assuntos
Competência Clínica , Imageamento por Ressonância Magnética , Neoplasias da Próstata , Humanos , Masculino , Neoplasias da Próstata/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Estudos Retrospectivos , Internato e Residência , Radiologistas , Pessoa de Meia-Idade , Radiologia/educação , Idoso , Interpretação de Imagem Assistida por Computador/métodos , Próstata/diagnóstico por imagem , Variações Dependentes do Observador
17.
Eur Radiol ; 34(8): 5415-5424, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38165430

RESUMO

OBJECTIVES: The aim of our study was to examine how breast radiologists would be affected by high cancer prevalence and the use of artificial intelligence (AI) for decision support. MATERIALS AND METHOD: This reader study was based on selection of screening mammograms, including the original radiologist assessment, acquired in 2010 to 2013 at the Karolinska University Hospital, with a ratio of 1:1 cancer versus healthy based on a 2-year follow-up. A commercial AI system generated an exam-level positive or negative read, and image markers. Double-reading and consensus discussions were first performed without AI and later with AI, with a 6-week wash-out period in between. The chi-squared test was used to test for differences in contingency tables. RESULTS: Mammograms of 758 women were included, half with cancer and half healthy. 52% were 40-55 years; 48% were 56-75 years. In the original non-enriched screening setting, the sensitivity was 61% (232/379) at specificity 98% (323/379). In the reader study, the sensitivity without and with AI was 81% (307/379) and 75% (284/379) respectively (p < 0.001). The specificity without and with AI was 67% (255/379) and 86% (326/379) respectively (p < 0.001). The tendency to change assessment from positive to negative based on erroneous AI information differed between readers and was affected by type and number of image signs of malignancy. CONCLUSION: Breast radiologists reading a list with high cancer prevalence performed at considerably higher sensitivity and lower specificity than the original screen-readers. Adding AI information, calibrated to a screening setting, decreased sensitivity and increased specificity. CLINICAL RELEVANCE STATEMENT: Radiologist screening mammography assessments will be biased towards higher sensitivity and lower specificity by high-risk triaging and nudged towards the sensitivity and specificity setting of AI reads. After AI implementation in clinical practice, there is reason to carefully follow screening metrics to ensure the impact is desired. KEY POINTS: • Breast radiologists' sensitivity and specificity will be affected by changes brought by artificial intelligence. • Reading in a high cancer prevalence setting markedly increased sensitivity and decreased specificity. • Reviewing the binary reads by AI, negative or positive, biased screening radiologists towards the sensitivity and specificity of the AI system.


Assuntos
Inteligência Artificial , Neoplasias da Mama , Detecção Precoce de Câncer , Mamografia , Sensibilidade e Especificidade , Humanos , Feminino , Neoplasias da Mama/diagnóstico por imagem , Neoplasias da Mama/epidemiologia , Pessoa de Meia-Idade , Mamografia/métodos , Idoso , Adulto , Prevalência , Detecção Precoce de Câncer/métodos , Variações Dependentes do Observador , Sistemas de Apoio a Decisões Clínicas
18.
Eur Radiol ; 34(8): 5360-5369, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38206404

RESUMO

OBJECTIVE: To evaluate the reproducibility of vessel wall magnetic resonance imaging (VW-MRI) in diagnosing giant cell arteritis (GCA) among groups of radiologists with varying levels of expertise. METHODS: This institutional review board-approved retrospective single-center study recruited patients with suspected GCA between December 2014 and September 2021. Patients underwent 3 -T VW-MRI before temporal artery biopsy. Ten radiologists with varying levels of expertise, blinded to all data, evaluated several intracranial and extracranial arteries to assess GCA diagnosis. Interobserver reproducibility and diagnostic performance were evaluated. RESULTS: Fifty patients (27 women and 23 men) with a mean age of 75.9 ± 9 years were included. Thirty-one of 50 (62%) had a final diagnosis of GCA.VW-MRI had an almost perfect reproducibility among expert readers (kappa = 0.93; 95% CI 0.77-1) and substantial reproducibility among all readers, junior and non-expert senior readers (kappa = 0.7; 95% CI 0.66-0.73; kappa = 0.67 95% CI 0.59-0.74; kappa = 0.65; 95% CI 0.43-0.88 respectively) when diagnosing GCA. Substantial interobserver agreement was observed for the frontal branch of superficial temporal artery. Moderate interobserver agreement was observed for the superficial temporal artery and its parietal branch, as well as ophthalmic arteries in all groups of readers. Sensitivity, specificity, positive predictive value, negative predictive value, and accuracy varied depending on the group of readers. CONCLUSION: VW-MRI is a reproducible and accurate imaging modality for detecting GCA, even among less-experienced readers. This study advocates for the use of VW-MRI when diagnosing GCA even in less-experienced centers. CLINICAL RELEVANCE STATEMENT: VW-MRI is a reproducible and accurate imaging modality for detecting GCA, even among less-experienced readers, and it could be used as a first-line diagnostic tool for GCA in centers with limited expertise in GCA diagnosis. KEY POINTS: • Vessel wall magnetic resonance imaging (VW-MRI) is a reproducible and accurate imaging modality for detecting giant cell arteritis (GCA) in both extracranial and intracranial arteries. • The reproducibility of vessel wall magnetic resonance imaging for giant cell arteritis diagnosis was high among expert readers and moderate among less-experienced readers. • The use of vessel wall magnetic resonance imaging for giant cell arteritis diagnosis can be recommended even in centers with less-experienced readers.


Assuntos
Arterite de Células Gigantes , Imageamento por Ressonância Magnética , Artérias Temporais , Humanos , Arterite de Células Gigantes/diagnóstico por imagem , Feminino , Masculino , Reprodutibilidade dos Testes , Idoso , Estudos Retrospectivos , Imageamento por Ressonância Magnética/métodos , Artérias Temporais/diagnóstico por imagem , Artérias Temporais/patologia , Variações Dependentes do Observador , Competência Clínica , Sensibilidade e Especificidade , Idoso de 80 Anos ou mais
19.
Eur Radiol ; 34(8): 5228-5238, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38244046

RESUMO

OBJECTIVE: To determine the inter-reader reliability and diagnostic performance of classification and severity scales of Neuropathy Score Reporting And Data System (NS-RADS) among readers of differing experience levels after limited teaching of the scoring system. METHODS: This is a multi-institutional, cross-sectional, retrospective study of MRI cases of proven peripheral neuropathy (PN) conditions. Thirty-two radiology readers with varying experience levels were recruited from different institutions. Each reader attended and received a structured presentation that described the NS-RADS classification system containing examples and reviewed published articles on this subject. The readers were then asked to perform NS-RADS scoring with recording of category, subcategory, and most likely diagnosis. Inter-reader agreements were evaluated by Conger's kappa and diagnostic accuracy was calculated for each reader as percent correct diagnosis. A linear mixed model was used to estimate and compare accuracy between trainees and attendings. RESULTS: Across all readers, agreement was good for NS-RADS category and moderate for subcategory. Inter-reader agreement of trainees was comparable to attendings (0.65 vs 0.65). Reader accuracy for attendings was 75% (95% CI 73%, 77%), slightly higher than for trainees (71% (69%, 72%), p = 0.0006) for nerves and comparable for muscles (attendings, 87.5% (95% CI 86.1-88.8%) and trainees, 86.6% (95% CI 85.2-87.9%), p = 0.4). NS-RADS accuracy was also higher than average accuracy for the most plausible diagnosis for attending radiologists at 67% (95% CI 63%, 71%) and for trainees at 65% (95% CI 60%, 69%) (p = 0.036). CONCLUSION: Non-expert radiologists interpreted PN conditions with good accuracy and moderate-to-good inter-reader reliability using the NS-RADS scoring system. CLINICAL RELEVANCE STATEMENT: The Neuropathy Score Reporting And Data System (NS-RADS) is an accurate and reliable MRI-based image scoring system for practical use for the diagnosis and grading of severity of peripheral neuromuscular disorders by both experienced and general radiologists. KEY POINTS: • The Neuropathy Score Reporting And Data System (NS-RADS) can be used effectively by non-expert radiologists to categorize peripheral neuropathy. • Across 32 different experience-level readers, the agreement was good for NS-RADS category and moderate for NS-RADS subcategory. • NS-RADS accuracy was higher than the average accuracy for the most plausible diagnosis for both attending radiologists and trainees (at 75%, 71% and 65%, 65%, respectively).


Assuntos
Imageamento por Ressonância Magnética , Variações Dependentes do Observador , Doenças do Sistema Nervoso Periférico , Humanos , Doenças do Sistema Nervoso Periférico/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Estudos Transversais , Estudos Retrospectivos , Reprodutibilidade dos Testes , Feminino , Masculino , Pessoa de Meia-Idade , Adulto , Idoso , Índice de Gravidade de Doença , Radiologistas , Competência Clínica , Radiologia/educação
20.
Eur Radiol ; 34(7): 4504-4515, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38099965

RESUMO

OBJECTIVES: The aim of this proof-of-principle study combining data analysis and computer simulation was to evaluate the robustness of apparent diffusion coefficient (ADC) values for lymph node classification in prostate cancer under conditions comparable to clinical practice. MATERIALS AND METHODS: To assess differences in ADC and inter-rater variability, ADC values of 359 lymph nodes in 101 patients undergoing simultaneous prostate-specific membrane antigen (PSMA)-PET/MRI were retrospectively measured by two blinded readers and compared in a node-by-node analysis with respect to lymph node status. In addition, a phantom and 13 patients with 86 lymph nodes were prospectively measured on two different MRI scanners to analyze inter-scanner agreement. To estimate the diagnostic quality of the ADC in real-world application, a computer simulation was used to emulate the blurring caused by scanner and reader variability. To account for intra-individual correlation, the statistical analyses and simulations were based on linear mixed models. RESULTS: The mean ADC of lymph nodes showing PSMA signals in PET was markedly lower (0.77 × 10-3 mm2/s) compared to inconspicuous nodes (1.46 × 10-3 mm2/s, p < 0.001). High inter-reader agreement was observed for ADC measurements (ICC 0.93, 95%CI [0.92, 0.95]). Good inter-scanner agreement was observed in the phantom study and confirmed in vivo (ICC 0.89, 95%CI [0.84, 0.93]). With a median AUC of 0.95 (95%CI [0.92, 0.97]), the simulation study confirmed the diagnostic potential of ADC for lymph node classification in prostate cancer. CONCLUSION: Our model-based simulation approach implicates a high potential of ADC for lymph node classification in prostate cancer, even when inter-rater and inter-scanner variability are considered. CLINICAL RELEVANCE STATEMENT: The ADC value shows a high diagnostic potential for lymph node classification in prostate cancer. The robustness to scanner and reader variability implicates that this easy to measure and widely available method could be readily integrated into clinical routine. KEY POINTS: • The diagnostic value of the apparent diffusion coefficient (ADC) for lymph node classification in prostate cancer is unclear in the light of inter-rater and inter-scanner variability. • Metastatic and inconspicuous lymph nodes differ significantly in ADC, resulting in a high diagnostic potential that is robust to inter-scanner and inter-rater variability. • ADC has a high potential for lymph node classification in prostate cancer that is maintained under conditions comparable to clinical practice.


Assuntos
Linfonodos , Metástase Linfática , Neoplasias da Próstata , Humanos , Masculino , Neoplasias da Próstata/diagnóstico por imagem , Neoplasias da Próstata/patologia , Metástase Linfática/diagnóstico por imagem , Idoso , Pessoa de Meia-Idade , Linfonodos/diagnóstico por imagem , Linfonodos/patologia , Estudos Retrospectivos , Imagens de Fantasmas , Imagem de Difusão por Ressonância Magnética/métodos , Tomografia por Emissão de Pósitrons/métodos , Simulação por Computador , Estudos Prospectivos , Imagem Multimodal/métodos , Variações Dependentes do Observador
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa