Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Eur Radiol ; 2024 Jun 11.
Artigo em Inglês | MEDLINE | ID: mdl-38861161

RESUMO

PURPOSE: This work aims to assess standard evaluation practices used by the research community for evaluating medical imaging classifiers, with a specific focus on the implications of class imbalance. The analysis is performed on chest X-rays as a case study and encompasses a comprehensive model performance definition, considering both discriminative capabilities and model calibration. MATERIALS AND METHODS: We conduct a concise literature review to examine prevailing scientific practices used when evaluating X-ray classifiers. Then, we perform a systematic experiment on two major chest X-ray datasets to showcase a didactic example of the behavior of several performance metrics under different class ratios and highlight how widely adopted metrics can conceal performance in the minority class. RESULTS: Our literature study confirms that: (1) even when dealing with highly imbalanced datasets, the community tends to use metrics that are dominated by the majority class; and (2) it is still uncommon to include calibration studies for chest X-ray classifiers, albeit its importance in the context of healthcare. Moreover, our systematic experiments confirm that current evaluation practices may not reflect model performance in real clinical scenarios and suggest complementary metrics to better reflect the performance of the system in such scenarios. CONCLUSION: Our analysis underscores the need for enhanced evaluation practices, particularly in the context of class-imbalanced chest X-ray classifiers. We recommend the inclusion of complementary metrics such as the area under the precision-recall curve (AUC-PR), adjusted AUC-PR, and balanced Brier score, to offer a more accurate depiction of system performance in real clinical scenarios, considering metrics that reflect both, discrimination and calibration performance. CLINICAL RELEVANCE STATEMENT: This study underscores the critical need for refined evaluation metrics in medical imaging classifiers, emphasizing that prevalent metrics may mask poor performance in minority classes, potentially impacting clinical diagnoses and healthcare outcomes. KEY POINTS: Common scientific practices in papers dealing with X-ray computer-assisted diagnosis (CAD) systems may be misleading. We highlight limitations in reporting of evaluation metrics for X-ray CAD systems in highly imbalanced scenarios. We propose adopting alternative metrics based on experimental evaluation on large-scale datasets.

2.
Sci Data ; 11(1): 511, 2024 May 17.
Artigo em Inglês | MEDLINE | ID: mdl-38760409

RESUMO

The development of successful artificial intelligence models for chest X-ray analysis relies on large, diverse datasets with high-quality annotations. While several databases of chest X-ray images have been released, most include disease diagnosis labels but lack detailed pixel-level anatomical segmentation labels. To address this gap, we introduce an extensive chest X-ray multi-center segmentation dataset with uniform and fine-grain anatomical annotations for images coming from five well-known publicly available databases: ChestX-ray8, CheXpert, MIMIC-CXR-JPG, Padchest, and VinDr-CXR, resulting in 657,566 segmentation masks. Our methodology utilizes the HybridGNet model to ensure consistent and high-quality segmentations across all datasets. Rigorous validation, including expert physician evaluation and automatic quality control, was conducted to validate the resulting masks. Additionally, we provide individualized quality indices per mask and an overall quality estimation per dataset. This dataset serves as a valuable resource for the broader scientific community, streamlining the development and assessment of innovative methodologies in chest X-ray analysis.


Assuntos
Radiografia Torácica , Humanos , Bases de Dados Factuais , Inteligência Artificial , Pulmão/diagnóstico por imagem
3.
World Neurosurg ; 187: e363-e382, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38649028

RESUMO

BACKGROUND: Measuring spinal alignment with radiological parameters is essential in patients with spinal conditions likely to be treated surgically. These evaluations are not usually included in the radiological report. As a result, spinal surgeons commonly perform the measurement, which is time-consuming and subject to errors. We aim to develop a fully automated artificial intelligence (AI) tool to assist in measuring alignment parameters in whole-spine lateral radiograph (WSL X-rays). METHODS: We developed a tool called Vertebrai that automatically calculates the global spinal parameters (GSPs): Pelvic incidence, sacral slope, pelvic tilt, L1-L4 angle, L4-S1 lumbo-pelvic angle, T1 pelvic angle, sagittal vertical axis, cervical lordosis, C1-C2 lordosis, lumbar lordosis, mid-thoracic kyphosis, proximal thoracic kyphosis, global thoracic kyphosis, T1 slope, C2-C7 plummet, spino-sacral angle, C7 tilt, global tilt, spinopelvic tilt, and hip odontoid axis. We assessed human-AI interaction instead of AI performance alone. We compared the time to measure GSP and inter-rater agreement with and without AI assistance. Two institutional datasets were created with 2267 multilabel images for classification and 784 WSL X-rays with reference standard landmark labeled by spinal surgeons. RESULTS: Vertebrai significantly reduced the measurement time comparing spine surgeons with AI assistance and the AI algorithm alone, without human intervention (3 minutes vs. 0.26 minutes; P < 0.05). Vertebrai achieved an average accuracy of 83% in detecting abnormal alignment values, with the sacral slope parameter exhibiting the lowest accuracy at 61.5% and spinopelvic tilt demonstrating the highest accuracy at 100%. Intraclass correlation analysis revealed a high level of correlation and consistency in the global alignment parameters. CONCLUSIONS: Vertebrai's measurements can accurately detect alignment parameters, making it a promising tool for measuring GSP automatically.


Assuntos
Inteligência Artificial , Humanos , Radiografia/métodos , Coluna Vertebral/diagnóstico por imagem , Coluna Vertebral/cirurgia , Feminino , Masculino , Lordose/diagnóstico por imagem , Lordose/cirurgia , Adulto , Pessoa de Meia-Idade
4.
IEEE Trans Med Imaging ; 42(2): 546-556, 2023 02.
Artigo em Inglês | MEDLINE | ID: mdl-36423313

RESUMO

Anatomical segmentation is a fundamental task in medical image computing, generally tackled with fully convolutional neural networks which produce dense segmentation masks. These models are often trained with loss functions such as cross-entropy or Dice, which assume pixels to be independent of each other, thus ignoring topological errors and anatomical inconsistencies. We address this limitation by moving from pixel-level to graph representations, which allow to naturally incorporate anatomical constraints by construction. To this end, we introduce HybridGNet, an encoder-decoder neural architecture that leverages standard convolutions for image feature encoding and graph convolutional neural networks (GCNNs) to decode plausible representations of anatomical structures. We also propose a novel image-to-graph skip connection layer which allows localized features to flow from standard convolutional blocks to GCNN blocks, and show that it improves segmentation accuracy. The proposed architecture is extensively evaluated in a variety of domain shift and image occlusion scenarios, and audited considering different types of demographic domain shift. Our comprehensive experimental setup compares HybridGNet with other landmark and pixel-based models for anatomical segmentation in chest x-ray images, and shows that it produces anatomically plausible results in challenging scenarios where other models tend to fail.


Assuntos
Processamento de Imagem Assistida por Computador , Redes Neurais de Computação , Raios X , Processamento de Imagem Assistida por Computador/métodos , Radiografia , Tórax/diagnóstico por imagem
5.
Neurol India ; 71(5): 902-906, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37929425

RESUMO

Background: The delay in the referral of patients with potential surgical vertebral metastasis (VM) to the spine surgeon is strongly associated with a worse outcome. The spinal instability neoplastic score (SINS) allows for determining the risk of instability of a spine segment with VM; however, it is almost exclusively used by specialists or residents in neurosurgery or orthopedics. The objective of this work is to report the delay in surgical consultation of patients with potentially unstable and unstable VM (SINS >6) at our center. Material: We performed a 5-year single-center retrospective analysis of patients with spine metastasis on computed tomography (CT). Patients were divided into Group 1 (G1), potentially unstable VM (SINS 7-12), and Group 2 (G2), unstable VM (SINS 13-18). Time to surgical referral was calculated as the number of days between the report of the VM in the CT and the first clinical assessment of a spinal surgeon on the medical records. Results: We analyzed 220 CT scans, and 98 met the selection criteria. Group 1 had 85 patients (86.7%) and Group 2 had 13 (13.3%). We observed a mean time to referral of 83.5 days in the entire cohort (std = 127.6); 87.6 days (std = 135.1) for G1, and 57.2 days (std = 53.8) for G2. The delay in referral showed no significant correlation with the SINS score. Conclusion: We report a mean delay of 83.5 days in the surgical referral of VM (SINS >6, n = 98). Both groups showed cases of serious referral delay, with 25% of patients having the first surgical consultation more than three months after the CT study.


Assuntos
Neoplasias da Coluna Vertebral , Humanos , América Latina , Estudos Retrospectivos , Neoplasias da Coluna Vertebral/diagnóstico por imagem , Neoplasias da Coluna Vertebral/patologia , Neoplasias da Coluna Vertebral/secundário , Neoplasias da Coluna Vertebral/cirurgia , Cirurgiões , Encaminhamento e Consulta , Tempo para o Tratamento , Tomografia Computadorizada por Raios X , Coluna Vertebral/diagnóstico por imagem , Coluna Vertebral/patologia , Coluna Vertebral/cirurgia
6.
Stud Health Technol Inform ; 294: 8-12, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612006

RESUMO

The acceptance of artificial intelligence (AI) systems by health professionals is crucial to obtain a positive impact on the diagnosis pathway. We evaluated user satisfaction with an AI system for the automated detection of findings in chest x-rays, after five months of use at the Emergency Department. We collected quantitative and qualitative data to analyze the main aspects of user satisfaction, following the Technology Acceptance Model. We selected the intended users of the system as study participants: radiology residents and emergency physicians. We found that both groups of users shared a high satisfaction with the system's ease of use, while their perception of output quality (i.e., diagnostic performance) differed notably. The perceived usefulness of the application yielded positive evaluations, focusing on its utility to confirm that no findings were omitted, and also presenting distinct patterns across the two groups of users. Our results highlight the importance of clearly differentiating the intended users of AI applications in clinical workflows, to enable the design of specific modifications that better suit their particular needs. This study confirmed that measuring user acceptance and recognizing the perception that professionals have of the AI system after daily use can provide important insights for future implementations.


Assuntos
Inteligência Artificial , Satisfação Pessoal , Hospitais , Humanos , Radiografia , Raios X
7.
Comput Methods Programs Biomed ; 206: 106130, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-34023576

RESUMO

BACKGROUND AND OBJECTIVES: The multiple chest x-ray datasets released in the last years have ground-truth labels intended for different computer vision tasks, suggesting that performance in automated chest x-ray interpretation might improve by using a method that can exploit diverse types of annotations. This work presents a Deep Learning method based on the late fusion of different convolutional architectures, that allows training with heterogeneous data with a simple implementation, and evaluates its performance on independent test data. We focused on obtaining a clinically useful tool that could be successfully integrated into a hospital workflow. MATERIALS AND METHODS: Based on expert opinion, we selected four target chest x-ray findings, namely lung opacities, fractures, pneumothorax and pleural effusion. For each finding we defined the most suitable type of ground-truth label, and built four training datasets combining images from public chest x-ray datasets and our institutional archive. We trained four different Deep Learning architectures and combined their outputs with a late fusion strategy, obtaining a unified tool. The performance was measured on two test datasets: an external openly-available dataset, and a retrospective institutional dataset, to estimate performance on the local population. RESULTS: The external and local test sets had 4376 and 1064 images, respectively, for which the model showed an area under the Receiver Operating Characteristics curve of 0.75 (95%CI: 0.74-0.76) and 0.87 (95%CI: 0.86-0.89) in the detection of abnormal chest x-rays. For the local population, a sensitivity of 86% (95%CI: 84-90), and a specificity of 88% (95%CI: 86-90) were obtained, with no significant differences between demographic subgroups. We present examples of heatmaps to show the accomplished level of interpretability, examining true and false positives. CONCLUSION: This study presents a new approach for exploiting heterogeneous labels from different chest x-ray datasets, by choosing Deep Learning architectures according to the radiological characteristics of each pathological finding. We estimated the tool's performance on the local population, obtaining results comparable to state-of-the-art metrics. We believe this approach is closer to the actual reading process of chest x-rays by professionals, and therefore more likely to be successful in a real clinical setting.


Assuntos
Aprendizado Profundo , Radiografia , Estudos Retrospectivos , Triagem , Raios X
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA