RESUMO
A Food and Drug Administration (FDA)-cleared artificial intelligence (AI) algorithm misdiagnosed a finding as an intracranial hemorrhage in a patient, who was finally diagnosed with an ischemic stroke. This scenario highlights a notable failure mode of AI tools, emphasizing the importance of human-machine interaction. In this report, the authors summarize the review processes by the FDA for software as a medical device and the unique regulatory designs for radiologic AI/machine learning algorithms to ensure their safety in clinical practice. Then the challenges in maximizing the efficacy of these tools posed by their clinical implementation are discussed.
Assuntos
Algoritmos , Inteligência Artificial , Estados Unidos , Humanos , United States Food and Drug Administration , Software , Aprendizado de MáquinaRESUMO
Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword "Deep Learning" and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.
Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Humanos , Estudos Transversais , Reprodutibilidade dos Testes , AlgoritmosRESUMO
INTRODUCTION: Glioblastomas (GBMs) are highly aggressive tumors. A common clinical challenge after standard of care treatment is differentiating tumor progression from treatment-related changes, also known as pseudoprogression (PsP). Usually, PsP resolves or stabilizes without further treatment or a course of steroids, whereas true progression (TP) requires more aggressive management. Differentiating PsP from TP will affect the patient's outcome. This study investigated using deep learning to distinguish PsP MRI features from progressive disease. METHOD: We included GBM patients with a new or increasingly enhancing lesion within the original radiation field. We labeled those who subsequently were stable or improved on imaging and clinically as PsP and those with clinical and imaging deterioration as TP. A subset of subjects underwent a second resection. We labeled these subjects as PsP, or TP based on the histological diagnosis. We coregistered contrast-enhanced T1 MRIs with T2-weighted images for each patient and used them as input to a 3-D Densenet121 model and using five-fold cross-validation to predict TP vs PsP. RESULT: We included 124 patients who met the criteria, and of those, 63 were PsP and 61 were TP. We trained a deep learning model that achieved 76.4% (range 70-84%, SD 5.122) mean accuracy over the 5 folds, 0.7560 (range 0.6553-0.8535, SD 0.069) mean AUROCC, 88.72% (SD 6.86) mean sensitivity, and 62.05% (SD 9.11) mean specificity. CONCLUSION: We report the development of a deep learning model that distinguishes PsP from TP in GBM patients treated per the Stupp protocol. Further refinement and external validation are required prior to widespread adoption in clinical practice.
Assuntos
Neoplasias Encefálicas , Aprendizado Profundo , Glioblastoma , Progressão da Doença , Humanos , Imageamento por Ressonância Magnética , Estudos RetrospectivosRESUMO
Thyroid Ultrasound (US) is the primary method to evaluate thyroid nodules. Deep learning (DL) has been playing a significant role in evaluating thyroid cancer. We propose a DL-based pipeline to detect and classify thyroid nodules into benign or malignant groups relying on two views of US imaging. Transverse and longitudinal US images of thyroid nodules from 983 patients were collected retrospectively. Eighty-one cases were held out as a testing set, and the rest of the data were used in five-fold cross-validation (CV). Two You Look Only Once (YOLO) v5 models were trained to detect nodules and classify them. For each view, five models were developed during the CV, which was ensembled by using non-max suppression (NMS) to boost their collective generalizability. An extreme gradient boosting (XGBoost) model was trained on the outputs of the ensembled models for both views to yield a final prediction of malignancy for each nodule. The test set was evaluated by an expert radiologist using the American College of Radiology Thyroid Imaging Reporting and Data System (ACR-TIRADS). The ensemble models for each view achieved a mAP0.5 of 0.797 (transverse) and 0.716 (longitudinal). The whole pipeline reached an AUROC of 0.84 (CI 95%: 0.75-0.91) with sensitivity and specificity of 84% and 63%, respectively, while the ACR-TIRADS evaluation of the same set had a sensitivity of 76% and specificity of 34% (p-value = 0.003). Our proposed work demonstrated the potential possibility of a deep learning model to achieve diagnostic performance for thyroid nodule evaluation.
RESUMO
Synovial sarcoma is a rare type of soft tissue sarcoma that typically arises in the lower extremities and rarely in the upper extremities. Here, we present an unusual case of a middle-aged man who complained of dyspnea, dry cough, and chest pain and was found to have a mass-like lesion on the ulnar side of his left wrist during physical examination. The patient also exhibited gynecomastia and had elevated ß-human chorionic gonadotropin (ßHCG) levels. Subsequent imaging and histopathological analysis of the wrist mass confirmed the diagnosis of synovial sarcoma with disseminated lung metastasis. This article aims to provide a comprehensive overview of the clinical and pathological characteristics of synovial sarcoma, highlight the importance of considering synovial sarcoma as a differential diagnosis in patients with abnormal hormonal assays, and emphasize the need for clinicians to be vigilant about any pathologic lesions existing on the upper extremity to avoid late diagnosis and the development of advanced cancerous diseases.
RESUMO
In recent years, the role of Artificial Intelligence (AI) in medical imaging has become increasingly prominent, with the majority of AI applications approved by the FDA being in imaging and radiology in 2023. The surge in AI model development to tackle clinical challenges underscores the necessity for preparing high-quality medical imaging data. Proper data preparation is crucial as it fosters the creation of standardized and reproducible AI models while minimizing biases. Data curation transforms raw data into a valuable, organized, and dependable resource and is a fundamental process to the success of machine learning and analytical projects. Considering the plethora of available tools for data curation in different stages, it is crucial to stay informed about the most relevant tools within specific research areas. In the current work, we propose a descriptive outline for different steps of data curation while we furnish compilations of tools collected from a survey applied among members of the Society of Imaging Informatics (SIIM) for each of these stages. This collection has the potential to enhance the decision-making process for researchers as they select the most appropriate tool for their specific tasks.
Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Diagnóstico por Imagem/métodos , Humanos , Sociedades Médicas , Informática Médica/métodos , Inquéritos e Questionários , Curadoria de Dados/métodos , Aprendizado de MáquinaRESUMO
The application of deep learning (DL) in medicine introduces transformative tools with the potential to enhance prognosis, diagnosis, and treatment planning. However, ensuring transparent documentation is essential for researchers to enhance reproducibility and refine techniques. Our study addresses the unique challenges presented by DL in medical imaging by developing a comprehensive checklist using the Delphi method to enhance reproducibility and reliability in this dynamic field. We compiled a preliminary checklist based on a comprehensive review of existing checklists and relevant literature. A panel of 11 experts in medical imaging and DL assessed these items using Likert scales, with two survey rounds to refine responses and gauge consensus. We also employed the content validity ratio with a cutoff of 0.59 to determine item face and content validity. Round 1 included a 27-item questionnaire, with 12 items demonstrating high consensus for face and content validity that were then left out of round 2. Round 2 involved refining the checklist, resulting in an additional 17 items. In the last round, 3 items were deemed non-essential or infeasible, while 2 newly suggested items received unanimous agreement for inclusion, resulting in a final 26-item DL model reporting checklist derived from the Delphi process. The 26-item checklist facilitates the reproducible reporting of DL tools and enables scientists to replicate the study's results.
Assuntos
Lista de Checagem , Aprendizado Profundo , Técnica Delphi , Diagnóstico por Imagem , Humanos , Reprodutibilidade dos Testes , Diagnóstico por Imagem/métodos , Diagnóstico por Imagem/normas , Inquéritos e QuestionáriosRESUMO
Purpose of Review: In this study, we planned and carried out a scoping review of the literature to learn how machine learning (ML) has been investigated in cardiovascular imaging (CVI). Recent Findings: During our search, we found numerous studies that developed or utilized existing ML models for segmentation, classification, object detection, generation, and regression applications involving cardiovascular imaging data. We first quantitatively investigated the different aspects of study characteristics, data handling, model development, and performance evaluation in all studies that were included in our review. We then supplemented these findings with a qualitative synthesis to highlight the common themes in the studied literature and provided recommendations to pave the way for upcoming research. Summary: ML is a subfield of artificial intelligence (AI) that enables computers to learn human-like decision-making from data. Due to its novel applications, ML is gaining more and more attention from researchers in the healthcare industry. Cardiovascular imaging is an active area of research in medical imaging with lots of room for incorporating new technologies, like ML. Supplementary Information: The online version contains supplementary material available at 10.1007/s40134-022-00407-8.
RESUMO
Machine-learning (ML) and deep-learning (DL) algorithms are part of a group of modeling algorithms that grasp the hidden patterns in data based on a training process, enabling them to extract complex information from the input data. In the past decade, these algorithms have been increasingly used for image processing, specifically in the medical domain. Cardiothoracic imaging is one of the early adopters of ML/DL research, and the COVID-19 pandemic resulted in more research focus on the feasibility and applications of ML/DL in cardiothoracic imaging. In this scoping review, we systematically searched available peer-reviewed medical literature on cardiothoracic imaging and quantitatively extracted key data elements in order to get a big picture of how ML/DL have been used in the rapidly evolving cardiothoracic imaging field. During this report, we provide insights on different applications of ML/DL and some nuances pertaining to this specific field of research. Finally, we provide general suggestions on how researchers can make their research more than just a proof-of-concept and move toward clinical adoption.
RESUMO
There are increasing concerns about the bias and fairness of artificial intelligence (AI) models as they are put into clinical practice. Among the steps for implementing machine learning tools into clinical workflow, model development is an important stage where different types of biases can occur. This report focuses on four aspects of model development where such bias may arise: data augmentation, model and loss function, optimizers, and transfer learning. This report emphasizes appropriate considerations and practices that can mitigate biases in radiology AI studies. Keywords: Model, Bias, Machine Learning, Deep Learning, Radiology © RSNA, 2022.
RESUMO
The increasing use of machine learning (ML) algorithms in clinical settings raises concerns about bias in ML models. Bias can arise at any step of ML creation, including data handling, model development, and performance evaluation. Potential biases in the ML model can be minimized by implementing these steps correctly. This report focuses on performance evaluation and discusses model fitness, as well as a set of performance evaluation toolboxes: namely, performance metrics, performance interpretation maps, and uncertainty quantification. By discussing the strengths and limitations of each toolbox, our report highlights strategies and considerations to mitigate and detect biases during performance evaluations of radiology artificial intelligence models. Keywords: Segmentation, Diagnosis, Convolutional Neural Network (CNN) © RSNA, 2022.