Search | VHL Regional Portal

1.

OTMorph: Unsupervised Multi-domain Abdominal Medical Image Registration Using Neural Optimal Transport.

Kim, Boah; Zhuang, Yan; Mathai, Tejas Sudharshan; Summers, Ronald M.

IEEE Trans Med Imaging ; PP2024 Aug 02.

Article in English | MEDLINE | ID: mdl-39093684

ABSTRACT

Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.

2.

A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images with Multi-Relationship Graph Learning.

Zhang, Mengliang; Hu, Xinyue; Gu, Lin; Liu, Liangchen; Kobayashi, Kazuma; Harada, Tatsuya; Yan, Yan; Summers, Ronald M; Zhu, Yingying.

IEEE Trans Med Imaging ; PP2024 Aug 09.

Article in English | MEDLINE | ID: mdl-39120990

ABSTRACT

Chest radiography, commonly known as CXR, is frequently utilized in clinical settings to detect cardiopulmonary conditions. However, even seasoned radiologists might offer different evaluations regarding the seriousness and uncertainty associated with observed abnormalities. Previous research has attempted to utilize clinical notes to extract abnormal labels for training deep-learning models in CXR image diagnosis. However, these methods often neglected the varying degrees of severity and uncertainty linked to different labels. In our study, we initially assembled a comprehensive new dataset of CXR images based on clinical textual data, which incorporated radiologists' assessments of uncertainty and severity. Using this dataset, we introduced a multi-relationship graph learning framework that leverages spatial and semantic relationships while addressing expert uncertainty through a dedicated loss function. Our research showcases a notable enhancement in CXR image diagnosis and the interpretability of the diagnostic model, surpassing existing state-of-the-art methodologies. The dataset address of disease severity and uncertainty we extracted is: https://physionet.org/content/cad-chest/1.0/.

3.

Classification of Multi-Parametric Body MRI Series Using Deep Learning.

Kim, Boah; Mathai, Tejas Sudharshan; Helm, Kimberly; Pinto, Peter A; Summers, Ronald M.

IEEE J Biomed Health Inform ; PP2024 Aug 23.

Article in English | MEDLINE | ID: mdl-39178097

ABSTRACT

Multi-parametric magnetic resonance imaging (mpMRI) exams have various series types acquired with different imaging protocols. The DICOM headers of these series often have incorrect information due to the sheer diversity of protocols and occasional technologist errors. To address this, we present a deep learning-based classification model to classify 8 different body mpMRI series types so that radiologists read the exams efficiently. Using mpMRI data from various institutions, multiple deep learning-based classifiers of ResNet, EfficientNet, and DenseNet are trained to classify 8 different MRI series, and their performance is compared. Then, the best-performing classifier is identified, and its classification capability under the setting of different training data quantities is studied. Also, the model is evaluated on the out-of-training-distribution datasets. Moreover, the model is trained using mpMRI exams obtained from different scanners in two training strategies, and its performance is tested. Experimental results show that the DenseNet-121 model achieves the highest F1-score and accuracy of 0.966 and 0.972 over the other classification models with p-value 0.05. The model shows greater than 0.95 accuracy when trained with over 729 studies of the training data, whose performance improves as the training data quantities grew larger. On the external data with the DLDS and CPTAC-UCEC datasets, the model yields 0.872 and 0.810 accuracy for each. These results indicate that in both the internal and external datasets, the DenseNet-121 model attains high accuracy for the task of classifying 8 body MRI series types.

4.

Weakly-Supervised Detection of Bone Lesions in CT.

Sheng, Tao; Mathai, Tejas Sudharshan; Shieh, Alexander; Summers, Ronald M.

Proc SPIE Int Soc Opt Eng ; 129272024 Feb.

Article in English | MEDLINE | ID: mdl-38974478

ABSTRACT

The skeletal region is one of the common sites of metastatic spread of cancer in the breast and prostate. CT is routinely used to measure the size of lesions in the bones. However, they can be difficult to spot due to the wide variations in their sizes, shapes, and appearances. Precise localization of such lesions would enable reliable tracking of interval changes (growth, shrinkage, or unchanged status). To that end, an automated technique to detect bone lesions is highly desirable. In this pilot work, we developed a pipeline to detect bone lesions (lytic, blastic, and mixed) in CT volumes via a proxy segmentation task. First, we used the bone lesions that were prospectively marked by radiologists in a few 2D slices of CT volumes and converted them into weak 3D segmentation masks. Then, we trained a 3D full-resolution nnUNet model using these weak 3D annotations to segment the lesions and thereby detected them. Our automated method detected bone lesions in CT with a precision of 96.7% and recall of 47.3% despite the use of incomplete and partial training data. To the best of our knowledge, we are the first to attempt the direct detection of bone lesions in CT via a proxy segmentation task.

5.

AI-based abdominal CT measurements of orthotopic and ectopic fat predict mortality and cardiometabolic disease risk in adults.

Lee, Matthew H; Zea, Ryan; Garrett, John W; Summers, Ronald M; Pickhardt, Perry J.

Eur Radiol ; 2024 Jul 12.

Article in English | MEDLINE | ID: mdl-38995381

ABSTRACT

OBJECTIVES: To evaluate the utility of CT-based abdominal fat measures for predicting the risk of death and cardiometabolic disease in an asymptomatic adult screening population. METHODS: Fully automated AI tools quantifying abdominal adipose tissue (L3 level visceral [VAT] and subcutaneous [SAT] fat area, visceral-to-subcutaneous fat ratio [VSR], VAT attenuation), muscle attenuation (L3 level), and liver attenuation were applied to non-contrast CT scans in asymptomatic adults undergoing CT colonography (CTC). Longitudinal follow-up documented subsequent deaths, cardiovascular events, and diabetes. ROC and time-to-event analyses were performed to generate AUCs and hazard ratios (HR) binned by octile. RESULTS: A total of 9223 adults (mean age, 57 years; 4071:5152 M:F) underwent screening CTC from April 2004 to December 2016. 549 patients died on follow-up (median, nine years). Fat measures outperformed BMI for predicting mortality risk-5-year AUCs for muscle attenuation, VSR, and BMI were 0.721, 0.661, and 0.499, respectively. Higher visceral, muscle, and liver fat were associated with increased mortality risk-VSR > 1.53, HR = 3.1; muscle attenuation < 15 HU, HR = 5.4; liver attenuation < 45 HU, HR = 2.3. Higher VAT area and VSR were associated with increased cardiovascular event and diabetes risk-VSR > 1.59, HR = 2.6 for cardiovascular event; VAT area > 291 cm2, HR = 6.3 for diabetes (p < 0.001). A U-shaped association was observed for SAT with a higher risk of death for very low and very high SAT. CONCLUSION: Fully automated CT-based measures of abdominal fat are predictive of mortality and cardiometabolic disease risk in asymptomatic adults and uncover trends that are not reflected in anthropomorphic measures. CLINICAL RELEVANCE STATEMENT: Fully automated CT-based measures of abdominal fat soundly outperform anthropometric measures for mortality and cardiometabolic risk prediction in asymptomatic patients. KEY POINTS: Abdominal fat depots associated with metabolic dysregulation and cardiovascular disease can be derived from abdominal CT. Fully automated AI body composition tools can measure factors associated with increased mortality and cardiometabolic risk. CT-based abdominal fat measures uncover trends in mortality and cardiometabolic risk not captured by BMI in asymptomatic outpatients.

6.

Weakly supervised detection of pheochromocytomas and paragangliomas in CT using noisy data.

Oluigbo, David; Mathai, Tejas Sudharshan; Santra, Bikash; Mukherjee, Pritam; Liu, Jianfei; Jha, Abhishek; Patel, Mayank; Pacak, Karel; Summers, Ronald M.

Comput Med Imaging Graph ; 116: 102419, 2024 Sep.

Article in English | MEDLINE | ID: mdl-39053035

ABSTRACT

Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors that have metastatic potential. Management of patients with PPGLs mainly depends on the makeup of their genetic cluster: SDHx, VHL/EPAS1, kinase, and sporadic. CT is the preferred modality for precise localization of PPGLs, such that their metastatic progression can be assessed. However, the variable size, morphology, and appearance of these tumors in different anatomical regions can pose challenges for radiologists. Since radiologists must routinely track changes across patient visits, manual annotation of PPGLs is quite time-consuming and cumbersome to do across all axial slices in a CT volume. As such, PPGLs are only weakly annotated on axial slices by radiologists in the form of RECIST measurements. To ameliorate the manual effort spent by radiologists, we propose a method for the automated detection of PPGLs in CT via a proxy segmentation task. Weak 3D annotations (derived from 2D bounding boxes) were used to train both 2D and 3D nnUNet models to detect PPGLs via segmentation. We evaluated our approaches on an in-house dataset comprised of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. On a test set of 53 CT volumes, our 3D nnUNet model achieved a detection precision of 70% and sensitivity of 64.1%, and outperformed the 2D model that obtained a precision of 52.7% and sensitivity of 27.5% (p< 0.05). SDHx and sporadic genetic clusters achieved the highest precisions of 73.1% and 72.7% respectively. Our state-of-the art findings highlight the promising nature of the challenging task of automated PPGL detection.

Subject(s)

Adrenal Gland Neoplasms , Paraganglioma , Pheochromocytoma , Tomography, X-Ray Computed , Humans , Pheochromocytoma/diagnostic imaging , Paraganglioma/diagnostic imaging , Adrenal Gland Neoplasms/diagnostic imaging , Tomography, X-Ray Computed/methods , Radiographic Image Interpretation, Computer-Assisted/methods

7.

Interpretable medical image Visual Question Answering via multi-modal relationship graph learning.

Hu, Xinyue; Gu, Lin; Kobayashi, Kazuma; Liu, Liangchen; Zhang, Mengliang; Harada, Tatsuya; Summers, Ronald M; Zhu, Yingying.

Med Image Anal ; 97: 103279, 2024 Oct.

Article in English | MEDLINE | ID: mdl-39079429

ABSTRACT

Medical Visual Question Answering (VQA) is an important task in medical multi-modal Large Language Models (LLMs), aiming to answer clinically relevant questions regarding input medical images. This technique has the potential to improve the efficiency of medical professionals while relieving the burden on the public health system, particularly in resource-poor countries. However, existing medical VQA datasets are small and only contain simple questions (equivalent to classification tasks), which lack semantic reasoning and clinical knowledge. Our previous work proposed a clinical knowledge-driven image difference VQA benchmark using a rule-based approach (Hu et al., 2023). However, given the same breadth of information coverage, the rule-based approach shows an 85% error rate on extracted labels. We trained an LLM method to extract labels with 62% increased accuracy. We also comprehensively evaluated our labels with 2 clinical experts on 100 samples to help us fine-tune the LLM. Based on the trained LLM model, we proposed a large-scale medical VQA dataset, Medical-CXR-VQA, using LLMs focused on chest X-ray images. The questions involved detailed information, such as abnormalities, locations, levels, and types. Based on this dataset, we proposed a novel VQA method by constructing three different relationship graphs: spatial relationships, semantic relationships, and implicit relationship graphs on the image regions, questions, and semantic labels. We leveraged graph attention to learn the logical reasoning paths for different questions. These learned graph VQA reasoning paths can be further used for LLM prompt engineering and chain-of-thought, which are crucial for further fine-tuning and training multi-modal large language models. Moreover, we demonstrate that our approach has the qualities of evidence and faithfulness, which are crucial in the clinical field. The code and the dataset is available at https://github.com/Holipori/Medical-CXR-VQA.

Subject(s)

Machine Learning , Humans , Image Interpretation, Computer-Assisted/methods , Semantics

8.

Hidden flaws behind expert-level accuracy of multimodal GPT-4 vision in medicine.

Jin, Qiao; Chen, Fangyuan; Zhou, Yiliang; Xu, Ziyang; Cheung, Justin M; Chen, Robert; Summers, Ronald M; Rousseau, Justin F; Ni, Peiyun; Landsman, Marc J; Baxter, Sally L; Al'Aref, Subhi J; Li, Yijia; Chen, Alexander; Brejt, Josef A; Chiang, Michael F; Peng, Yifan; Lu, Zhiyong.

NPJ Digit Med ; 7(1): 190, 2024 Jul 23.

Article in English | MEDLINE | ID: mdl-39043988

ABSTRACT

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges-an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.

9.

Clinical, Cultural, Computational, and Regulatory Considerations to Deploy AI in Radiology: Perspectives of RSNA and MICCAI Experts.

Linguraru, Marius George; Bakas, Spyridon; Aboian, Mariam; Chang, Peter D; Flanders, Adam E; Kalpathy-Cramer, Jayashree; Kitamura, Felipe C; Lungren, Matthew P; Mongan, John; Prevedello, Luciano M; Summers, Ronald M; Wu, Carol C; Adewole, Maruf; Kahn, Charles E.

Radiol Artif Intell ; 6(4): e240225, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38984986

ABSTRACT

The Radiological Society of North of America (RSNA) and the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society have led a series of joint panels and seminars focused on the present impact and future directions of artificial intelligence (AI) in radiology. These conversations have collected viewpoints from multidisciplinary experts in radiology, medical imaging, and machine learning on the current clinical penetration of AI technology in radiology and how it is impacted by trust, reproducibility, explainability, and accountability. The collective points-both practical and philosophical-define the cultural changes for radiologists and AI scientists working together and describe the challenges ahead for AI technologies to meet broad approval. This article presents the perspectives of experts from MICCAI and RSNA on the clinical, cultural, computational, and regulatory considerations-coupled with recommended reading materials-essential to adopt AI technology successfully in radiology and, more generally, in clinical practice. The report emphasizes the importance of collaboration to improve clinical deployment, highlights the need to integrate clinical and medical imaging data, and introduces strategies to ensure smooth and incentivized integration. Keywords: Adults and Pediatrics, Computer Applications-General (Informatics), Diagnosis, Prognosis © RSNA, 2024.

Subject(s)

Artificial Intelligence , Radiology , Humans , Radiology/methods , Societies, Medical

10.

Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports.

Zhu, Qingqing; Chen, Xiuying; Jin, Qiao; Hou, Benjamin; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Gao, Xin; Summers, Ronald M; Lu, Zhiyong.

ArXiv ; 2024 Feb 17.

Article in English | MEDLINE | ID: mdl-38903745

ABSTRACT

In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI-generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our "Detailed GPT-4 (5-shot)" model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our "Regressed GPT-4" model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.

11.

A Comparison of CT-Based Pancreatic Segmentation Deep Learning Models.

Suri, Abhinav; Mukherjee, Pritam; Pickhardt, Perry J; Summers, Ronald M.

Acad Radiol ; 2024 Jun 28.

Article in English | MEDLINE | ID: mdl-38944630

ABSTRACT

RATIONALE AND OBJECTIVES: Pancreas segmentation accuracy at CT is critical for the identification of pancreatic pathologies and is essential for the development of imaging biomarkers. Our objective was to benchmark the performance of five high-performing pancreas segmentation models across multiple metrics stratified by scan and patient/pancreatic characteristics that may affect segmentation performance. MATERIALS AND METHODS: In this retrospective study, PubMed and ArXiv searches were conducted to identify pancreas segmentation models which were then evaluated on a set of annotated imaging datasets. Results (Dice score, Hausdorff distance [HD], average surface distance [ASD]) were stratified by contrast status and quartiles of peri-pancreatic attenuation (5 mm region around pancreas). Multivariate regression was performed to identify imaging characteristics and biomarkers (n = 9) that were significantly associated with Dice score. RESULTS: Five pancreas segmentation models were identified: Abdomen Atlas [AAUNet, AASwin, trained on 8448 scans], TotalSegmentator [TS, 1204 scans], nnUNetv1 [MSD-nnUNet, 282 scans], and a U-Net based model for predicting diabetes [DM-UNet, 427 scans]. These were evaluated on 352 CT scans (30 females, 25 males, 297 sex unknown; age 58 ± 7 years [ ± 1 SD], 327 age unknown) from 2000-2023. Overall, TS, AAUNet, and AASwin were the best performers, Dice= 80 ± 11%, 79 ± 16%, and 77 ± 18%, respectively (pairwise Sidak test not-significantly different). AASwin and MSD-nnUNet performed worse (for all metrics) on non-contrast scans (vs contrast, P < .001). The worst performer was DM-UNet (Dice=67 ± 16%). All algorithms except TS showed lower Dice scores with increasing peri-pancreatic attenuation (P < .01). Multivariate regression showed non-contrast scans, (P < .001; MSD-nnUNet), smaller pancreatic length (P = .005, MSD-nnUNet), and height (P = .003, DM-UNet) were associated with lower Dice scores. CONCLUSION: The convolutional neural network-based models trained on a diverse set of scans performed best (TS, AAUnet, and AASwin). TS performed equivalently to AAUnet and AASwin with only 13% of the training set size (8488 vs 1204 scans). Though trained on the same dataset, a transformer network (AASwin) had poorer performance on non-contrast scans whereas its convolutional network counterpart (AAUNet) did not. This study highlights how aggregate assessment metrics of pancreatic segmentation algorithms seen in other literature are not enough to capture differential performance across common patient and scanning characteristics in clinical populations.

12.

AI-based opportunistic quantitative image analysis of lung cancer screening CTs to reduce disparities in osteoporosis screening.

Huber, Florian A; Bunnell, Katherine M; Garrett, John W; Flores, Efren J; Summers, Ronald M; Pickhardt, Perry J; Bredella, Miriam A.

Bone ; 186: 117176, 2024 Sep.

Article in English | MEDLINE | ID: mdl-38925254

ABSTRACT

Osteoporosis is underdiagnosed, especially in ethnic and racial minorities who are thought to be protected against bone loss, but often have worse outcomes after an osteoporotic fracture. We aimed to determine the prevalence of osteoporosis by opportunistic CT in patients who underwent lung cancer screening (LCS) using non-contrast CT in the Northeastern United States. Demographics including race and ethnicity were retrieved. We assessed trabecular bone and body composition using a fully-automated artificial intelligence algorithm. ROIs were placed at T12 vertebral body for attenuation measurements in Hounsfield Units (HU). Two validated thresholds were used to diagnose osteoporosis: high-sensitivity threshold (115-165 HU) and high specificity threshold (<115 HU). We performed descriptive statistics and ANOVA to compare differences across sex, race, ethnicity, and income class according to neighborhoods' mean household incomes. Forward stepwise regression modeling was used to determine body composition predictors of trabecular attenuation. We included 3708 patients (mean age 64 ± 7 years, 54 % males) who underwent LCS, had available demographic information and an evaluable CT for trabecular attenuation analysis. Using the high sensitivity threshold, osteoporosis was more prevalent in females (74 % vs. 65 % in males, p < 0.0001) and Whites (72 % vs 49 % non-Whites, p < 0.0001). However, osteoporosis was present across all races (38 % Black, 55 % Asian, 56 % Hispanic) and affected all income classes (69 %, 69 %, and 91 % in low, medium, and high-income class, respectively). High visceral/subcutaneous fat-ratio, aortic calcification, and hepatic steatosis were associated with low trabecular attenuation (p < 0.01), whereas muscle mass was positively associated with trabecular attenuation (p < 0.01). In conclusion, osteoporosis is prevalent across all races, income classes and both sexes in patients undergoing LCS. Opportunistic CT using a fully-automated algorithm and uniform imaging protocol is able to detect osteoporosis and body composition without additional testing or radiation. Early identification of patients traditionally thought to be at low risk for bone loss will allow for initiating appropriate treatment to prevent future fragility fractures. CLINICALTRIALS.GOV IDENTIFIER: N/A.

Subject(s)

Early Detection of Cancer , Lung Neoplasms , Osteoporosis , Tomography, X-Ray Computed , Aged , Female , Humans , Male , Middle Aged , Artificial Intelligence , Early Detection of Cancer/methods , Image Processing, Computer-Assisted/methods , Lung Neoplasms/diagnostic imaging , Osteoporosis/diagnostic imaging , Osteoporosis/epidemiology , Tomography, X-Ray Computed/methods

13.

MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI.

Zhuang, Yan; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Khoury, Brandon; Kim, Boah; Hou, Benjamin; Rabbee, Nusrat; Suri, Abhinav; Summers, Ronald M.

ArXiv ; 2024 Jun 24.

Article in English | MEDLINE | ID: mdl-38903743

ABSTRACT

BACKGROUND: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. METHODS: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). FINDINGS: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. INTERPRETATION: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others.

14.

AUTOMATED CLASSIFICATION OF MULTI-PARAMETRIC BODY MRI SERIES.

Kim, Boah; Mathai, Tejas Sudharshan; Helm, Kimberly; Summers, Ronald M.

ArXiv ; 2024 May 14.

Article in English | MEDLINE | ID: mdl-38903740

ABSTRACT

Multi-parametric MRI (mpMRI) studies are widely available in clinical practice for the diagnosis of various diseases. As the volume of mpMRI exams increases yearly, there are concomitant inaccuracies that exist within the DICOM header fields of these exams. This precludes the use of the header information for the arrangement of the different series as part of the radiologist's hanging protocol, and clinician oversight is needed for correction. In this pilot work, we propose an automated framework to classify the type of 8 different series in mpMRI studies. We used 1,363 studies acquired by three Siemens scanners to train a DenseNet-121 model with 5-fold cross-validation. Then, we evaluated the performance of the DenseNet-121 ensemble on a held-out test set of 313 mpMRI studies. Our method achieved an average precision of 96.6%, sensitivity of 96.6%, specificity of 99.6%, and F 1 score of 96.6% for the MRI series classification task. To the best of our knowledge, we are the first to develop a method to classify the series type in mpMRI studies acquired at the level of the chest, abdomen, and pelvis. Our method has the capability for robust automation of hanging protocols in modern radiology practice.

15.

Deep Learning Segmentation of Ascites on Abdominal CT Scans for Automatic Volume Quantification.

Hou, Benjamin; Lee, Sungwon; Lee, Jung-Min; Koh, Christopher; Xiao, Jing; Pickhardt, Perry J; Summers, Ronald M.

Radiol Artif Intell ; 6(5): e230601, 2024 Sep.

Article in English | MEDLINE | ID: mdl-38900043

ABSTRACT

Purpose To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and patients with ovarian cancer. Materials and Methods This retrospective study included contrast-enhanced and noncontrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age [±SD], 60 years ± 11; 143 female), was tested on two internal datasets (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the F1/Dice coefficient, SDs, and 95% CIs, focusing on ascites volume in the peritoneal cavity. Results On NIH-LC (25 patients; mean age, 59 years ± 14; 14 male) and NIH-OV (166 patients; mean age, 65 years ± 9; all female), the model achieved F1/Dice scores of 85.5% ± 6.1 (95% CI: 83.1, 87.8) and 82.6% ± 15.3 (95% CI: 76.4, 88.7), with median volume estimation errors of 19.6% (IQR, 13.2%-29.0%) and 5.3% (IQR: 2.4%-9.7%), respectively. On UofW-LC (124 patients; mean age, 46 years ± 12; 73 female), the model had a F1/Dice score of 83.0% ± 10.7 (95% CI: 79.8, 86.3) and median volume estimation error of 9.7% (IQR, 4.5%-15.1%). The model showed strong agreement with expert assessments, with r2 values of 0.79, 0.98, and 0.97 across the test sets. Conclusion The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in patients with cirrhosis and those with ovarian cancer, in concordance with expert radiologist assessments. Keywords: Abdomen/GI, Cirrhosis, Deep Learning, Segmentation Supplemental material is available for this article. © RSNA, 2024 See also commentary by Aisen and Rodrigues in this issue.

Subject(s)

Ascites , Deep Learning , Liver Cirrhosis , Ovarian Neoplasms , Tomography, X-Ray Computed , Humans , Female , Middle Aged , Ascites/diagnostic imaging , Retrospective Studies , Tomography, X-Ray Computed/methods , Aged , Ovarian Neoplasms/diagnostic imaging , Ovarian Neoplasms/complications , Male , Liver Cirrhosis/diagnostic imaging , Liver Cirrhosis/complications , Radiographic Image Interpretation, Computer-Assisted/methods

16.

Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge.

Holste, Gregory; Zhou, Yiliang; Wang, Song; Jaiswal, Ajay; Lin, Mingquan; Zhuge, Sherry; Yang, Yuzhe; Kim, Dongkyun; Nguyen-Mau, Trong-Hieu; Tran, Minh-Triet; Jeong, Jaehyup; Park, Wongi; Ryu, Jongbin; Hong, Feng; Verma, Arsh; Yamagishi, Yosuke; Kim, Changhyun; Seo, Hyeryeong; Kang, Myungjoo; Celi, Leo Anthony; Lu, Zhiyong; Summers, Ronald M; Shih, George; Wang, Zhangyang; Peng, Yifan.

Med Image Anal ; 97: 103224, 2024 Oct.

Article in English | MEDLINE | ID: mdl-38850624

ABSTRACT

Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.

Subject(s)

Radiography, Thoracic , Humans , Radiography, Thoracic/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Thoracic Diseases/diagnostic imaging , Thoracic Diseases/classification , Algorithms

17.

AI and machine learning in medical imaging: key points from development to translation.

Samala, Ravi K; Drukker, Karen; Shukla-Dave, Amita; Chan, Heang-Ping; Sahiner, Berkman; Petrick, Nicholas; Greenspan, Hayit; Mahmood, Usman; Summers, Ronald M; Tourassi, Georgia; Deserno, Thomas M; Regge, Daniele; Näppi, Janne J; Yoshida, Hiroyuki; Huo, Zhimin; Chen, Quan; Vergara, Daniel; Cha, Kenny H; Mazurchuk, Richard; Grizzard, Kevin T; Huisman, Henkjan; Morra, Lia; Suzuki, Kenji; Armato, Samuel G; Hadjiiski, Lubomir.

BJR Artif Intell ; 1(1): ubae006, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38828430

ABSTRACT

Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.

18.

Post-contrast CT liver attenuation alone is superior to the liver-spleen difference for identifying moderate hepatic steatosis.

Pickhardt, Perry J; Blake, Glen M; Moeller, Alex; Garrett, John W; Summers, Ronald M.

Eur Radiol ; 2024 Jun 04.

Article in English | MEDLINE | ID: mdl-38834787

ABSTRACT

OBJECTIVE: To assess the diagnostic performance of post-contrast CT for predicting moderate hepatic steatosis in an older adult cohort undergoing a uniform CT protocol, utilizing hepatic and splenic attenuation values. MATERIALS AND METHODS: A total of 1676 adults (mean age, 68.4 ± 10.2 years; 1045M/631F) underwent a CT urothelial protocol that included unenhanced, portal venous, and 10-min delayed phases through the liver and spleen. Automated hepatosplenic segmentation for attenuation values (in HU) was performed using a validated deep-learning tool. Unenhanced liver attenuation < 40.0 HU, corresponding to > 15% MRI-based proton density fat, served as the reference standard for moderate steatosis. RESULTS: The prevalence of moderate or severe steatosis was 12.9% (216/1676). The diagnostic performance of portal venous liver HU in predicting moderate hepatic steatosis (AUROC = 0.943) was significantly better than the liver-spleen HU difference (AUROC = 0.814) (p < 0.001). Portal venous phase liver thresholds of 80 and 90 HU had a sensitivity/specificity for moderate steatosis of 85.6%/89.6%, and 94.9%/74.7%, respectively, whereas a liver-spleen difference of -40 HU and -10 HU had a sensitivity/specificity of 43.5%/90.0% and 92.1%/52.5%, respectively. Furthermore, livers with moderate-severe steatosis demonstrated significantly less post-contrast enhancement (mean, 35.7 HU vs 47.3 HU; p < 0.001). CONCLUSION: Moderate steatosis can be reliably diagnosed on standard portal venous phase CT using liver attenuation values alone. Consideration of splenic attenuation appears to add little value. Moderate steatosis not only has intrinsically lower pre-contrast liver attenuation values (< 40 HU), but also enhances less, typically resulting in post-contrast liver attenuation values of 80 HU or less. CLINICAL RELEVANCE STATEMENT: Moderate steatosis can be reliably diagnosed on post-contrast CT using liver attenuation values alone. Livers with at least moderate steatosis enhance less than those with mild or no steatosis, which combines with the lower intrinsic attenuation to improve detection. KEY POINTS: The liver-spleen attenuation difference is frequently utilized in routine practice but appears to have performance limitations. The liver-spleen attenuation difference is less effective than liver attenuation for moderate steatosis. Moderate and severe steatosis can be identified on standard portal venous phase CT using liver attenuation alone.

19.

Adjusting for the effect of IV contrast on automated CT body composition measures during the portal venous phase.

Moeller, Alexander R; Garrett, John W; Summers, Ronald M; Pickhardt, Perry J.

Abdom Radiol (NY) ; 49(7): 2543-2551, 2024 Jul.

Article in English | MEDLINE | ID: mdl-38744704

ABSTRACT

OBJECTIVE: Fully-automated CT-based algorithms for quantifying numerous biomarkers have been validated for unenhanced abdominal scans. There is great interest in optimizing the documentation and reporting of biophysical measures present on all CT scans for the purposes of opportunistic screening and risk profiling. The purpose of this study was to determine and adjust the effect of intravenous (IV) contrast on these automated body composition measures at routine portal venous phase post-contrast imaging. METHODS: Final study cohort consisted of 1,612 older adults (mean age, 68.0 years; 594 women) all imaged utilizing a uniform CT urothelial protocol consisting of pre-contrast, portal venous, and delayed excretory phases. Fully-automated CT-based algorithms for quantifying numerous biomarkers, including muscle and fat area and density, bone mineral density, and solid organ volume were applied to pre-contrast and portal venous phases. The effect of IV contrast upon these body composition measures was analyzed. Regression analyses, including square of the Pearson correlation coefficient (r2), were performed for each comparison. RESULTS: We found that simple, linear relationships can be derived to determine non-contrast equivalent values from the post-contrast CT biomeasures. Excellent positive linear correlation (r2 = 0.91-0.99) between pre- and post-contrast values was observed for all automated soft tissue measures, whereas moderate positive linear correlation was observed for bone attenuation (r2 = 0.58-0.76). In general, the area- and volume-based measurement require less adjustment than attenuation-based measures, as expected. CONCLUSION: Fully-automated quantitative CT-biomarker measures at portal venous phase abdominal CT can be adjusted to a non-contrast equivalent using simple, linear relationships.

Subject(s)

Body Composition , Contrast Media , Portal Vein , Tomography, X-Ray Computed , Humans , Female , Aged , Male , Tomography, X-Ray Computed/methods , Portal Vein/diagnostic imaging , Algorithms , Radiographic Image Interpretation, Computer-Assisted/methods , Biomarkers , Middle Aged , Aged, 80 and over

20.

Segmentation of mediastinal lymph nodes in CT with anatomical priors.

Mathai, Tejas Sudharshan; Liu, Bohan; Summers, Ronald M.

Int J Comput Assist Radiol Surg ; 19(8): 1537-1544, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38740719

ABSTRACT

PURPOSE: Lymph nodes (LNs) in the chest have a tendency to enlarge due to various pathologies, such as lung cancer or pneumonia. Clinicians routinely measure nodal size to monitor disease progression, confirm metastatic cancer, and assess treatment response. However, variations in their shapes and appearances make it cumbersome to identify LNs, which reside outside of most organs. METHODS: We propose to segment LNs in the mediastinum by leveraging the anatomical priors of 28 different structures (e.g., lung, trachea etc.) generated by the public TotalSegmentator tool. The CT volumes from 89 patients available in the public NIH CT Lymph Node dataset were used to train three 3D off-the-shelf nnUNet models to segment LNs. The public St. Olavs dataset containing 15 patients (out-of-training-distribution) was used to evaluate the segmentation performance. RESULTS: For LNs with short axis diameter ≥ 8 mm, the 3D cascade nnUNet model obtained the highest Dice score of 67.9 ± 23.4 and lowest Hausdorff distance error of 22.8 ± 20.2. For LNs of all sizes, the Dice score was 58.7 ± 21.3 and this represented a ≥ 10% improvement over a recently published approach evaluated on the same test dataset. CONCLUSION: To our knowledge, we are the first to harness 28 distinct anatomical priors to segment mediastinal LNs, and our work can be extended to other nodal zones in the body. The proposed method has the potential for improved patient outcomes through the identification of enlarged nodes in initial staging CT scans.

Subject(s)

Lymph Nodes , Mediastinum , Tomography, X-Ray Computed , Humans , Lymph Nodes/diagnostic imaging , Mediastinum/diagnostic imaging , Tomography, X-Ray Computed/methods , Imaging, Three-Dimensional/methods , Lung Neoplasms/diagnostic imaging , Lung Neoplasms/pathology , Lymphatic Metastasis/diagnostic imaging

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL