Search | Brasil - Virtual Health Library

1.

Weakly supervised detection of pheochromocytomas and paragangliomas in CT using noisy data.

Oluigbo, David; Mathai, Tejas Sudharshan; Santra, Bikash; Mukherjee, Pritam; Liu, Jianfei; Jha, Abhishek; Patel, Mayank; Pacak, Karel; Summers, Ronald M.

Comput Med Imaging Graph ; 116: 102419, 2024 Jul 20.

Article in English | MEDLINE | ID: mdl-39053035

ABSTRACT

Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors that have metastatic potential. Management of patients with PPGLs mainly depends on the makeup of their genetic cluster: SDHx, VHL/EPAS1, kinase, and sporadic. CT is the preferred modality for precise localization of PPGLs, such that their metastatic progression can be assessed. However, the variable size, morphology, and appearance of these tumors in different anatomical regions can pose challenges for radiologists. Since radiologists must routinely track changes across patient visits, manual annotation of PPGLs is quite time-consuming and cumbersome to do across all axial slices in a CT volume. As such, PPGLs are only weakly annotated on axial slices by radiologists in the form of RECIST measurements. To ameliorate the manual effort spent by radiologists, we propose a method for the automated detection of PPGLs in CT via a proxy segmentation task. Weak 3D annotations (derived from 2D bounding boxes) were used to train both 2D and 3D nnUNet models to detect PPGLs via segmentation. We evaluated our approaches on an in-house dataset comprised of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. On a test set of 53 CT volumes, our 3D nnUNet model achieved a detection precision of 70% and sensitivity of 64.1%, and outperformed the 2D model that obtained a precision of 52.7% and sensitivity of 27.5% (p< 0.05). SDHx and sporadic genetic clusters achieved the highest precisions of 73.1% and 72.7% respectively. Our state-of-the art findings highlight the promising nature of the challenging task of automated PPGL detection.

2.

Leveraging Professional Radiologists' Expertise to Enhance LLMs' Evaluation for Radiology Reports.

Zhu, Qingqing; Chen, Xiuying; Jin, Qiao; Hou, Benjamin; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Gao, Xin; Summers, Ronald M; Lu, Zhiyong.

ArXiv ; 2024 Feb 17.

Article in English | MEDLINE | ID: mdl-38903745

ABSTRACT

In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI-generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our "Detailed GPT-4 (5-shot)" model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our "Regressed GPT-4" model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.

3.

A Comparison of CT-Based Pancreatic Segmentation Deep Learning Models.

Suri, Abhinav; Mukherjee, Pritam; Pickhardt, Perry J; Summers, Ronald M.

Acad Radiol ; 2024 Jun 28.

Article in English | MEDLINE | ID: mdl-38944630

ABSTRACT

RATIONALE AND OBJECTIVES: Pancreas segmentation accuracy at CT is critical for the identification of pancreatic pathologies and is essential for the development of imaging biomarkers. Our objective was to benchmark the performance of five high-performing pancreas segmentation models across multiple metrics stratified by scan and patient/pancreatic characteristics that may affect segmentation performance. MATERIALS AND METHODS: In this retrospective study, PubMed and ArXiv searches were conducted to identify pancreas segmentation models which were then evaluated on a set of annotated imaging datasets. Results (Dice score, Hausdorff distance [HD], average surface distance [ASD]) were stratified by contrast status and quartiles of peri-pancreatic attenuation (5 mm region around pancreas). Multivariate regression was performed to identify imaging characteristics and biomarkers (n = 9) that were significantly associated with Dice score. RESULTS: Five pancreas segmentation models were identified: Abdomen Atlas [AAUNet, AASwin, trained on 8448 scans], TotalSegmentator [TS, 1204 scans], nnUNetv1 [MSD-nnUNet, 282 scans], and a U-Net based model for predicting diabetes [DM-UNet, 427 scans]. These were evaluated on 352 CT scans (30 females, 25 males, 297 sex unknown; age 58 ± 7 years [ ± 1 SD], 327 age unknown) from 2000-2023. Overall, TS, AAUNet, and AASwin were the best performers, Dice= 80 ± 11%, 79 ± 16%, and 77 ± 18%, respectively (pairwise Sidak test not-significantly different). AASwin and MSD-nnUNet performed worse (for all metrics) on non-contrast scans (vs contrast, P < .001). The worst performer was DM-UNet (Dice=67 ± 16%). All algorithms except TS showed lower Dice scores with increasing peri-pancreatic attenuation (P < .01). Multivariate regression showed non-contrast scans, (P < .001; MSD-nnUNet), smaller pancreatic length (P = .005, MSD-nnUNet), and height (P = .003, DM-UNet) were associated with lower Dice scores. CONCLUSION: The convolutional neural network-based models trained on a diverse set of scans performed best (TS, AAUnet, and AASwin). TS performed equivalently to AAUnet and AASwin with only 13% of the training set size (8488 vs 1204 scans). Though trained on the same dataset, a transformer network (AASwin) had poorer performance on non-contrast scans whereas its convolutional network counterpart (AAUNet) did not. This study highlights how aggregate assessment metrics of pancreatic segmentation algorithms seen in other literature are not enough to capture differential performance across common patient and scanning characteristics in clinical populations.

4.

MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI.

Zhuang, Yan; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Khoury, Brandon; Kim, Boah; Hou, Benjamin; Rabbee, Nusrat; Suri, Abhinav; Summers, Ronald M.

ArXiv ; 2024 Jun 24.

Article in English | MEDLINE | ID: mdl-38903743

ABSTRACT

BACKGROUND: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. METHODS: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). FINDINGS: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. INTERPRETATION: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others.

5.

Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: a data-driven approach for improved classification.

Lanfredi, Ricardo Bigolin; Mukherjee, Pritam; Summers, Ronald.

ArXiv ; 2024 Mar 06.

Article in English | MEDLINE | ID: mdl-38711436

ABSTRACT

In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports, but concerns exist about label quality. These datasets typically offer only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) to extract and enhance findings labels on CXR reports. MAPLEZ extracts not only binary labels indicating the presence or absence of a finding but also the location, severity, and radiologists' uncertainty about the finding. Over eight abnormalities from five test sets, we show that our method can extract these annotations with an increase of 5 percentage points (pp) in F1 score for categorical presence annotations and more than 30 pp increase in F1 score for the location annotations over competing labelers. Additionally, using these improved annotations in classification supervision, we demonstrate substantial advancements in model quality, with an increase of 1.7 pp in AUROC over models trained with annotations from the state-of-the-art approach. We share code and annotations.

6.

Weakly Supervised Detection of Pheochromocytomas and Paragangliomas in CT.

Oluigbo, David C; Santra, Bikash; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Liu, Jianfei; Jha, Abhishek; Patel, Mayank; Pacak, Karel; Summers, Ronald M.

ArXiv ; 2024 Feb 12.

Article in English | MEDLINE | ID: mdl-38529074

ABSTRACT

Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors which have the potential to metastasize. For the management of patients with PPGLs, CT is the preferred modality of choice for precise localization and estimation of their progression. However, due to the myriad variations in size, morphology, and appearance of the tumors in different anatomical regions, radiologists are posed with the challenge of accurate detection of PPGLs. Since clinicians also need to routinely measure their size and track their changes over time across patient visits, manual demarcation of PPGLs is quite a time-consuming and cumbersome process. To ameliorate the manual effort spent for this task, we propose an automated method to detect PPGLs in CT studies via a proxy segmentation task. As only weak annotations for PPGLs in the form of prospectively marked 2D bounding boxes on an axial slice were available, we extended these 2D boxes into weak 3D annotations and trained a 3D full-resolution nnUNet model to directly segment PPGLs. We evaluated our approach on a dataset consisting of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. We obtained a precision of 70% and sensitivity of 64.1% with our proposed approach when tested on 53 CT studies. Our findings highlight the promising nature of detecting PPGLs via segmentation, and furthers the state-of-the-art in this exciting yet challenging area of rare cancer management.

7.

Automated Classification of Body MRI Sequence Type Using Convolutional Neural Networks.

Helm, Kimberly; Mathai, Tejas Sudharshan; Kim, Boah; Mukherjee, Pritam; Liu, Jianfei; Summers, Ronald M.

ArXiv ; 2024 Feb 12.

Article in English | MEDLINE | ID: mdl-38529076

ABSTRACT

Multi-parametric MRI of the body is routinely acquired for the identification of abnormalities and diagnosis of diseases. However, a standard naming convention for the MRI protocols and associated sequences does not exist due to wide variations in imaging practice at institutions and myriad MRI scanners from various manufacturers being used for imaging. The intensity distributions of MRI sequences differ widely as a result, and there also exists information conflicts related to the sequence type in the DICOM headers. At present, clinician oversight is necessary to ensure that the correct sequence is being read and used for diagnosis. This poses a challenge when specific series need to be considered for building a cohort for a large clinical study or for developing AI algorithms. In order to reduce clinician oversight and ensure the validity of the DICOM headers, we propose an automated method to classify the 3D MRI sequence acquired at the levels of the chest, abdomen, and pelvis. In our pilot work, our 3D DenseNet-121 model achieved an F1 score of 99.5% at differentiating 5 common MRI sequences obtained by three Siemens scanners (Aera, Verio, Biograph mMR). To the best of our knowledge, we are the first to develop an automated method for the 3D classification of MRI sequences in the chest, abdomen, and pelvis, and our work has outperformed the previous state-of-the-art MRI series classifiers.

8.

Segmentation of pelvic structures in T2 MRI via MR-to-CT synthesis.

Zhuang, Yan; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Summers, Ronald M.

Comput Med Imaging Graph ; 112: 102335, 2024 03.

Article in English | MEDLINE | ID: mdl-38271870

ABSTRACT

Segmentation of multiple pelvic structures in MRI volumes is a prerequisite for many clinical applications, such as sarcopenia assessment, bone density measurement, and muscle-to-fat volume ratio estimation. While many CT-specific datasets and automated CT-based multi-structure pelvis segmentation methods exist, there are few MRI-specific multi-structure segmentation methods in literature. In this pilot work, we propose a lightweight and annotation-free pipeline to synthetically translate T2 MRI volumes of the pelvis to CT, and subsequently leverage an existing CT-only tool called TotalSegmentator to segment 8 pelvic structures in the generated CT volumes. The predicted masks were then mapped back to the original MR volumes as segmentation masks. We compared the predicted masks against the expert annotations of the public TCGA-UCEC dataset and an internal dataset. Experiments demonstrated that the proposed pipeline achieved Dice measures ≥65% for 8 pelvic structures in T2 MRI. The proposed pipeline is an alternative method to obtain multi-organ and structure segmentations without being encumbered by time-consuming manual annotations. By exploiting the significant research progress in CTs, it is possible to extend the proposed pipeline to other MRI sequences in principle. Our research bridges the chasm between the current CT-based multi-structure segmentation and MRI-based segmentation. The manually segmented structures in the TCGA-UCEC dataset are publicly available.

Subject(s)

Image Processing, Computer-Assisted , Pelvis , Image Processing, Computer-Assisted/methods , Pelvis/diagnostic imaging , Tomography, X-Ray Computed , Magnetic Resonance Imaging/methods

9.

Longitudinal follow-up of incidental renal calculi on computed tomography.

Mukherjee, Pritam; Lee, Sungwon; Elton, Daniel C; Pickhardt, Perry J; Summers, Ronald M.

Abdom Radiol (NY) ; 49(1): 173-181, 2024 Jan.

Article in English | MEDLINE | ID: mdl-37906271

ABSTRACT

RATIONALE AND OBJECTIVES: Measuring small kidney stones on CT is a time-consuming task often neglected. Volumetric assessment provides a better measure of size than linear dimensions. Our objective is to analyze the growth rate and prognosis of incidental kidney stones in asymptomatic patients on CT. MATERIALS AND METHODS: This retrospective study included 4266 scans from 2030 asymptomatic patients who underwent two or more nonenhanced CT scans for colorectal screening between 2004 and 2016. The DL software identified and measured the volume, location, and attenuation of 883 stones. The corresponding scans were manually evaluated, and patients without follow-up were excluded. At each follow-up, the stones were categorized as new, growing, persistent, or resolved. Stone size (volume and diameter), attenuation, and location were correlated with the outcome and growth rates of the stones. RESULTS: The stone cohort comprised 407 scans from 189 (M: 124, F: 65, median age: 55.4 years) patients. The median number of stones per scan was 1 (IQR: [1, 2]). The median stone volume was 17.1 mm3 (IQR: [7.4, 43.6]) and the median peak attenuation was 308 HU (IQR: [204, 532]. The 189 initial scans contained 291stones; 91 (31.3%) resolved, 142 (48.8%) grew, and 58 (19.9) remained persistent at the first follow-up. At the second follow-up (for 27 patients with 2 follow-ups), 14/44 (31.8%) stones had resolved, 19/44 (43.2%) grew and 11/44 (25%) were persistent. The median growth rate of growing stones was 3.3 mm3/year, IQR: [1.4,7.4]. Size and attenuation had a moderate correlation (Spearman rho 0.53, P < .001 for volume, and 0.50 P < .001 for peak attenuation) with the growth rate. Growing and persistent stones had significantly greater maximum axial diameter (2.7 vs 2.3 mm, P =.047) and peak attenuation (300 vs 258 HU, P =.031) CONCLUSION: We report a 12.7% prevalence of incidental kidney stones in asymptomatic adults, of which about half grew during follow-up with a median growth rate of about 3.3 mm3/year.

Subject(s)

Kidney Calculi , Adult , Humans , Middle Aged , Follow-Up Studies , Retrospective Studies , Kidney Calculi/diagnostic imaging , Tomography, X-Ray Computed/methods , Kidney

10.

Automated detection of incidental abdominal aortic aneurysms on computed tomography.

Chatterjee, Devina; Shen, Thomas C; Mukherjee, Pritam; Lee, Sungwon; Garrett, John W; Zacharias, Nicholas; Pickhardt, Perry J; Summers, Ronald M.

Abdom Radiol (NY) ; 49(2): 642-650, 2024 Feb.

Article in English | MEDLINE | ID: mdl-38091064

ABSTRACT

PURPOSE: To detect and assess abdominal aortic aneurysms (AAAs) on CT in a large asymptomatic adult patient population using fully-automated deep learning software. MATERIALS AND METHODS: The abdominal aorta was segmented using a fully-automated deep learning model trained on 66 manually-segmented abdominal CT scans from two datasets. The axial diameters of the segmented aorta were extracted to detect the presence of AAAs-maximum axial aortic diameter greater than 3 cm were labeled as AAA positive. The trained system was then externally-validated on CT colonography scans of 9172 asymptomatic outpatients (mean age, 57 years) referred for colorectal cancer screening. Using a previously-validated automated calcified atherosclerotic plaque detector, we correlated abdominal aortic Agatston and volume scores with the presence of AAA. RESULTS: The deep learning software detected AAA on the external validation dataset with a sensitivity, specificity, and AUC of 96%, (95% CI 89%, 100%), 96% (96%, 97%), and 99% (98%, 99%) respectively. The Agatston and volume scores of reported AAA-positive cases were statistically significantly greater than those of reported AAA-negative cases (p < 0.0001). Using plaque alone as a AAA detector, at a threshold Agatston score of 2871, the sensitivity and specificity were 84% (73%, 94%) and 87% (86%, 87%), respectively. CONCLUSION: Fully-automated detection and assessment of AAA on CT is feasible and accurate. There was a strong statistical association between the presence of AAA and the quantity of abdominal aortic calcified atherosclerotic plaque.

Subject(s)

Aortic Aneurysm, Abdominal , Plaque, Atherosclerotic , Adult , Humans , Middle Aged , Aortic Aneurysm, Abdominal/diagnostic imaging , Aortic Aneurysm, Abdominal/epidemiology , Aorta, Abdominal/diagnostic imaging , Tomography, X-Ray Computed , Sensitivity and Specificity

11.

Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports.

Mukherjee, Pritam; Hou, Benjamin; Lanfredi, Ricardo B; Summers, Ronald M.

Radiology ; 309(1): e231147, 2023 10.

Article in English | MEDLINE | ID: mdl-37815442

ABSTRACT

Background Large language models (LLMs) such as ChatGPT, though proficient in many text-based tasks, are not suitable for use with radiology reports due to patient privacy constraints. Purpose To test the feasibility of using an alternative LLM (Vicuna-13B) that can be run locally for labeling radiography reports. Materials and Methods Chest radiography reports from the MIMIC-CXR and National Institutes of Health (NIH) data sets were included in this retrospective study. Reports were examined for 13 findings. Outputs reporting the presence or absence of the 13 findings were generated by Vicuna by using a single-step or multistep prompting strategy (prompts 1 and 2, respectively). Agreements between Vicuna outputs and CheXpert and CheXbert labelers were assessed using Fleiss κ. Agreement between Vicuna outputs from three runs under a hyperparameter setting that introduced some randomness (temperature, 0.7) was also assessed. The performance of Vicuna and the labelers was assessed in a subset of 100 NIH reports annotated by a radiologist with use of area under the receiver operating characteristic curve (AUC). Results A total of 3269 reports from the MIMIC-CXR data set (median patient age, 68 years [IQR, 59-79 years]; 161 male patients) and 25 596 reports from the NIH data set (median patient age, 47 years [IQR, 32-58 years]; 1557 male patients) were included. Vicuna outputs with prompt 2 showed, on average, moderate to substantial agreement with the labelers on the MIMIC-CXR (κ median, 0.57 [IQR, 0.45-0.66] with CheXpert and 0.64 [IQR, 0.45-0.68] with CheXbert) and NIH (κ median, 0.52 [IQR, 0.41-0.65] with CheXpert and 0.55 [IQR, 0.41-0.74] with CheXbert) data sets, respectively. Vicuna with prompt 2 performed at par (median AUC, 0.84 [IQR, 0.74-0.93]) with both labelers on nine of 11 findings. Conclusion In this proof-of-concept study, outputs of the LLM Vicuna reporting the presence or absence of 13 findings on chest radiography reports showed moderate to substantial agreement with existing labelers. © RSNA, 2023 Supplemental material is available for this article. See also the editorial by Cai in this issue.

Subject(s)

Camelids, New World , Radiology , United States , Humans , Male , Animals , Aged , Middle Aged , Privacy , Feasibility Studies , Retrospective Studies , Language

12.

Performance of alternative manual and automated deep learning segmentation techniques for the prediction of benign and malignant lung nodules.

Selby, Heather M; Mukherjee, Pritam; Parham, Christopher; Malik, Sachin B; Gevaert, Olivier; Napel, Sandy; Shah, Rajesh P.

J Med Imaging (Bellingham) ; 10(4): 044006, 2023 Jul.

Article in English | MEDLINE | ID: mdl-37564098

ABSTRACT

Purpose: We aim to evaluate the performance of radiomic biopsy (RB), best-fit bounding box (BB), and a deep-learning-based segmentation method called no-new-U-Net (nnU-Net), compared to the standard full manual (FM) segmentation method for predicting benign and malignant lung nodules using a computed tomography (CT) radiomic machine learning model. Materials and Methods: A total of 188 CT scans of lung nodules from 2 institutions were used for our study. One radiologist identified and delineated all 188 lung nodules, whereas a second radiologist segmented a subset (n=20) of these nodules. Both radiologists employed FM and RB segmentation methods. BB segmentations were generated computationally from the FM segmentations. The nnU-Net, a deep-learning-based segmentation method, performed automatic nodule detection and segmentation. The time radiologists took to perform segmentations was recorded. Radiomic features were extracted from each segmentation method, and models to predict benign and malignant lung nodules were developed. The Kruskal-Wallis and DeLong tests were used to compare segmentation times and areas under the curve (AUC), respectively. Results: For the delineation of the FM, RB, and BB segmentations, the two radiologists required a median time (IQR) of 113 (54 to 251.5), 21 (9.25 to 38), and 16 (12 to 64.25) s, respectively (p=0.04). In dataset 1, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.964 (0.96 to 0.968), 0.985 (0.983 to 0.987), 0.961 (0.956 to 0.965), and 0.878 (0.869 to 0.888). In dataset 2, the mean AUC (95% CI) of the FM, RB, BB, and nnU-Net model were 0.717 (0.705 to 0.729), 0.919 (0.913 to 0.924), 0.699 (0.687 to 0.711), and 0.644 (0.632 to 0.657). Conclusion: Radiomic biopsy-based models outperformed FM and BB models in prediction of benign and malignant lung nodules in two independent datasets while deep-learning segmentation-based models performed similarly to FM and BB. RB could be a more efficient segmentation method, but further validation is needed.

13.

Utilizing Longitudinal Chest X-Rays and Reports to Pre-Fill Radiology Reports.

Zhu, Qingqing; Mathai, Tejas Sudharshan; Mukherjee, Pritam; Peng, Yifan; Summers, Ronald M; Lu, Zhiyong.

ArXiv ; 2023 Oct 10.

Article in English | MEDLINE | ID: mdl-37502627

ABSTRACT

Despite the reduction in turn-around times in radiology reporting with the use of speech recognition software, persistent communication errors can significantly impact the interpretation of radiology reports. Pre-filling a radiology report holds promise in mitigating reporting errors, and despite multiple efforts in literature to generate comprehensive medical reports, there lacks approaches that exploit the longitudinal nature of patient visit records in the MIMIC-CXR dataset. To address this gap, we propose to use longitudinal multi-modal data, i.e., previous patient visit CXR, current visit CXR, and the previous visit report, to pre-fill the "findings" section of the patient's current visit. We first gathered the longitudinal visit information for 26,625 patients from the MIMIC-CXR dataset, and created a new dataset called Longitudinal-MIMIC. With this new dataset, a transformer-based model was trained to capture the multi-modal longitudinal information from patient visit records (CXR images + reports) via a cross-attention-based multi-modal fusion module and a hierarchical memory-driven decoder. In contrast to previous works that only uses current visit data as input to train a model, our work exploits the longitudinal information available to pre-fill the "findings" section of radiology reports. Experiments show that our approach outperforms several recent approaches. Code will be published at https://github.com/CelestialShine/Longitudinal-Chest-X-Ray.

14.

SCOPE: predicting future diagnoses in office visits using electronic health records.

Mukherjee, Pritam; Humbert-Droz, Marie; Chen, Jonathan H; Gevaert, Olivier.

Sci Rep ; 13(1): 11005, 2023 07 07.

Article in English | MEDLINE | ID: mdl-37419945

ABSTRACT

We propose an interpretable and scalable model to predict likely diagnoses at an encounter based on past diagnoses and lab results. This model is intended to aid physicians in their interaction with the electronic health records (EHR). To accomplish this, we retrospectively collected and de-identified EHR data of 2,701,522 patients at Stanford Healthcare over a time period from January 2008 to December 2016. A population-based sample of patients comprising 524,198 individuals (44% M, 56% F) with multiple encounters with at least one frequently occurring diagnosis codes were chosen. A calibrated model was developed to predict ICD-10 diagnosis codes at an encounter based on the past diagnoses and lab results, using a binary relevance based multi-label modeling strategy. Logistic regression and random forests were tested as the base classifier, and several time windows were tested for aggregating the past diagnoses and labs. This modeling approach was compared to a recurrent neural network based deep learning method. The best model used random forest as the base classifier and integrated demographic features, diagnosis codes, and lab results. The best model was calibrated and its performance was comparable or better than existing methods in terms of various metrics, including a median AUROC of 0.904 (IQR [0.838, 0.954]) over 583 diseases. When predicting the first occurrence of a disease label for a patient, the median AUROC with the best model was 0.796 (IQR [0.737, 0.868]). Our modeling approach performed comparably as the tested deep learning method, outperforming it in terms of AUROC (p < 0.001) but underperforming in terms of AUPRC (p < 0.001). Interpreting the model showed that the model uses meaningful features and highlights many interesting associations among diagnoses and lab results. We conclude that the multi-label model performs comparably with RNN based deep learning model while offering simplicity and potentially superior interpretability. While the model was trained and validated on data obtained from a single institution, its simplicity, interpretability and performance makes it a promising candidate for deployment.

Subject(s)

Electronic Health Records , Neural Networks, Computer , Humans , Retrospective Studies , Forecasting , Logistic Models

15.

Fully Automated Longitudinal Assessment of Renal Stone Burden on Serial CT Imaging Using Deep Learning.

Mukherjee, Pritam; Lee, Sungwon; Elton, Daniel C; Nakada, Stephen Y; Pickhardt, Perry J; Summers, Ronald M.

J Endourol ; 37(8): 948-955, 2023 08.

Article in English | MEDLINE | ID: mdl-37310890

ABSTRACT

Purpose: Use deep learning (DL) to automate the measurement and tracking of kidney stone burden over serial CT scans. Materials and Methods: This retrospective study included 259 scans from 113 symptomatic patients being treated for urolithiasis at a single medical center between 2006 and 2019. These patients underwent a standard low-dose noncontrast CT scan followed by ultra-low-dose CT scans limited to the level of the kidneys. A DL model was used to detect, segment, and measure the volume of all stones in both initial and follow-up scans. The stone burden was characterized by the total volume of all stones in a scan (SV). The absolute and relative change of SV, (SVA and SVR, respectively) over serial scans were computed. The automated assessments were compared with manual assessments using concordance correlation coefficient (CCC), and their agreement was visualized using Bland-Altman and scatter plots. Results: Two hundred twenty-eight out of 233 scans with stones were identified by the automated pipeline; per-scan sensitivity was 97.8% (95% confidence interval [CI]: 96.0-99.7). The per-scan positive predictive value was 96.6% (95% CI: 94.4-98.8). The median SV, SVA, and SVR were 476.5 mm3, -10 mm3, and 0.89, respectively. After removing outliers outside the 5th and 95th percentiles, the CCC measuring agreement on SV, SVA, and SVR were 0.995 (0.992-0.996), 0.980 (0.972-0.986), and 0.915 (0.881-0.939), respectively Conclusions: The automated DL-based measurements showed good agreement with the manual assessments of the stone burden and its interval change on serial CT scans.

Subject(s)

Deep Learning , Kidney Calculi , Urolithiasis , Humans , Retrospective Studies , Kidney Calculi/diagnostic imaging , Tomography, X-Ray Computed/methods

16.

Early Detection of Lung Cancer in the NLST Dataset.

Mukherjee, Pritam; Brezhneva, Anna; Napel, Sandy; Gevaert, Olivier.

medRxiv ; 2023 Mar 02.

Article in English | MEDLINE | ID: mdl-36909593

ABSTRACT

Lung Cancer is the leading cause of cancer mortality in the U.S. The effectiveness of standard treatments, including surgery, chemotherapy or radiotherapy, depends on several factors like type and stage of cancer, with the survival rate being much worse for later cancer stages. The National Lung Screening Trial (NLST) established that patients screened using low-dose Computed Tomography (CT) had a 15 to 20 percent lower risk of dying from lung cancer than patients screened using chest X-rays. While CT excelled at detecting small early stage malignant nodules, a large proportion of patients (> 25%) screened positive and only a small fraction (< 10%) of these positive screens actually had or developed cancer in the subsequent years. We developed a model to distinguish between high and low risk patients among the positive screens, predicting the likelihood of having or developing lung cancer at the current time point or in subsequent years non-invasively, based on current and previous CT imaging data. However, most of the nodules in NLST are very small, and nodule segmentations or even precise locations are unavailable. Our model comprises two stages: the first stage is a neural network model trained on the Lung Image Database Consortium (LIDC-IDRI) cohort which detects nodules and assigns them malignancy scores. The second part of our model is a boosted tree which outputs a cancer probability for a patient based on the nodule information (location and malignancy score) predicted by the first stage. Our model, built on a subset of the NLST cohort (n = 1138) shows excellent performance, achieving an area under the receiver operating characteristics curve (ROC AUC) of 0.85 when predicting based on CT images from all three time points available in the NLST dataset.

17.

Multimodal deep learning to predict prognosis in adult and pediatric brain tumors.

Steyaert, Sandra; Qiu, Yeping Lina; Zheng, Yuanning; Mukherjee, Pritam; Vogel, Hannes; Gevaert, Olivier.

Commun Med (Lond) ; 3(1): 44, 2023 Mar 29.

Article in English | MEDLINE | ID: mdl-36991216

ABSTRACT

BACKGROUND: The introduction of deep learning in both imaging and genomics has significantly advanced the analysis of biomedical data. For complex diseases such as cancer, different data modalities may reveal different disease characteristics, and the integration of imaging with genomic data has the potential to unravel additional information than when using these data sources in isolation. Here, we propose a DL framework that combines these two modalities with the aim to predict brain tumor prognosis. METHODS: Using two separate glioma cohorts of 783 adults and 305 pediatric patients we developed a DL framework that can fuse histopathology images with gene expression profiles. Three strategies for data fusion were implemented and compared: early, late, and joint fusion. Additional validation of the adult glioma models was done on an independent cohort of 97 adult patients. RESULTS: Here we show that the developed multimodal data models achieve better prediction results compared to the single data models, but also lead to the identification of more relevant biological pathways. When testing our adult models on a third brain tumor dataset, we show our multimodal framework is able to generalize and performs better on new data from different cohorts. Leveraging the concept of transfer learning, we demonstrate how our pediatric multimodal models can be used to predict prognosis for two more rare (less available samples) pediatric brain tumors. CONCLUSIONS: Our study illustrates that a multimodal data fusion approach can be successfully implemented and customized to model clinical outcome of adult and pediatric brain tumors.

An increasing amount of complex patient data is generated when treating patients with cancer, including histopathology data (where the appearance of a tumor is examined under a microscope) and molecular data (such as analysis of a tumor's genetic material). Computational methods to integrate these data types might help us to predict outcomes in patients with cancer. Here, we propose a deep learning method which involves computer software learning from patterns in the data, to combine histopathology and molecular data to predict outcomes in patients with brain cancers. Using three cohorts of patients, we show that our method combining the different datasets performs better than models using one data type. Methods like ours might help clinicians to better inform patients about their prognosis and make decisions about their care.

18.

Machine intelligence for radiation science: summary of the Radiation Research Society 67th annual meeting symposium.

Wilson, Lydia J; Kiffer, Frederico C; Berrios, Daniel C; Bryce-Atkinson, Abigail; Costes, Sylvain V; Gevaert, Olivier; Matarèse, Bruno F E; Miller, Jack; Mukherjee, Pritam; Peach, Kristen; Schofield, Paul N; Slater, Luke T; Langen, Britta.

Int J Radiat Biol ; 99(8): 1291-1300, 2023.

Article in English | MEDLINE | ID: mdl-36735963

ABSTRACT

The era of high-throughput techniques created big data in the medical field and research disciplines. Machine intelligence (MI) approaches can overcome critical limitations on how those large-scale data sets are processed, analyzed, and interpreted. The 67th Annual Meeting of the Radiation Research Society featured a symposium on MI approaches to highlight recent advancements in the radiation sciences and their clinical applications. This article summarizes three of those presentations regarding recent developments for metadata processing and ontological formalization, data mining for radiation outcomes in pediatric oncology, and imaging in lung cancer.

Subject(s)

Artificial Intelligence , Lung Neoplasms , Child , Humans , Big Data , Data Mining

19.

Identifying key multifunctional components shared by critical cancer and normal liver pathways via SparseGMM.

Bakr, Shaimaa; Brennan, Kevin; Mukherjee, Pritam; Argemi, Josepmaria; Hernaez, Mikel; Gevaert, Olivier.

Cell Rep Methods ; 3(1): 100392, 2023 01 23.

Article in English | MEDLINE | ID: mdl-36814838

ABSTRACT

Despite the abundance of multimodal data, suitable statistical models that can improve our understanding of diseases with genetic underpinnings are challenging to develop. Here, we present SparseGMM, a statistical approach for gene regulatory network discovery. SparseGMM uses latent variable modeling with sparsity constraints to learn Gaussian mixtures from multiomic data. By combining coexpression patterns with a Bayesian framework, SparseGMM quantitatively measures confidence in regulators and uncertainty in target gene assignment by computing gene entropy. We apply SparseGMM to liver cancer and normal liver tissue data and evaluate discovered gene modules in an independent single-cell RNA sequencing (scRNA-seq) dataset. SparseGMM identifies PROCR as a regulator of angiogenesis and PDCD1LG2 and HNF4A as regulators of immune response and blood coagulation in cancer. Furthermore, we show that more genes have significantly higher entropy in cancer compared with normal liver. Among high-entropy genes are key multifunctional components shared by critical pathways, including p53 and estrogen signaling.

Subject(s)

Gene Expression Profiling , Liver Neoplasms , Humans , Bayes Theorem , Gene Regulatory Networks , Liver Neoplasms/genetics

20.

Topological data analysis of thoracic radiographic images shows improved radiomics-based lung tumor histology prediction.

Vandaele, Robin; Mukherjee, Pritam; Selby, Heather Marie; Shah, Rajesh Pravin; Gevaert, Olivier.

Patterns (N Y) ; 4(1): 100657, 2023 Jan 13.

Article in English | MEDLINE | ID: mdl-36699734

ABSTRACT

Topological data analysis provides tools to capture wide-scale structural shape information in data. Its main method, persistent homology, has found successful applications to various machine-learning problems. Despite its recent gain in popularity, much of its potential for medical image analysis remains undiscovered. We explore the prominent learning problems on thoracic radiographic images of lung tumors for which persistent homology improves radiomic-based learning. It turns out that our topological features well capture complementary information important for benign versus malignant and adenocarcinoma versus squamous cell carcinoma tumor prediction while contributing less consistently to small cell versus non-small cell-an interesting result in its own right. Furthermore, while radiomic features are better for predicting malignancy scores assigned by expert radiologists through visual inspection, we find that topological features are better for predicting more accurate histology assessed through long-term radiology review, biopsy, surgical resection, progression, or response.

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL