Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 314
Filtrar
1.
IEEE Trans Med Imaging ; PP2024 Aug 02.
Artículo en Inglés | MEDLINE | ID: mdl-39093684

RESUMEN

Deformable image registration is one of the essential processes in analyzing medical images. In particular, when diagnosing abdominal diseases such as hepatic cancer and lymphoma, multi-domain images scanned from different modalities or different imaging protocols are often used. However, they are not aligned due to scanning times, patient breathing, movement, etc. Although recent learning-based approaches can provide deformations in real-time with high performance, multi-domain abdominal image registration using deep learning is still challenging since the images in different domains have different characteristics such as image contrast and intensity ranges. To address this, this paper proposes a novel unsupervised multi-domain image registration framework using neural optimal transport, dubbed OTMorph. When moving and fixed volumes are given as input, a transport module of our proposed model learns the optimal transport plan to map data distributions from the moving to the fixed volumes and estimates a domain-transported volume. Subsequently, a registration module taking the transported volume can effectively estimate the deformation field, leading to deformation performance improvement. Experimental results on multi-domain image registration using multi-modality and multi-parametric abdominal medical images demonstrate that the proposed method provides superior deformable registration via the domain-transported image that alleviates the domain gap between the input images. Also, we attain the improvement even on out-of-distribution data, which indicates the superior generalizability of our model for the registration of various medical images. Our source code is available at https://github.com/boahK/OTMorph.

2.
IEEE Trans Med Imaging ; PP2024 Aug 09.
Artículo en Inglés | MEDLINE | ID: mdl-39120990

RESUMEN

Chest radiography, commonly known as CXR, is frequently utilized in clinical settings to detect cardiopulmonary conditions. However, even seasoned radiologists might offer different evaluations regarding the seriousness and uncertainty associated with observed abnormalities. Previous research has attempted to utilize clinical notes to extract abnormal labels for training deep-learning models in CXR image diagnosis. However, these methods often neglected the varying degrees of severity and uncertainty linked to different labels. In our study, we initially assembled a comprehensive new dataset of CXR images based on clinical textual data, which incorporated radiologists' assessments of uncertainty and severity. Using this dataset, we introduced a multi-relationship graph learning framework that leverages spatial and semantic relationships while addressing expert uncertainty through a dedicated loss function. Our research showcases a notable enhancement in CXR image diagnosis and the interpretability of the diagnostic model, surpassing existing state-of-the-art methodologies. The dataset address of disease severity and uncertainty we extracted is: https://physionet.org/content/cad-chest/1.0/.

3.
Eur Radiol ; 2024 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-38995381

RESUMEN

OBJECTIVES: To evaluate the utility of CT-based abdominal fat measures for predicting the risk of death and cardiometabolic disease in an asymptomatic adult screening population. METHODS: Fully automated AI tools quantifying abdominal adipose tissue (L3 level visceral [VAT] and subcutaneous [SAT] fat area, visceral-to-subcutaneous fat ratio [VSR], VAT attenuation), muscle attenuation (L3 level), and liver attenuation were applied to non-contrast CT scans in asymptomatic adults undergoing CT colonography (CTC). Longitudinal follow-up documented subsequent deaths, cardiovascular events, and diabetes. ROC and time-to-event analyses were performed to generate AUCs and hazard ratios (HR) binned by octile. RESULTS: A total of 9223 adults (mean age, 57 years; 4071:5152 M:F) underwent screening CTC from April 2004 to December 2016. 549 patients died on follow-up (median, nine years). Fat measures outperformed BMI for predicting mortality risk-5-year AUCs for muscle attenuation, VSR, and BMI were 0.721, 0.661, and 0.499, respectively. Higher visceral, muscle, and liver fat were associated with increased mortality risk-VSR > 1.53, HR = 3.1; muscle attenuation < 15 HU, HR = 5.4; liver attenuation < 45 HU, HR = 2.3. Higher VAT area and VSR were associated with increased cardiovascular event and diabetes risk-VSR > 1.59, HR = 2.6 for cardiovascular event; VAT area > 291 cm2, HR = 6.3 for diabetes (p < 0.001). A U-shaped association was observed for SAT with a higher risk of death for very low and very high SAT. CONCLUSION: Fully automated CT-based measures of abdominal fat are predictive of mortality and cardiometabolic disease risk in asymptomatic adults and uncover trends that are not reflected in anthropomorphic measures. CLINICAL RELEVANCE STATEMENT: Fully automated CT-based measures of abdominal fat soundly outperform anthropometric measures for mortality and cardiometabolic risk prediction in asymptomatic patients. KEY POINTS: Abdominal fat depots associated with metabolic dysregulation and cardiovascular disease can be derived from abdominal CT. Fully automated AI body composition tools can measure factors associated with increased mortality and cardiometabolic risk. CT-based abdominal fat measures uncover trends in mortality and cardiometabolic risk not captured by BMI in asymptomatic outpatients.

4.
Radiol Artif Intell ; 6(4): e240225, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38984986

RESUMEN

The Radiological Society of North of America (RSNA) and the Medical Image Computing and Computer Assisted Intervention (MICCAI) Society have led a series of joint panels and seminars focused on the present impact and future directions of artificial intelligence (AI) in radiology. These conversations have collected viewpoints from multidisciplinary experts in radiology, medical imaging, and machine learning on the current clinical penetration of AI technology in radiology and how it is impacted by trust, reproducibility, explainability, and accountability. The collective points-both practical and philosophical-define the cultural changes for radiologists and AI scientists working together and describe the challenges ahead for AI technologies to meet broad approval. This article presents the perspectives of experts from MICCAI and RSNA on the clinical, cultural, computational, and regulatory considerations-coupled with recommended reading materials-essential to adopt AI technology successfully in radiology and, more generally, in clinical practice. The report emphasizes the importance of collaboration to improve clinical deployment, highlights the need to integrate clinical and medical imaging data, and introduces strategies to ensure smooth and incentivized integration. Keywords: Adults and Pediatrics, Computer Applications-General (Informatics), Diagnosis, Prognosis © RSNA, 2024.


Asunto(s)
Inteligencia Artificial , Radiología , Humanos , Radiología/métodos , Sociedades Médicas
5.
Artículo en Inglés | MEDLINE | ID: mdl-38974478

RESUMEN

The skeletal region is one of the common sites of metastatic spread of cancer in the breast and prostate. CT is routinely used to measure the size of lesions in the bones. However, they can be difficult to spot due to the wide variations in their sizes, shapes, and appearances. Precise localization of such lesions would enable reliable tracking of interval changes (growth, shrinkage, or unchanged status). To that end, an automated technique to detect bone lesions is highly desirable. In this pilot work, we developed a pipeline to detect bone lesions (lytic, blastic, and mixed) in CT volumes via a proxy segmentation task. First, we used the bone lesions that were prospectively marked by radiologists in a few 2D slices of CT volumes and converted them into weak 3D segmentation masks. Then, we trained a 3D full-resolution nnUNet model using these weak 3D annotations to segment the lesions and thereby detected them. Our automated method detected bone lesions in CT with a precision of 96.7% and recall of 47.3% despite the use of incomplete and partial training data. To the best of our knowledge, we are the first to attempt the direct detection of bone lesions in CT via a proxy segmentation task.

6.
NPJ Digit Med ; 7(1): 190, 2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39043988

RESUMEN

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges-an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.

7.
Comput Med Imaging Graph ; 116: 102419, 2024 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-39053035

RESUMEN

Pheochromocytomas and Paragangliomas (PPGLs) are rare adrenal and extra-adrenal tumors that have metastatic potential. Management of patients with PPGLs mainly depends on the makeup of their genetic cluster: SDHx, VHL/EPAS1, kinase, and sporadic. CT is the preferred modality for precise localization of PPGLs, such that their metastatic progression can be assessed. However, the variable size, morphology, and appearance of these tumors in different anatomical regions can pose challenges for radiologists. Since radiologists must routinely track changes across patient visits, manual annotation of PPGLs is quite time-consuming and cumbersome to do across all axial slices in a CT volume. As such, PPGLs are only weakly annotated on axial slices by radiologists in the form of RECIST measurements. To ameliorate the manual effort spent by radiologists, we propose a method for the automated detection of PPGLs in CT via a proxy segmentation task. Weak 3D annotations (derived from 2D bounding boxes) were used to train both 2D and 3D nnUNet models to detect PPGLs via segmentation. We evaluated our approaches on an in-house dataset comprised of chest-abdomen-pelvis CTs of 255 patients with confirmed PPGLs. On a test set of 53 CT volumes, our 3D nnUNet model achieved a detection precision of 70% and sensitivity of 64.1%, and outperformed the 2D model that obtained a precision of 52.7% and sensitivity of 27.5% (p< 0.05). SDHx and sporadic genetic clusters achieved the highest precisions of 73.1% and 72.7% respectively. Our state-of-the art findings highlight the promising nature of the challenging task of automated PPGL detection.

8.
Med Image Anal ; 97: 103279, 2024 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-39079429

RESUMEN

Medical Visual Question Answering (VQA) is an important task in medical multi-modal Large Language Models (LLMs), aiming to answer clinically relevant questions regarding input medical images. This technique has the potential to improve the efficiency of medical professionals while relieving the burden on the public health system, particularly in resource-poor countries. However, existing medical VQA datasets are small and only contain simple questions (equivalent to classification tasks), which lack semantic reasoning and clinical knowledge. Our previous work proposed a clinical knowledge-driven image difference VQA benchmark using a rule-based approach (Hu et al., 2023). However, given the same breadth of information coverage, the rule-based approach shows an 85% error rate on extracted labels. We trained an LLM method to extract labels with 62% increased accuracy. We also comprehensively evaluated our labels with 2 clinical experts on 100 samples to help us fine-tune the LLM. Based on the trained LLM model, we proposed a large-scale medical VQA dataset, Medical-CXR-VQA, using LLMs focused on chest X-ray images. The questions involved detailed information, such as abnormalities, locations, levels, and types. Based on this dataset, we proposed a novel VQA method by constructing three different relationship graphs: spatial relationships, semantic relationships, and implicit relationship graphs on the image regions, questions, and semantic labels. We leveraged graph attention to learn the logical reasoning paths for different questions. These learned graph VQA reasoning paths can be further used for LLM prompt engineering and chain-of-thought, which are crucial for further fine-tuning and training multi-modal large language models. Moreover, we demonstrate that our approach has the qualities of evidence and faithfulness, which are crucial in the clinical field. The code and the dataset is available at https://github.com/Holipori/Medical-CXR-VQA.

9.
Radiol Artif Intell ; : e230601, 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38900043

RESUMEN

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To evaluate the performance of an automated deep learning method in detecting ascites and subsequently quantifying its volume in patients with liver cirrhosis and ovarian cancer. Materials and Methods This retrospective study included contrast-enhanced and noncontrast abdominal-pelvic CT scans of patients with cirrhotic ascites and patients with ovarian cancer from two institutions, National Institutes of Health (NIH) and University of Wisconsin (UofW). The model, trained on The Cancer Genome Atlas Ovarian Cancer dataset (mean age, 60 years ± 11 [SD]; 143 female), was tested on two internal (NIH-LC and NIH-OV) and one external dataset (UofW-LC). Its performance was measured by the Dice coefficient, standard deviations, and 95% confidence intervals, focusing on ascites volume in the peritoneal cavity. Results On NIH-LC (25 patients; mean age, 59 years ± 14; 14 male) and NIH-OV (166 patients; mean age, 65 years ± 9; all female), the model achieved Dice scores of 85.5% ± 6.1% (CI: 83.1%-87.8%) and 82.6% ± 15.3% (CI: 76.4%-88.7%), with median volume estimation errors of 19.6% (IQR: 13.2%-29.0%) and 5.3% (IQR: 2.4%- 9.7%), respectively. On UofW-LC (124 patients; mean age, 46 years ± 12; 73 female), the model had a Dice score of 83.0% ± 10.7% (CI: 79.8%-86.3%) and median volume estimation error of 9.7% (IQR: 4.5%-15.1%). The model showed strong agreement with expert assessments, with r2 values of 0.79, 0.98, and 0.97 across the test sets. Conclusion The proposed deep learning method performed well in segmenting and quantifying the volume of ascites in concordance with expert radiologist assessments. ©RSNA, 2024.

10.
Eur Radiol ; 2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38834787

RESUMEN

OBJECTIVE: To assess the diagnostic performance of post-contrast CT for predicting moderate hepatic steatosis in an older adult cohort undergoing a uniform CT protocol, utilizing hepatic and splenic attenuation values. MATERIALS AND METHODS: A total of 1676 adults (mean age, 68.4 ± 10.2 years; 1045M/631F) underwent a CT urothelial protocol that included unenhanced, portal venous, and 10-min delayed phases through the liver and spleen. Automated hepatosplenic segmentation for attenuation values (in HU) was performed using a validated deep-learning tool. Unenhanced liver attenuation < 40.0 HU, corresponding to > 15% MRI-based proton density fat, served as the reference standard for moderate steatosis. RESULTS: The prevalence of moderate or severe steatosis was 12.9% (216/1676). The diagnostic performance of portal venous liver HU in predicting moderate hepatic steatosis (AUROC = 0.943) was significantly better than the liver-spleen HU difference (AUROC = 0.814) (p < 0.001). Portal venous phase liver thresholds of 80 and 90 HU had a sensitivity/specificity for moderate steatosis of 85.6%/89.6%, and 94.9%/74.7%, respectively, whereas a liver-spleen difference of -40 HU and -10 HU had a sensitivity/specificity of 43.5%/90.0% and 92.1%/52.5%, respectively. Furthermore, livers with moderate-severe steatosis demonstrated significantly less post-contrast enhancement (mean, 35.7 HU vs 47.3 HU; p < 0.001). CONCLUSION: Moderate steatosis can be reliably diagnosed on standard portal venous phase CT using liver attenuation values alone. Consideration of splenic attenuation appears to add little value. Moderate steatosis not only has intrinsically lower pre-contrast liver attenuation values (< 40 HU), but also enhances less, typically resulting in post-contrast liver attenuation values of 80 HU or less. CLINICAL RELEVANCE STATEMENT: Moderate steatosis can be reliably diagnosed on post-contrast CT using liver attenuation values alone. Livers with at least moderate steatosis enhance less than those with mild or no steatosis, which combines with the lower intrinsic attenuation to improve detection. KEY POINTS: The liver-spleen attenuation difference is frequently utilized in routine practice but appears to have performance limitations. The liver-spleen attenuation difference is less effective than liver attenuation for moderate steatosis. Moderate and severe steatosis can be identified on standard portal venous phase CT using liver attenuation alone.

11.
Acad Radiol ; 2024 Jun 28.
Artículo en Inglés | MEDLINE | ID: mdl-38944630

RESUMEN

RATIONALE AND OBJECTIVES: Pancreas segmentation accuracy at CT is critical for the identification of pancreatic pathologies and is essential for the development of imaging biomarkers. Our objective was to benchmark the performance of five high-performing pancreas segmentation models across multiple metrics stratified by scan and patient/pancreatic characteristics that may affect segmentation performance. MATERIALS AND METHODS: In this retrospective study, PubMed and ArXiv searches were conducted to identify pancreas segmentation models which were then evaluated on a set of annotated imaging datasets. Results (Dice score, Hausdorff distance [HD], average surface distance [ASD]) were stratified by contrast status and quartiles of peri-pancreatic attenuation (5 mm region around pancreas). Multivariate regression was performed to identify imaging characteristics and biomarkers (n = 9) that were significantly associated with Dice score. RESULTS: Five pancreas segmentation models were identified: Abdomen Atlas [AAUNet, AASwin, trained on 8448 scans], TotalSegmentator [TS, 1204 scans], nnUNetv1 [MSD-nnUNet, 282 scans], and a U-Net based model for predicting diabetes [DM-UNet, 427 scans]. These were evaluated on 352 CT scans (30 females, 25 males, 297 sex unknown; age 58 ± 7 years [ ± 1 SD], 327 age unknown) from 2000-2023. Overall, TS, AAUNet, and AASwin were the best performers, Dice= 80 ± 11%, 79 ± 16%, and 77 ± 18%, respectively (pairwise Sidak test not-significantly different). AASwin and MSD-nnUNet performed worse (for all metrics) on non-contrast scans (vs contrast, P < .001). The worst performer was DM-UNet (Dice=67 ± 16%). All algorithms except TS showed lower Dice scores with increasing peri-pancreatic attenuation (P < .01). Multivariate regression showed non-contrast scans, (P < .001; MSD-nnUNet), smaller pancreatic length (P = .005, MSD-nnUNet), and height (P = .003, DM-UNet) were associated with lower Dice scores. CONCLUSION: The convolutional neural network-based models trained on a diverse set of scans performed best (TS, AAUnet, and AASwin). TS performed equivalently to AAUnet and AASwin with only 13% of the training set size (8488 vs 1204 scans). Though trained on the same dataset, a transformer network (AASwin) had poorer performance on non-contrast scans whereas its convolutional network counterpart (AAUNet) did not. This study highlights how aggregate assessment metrics of pancreatic segmentation algorithms seen in other literature are not enough to capture differential performance across common patient and scanning characteristics in clinical populations.

12.
Med Image Anal ; 97: 103224, 2024 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-38850624

RESUMEN

Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" - there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of long-tailed learning in medical image recognition, few have studied the interaction of label imbalance and label co-occurrence posed by long-tailed, multi-label disease classification. To engage with the research community on this emerging topic, we conducted an open challenge, CXR-LT, on long-tailed, multi-label thorax disease classification from chest X-rays (CXRs). We publicly release a large-scale benchmark dataset of over 350,000 CXRs, each labeled with at least one of 26 clinical findings following a long-tailed distribution. We synthesize common themes of top-performing solutions, providing practical recommendations for long-tailed, multi-label medical image classification. Finally, we use these insights to propose a path forward involving vision-language foundation models for few- and zero-shot disease classification.

13.
BJR Artif Intell ; 1(1): ubae006, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38828430

RESUMEN

Innovation in medical imaging artificial intelligence (AI)/machine learning (ML) demands extensive data collection, algorithmic advancements, and rigorous performance assessments encompassing aspects such as generalizability, uncertainty, bias, fairness, trustworthiness, and interpretability. Achieving widespread integration of AI/ML algorithms into diverse clinical tasks will demand a steadfast commitment to overcoming issues in model design, development, and performance assessment. The complexities of AI/ML clinical translation present substantial challenges, requiring engagement with relevant stakeholders, assessment of cost-effectiveness for user and patient benefit, timely dissemination of information relevant to robust functioning throughout the AI/ML lifecycle, consideration of regulatory compliance, and feedback loops for real-world performance evidence. This commentary addresses several hurdles for the development and adoption of AI/ML technologies in medical imaging. Comprehensive attention to these underlying and often subtle factors is critical not only for tackling the challenges but also for exploring novel opportunities for the advancement of AI in radiology.

14.
Bone ; 186: 117176, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38925254

RESUMEN

Osteoporosis is underdiagnosed, especially in ethnic and racial minorities who are thought to be protected against bone loss, but often have worse outcomes after an osteoporotic fracture. We aimed to determine the prevalence of osteoporosis by opportunistic CT in patients who underwent lung cancer screening (LCS) using non-contrast CT in the Northeastern United States. Demographics including race and ethnicity were retrieved. We assessed trabecular bone and body composition using a fully-automated artificial intelligence algorithm. ROIs were placed at T12 vertebral body for attenuation measurements in Hounsfield Units (HU). Two validated thresholds were used to diagnose osteoporosis: high-sensitivity threshold (115-165 HU) and high specificity threshold (<115 HU). We performed descriptive statistics and ANOVA to compare differences across sex, race, ethnicity, and income class according to neighborhoods' mean household incomes. Forward stepwise regression modeling was used to determine body composition predictors of trabecular attenuation. We included 3708 patients (mean age 64 ± 7 years, 54 % males) who underwent LCS, had available demographic information and an evaluable CT for trabecular attenuation analysis. Using the high sensitivity threshold, osteoporosis was more prevalent in females (74 % vs. 65 % in males, p < 0.0001) and Whites (72 % vs 49 % non-Whites, p < 0.0001). However, osteoporosis was present across all races (38 % Black, 55 % Asian, 56 % Hispanic) and affected all income classes (69 %, 69 %, and 91 % in low, medium, and high-income class, respectively). High visceral/subcutaneous fat-ratio, aortic calcification, and hepatic steatosis were associated with low trabecular attenuation (p < 0.01), whereas muscle mass was positively associated with trabecular attenuation (p < 0.01). In conclusion, osteoporosis is prevalent across all races, income classes and both sexes in patients undergoing LCS. Opportunistic CT using a fully-automated algorithm and uniform imaging protocol is able to detect osteoporosis and body composition without additional testing or radiation. Early identification of patients traditionally thought to be at low risk for bone loss will allow for initiating appropriate treatment to prevent future fragility fractures. CLINICALTRIALS.GOV IDENTIFIER: N/A.


Asunto(s)
Detección Precoz del Cáncer , Neoplasias Pulmonares , Osteoporosis , Tomografía Computarizada por Rayos X , Anciano , Femenino , Humanos , Masculino , Persona de Mediana Edad , Inteligencia Artificial , Detección Precoz del Cáncer/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Neoplasias Pulmonares/diagnóstico por imagen , Osteoporosis/diagnóstico por imagen , Osteoporosis/epidemiología , Tomografía Computarizada por Rayos X/métodos
15.
ArXiv ; 2024 May 14.
Artículo en Inglés | MEDLINE | ID: mdl-38903740

RESUMEN

Multi-parametric MRI (mpMRI) studies are widely available in clinical practice for the diagnosis of various diseases. As the volume of mpMRI exams increases yearly, there are concomitant inaccuracies that exist within the DICOM header fields of these exams. This precludes the use of the header information for the arrangement of the different series as part of the radiologist's hanging protocol, and clinician oversight is needed for correction. In this pilot work, we propose an automated framework to classify the type of 8 different series in mpMRI studies. We used 1,363 studies acquired by three Siemens scanners to train a DenseNet-121 model with 5-fold cross-validation. Then, we evaluated the performance of the DenseNet-121 ensemble on a held-out test set of 313 mpMRI studies. Our method achieved an average precision of 96.6%, sensitivity of 96.6%, specificity of 99.6%, and F 1 score of 96.6% for the MRI series classification task. To the best of our knowledge, we are the first to develop a method to classify the series type in mpMRI studies acquired at the level of the chest, abdomen, and pelvis. Our method has the capability for robust automation of hanging protocols in modern radiology practice.

16.
ArXiv ; 2024 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-38903743

RESUMEN

BACKGROUND: Segmentation of organs and structures in abdominal MRI is useful for many clinical applications, such as disease diagnosis and radiotherapy. Current approaches have focused on delineating a limited set of abdominal structures (13 types). To date, there is no publicly available abdominal MRI dataset with voxel-level annotations of multiple organs and structures. Consequently, a segmentation tool for multi-structure segmentation is also unavailable. METHODS: We curated a T1-weighted abdominal MRI dataset consisting of 195 patients who underwent imaging at National Institutes of Health (NIH) Clinical Center. The dataset comprises of axial pre-contrast T1, arterial, venous, and delayed phases for each patient, thereby amounting to a total of 780 series (69,248 2D slices). Each series contains voxel-level annotations of 62 abdominal organs and structures. A 3D nnUNet model, dubbed as MRISegmentator-Abdomen (MRISegmentator in short), was trained on this dataset, and evaluation was conducted on an internal test set and two large external datasets: AMOS22 and Duke Liver. The predicted segmentations were compared against the ground-truth using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD). FINDINGS: MRISegmentator achieved an average DSC of 0.861$\pm$0.170 and a NSD of 0.924$\pm$0.163 in the internal test set. On the AMOS22 dataset, MRISegmentator attained an average DSC of 0.829$\pm$0.133 and a NSD of 0.908$\pm$0.067. For the Duke Liver dataset, an average DSC of 0.933$\pm$0.015 and a NSD of 0.929$\pm$0.021 was obtained. INTERPRETATION: The proposed MRISegmentator provides automatic, accurate, and robust segmentations of 62 organs and structures in T1-weighted abdominal MRI sequences. The tool has the potential to accelerate research on various clinical topics, such as abnormality detection, radiotherapy, disease classification among others.

17.
ArXiv ; 2024 Feb 17.
Artículo en Inglés | MEDLINE | ID: mdl-38903745

RESUMEN

In radiology, Artificial Intelligence (AI) has significantly advanced report generation, but automatic evaluation of these AI-produced reports remains challenging. Current metrics, such as Conventional Natural Language Generation (NLG) and Clinical Efficacy (CE), often fall short in capturing the semantic intricacies of clinical contexts or overemphasize clinical details, undermining report clarity. To overcome these issues, our proposed method synergizes the expertise of professional radiologists with Large Language Models (LLMs), like GPT-3.5 and GPT-4. Utilizing In-Context Instruction Learning (ICIL) and Chain of Thought (CoT) reasoning, our approach aligns LLM evaluations with radiologist standards, enabling detailed comparisons between human and AI-generated reports. This is further enhanced by a Regression model that aggregates sentence evaluation scores. Experimental results show that our "Detailed GPT-4 (5-shot)" model achieves a 0.48 score, outperforming the METEOR metric by 0.19, while our "Regressed GPT-4" model shows even greater alignment with expert evaluations, exceeding the best existing metric by a 0.35 margin. Moreover, the robustness of our explanations has been validated through a thorough iterative strategy. We plan to publicly release annotations from radiology experts, setting a new standard for accuracy in future assessments. This underscores the potential of our approach in enhancing the quality assessment of AI-driven medical reports.

18.
ArXiv ; 2024 Mar 06.
Artículo en Inglés | MEDLINE | ID: mdl-38711428

RESUMEN

Accurate training labels are a key component for multi-class medical image segmentation. Their annotation is costly and time-consuming because it requires domain expertise. In our previous work, a dual-branch network was developed to segment single-class edematous adipose tissue. Its inputs include a few strong labels from manual annotation and many inaccurate weak labels from existing segmentation methods. The dual-branch network consists of a shared encoder and two decoders to process weak and strong labels. Self-supervision iteratively updates weak labels during the training process. This work aims to follow this strategy and automatically improve training labels for multi-class image segmentation. Instead of using weak and strong labels to only train the network once in the previous work, transfer learning is used to train the network and improve weak labels sequentially. The dual-branch network is first trained by weak labels alone to initialize model parameters. After the network is stabilized, the shared encoder is frozen, and strong and weak decoders are fine-tuned by strong and weak labels together. The accuracy of weak labels is iteratively improved in the fine-tuning process. The proposed method was applied to a three-class segmentation of muscle, subcutaneous and visceral adipose tissue on abdominal CT scans. Validation results on 11 patients showed that the accuracy of training labels was statistically significantly improved, with the Dice similarity coefficient of muscle, subcutaneous and visceral adipose tissue increased from 74.2% to 91.5%, 91.2% to 95.6%, and 77.6% to 88.5%, respectively (p<0.05). In comparison with our earlier method, the label accuracy was also significantly improved (p<0.05). These experimental results suggested that the combination of the dual-branch network and transfer learning is an efficient means to improve training labels for multi-class segmentation.

19.
Int J Comput Assist Radiol Surg ; 19(8): 1537-1544, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38740719

RESUMEN

PURPOSE: Lymph nodes (LNs) in the chest have a tendency to enlarge due to various pathologies, such as lung cancer or pneumonia. Clinicians routinely measure nodal size to monitor disease progression, confirm metastatic cancer, and assess treatment response. However, variations in their shapes and appearances make it cumbersome to identify LNs, which reside outside of most organs. METHODS: We propose to segment LNs in the mediastinum by leveraging the anatomical priors of 28 different structures (e.g., lung, trachea etc.) generated by the public TotalSegmentator tool. The CT volumes from 89 patients available in the public NIH CT Lymph Node dataset were used to train three 3D off-the-shelf nnUNet models to segment LNs. The public St. Olavs dataset containing 15 patients (out-of-training-distribution) was used to evaluate the segmentation performance. RESULTS: For LNs with short axis diameter ≥ 8 mm, the 3D cascade nnUNet model obtained the highest Dice score of 67.9 ± 23.4 and lowest Hausdorff distance error of 22.8 ± 20.2. For LNs of all sizes, the Dice score was 58.7 ± 21.3 and this represented a ≥ 10% improvement over a recently published approach evaluated on the same test dataset. CONCLUSION: To our knowledge, we are the first to harness 28 distinct anatomical priors to segment mediastinal LNs, and our work can be extended to other nodal zones in the body. The proposed method has the potential for improved patient outcomes through the identification of enlarged nodes in initial staging CT scans.


Asunto(s)
Ganglios Linfáticos , Mediastino , Tomografía Computarizada por Rayos X , Humanos , Ganglios Linfáticos/diagnóstico por imagen , Mediastino/diagnóstico por imagen , Tomografía Computarizada por Rayos X/métodos , Imagenología Tridimensional/métodos , Neoplasias Pulmonares/diagnóstico por imagen , Neoplasias Pulmonares/patología , Metástasis Linfática/diagnóstico por imagen
20.
Int J Comput Assist Radiol Surg ; 19(8): 1589-1596, 2024 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-38758290

RESUMEN

PURPOSE: Body composition measurements from routine abdominal CT can yield personalized risk assessments for asymptomatic and diseased patients. In particular, attenuation and volume measures of muscle and fat are associated with important clinical outcomes, such as cardiovascular events, fractures, and death. This study evaluates the reliability of an Internal tool for the segmentation of muscle and fat (subcutaneous and visceral) as compared to the well-established public TotalSegmentator tool. METHODS: We assessed the tools across 900 CT series from the publicly available SAROS dataset, focusing on muscle, subcutaneous fat, and visceral fat. The Dice score was employed to assess accuracy in subcutaneous fat and muscle segmentation. Due to the lack of ground truth segmentations for visceral fat, Cohen's Kappa was utilized to assess segmentation agreement between the tools. RESULTS: Our Internal tool achieved a 3% higher Dice (83.8 vs. 80.8) for subcutaneous fat and a 5% improvement (87.6 vs. 83.2) for muscle segmentation, respectively. A Wilcoxon signed-rank test revealed that our results were statistically different with p < 0.01. For visceral fat, the Cohen's Kappa score of 0.856 indicated near-perfect agreement between the two tools. Our internal tool also showed very strong correlations for muscle volume (R 2 =0.99), muscle attenuation (R 2 =0.93), and subcutaneous fat volume (R 2 =0.99) with a moderate correlation for subcutaneous fat attenuation (R 2 =0.45). CONCLUSION: Our findings indicated that our Internal tool outperformed TotalSegmentator in measuring subcutaneous fat and muscle. The high Cohen's Kappa score for visceral fat suggests a reliable level of agreement between the two tools. These results demonstrate the potential of our tool in advancing the accuracy of body composition analysis.


Asunto(s)
Composición Corporal , Tomografía Computarizada por Rayos X , Humanos , Tomografía Computarizada por Rayos X/métodos , Composición Corporal/fisiología , Reproducibilidad de los Resultados , Masculino , Grasa Intraabdominal/diagnóstico por imagen , Femenino , Grasa Subcutánea/diagnóstico por imagen , Músculo Esquelético/diagnóstico por imagen , Persona de Mediana Edad , Anciano
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...