Search | VHL Regional Portal

1.

GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination.

Hirano, Yuichiro; Hanaoka, Shouhei; Nakao, Takahiro; Miki, Soichiro; Kikuchi, Tomohiro; Nakamura, Yuta; Nomura, Yukihiro; Yoshikawa, Takeharu; Abe, Osamu.

Jpn J Radiol ; 2024 May 11.

Article in English | MEDLINE | ID: mdl-38733472

ABSTRACT

PURPOSE: To assess the performance of GPT-4 Turbo with Vision (GPT-4TV), OpenAI's latest multimodal large language model, by comparing its ability to process both text and image inputs with that of the text-only GPT-4 Turbo (GPT-4 T) in the context of the Japan Diagnostic Radiology Board Examination (JDRBE). MATERIALS AND METHODS: The dataset comprised questions from JDRBE 2021 and 2023. A total of six board-certified diagnostic radiologists discussed the questions and provided ground-truth answers by consulting relevant literature as necessary. The following questions were excluded: those lacking associated images, those with no unanimous agreement on answers, and those including images rejected by the OpenAI application programming interface. The inputs for GPT-4TV included both text and images, whereas those for GPT-4 T were entirely text. Both models were deployed on the dataset, and their performance was compared using McNemar's exact test. The radiological credibility of the responses was assessed by two diagnostic radiologists through the assignment of legitimacy scores on a five-point Likert scale. These scores were subsequently used to compare model performance using Wilcoxon's signed-rank test. RESULTS: The dataset comprised 139 questions. GPT-4TV correctly answered 62 questions (45%), whereas GPT-4 T correctly answered 57 questions (41%). A statistical analysis found no significant performance difference between the two models (P = 0.44). The GPT-4TV responses received significantly lower legitimacy scores from both radiologists than the GPT-4 T responses. CONCLUSION: No significant enhancement in accuracy was observed when using GPT-4TV with image input compared with that of using text-only GPT-4 T for JDRBE questions.

2.

Towards Improved Radiological Diagnostics: Investigating the Utility and Limitations of GPT-3.5 Turbo and GPT-4 with Quiz Cases.

Kikuchi, Tomohiro; Nakao, Takahiro; Nakamura, Yuta; Hanaoka, Shouhei; Mori, Harushi; Yoshikawa, Takeharu.

AJNR Am J Neuroradiol ; 2024 May 07.

Article in English | MEDLINE | ID: mdl-38719605

ABSTRACT

BACKGROUND AND PURPOSE: The rise of large language models such as generative pre-trained transformers (GPTs) has sparked significant interest in radiology, especially in interpreting radiological reports and image findings. While existing research has focused on GPTs estimating diagnoses from radiological descriptions, exploring alternative diagnostic information sources is also crucial. This study introduces the use of GPTs (GPT-3.5 Turbo and GPT-4) for information retrieval and summarization, searching relevant case reports via PubMed, and investigates their potential to aid diagnosis. MATERIALS AND METHODS: From October 2021 to December 2023, we selected 115 cases from the "Case of the Week" series on the American Journal of Neuroradiology website. Their Description and Legend sections were presented to the GPTs for the two tasks. For the Direct Diagnosis task, the models provided three differential diagnoses that were considered correct if they matched the diagnosis in the diagnosis section. For the Case Report Search task, the models generated two keywords per case, creating PubMed search queries to extract up to three relevant reports. A response was considered correct if reports containing the disease name stated in the diagnosis section were extracted. McNemar's test was employed to evaluate whether adding a Case Report Search to Direct Diagnosis improved overall accuracy. RESULTS: In the Direct Diagnosis task, GPT-3.5 Turbo achieved a correct response rate of 26% (30/115 cases), whereas GPT-4 achieved 41% (47/115). For the Case Report Search task, GPT-3.5 Turbo scored 10% (11/115), and GPT-4 scored 7% (8/115). Correct responses totaled 32% (37/115) with three overlapping cases for GPT-3.5 Turbo, whereas GPT-4 had 43% (50/115) of correct responses with five overlapping cases. Adding Case Report Search improved GPT-3.5 Turbo's performance (p = 0.023) but not that of GPT-4 (p = 0.248). CONCLUSIONS: The effectiveness of adding Case Report Search to GPT-3.5 Turbo was particularly pronounced, suggesting its potential as an alternative diagnostic approach to GPTs, particularly in scenarios where direct diagnoses from GPTs are not obtainable. Nevertheless, the overall performance of GPT models in both direct diagnosis and case report retrieval tasks remains not optimal, and users should be aware of their limitations.ABBREVIATIONS: AI = Artificial Intelligence, GPT = generative pretrained transformer, LLM = large language model.

3.

Performance changes due to differences among annotating radiologists for training data in computerized lesion detection.

Nomura, Yukihiro; Hanaoka, Shouhei; Hayashi, Naoto; Yoshikawa, Takeharu; Koshino, Saori; Sato, Chiaki; Tatsuta, Momoko; Tanaka, Yuya; Kano, Shintaro; Nakaya, Moto; Inui, Shohei; Kusakabe, Masashi; Nakao, Takahiro; Miki, Soichiro; Watadani, Takeyuki; Nakaoka, Ryusuke; Shimizu, Akinobu; Abe, Osamu.

Int J Comput Assist Radiol Surg ; 2024 Apr 16.

Article in English | MEDLINE | ID: mdl-38625446

ABSTRACT

PURPOSE: The quality and bias of annotations by annotators (e.g., radiologists) affect the performance changes in computer-aided detection (CAD) software using machine learning. We hypothesized that the difference in the years of experience in image interpretation among radiologists contributes to annotation variability. In this study, we focused on how the performance of CAD software changes with retraining by incorporating cases annotated by radiologists with varying experience. METHODS: We used two types of CAD software for lung nodule detection in chest computed tomography images and cerebral aneurysm detection in magnetic resonance angiography images. Twelve radiologists with different years of experience independently annotated the lesions, and the performance changes were investigated by repeating the retraining of the CAD software twice, with the addition of cases annotated by each radiologist. Additionally, we investigated the effects of retraining using integrated annotations from multiple radiologists. RESULTS: The performance of the CAD software after retraining differed among annotating radiologists. In some cases, the performance was degraded compared to that of the initial software. Retraining using integrated annotations showed different performance trends depending on the target CAD software, notably in cerebral aneurysm detection, where the performance decreased compared to using annotations from a single radiologist. CONCLUSIONS: Although the performance of the CAD software after retraining varied among the annotating radiologists, no direct correlation with their experience was found. The performance trends differed according to the type of CAD software used when integrated annotations from multiple radiologists were used.

4.

Impact of CT-determined low kidney volume on renal function decline: a propensity score-matched analysis.

Kikuchi, Tomohiro; Hanaoka, Shouhei; Nakao, Takahiro; Nomura, Yukihiro; Mori, Harushi; Yoshikawa, Takeharu.

Insights Imaging ; 15(1): 102, 2024 Apr 05.

Article in English | MEDLINE | ID: mdl-38578554

ABSTRACT

OBJECTIVES: To investigate the relationship between low kidney volume and subsequent estimated glomerular filtration rate (eGFR) decline in eGFR category G2 (60-89 mL/min/1.73 m2) population. METHODS: In this retrospective study, we evaluated 5531 individuals with eGFR category G2 who underwent medical checkups at our institution between November 2006 and October 2017. Exclusion criteria were absent for follow-up visit, missing data, prior renal surgery, current renal disease under treatment, large renal masses, and horseshoe kidney. We developed a 3D U-net-based automated system for renal volumetry on CT images. Participants were grouped by sex-specific kidney volume deviations set at mean minus one standard deviation. After 1:1 propensity score matching, we obtained 397 pairs of individuals in the low kidney volume (LKV) and control groups. The primary endpoint was progression of eGFR categories within 5 years, assessed using Cox regression analysis. RESULTS: This study included 3220 individuals (mean age, 60.0 ± 9.7 years; men, n = 2209). The kidney volume was 404.6 ± 67.1 and 376.8 ± 68.0 cm3 in men and women, respectively. The low kidney volume (LKV) cutoff was 337.5 and 308.8 cm3 for men and women, respectively. LKV was a significant risk factor for the endpoint with an adjusted hazard ratio of 1.64 (95% confidence interval: 1.09-2.45; p = 0.02). CONCLUSION: Low kidney volume may adversely affect subsequent eGFR maintenance; hence, the use of imaging metrics may help predict eGFR decline. CRITICAL RELEVANCE STATEMENT: Low kidney volume is a significant predictor of reduced kidney function over time; thus, kidney volume measurements could aid in early identification of individuals at risk for declining kidney health. KEY POINTS: â¢ This study explores how kidney volume affects subsequent kidney function maintenance. â¢ Low kidney volume was associated with estimated glomerular filtration rate decreases. â¢ Low kidney volume is a prognostic indicator of estimated glomerular filtration rate decline.

5.

Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study.

Nakao, Takahiro; Miki, Soichiro; Nakamura, Yuta; Kikuchi, Tomohiro; Nomura, Yukihiro; Hanaoka, Shouhei; Yoshikawa, Takeharu; Abe, Osamu.

JMIR Med Educ ; 10: e54393, 2024 Mar 12.

Article in English | MEDLINE | ID: mdl-38470459

ABSTRACT

BACKGROUND: Previous research applying large language models (LLMs) to medicine was focused on text-based information. Recently, multimodal variants of LLMs acquired the capability of recognizing images. OBJECTIVE: We aim to evaluate the image recognition capability of generative pretrained transformer (GPT)-4V, a recent multimodal LLM developed by OpenAI, in the medical field by testing how visual information affects its performance to answer questions in the 117th Japanese National Medical Licensing Examination. METHODS: We focused on 108 questions that had 1 or more images as part of a question and presented GPT-4V with the same questions under two conditions: (1) with both the question text and associated images and (2) with the question text only. We then compared the difference in accuracy between the 2 conditions using the exact McNemar test. RESULTS: Among the 108 questions with images, GPT-4V's accuracy was 68% (73/108) when presented with images and 72% (78/108) when presented without images (P=.36). For the 2 question categories, clinical and general, the accuracies with and those without images were 71% (70/98) versus 78% (76/98; P=.21) and 30% (3/10) versus 20% (2/10; P≥.99), respectively. CONCLUSIONS: The additional information from the images did not significantly improve the performance of GPT-4V in the Japanese National Medical Licensing Examination.

Subject(s)

Licensure , Medicine , Japan , Language

6.

Axillary Lymphadenopathy after COVID-19 Vaccination: Follow-up for Enlarged Lymph Nodes on MR Imaging.

Kanemaru, Noriko; Yoshikawa, Takeharu; Miki, Soichiro; Nakao, Takahiro; Nakamura, Yuta; Fujimoto, Kotaro; Abe, Osamu.

Magn Reson Med Sci ; 2024 Feb 07.

Article in English | MEDLINE | ID: mdl-38325833

ABSTRACT

PURPOSE: The purpose of this study was to investigate the longitudinal MRI characteristic of COVID-19-vaccination-related axillary lymphadenopathy by evaluating the size, T2-weighted signal intensity, and apparent diffusion coefficient (ADC) values. METHODS: COVID-19-vaccination-related axillary lymphadenopathy was observed in 90 of 433 health screening program participants on the chest region of whole-body axial MRIs in 2021, as reported in our previous study. Follow-up MRI was performed at an interval of approximately 1 year after the second vaccination dose from 2022 to 2023. The diameter, signal intensity on T2-weighted images, and ADC of the largest enlarged lymph nodes were measured on chest MRI. The values were compared between the post-vaccination MRI and the follow-up MRI, and statistically analyzed. RESULTS: Out of the 90 participants who had enlarged lymph nodes of 5 mm or larger in short axis after the second vaccination dose, 76 participants (45 men and 31 women, mean age: 61 years) were enrolled in the present study. The median short- and long-axis diameter of the enlarged lymph nodes was 7 mm and 9 mm for post-vaccination MRI and 4 mm and 6 mm for follow-up MRI, respectively. The median signal intensity relative to the muscle on T2-weighted images decreased (5.1 for the initial post-vaccination MRI and 3.6 for the follow-up MRI, P < .0001). The ADC values did not show a notable change and remained in a normal range. CONCLUSION: The enlarged axillary lymph nodes decreased both in size and in signal intensity on T2-weighted images of follow-up MRI. The ADC remained unchanged. Our findings may provide important information to establish evidence-based guidelines for conducting proper assessment and management of post-vaccination lymphadenopathy.

7.

Synthesis of Hybrid Data Consisting of Chest Radiographs and Tabular Clinical Records Using Dual Generative Models for COVID-19 Positive Cases.

Kikuchi, Tomohiro; Hanaoka, Shouhei; Nakao, Takahiro; Takenaga, Tomomi; Nomura, Yukihiro; Mori, Harushi; Yoshikawa, Takeharu.

J Imaging Inform Med ; 37(3): 1217-1227, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38351224

ABSTRACT

To generate synthetic medical data incorporating image-tabular hybrid data by merging an image encoding/decoding model with a table-compatible generative model and assess their utility. We used 1342 cases from the Stony Brook University Covid-19-positive cases, comprising chest X-ray radiographs (CXRs) and tabular clinical data as a private dataset (pDS). We generated a synthetic dataset (sDS) through the following steps: (I) dimensionally reducing CXRs in the pDS using a pretrained encoder of the auto-encoding generative adversarial networks (αGAN) and integrating them with the correspondent tabular clinical data; (II) training the conditional tabular GAN (CTGAN) on this combined data to generate synthetic records, encompassing encoded image features and clinical data; and (III) reconstructing synthetic images from these encoded image features in the sDS using a pretrained decoder of the αGAN. The utility of sDS was assessed by the performance of the prediction models for patient outcomes (deceased or discharged). For the pDS test set, the area under the receiver operating characteristic (AUC) curve was calculated to compare the performance of prediction models trained separately with pDS, sDS, or a combination of both. We created an sDS comprising CXRs with a resolution of 256 × 256 pixels and tabular data containing 13 variables. The AUC for the outcome was 0.83 when the model was trained with the pDS, 0.74 with the sDS, and 0.87 when combining pDS and sDS for training. Our method is effective for generating synthetic records consisting of both images and tabular clinical data.

Subject(s)

COVID-19 , Radiography, Thoracic , SARS-CoV-2 , Humans , COVID-19/diagnostic imaging , Radiography, Thoracic/methods , Female , Male , Middle Aged , Aged , ROC Curve , Adult

8.

Improved identification of tumors in 18F-FDG-PET examination by normalizing the standard uptake in the liver based on blood test data.

Alam, Md Ashraful; Hanaoka, Shouhei; Nomura, Yukihiro; Kikuchi, Tomohiro; Nakao, Takahiro; Takenaga, Tomomi; Hayashi, Naoto; Yoshikawa, Takeharu; Abe, Osamu.

Int J Comput Assist Radiol Surg ; 19(3): 581-590, 2024 Mar.

Article in English | MEDLINE | ID: mdl-38180621

ABSTRACT

PURPOSE: Standardized uptake values (SUVs) derived from 18F-fluoro-2-deoxy-D-glucose positron emission tomography/computed tomography are a crucial parameter for identifying tumors or abnormalities in an organ. Moreover, exploring ways to improve the identification of tumors or abnormalities using a statistical measurement tool is important in clinical research. Therefore, we developed a fully automatic method to create a personally normalized Z-score map of the liver SUV. METHODS: The normalized Z-score map for each patient was created using the SUV mean and standard deviation estimated from blood-test-derived variables, such as alanine aminotransferase and aspartate aminotransferase, as well as other demographic information. This was performed using the least absolute shrinkage and selection operator (LASSO)-based estimation formula. We also used receiver operating characteristic (ROC) to analyze the results of people with and without hepatic tumors and compared them to the ROC curve of normal SUV. RESULTS: A total of 7757 people were selected for this study. Of these, 7744 were healthy, while 13 had abnormalities. The area under the ROC curve results indicated that the anomaly detection approach (0.91) outperformed only the maximum SUV (0.89). To build the LASSO regression, sets of covariates, including sex, weight, body mass index, blood glucose level, triglyceride, total cholesterol, Î³-glutamyl transpeptidase, total protein, creatinine, insulin, albumin, and cholinesterase, were used to determine the SUV mean, whereas weight was used to determine the SUV standard deviation. CONCLUSION: The Z-score normalizes the mean and standard deviation. It is effective in ROC curve analysis and increases the clarity of the abnormality. This normalization is a key technique for effective measurement of maximum glucose consumption by tumors in the liver.

Subject(s)

Fluorodeoxyglucose F18 , Neoplasms , Humans , Radiopharmaceuticals , Positron-Emission Tomography/methods , Neoplasms/diagnostic imaging , Liver/diagnostic imaging

9.

Development and evaluation of an integrated liver nodule diagnostic method by combining the liver segment division and lesion localization/classification models for enhanced focal liver lesion detection.

Takenaga, Tomomi; Hanaoka, Shouhei; Nomura, Yukihiro; Nakao, Takahiro; Shibata, Hisaichi; Miki, Soichiro; Yoshikawa, Takeharu; Hayashi, Naoto; Abe, Osamu.

Radiol Phys Technol ; 17(1): 103-111, 2024 Mar.

Article in English | MEDLINE | ID: mdl-37917288

ABSTRACT

The purpose of the study was to develop a liver nodule diagnostic method that accurately localizes and classifies focal liver lesions and identifies the specific liver segments in which they reside by integrating a liver segment division algorithm using a four-dimensional (4D) fully convolutional residual network (FC-ResNet) with a localization and classification model. We retrospectively collected data and divided 106 gadolinium-ethoxybenzyl-diethylenetriamine pentaacetic acid-enhanced magnetic resonance examinations into Case-sets 1, 2, and 3. A liver segment division algorithm was developed using a 4D FC-ResNet and trained with semi-automatically created silver-standard annotations; performance was evaluated using manually created gold-standard annotations by calculating the Dice scores for each liver segment. The performance of the liver nodule diagnostic method was assessed by comparing the results with those of the original radiology reports. The mean Dice score between the output of the liver segment division model and the gold standard was 0.643 for Case-set 2 (normal liver contours) and 0.534 for Case-set 1 (deformed liver contours). Among the 64 lesions in Case-set 3, the diagnostic method localized 37 lesions, classified 33 lesions, and identified the liver segments for 30 lesions. A total of 28 lesions were true positives, matching the original radiology reports. The liver nodule diagnostic method, which integrates a liver segment division algorithm with a lesion localization and classification model, exhibits great potential for localizing and classifying focal liver lesions and identifying the liver segments in which they reside. Further improvements and validation using larger sample sizes will enhance its performance and clinical applicability.

Subject(s)

Contrast Media , Liver Neoplasms , Humans , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/pathology , Retrospective Studies , Liver/diagnostic imaging , Gadolinium DTPA , Magnetic Resonance Imaging/methods

10.

Relationship between Thyroid CT Density, Volume, and Future TSH Elevation: A 5-Year Follow-Up Study.

Kikuchi, Tomohiro; Hanaoka, Shouhei; Nakao, Takahiro; Nomura, Yukihiro; Yoshikawa, Takeharu; Alam, Md Ashraful; Mori, Harushi; Hayashi, Naoto.

Life (Basel) ; 13(12)2023 Dec 06.

Article in English | MEDLINE | ID: mdl-38137904

ABSTRACT

This study aimed to explore the relationship between thyroid-stimulating hormone (TSH) elevation and the baseline computed tomography (CT) density and volume of the thyroid. We examined 86 cases with new-onset hypothyroidism (TSH > 4.5 IU/mL) and 1071 controls from a medical check-up database over 5 years. A deep learning-based thyroid segmentation method was used to assess CT density and volume. Statistical tests and logistic regression were employed to determine differences and odds ratios. Initially, the case group showed a higher CT density (89.8 vs. 81.7 Hounsfield units (HUs)) and smaller volume (13.0 vs. 15.3 mL) than those in the control group. For every +10 HU in CT density and -3 mL in volume, the odds of developing hypothyroidism increased by 1.40 and 1.35, respectively. Over the course of the study, the case group showed a notable CT density reduction (median: -8.9 HU), whereas the control group had a minor decrease (-2.9 HU). Thyroid volume remained relatively stable for both groups. Higher CT density and smaller thyroid volume at baseline are correlated with future TSH elevation. Over time, there was a substantial and minor decrease in CT density in the case and control groups, respectively. Thyroid volumes remained consistent in both cohorts.

11.

Predicting Breast Cancer Risk Using Radiomics Features of Mammography Images.

Suzuki, Yusuke; Hanaoka, Shouhei; Tanabe, Masahiko; Yoshikawa, Takeharu; Seto, Yasuyuki.

J Pers Med ; 13(11)2023 Oct 25.

Article in English | MEDLINE | ID: mdl-38003843

ABSTRACT

Mammography images contain a lot of information about not only the mammary glands but also the skin, adipose tissue, and stroma, which may reflect the risk of developing breast cancer. We aimed to establish a method to predict breast cancer risk using radiomics features of mammography images and to enable further examinations and prophylactic treatment to reduce breast cancer mortality. We used mammography images of 4000 women with breast cancer and 1000 healthy women from the 'starting point set' of the OPTIMAM dataset, a public dataset. We trained a Light Gradient Boosting Machine using radiomics features extracted from mammography images of women with breast cancer (only the healthy side) and healthy women. This model was a binary classifier that could discriminate whether a given mammography image was of the contralateral side of women with breast cancer or not, and its performance was evaluated using five-fold cross-validation. The average area under the curve for five folds was 0.60122. Some radiomics features, such as 'wavelet-H_glcm_Correlation' and 'wavelet-H_firstorder_Maximum', showed distribution differences between the malignant and normal groups. Therefore, a single radiomics feature might reflect the breast cancer risk. The odds ratio of breast cancer incidence was 7.38 in women whose estimated malignancy probability was ≥0.95. Radiomics features from mammography images can help predict breast cancer risk.

12.

Characterization of Brain Volume Changes in Aging Individuals With Normal Cognition Using Serial Magnetic Resonance Imaging.

Fujita, Shohei; Mori, Susumu; Onda, Kengo; Hanaoka, Shouhei; Nomura, Yukihiro; Nakao, Takahiro; Yoshikawa, Takeharu; Takao, Hidemasa; Hayashi, Naoto; Abe, Osamu.

JAMA Netw Open ; 6(6): e2318153, 2023 06 01.

Article in English | MEDLINE | ID: mdl-37378985

ABSTRACT

Importance: Characterizing longitudinal patterns of regional brain volume changes in a population with normal cognition at the individual level could improve understanding of the brain aging process and may aid in the prevention of age-related neurodegenerative diseases. Objective: To investigate age-related trajectories of the volumes and volume change rates of brain structures in participants without dementia. Design, Setting, and Participants: This cohort study was conducted from November 1, 2006, to April 30, 2021, at a single academic health-checkup center among 653 individuals who participated in a health screening program with more than 10 years of serial visits. Exposure: Serial magnetic resonance imaging, Mini-Mental State Examination, health checkup. Main Outcomes and Measures: Volumes and volume change rates across brain tissue types and regions. Results: The study sample included 653 healthy control individuals (mean [SD] age at baseline, 55.1 [9.3] years; median age, 55 years [IQR, 47-62 years]; 447 men [69%]), who were followed up annually for up to 15 years (mean [SD], 11.5 [1.8] years; mean [SD] number of scans, 12.1 [1.9]; total visits, 7915). Each brain structure showed characteristic age-dependent volume and atrophy change rates. In particular, the cortical gray matter showed a consistent pattern of volume loss in each brain lobe with aging. The white matter showed an age-related decrease in volume and an accelerated atrophy rate (regression coefficient, -0.016 [95% CI, -0.012 to -0.011]; P < .001). An accelerated age-related volume increase in the cerebrospinal fluid-filled spaces, particularly in the inferior lateral ventricle and the Sylvian fissure, was also observed (ventricle regression coefficient, 0.042 [95% CI, 0.037-0.047]; P < .001; sulcus regression coefficient, 0.021 [95% CI, 0.018-0.023]; P < .001). The temporal lobe atrophy rate accelerated from approximately 70 years of age, preceded by acceleration of atrophy in the hippocampus and amygdala. Conclusions and Relevance: In this cohort study of adults without dementia, age-dependent brain structure volumes and volume change rates in various brain structures were characterized using serial magnetic resonance imaging scans. These findings clarified the normal distributions in the aging brain, which are essential for understanding the process of age-related neurodegenerative diseases.

Subject(s)

Brain , Dementia , Male , Adult , Humans , Middle Aged , Child , Cohort Studies , Brain/diagnostic imaging , Brain/pathology , Aging/pathology , Magnetic Resonance Imaging , Cognition , Atrophy , Dementia/pathology

13.

Automated volume measurement of abdominal adipose tissue from entire abdominal cavity in Dixon MR images using deep learning.

Takahashi, Masato; Takenaga, Tomomi; Nomura, Yukihiro; Hanaoka, Shouhei; Hayashi, Naoto; Nemoto, Mitsutaka; Nakao, Takahiro; Miki, Soichiro; Yoshikawa, Takeharu; Kobayashi, Tomoya; Abe, Shinji.

Radiol Phys Technol ; 16(1): 28-38, 2023 Mar.

Article in English | MEDLINE | ID: mdl-36344662

ABSTRACT

The purpose of this study was to realize an automated volume measurement of abdominal adipose tissue from the entire abdominal cavity in Dixon magnetic resonance (MR) images using deep learning. Our algorithm involves a combination of extraction of the abdominal cavity and body trunk regions using deep learning and extraction of a fat region based on automatic thresholding. To evaluate the proposed method, we calculated the Dice coefficient (DC) between the extracted regions using deep learning and labeled images. We also compared the visceral adipose tissue (VAT) and subcutaneous adipose tissue volumes calculated by employing the proposed method with those calculated from computed tomography (CT) images scanned on the same day using the automatic calculation method previously developed by our group. We implemented our method as a plug-in in a web-based medical image processing platform. The DCs of the abdominal cavity and body trunk regions were 0.952 ± 0.014 and 0.995 ± 0.002, respectively. The VAT volume measured from MR images using the proposed method was almost equivalent to that measured from CT images. The time required for our plug-in to process the test set was 118.9 ± 28.0 s. Using our proposed method, the VAT volume measured from MR images can be an alternative to that measured from CT images.

Subject(s)

Abdominal Cavity , Deep Learning , Reproducibility of Results , Abdominal Fat/diagnostic imaging , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Adipose Tissue

14.

Significance of FDG-PET standardized uptake values in predicting thyroid disease.

Kikuchi, Tomohiro; Hanaoka, Shouhei; Nakao, Takahiro; Nomura, Yukihiro; Yoshikawa, Takeharu; Alam, Ashraful; Mori, Harushi; Hayashi, Naoto.

Eur Thyroid J ; 12(1)2023 02 01.

Article in English | MEDLINE | ID: mdl-36562641

ABSTRACT

Objective: This study aimed to determine a standardized cut-off value for abnormal 18F-fluorodeoxyglucose (FDG) accumulation in the thyroid gland. Methods: Herein, 7013 FDG-PET/CT scans were included. An automatic thyroid segmentation method using two U-nets (2D- and 3D-U-net) was constructed; mean FDG standardized uptake value (SUV), CT value, and volume of the thyroid gland were obtained from each participant. The values were categorized by thyroid function into three groups based on serum thyroid-stimulating hormone levels. Thyroid function and mean SUV with increments of 1 were analyzed, and risk for thyroid dysfunction was calculated. Thyroid dysfunction detection ability was examined using a machine learning method (LightGBM, Microsoft) with age, sex, height, weight, CT value, volume, and mean SUV as explanatory variables. Results: Mean SUV was significantly higher in females with hypothyroidism. Almost 98.9% of participants in the normal group had mean SUV < 2 and 93.8% participants with mean SUV < 2 had normal thyroid function. The hypothyroidism group had more cases with mean SUV ≥ 2. The relative risk of having abnormal thyroid function was 4.6 with mean SUV ≥ 2. The sensitivity and specificity for detecting thyroid dysfunction using LightGBM (Microsoft) were 14.5 and 99%, respectively. Conclusions: Mean SUV ≥ 2 was strongly associated with abnormal thyroid function in this large cohort, indicating that mean SUV with FDG-PET/CT can be used as a criterion for thyroid evaluation. Preliminarily, this study shows the potential utility of detecting thyroid dysfunction based on imaging findings.

Subject(s)

Hypothyroidism , Thyroid Diseases , Female , Humans , Fluorodeoxyglucose F18 , Positron Emission Tomography Computed Tomography , Tomography, X-Ray Computed/methods , Thyroid Diseases/diagnostic imaging

15.

Axillary Lymphadenopathy after Pfizer-BioNTech and Moderna COVID-19 Vaccination: MRI Evaluation.

Yoshikawa, Takeharu; Miki, Soichiro; Nakao, Takahiro; Koshino, Saori; Hayashi, Naoto; Abe, Osamu.

Radiology ; 306(1): 270-278, 2023 01.

Article in English | MEDLINE | ID: mdl-36098641

ABSTRACT

Background COVID-19 vaccination-related axillary lymphadenopathy has become an important problem in cancer imaging. Data are needed to update or support imaging guidelines for conducting appropriate follow-up. Purpose To investigate the prevalence, predisposing factors, and MRI characteristics of COVID-19 vaccination-related axillary lymphadenopathy. Materials and Methods Prospectively collected prevaccination and postvaccination chest MRI scans were secondarily analyzed. Participants who underwent two doses of either the Pfizer-BioNTech or Moderna COVID-19 vaccine and chest MRI from June to October 2021 were included. Enlarged axillary lymph nodes were identified on postvaccination MRI scans compared with prevaccination scans. The lymph node diameter, signal intensity with T2-weighted imaging, and apparent diffusion coefficient (ADC) of the largest enlarged lymph nodes were measured. These values were compared between prevaccination and postvaccination MRI by using the Wilcoxon signed-rank test. Results Overall, 433 participants (mean age, 65 years ± 11 [SD]; 300 men and 133 women) were included. The prevalence of axillary lymphadenopathy in participants 1-14 days after vaccination was 65% (30 of 46). Participants with lymphadenopathy were younger than those without lymphadenopathy (P < .001). Female sex and the Moderna vaccine were predisposing factors (P = .005 and P = .003, respectively). Five or more enlarged lymph nodes were noted in 2% (eight of 433) of participants. Enlarged lymph nodes greater than or equal to 10 mm in the short axis were noted in 1% (four of 433) of participants. The median signal intensity relative to the muscle on T2-weighted images was 4.0; enlarged lymph nodes demonstrated a higher signal intensity (P = .002). The median ADC of enlarged lymph nodes after vaccination in 90 participants was 1.1 × 10-3 mm2/sec (range, 0.6-2.0 × 10-3 mm2/sec), thus ADC values remained normal. Conclusion Axillary lymphadenopathy after the second dose of the Pfizer-BioNTech or Moderna COVID-19 vaccines was frequent within 2 weeks after vaccination, was typically less than 10 mm in size, and had a normal apparent diffusion coefficient. © RSNA, 2022.

Subject(s)

COVID-19 , Lymphadenopathy , Male , Female , Humans , Aged , COVID-19 Vaccines , 2019-nCoV Vaccine mRNA-1273 , Sensitivity and Specificity , COVID-19/pathology , Magnetic Resonance Imaging/methods , Lymph Nodes/pathology , Vaccination

16.

Preliminary study of generalized semiautomatic segmentation for 3D voxel labeling of lesions based on deep learning.

Nomura, Yukihiro; Hanaoka, Shouhei; Takenaga, Tomomi; Nakao, Takahiro; Shibata, Hisaichi; Miki, Soichiro; Yoshikawa, Takeharu; Watadani, Takeyuki; Hayashi, Naoto; Abe, Osamu.

Int J Comput Assist Radiol Surg ; 16(11): 1901-1913, 2021 Nov.

Article in English | MEDLINE | ID: mdl-34652606

ABSTRACT

PURPOSE: The three-dimensional (3D) voxel labeling of lesions requires significant radiologists' effort in the development of computer-aided detection software. To reduce the time required for the 3D voxel labeling, we aimed to develop a generalized semiautomatic segmentation method based on deep learning via a data augmentation-based domain generalization framework. In this study, we investigated whether a generalized semiautomatic segmentation model trained using two types of lesion can segment previously unseen types of lesion. METHODS: We targeted lung nodules in chest CT images, liver lesions in hepatobiliary-phase images of Gd-EOB-DTPA-enhanced MR imaging, and brain metastases in contrast-enhanced MR images. For each lesion, the 32 × 32 × 32 isotropic volume of interest (VOI) around the center of gravity of the lesion was extracted. The VOI was input into a 3D U-Net model to define the label of the lesion. For each type of target lesion, we compared five types of data augmentation and two types of input data. RESULTS: For all considered target lesions, the highest dice coefficients among the training patterns were obtained when using a combination of the existing data augmentation-based domain generalization framework and random monochrome inversion and when using the resized VOI as the input image. The dice coefficients were 0.639 ± 0.124 for the lung nodules, 0.660 ± 0.137 for the liver lesions, and 0.727 ± 0.115 for the brain metastases. CONCLUSIONS: Our generalized semiautomatic segmentation model could label unseen three types of lesion with different contrasts from the surroundings. In addition, the resized VOI as the input image enables the adaptation to the various sizes of lesions even when the size distribution differed between the training set and the test set.

Subject(s)

Deep Learning , Humans , Liver , Magnetic Resonance Imaging , Thorax , Tomography, X-Ray Computed

17.

Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers.

Nakamura, Yuta; Hanaoka, Shouhei; Nomura, Yukihiro; Nakao, Takahiro; Miki, Soichiro; Watadani, Takeyuki; Yoshikawa, Takeharu; Hayashi, Naoto; Abe, Osamu.

BMC Med Inform Decis Mak ; 21(1): 262, 2021 09 11.

Article in English | MEDLINE | ID: mdl-34511100

ABSTRACT

BACKGROUND: It is essential for radiologists to communicate actionable findings to the referring clinicians reliably. Natural language processing (NLP) has been shown to help identify free-text radiology reports including actionable findings. However, the application of recent deep learning techniques to radiology reports, which can improve the detection performance, has not been thoroughly examined. Moreover, free-text that clinicians input in the ordering form (order information) has seldom been used to identify actionable reports. This study aims to evaluate the benefits of two new approaches: (1) bidirectional encoder representations from transformers (BERT), a recent deep learning architecture in NLP, and (2) using order information in addition to radiology reports. METHODS: We performed a binary classification to distinguish actionable reports (i.e., radiology reports tagged as actionable in actual radiological practice) from non-actionable ones (those without an actionable tag). 90,923 Japanese radiology reports in our hospital were used, of which 788 (0.87%) were actionable. We evaluated four methods, statistical machine learning with logistic regression (LR) and with gradient boosting decision tree (GBDT), and deep learning with a bidirectional long short-term memory (LSTM) model and a publicly available Japanese BERT model. Each method was used with two different inputs, radiology reports alone and pairs of order information and radiology reports. Thus, eight experiments were conducted to examine the performance. RESULTS: Without order information, BERT achieved the highest area under the precision-recall curve (AUPRC) of 0.5138, which showed a statistically significant improvement over LR, GBDT, and LSTM, and the highest area under the receiver operating characteristic curve (AUROC) of 0.9516. Simply coupling the order information with the radiology reports slightly increased the AUPRC of BERT but did not lead to a statistically significant improvement. This may be due to the complexity of clinical decisions made by radiologists. CONCLUSIONS: BERT was assumed to be useful to detect actionable reports. More sophisticated methods are required to use order information effectively.

Subject(s)

Natural Language Processing , Radiology , Humans , Logistic Models , Machine Learning , Radiography

18.

Performance changes due to differences in training data for cerebral aneurysm detection in head MR angiography images.

Nomura, Yukihiro; Hanaoka, Shouhei; Nakao, Takahiro; Hayashi, Naoto; Yoshikawa, Takeharu; Miki, Soichiro; Watadani, Takeyuki; Abe, Osamu.

Jpn J Radiol ; 39(11): 1039-1048, 2021 Nov.

Article in English | MEDLINE | ID: mdl-34125368

ABSTRACT

PURPOSE: The performance of computer-aided detection (CAD) software depends on the quality and quantity of the dataset used for machine learning. If the data characteristics in development and practical use are different, the performance of CAD software degrades. In this study, we investigated changes in detection performance due to differences in training data for cerebral aneurysm detection software in head magnetic resonance angiography images. MATERIALS AND METHODS: We utilized three types of CAD software for cerebral aneurysm detection in MRA images, which were based on 3D local intensity structure analysis, graph-based features, and convolutional neural network. For each type of CAD software, we compared three types of training pattern, which were two types of training using single-site data and one type of training using multisite data. We also carried out internal and external evaluations. RESULTS: In training using single-site data, the performance of CAD software largely and unpredictably fluctuated when the training dataset was changed. Training using multisite data did not show the lowest performance among the three training patterns for any CAD software and dataset. CONCLUSION: The training of cerebral aneurysm detection software using data collected from multiple sites is desirable to ensure the stable performance of the software.

Subject(s)

Intracranial Aneurysm , Angiography , Cerebral Angiography , Humans , Intracranial Aneurysm/diagnostic imaging , Machine Learning , Magnetic Resonance Angiography , Magnetic Resonance Imaging , Neural Networks, Computer

19.

Multichannel three-dimensional fully convolutional residual network-based focal liver lesion detection and classification in Gd-EOB-DTPA-enhanced MRI.

Takenaga, Tomomi; Hanaoka, Shouhei; Nomura, Yukihiro; Nakao, Takahiro; Shibata, Hisaichi; Miki, Soichiro; Yoshikawa, Takeharu; Hayashi, Naoto; Abe, Osamu.

Int J Comput Assist Radiol Surg ; 16(9): 1527-1536, 2021 Sep.

Article in English | MEDLINE | ID: mdl-34075548

ABSTRACT

PURPOSE: Gadolinium ethoxybenzyl diethylenetriamine pentaacetic acid (Gd-EOB-DTPA)-enhanced magnetic resonance imaging (MRI) has high diagnostic accuracy in the detection of liver lesions. There is a demand for computer-aided detection/diagnosis software for Gd-EOB-DTPA-enhanced MRI. We propose a deep learning-based method using one three-dimensional fully convolutional residual network (3D FC-ResNet) for liver segmentation and another 3D FC-ResNet for simultaneous detection and classification of a focal liver lesion in Gd-EOB-DTPA-enhanced MRI. METHODS: We prepared a five-phase (unenhanced, arterial, portal venous, equilibrium, and hepatobiliary phases) series as the input image sets and labeled focal liver lesion (hepatocellular carcinoma, metastasis, hemangiomas, cysts, and scars) images as the output image sets. We used 100 cases to train our model, 42 cases to determine the hyperparameters of our model, and 42 cases to evaluate our model. We evaluated our model by free-response receiver operating characteristic curve analysis and using a confusion matrix. RESULTS: Our model simultaneously detected and classified focal liver lesions. In the test cases, the detection accuracy for whole focal liver lesions had a true-positive ratio of 0.6 at an average of 25 false positives per case. The classification accuracy was 0.790. CONCLUSION: We proposed the simultaneous detection and classification of a focal liver lesion in Gd-EOB-DTPA-enhanced MRI using multichannel 3D FC-ResNet. Our results indicated simultaneous detection and classification are possible using a single network. It is necessary to further improve detection sensitivity to help radiologists.

Subject(s)

Carcinoma, Hepatocellular , Liver Neoplasms , Contrast Media , Gadolinium DTPA , Humans , Liver/diagnostic imaging , Liver Neoplasms/diagnostic imaging , Magnetic Resonance Imaging

20.

Detectability of pancreatic lesions by low-dose unenhanced computed tomography using iterative reconstruction.

Sugawara, Haruto; Yoshikawa, Takeharu; Kunimatsu, Akira; Akai, Hiroyuki; Yasaka, Koichiro; Abe, Osamu.

Eur J Radiol ; 141: 109776, 2021 Aug.

Article in English | MEDLINE | ID: mdl-34029934

ABSTRACT

OBJECTIVES: To investigate the detectability of pancreatic cystic lesions and main pancreatic duct dilation by low-dose unenhanced computed tomography (CT). MATERIAL AND METHODS: This study included 2684 patients who underwent low-dose unenhanced CT using iterative reconstruction and magnetic resonance imaging (MRI) as a part of a health-screening program between February 1, 2019 and December 31, 2019. Patients diagnosed with pancreatic cystic lesions and/or dilatations of the main pancreatic duct on MRI were identified. Detection rates by low dose CT in terms of lesion size were tested for significance by Fisher's exact test. RESULTS: Of the 2684 patients, 558 (20.8 %) had pancreatic cystic lesions and 22 (0.8 %) had main pancreatic duct dilatation on MRI. The low-dose CT detection rates among the pancreatic cystic lesions were as follows: 1-9-mm cysts, three (0.65 %) of 461; 10-19-mm cysts, 17 (21.25 %) of 80, and ≥20-mm cysts, eight (47.06 %) of 17. The detection rates were significantly higher in the 10-19-mm and the ≥20-mm cyst group than in the 1-9-mm cyst group (pâ¯<â¯ 0.001). The detection rates among the main pancreatic duct dilatations were as follows: 3-5-mm dilatations, two (11.76 %) of 17 and ≥6-mm dilatations, four (80 %) of five, which were significantly higher rates than that for the 3-5-mm dilatations (pâ¯=â¯ 0.009). CONCLUSION: Small pancreatic cysts and slight main pancreatic duct dilatation were practically undetectable by low-dose unenhanced CT. The application of a low-dose CT protocol as a screening tool in the detection of pancreatic abnormalities is not recommended.

Subject(s)

Pancreatic Cyst , Pancreatic Neoplasms , Humans , Magnetic Resonance Imaging , Pancreas/diagnostic imaging , Pancreatic Cyst/diagnostic imaging , Pancreatic Ducts/diagnostic imaging , Retrospective Studies , Tomography, X-Ray Computed

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL