Search | VHL CLAP/WR-PAHO/WHO

1.

The Association of Metabolic Brain MRI, Amyloid PET, and Clinical Factors: A Study of Alzheimer's Disease and Normal Controls From the Open Access Series of Imaging Studies Dataset.

Matsushita, Shu; Tatekawa, Hiroyuki; Ueda, Daiju; Takita, Hirotaka; Horiuchi, Daisuke; Tsukamoto, Taro; Shimono, Taro; Miki, Yukio.

J Magn Reson Imaging ; 59(4): 1341-1348, 2024 Apr.

Article in English | MEDLINE | ID: mdl-37424114

ABSTRACT

BACKGROUND: Although brain activities in Alzheimer's disease (AD) might be evaluated MRI and PET, the relationships between brain temperature (BT), the index of diffusivity along the perivascular space (ALPS index), and amyloid deposition in the cerebral cortex are still unclear. PURPOSE: To investigate the relationship between metabolic imaging measurements and clinical information in patients with AD and normal controls (NCs). STUDY TYPE: Retrospective analysis of a prospective dataset. POPULATION: 58 participants (78.3 ± 6.8 years; 30 female): 29 AD patients and 29 age- and sex-matched NCs from the Open Access Series of Imaging Studies dataset. FIELD STRENGTH/SEQUENCE: 3T; T1-weighted magnetization-prepared rapid gradient-echo, diffusion tensor imaging with 64 directions, and dynamic 18 F-florbetapir PET. ASSESSMENT: Imaging metrics were compared between AD and NCs. These included BT calculated by the diffusivity of the lateral ventricles, ALPS index that reflects the glymphatic system, the mean standardized uptake value ratio (SUVR) of amyloid PET in the cerebral cortex and clinical information, such as age, sex, and MMSE. STATISTICAL TESTS: Pearson's or Spearman's correlation and multiple linear regression analyses. P values <0.05 were defined as statistically significant. RESULTS: Significant positive correlations were found between BT and ALPS index (r = 0.44 for NCs), while significant negative correlations were found between age and ALPS index (rs = -0.43 for AD and - 0.47 for NCs). The SUVR of amyloid PET was not significantly associated with BT (P = 0.81 for AD and 0.21 for NCs) or ALPS index (P = 0.10 for AD and 0.52 for NCs). In the multiple regression analysis, age was significantly associated with BT, while age, sex, and presence of AD were significantly associated with the ALPS index. DATA CONCLUSION: Impairment of the glymphatic system measured using MRI was associated with lower BT and aging. LEVEL OF EVIDENCE: 3 TECHNICAL EFFICACY STAGE: 1.

Subject(s)

Alzheimer Disease , Humans , Female , Alzheimer Disease/diagnostic imaging , Alzheimer Disease/metabolism , Diffusion Tensor Imaging/methods , Retrospective Studies , Prospective Studies , Access to Information , Positron-Emission Tomography/methods , Magnetic Resonance Imaging/methods , Amyloid , Amyloidogenic Proteins , Cerebral Cortex

2.

ChatGPT's diagnostic performance based on textual vs. visual information compared to radiologists' diagnostic performance in musculoskeletal radiology.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Oura, Tatsushi; Shimono, Taro; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Eur Radiol ; 2024 Jul 12.

Article in English | MEDLINE | ID: mdl-38995378

ABSTRACT

OBJECTIVES: To compare the diagnostic accuracy of Generative Pre-trained Transformer (GPT)-4-based ChatGPT, GPT-4 with vision (GPT-4V) based ChatGPT, and radiologists in musculoskeletal radiology. MATERIALS AND METHODS: We included 106 "Test Yourself" cases from Skeletal Radiology between January 2014 and September 2023. We input the medical history and imaging findings into GPT-4-based ChatGPT and the medical history and images into GPT-4V-based ChatGPT, then both generated a diagnosis for each case. Two radiologists (a radiology resident and a board-certified radiologist) independently provided diagnoses for all cases. The diagnostic accuracy rates were determined based on the published ground truth. Chi-square tests were performed to compare the diagnostic accuracy of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists. RESULTS: GPT-4-based ChatGPT significantly outperformed GPT-4V-based ChatGPT (p < 0.001) with accuracy rates of 43% (46/106) and 8% (9/106), respectively. The radiology resident and the board-certified radiologist achieved accuracy rates of 41% (43/106) and 53% (56/106). The diagnostic accuracy of GPT-4-based ChatGPT was comparable to that of the radiology resident, but was lower than that of the board-certified radiologist although the differences were not significant (p = 0.78 and 0.22, respectively). The diagnostic accuracy of GPT-4V-based ChatGPT was significantly lower than those of both radiologists (p < 0.001 and < 0.001, respectively). CONCLUSION: GPT-4-based ChatGPT demonstrated significantly higher diagnostic accuracy than GPT-4V-based ChatGPT. While GPT-4-based ChatGPT's diagnostic performance was comparable to radiology residents, it did not reach the performance level of board-certified radiologists in musculoskeletal radiology. CLINICAL RELEVANCE STATEMENT: GPT-4-based ChatGPT outperformed GPT-4V-based ChatGPT and was comparable to radiology residents, but it did not reach the level of board-certified radiologists in musculoskeletal radiology. Radiologists should comprehend ChatGPT's current performance as a diagnostic tool for optimal utilization. KEY POINTS: This study compared the diagnostic performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists in musculoskeletal radiology. GPT-4-based ChatGPT was comparable to radiology residents, but did not reach the level of board-certified radiologists. When utilizing ChatGPT, it is crucial to input appropriate descriptions of imaging findings rather than the images.

3.

Accuracy of ChatGPT generated diagnosis from patient's medical history and imaging findings in neuroradiology cases.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Shimono, Taro; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Oura, Tatsushi; Mitsuyama, Yasuhito; Miki, Yukio; Ueda, Daiju.

Neuroradiology ; 66(1): 73-79, 2024 Jan.

Article in English | MEDLINE | ID: mdl-37994939

ABSTRACT

PURPOSE: The noteworthy performance of Chat Generative Pre-trained Transformer (ChatGPT), an artificial intelligence text generation model based on the GPT-4 architecture, has been demonstrated in various fields; however, its potential applications in neuroradiology remain unexplored. This study aimed to evaluate the diagnostic performance of GPT-4 based ChatGPT in neuroradiology. METHODS: We collected 100 consecutive "Case of the Week" cases from the American Journal of Neuroradiology between October 2021 and September 2023. ChatGPT generated a diagnosis from patient's medical history and imaging findings for each case. Then the diagnostic accuracy rate was determined using the published ground truth. Each case was categorized by anatomical location (brain, spine, and head & neck), and brain cases were further divided into central nervous system (CNS) tumor and non-CNS tumor groups. Fisher's exact test was conducted to compare the accuracy rates among the three anatomical locations, as well as between the CNS tumor and non-CNS tumor groups. RESULTS: ChatGPT achieved a diagnostic accuracy rate of 50% (50/100 cases). There were no significant differences between the accuracy rates of the three anatomical locations (p = 0.89). The accuracy rate was significantly lower for the CNS tumor group compared to the non-CNS tumor group in the brain cases (16% [3/19] vs. 62% [36/58], p < 0.001). CONCLUSION: This study demonstrated the diagnostic performance of ChatGPT in neuroradiology. ChatGPT's diagnostic accuracy varied depending on disease etiologies, and its diagnostic accuracy was significantly lower in CNS tumors compared to non-CNS tumors.

Subject(s)

Artificial Intelligence , Neoplasms , Humans , Head , Brain , Neck

4.

Evaluation of cranial nerve involvement in chordomas and chondrosarcomas: a retrospective imaging study.

Oura, Tatsushi; Shimono, Taro; Horiuchi, Daisuke; Goto, Takeo; Takita, Hirotaka; Tsukamoto, Taro; Tatekawa, Hiroyuki; Ueda, Daiju; Matsushita, Shu; Mitsuyama, Yasuhito; Atsukawa, Natsuko; Miki, Yukio.

Neuroradiology ; 66(6): 955-961, 2024 Jun.

Article in English | MEDLINE | ID: mdl-38407581

ABSTRACT

PURPOSE: Cranial nerve involvement (CNI) influences the treatment strategies and prognosis of head and neck tumors. However, its incidence in skull base chordomas and chondrosarcomas remains to be investigated. This study evaluated the imaging features of chordoma and chondrosarcoma, with a focus on the differences in CNI. METHODS: Forty-two patients (26 and 16 patients with chordomas and chondrosarcomas, respectively) treated at our institution between January 2007 and January 2023 were included in this retrospective study. Imaging features, such as the maximum diameter, tumor location (midline or off-midline), calcification, signal intensity on T2-weighted image, mean apparent diffusion coefficient (ADC) values, contrast enhancement, and CNI, were evaluated and compared using Fisher's exact test or the Mann-Whitney U-test. The odds ratio (OR) was calculated to evaluate the association between the histological type and imaging features. RESULTS: The incidence of CNI in chondrosarcomas was significantly higher than that in chordomas (63% vs. 8%, P < 0.001). An off-midline location was more common in chondrosarcomas than in chordomas (86% vs. 13%; P < 0.001). The mean ADC values of chondrosarcomas were significantly higher than those of chordomas (P < 0.001). Significant associations were identified between chondrosarcomas and CNI (OR = 20.00; P < 0.001), location (OR = 53.70; P < 0.001), and mean ADC values (OR = 1.01; P = 0.002). CONCLUSION: The incidence of CNI and off-midline location in chondrosarcomas was significantly higher than that in chordomas. CNI, tumor location, and the mean ADC can help distinguish between these entities.

Subject(s)

Chondrosarcoma , Chordoma , Skull Base Neoplasms , Humans , Female , Male , Retrospective Studies , Middle Aged , Chordoma/diagnostic imaging , Chordoma/pathology , Adult , Chondrosarcoma/diagnostic imaging , Chondrosarcoma/pathology , Aged , Skull Base Neoplasms/diagnostic imaging , Contrast Media , Adolescent , Magnetic Resonance Imaging/methods

5.

AI-based Virtual Synthesis of Methionine PET from Contrast-enhanced MRI: Development and External Validation Study.

Takita, Hirotaka; Matsumoto, Toshimasa; Tatekawa, Hiroyuki; Katayama, Yutaka; Nakajo, Kosuke; Uda, Takehiro; Mitsuyama, Yasuhito; Walston, Shannon L; Miki, Yukio; Ueda, Daiju.

Radiology ; 308(2): e223016, 2023 08.

Article in English | MEDLINE | ID: mdl-37526545

ABSTRACT

Background Carbon 11 (11C)-methionine is a useful PET radiotracer for the management of patients with glioma, but radiation exposure and lack of molecular imaging facilities limit its use. Purpose To generate synthetic methionine PET images from contrast-enhanced (CE) MRI through an artificial intelligence (AI)-based image-to-image translation model and to compare its performance for grading and prognosis of gliomas with that of real PET. Materials and Methods An AI-based model to generate synthetic methionine PET images from CE MRI was developed and validated from patients who underwent both methionine PET and CE MRI at a university hospital from January 2007 to December 2018 (institutional data set). Pearson correlation coefficients for the maximum and mean tumor to background ratio (TBRmax and TBRmean, respectively) of methionine uptake and the lesion volume between synthetic and real PET were calculated. Two additional open-source glioma databases of preoperative CE MRI without methionine PET were used as the external test set. Using the TBRs, the area under the receiver operating characteristic curve (AUC) for classifying high-grade and low-grade gliomas and overall survival were evaluated. Results The institutional data set included 362 patients (mean age, 49 years ± 19 [SD]; 195 female, 167 male; training, n = 294; validation, n = 34; test, n = 34). In the internal test set, Pearson correlation coefficients were 0.68 (95% CI: 0.47, 0.81), 0.76 (95% CI: 0.59, 0.86), and 0.92 (95% CI: 0.85, 0.95) for TBRmax, TBRmean, and lesion volume, respectively. The external test set included 344 patients with gliomas (mean age, 53 years ± 15; 192 male, 152 female; high grade, n = 269). The AUC for TBRmax was 0.81 (95% CI: 0.75, 0.86) and the overall survival analysis showed a significant difference between the high (2-year survival rate, 27%) and low (2-year survival rate, 71%; P < .001) TBRmax groups. Conclusion The AI-based model-generated synthetic methionine PET images strongly correlated with real PET images and showed good performance for glioma grading and prognostication. Published under a CC BY 4.0 license. Supplemental material is available for this article.

Subject(s)

Brain Neoplasms , Glioma , Humans , Male , Female , Middle Aged , Methionine , Brain Neoplasms/diagnostic imaging , Brain Neoplasms/pathology , Artificial Intelligence , Positron-Emission Tomography/methods , Neoplasm Grading , Glioma/diagnostic imaging , Glioma/pathology , Magnetic Resonance Imaging/methods , Racemethionine

6.

Brain temperature remains stable during the day: a study of diffusion-weighted imaging thermometry in healthy individuals.

Horiuchi, Daisuke; Shimono, Taro; Tatekawa, Hiroyuki; Tsukamoto, Taro; Takita, Hirotaka; Matsushita, Shu; Miki, Yukio.

Neuroradiology ; 65(8): 1239-1246, 2023 Aug.

Article in English | MEDLINE | ID: mdl-36949255

ABSTRACT

PURPOSE: To investigate the daily fluctuations in brain temperature in healthy individuals using magnetic resonance (MR) diffusion-weighted imaging (DWI) thermometry and to clarify the associations between the brain and body temperatures and sex. METHODS: Thirty-two age-matched healthy male and female volunteers (male = 16, 20-38 years) were recruited between July 2021 and January 2022. Brain MR examinations were performed in the morning and evening phases on the same day to calculate the brain temperatures using DWI thermometry. Body temperature was also measured in each MR examination. Group comparisons of body and brain temperatures between the two phases were performed using paired t-tests. A multiple linear regression model was used to predict the morning brain temperature using sex, evening brain temperature, and the interaction between sex and evening brain temperature as covariates. RESULTS: Body temperatures were significantly higher in the evening than in the morning in all participants, male group, and female group (p < 0.001, = 0001, and < 0.001, respectively). Meanwhile, no significant difference was observed between the morning and evening brain temperatures in each analysis (p = 0.23, 0.70, and 0.16, respectively). Multiple linear regression analysis showed significant associations of morning brain temperature with sex (p = 0.038), evening brain temperature (p < 0.001), and the interaction between sex and evening brain temperature (p = 0.036). CONCLUSION: Unlike body temperature, brain temperature showed no significant daily fluctuations; however, daily fluctuations in brain temperature may vary depending on sex.

Subject(s)

Body Temperature , Thermometry , Male , Humans , Female , Temperature , Brain/diagnostic imaging , Thermometry/methods , Diffusion Magnetic Resonance Imaging/methods

7.

Frequency and imaging features of the adjacent osseous changes of salivary gland carcinomas in the head and neck region.

Horiuchi, Daisuke; Shimono, Taro; Tatekawa, Hiroyuki; Tsukamoto, Taro; Takita, Hirotaka; Okazaki, Masahiro; Miki, Yukio.

Neuroradiology ; 64(9): 1869-1877, 2022 Sep.

Article in English | MEDLINE | ID: mdl-35524819

ABSTRACT

PURPOSE: The association between salivary gland carcinomas and adjacent osseous changes in the head and neck region is not clear. We evaluated the frequency and imaging features of such changes and investigated the specific characteristics of salivary gland carcinomas associated with them. METHODS: A total of 118 patients with histologically proven salivary gland carcinomas were retrospectively reviewed. The imaging characteristics of osseous changes were sorted into three categories based on computed tomography images: sclerotic change, erosive change, and lytic change. The frequency of all these osseous changes and any one of them was compared between different pathologies using Fisher's exact test. Odds ratios were calculated to evaluate the association between these changes and perineural spread. RESULTS: Osseous changes were found in 21 (18%) of 118 cases. Among these, seven (6%) cases were with sclerotic, nine (8%) with erosive, and nine (8%) with lytic changes (four with mixed change). Adenoid cystic carcinoma showed a significantly higher frequency of sclerotic and erosive changes, and either osseous change, than the other salivary gland carcinomas (p < 0.001 for each). Sclerotic changes were only present in the adenoid cystic carcinomas. Perineural spread was a significant factor in showing higher osseous change frequencies (odds ratio = 3.98, p = 0.006). CONCLUSION: Among salivary gland carcinomas in the head and neck region, adenoid cystic carcinomas had a significantly higher frequency of adjacent osseous changes, especially sclerotic changes, than other salivary gland carcinomas.

Subject(s)

Carcinoma, Adenoid Cystic , Salivary Gland Neoplasms , Carcinoma, Adenoid Cystic/diagnostic imaging , Carcinoma, Adenoid Cystic/pathology , Humans , Neck/pathology , Retrospective Studies , Salivary Gland Neoplasms/diagnostic imaging , Salivary Gland Neoplasms/pathology , Salivary Glands/pathology

8.

Deep Learning-based Angiogram Generation Model for Cerebral Angiography without Misregistration Artifacts.

Ueda, Daiju; Katayama, Yutaka; Yamamoto, Akira; Ichinose, Tsutomu; Arima, Hironori; Watanabe, Yusuke; Walston, Shannon L; Tatekawa, Hiroyuki; Takita, Hirotaka; Honjo, Takashi; Shimazaki, Akitoshi; Kabata, Daijiro; Ichida, Takao; Goto, Takeo; Miki, Yukio.

Radiology ; 299(3): 675-681, 2021 06.

Article in English | MEDLINE | ID: mdl-33787336

ABSTRACT

Background Digital subtraction angiography (DSA) generates an image by subtracting a mask image from a dynamic angiogram. However, patient movement-caused misregistration artifacts can result in unclear DSA images that interrupt procedures. Purpose To train and to validate a deep learning (DL)-based model to produce DSA-like cerebral angiograms directly from dynamic angiograms and then quantitatively and visually evaluate these angiograms for clinical usefulness. Materials and Methods A retrospective model development and validation study was conducted on dynamic and DSA image pairs consecutively collected from January 2019 through April 2019. Angiograms showing misregistration were first separated per patient by two radiologists and sorted into the misregistration test data set. Nonmisregistration angiograms were divided into development and external test data sets at a ratio of 8:1 per patient. The development data set was divided into training and validation data sets at ratio of 3:1 per patient. The DL model was created by using the training data set, tuned with the validation data set, and then evaluated quantitatively with the external test data set and visually with the misregistration test data set. Quantitative evaluations used the peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM) with mixed liner models. Visual evaluation was conducted by using a numerical rating scale. Results The training, validation, nonmisregistration test, and misregistration test data sets included 10 751, 2784, 1346, and 711 paired images collected from 40 patients (mean age, 62 years ± 11 [standard deviation]; 33 women). In the quantitative evaluation, DL-generated angiograms showed a mean PSNR value of 40.2 dB ± 4.05 and a mean SSIM value of 0.97 ± 0.02, indicating high coincidence with the paired DSA images. In the visual evaluation, the median ratings of the DL-generated angiograms were similar to or better than those of the original DSA images for all 24 sequences. Conclusion The deep learning-based model provided clinically useful cerebral angiograms free from clinically significant artifacts directly from dynamic angiograms. Published under a CC BY 4.0 license. Supplemental material is available for this article.

Subject(s)

Cerebral Angiography , Deep Learning , Image Enhancement/methods , Adult , Aged , Aged, 80 and over , Angiography, Digital Subtraction , Artifacts , Female , Humans , Image Processing, Computer-Assisted/methods , Male , Middle Aged , Retrospective Studies , Signal-To-Noise Ratio

9.

ChatGPT's Diagnostic Performance from Patient History and Imaging Findings on the Diagnosis Please Quizzes.

Ueda, Daiju; Mitsuyama, Yasuhito; Takita, Hirotaka; Horiuchi, Daisuke; Walston, Shannon L; Tatekawa, Hiroyuki; Miki, Yukio.

Radiology ; 308(1): e231040, 2023 07.

Article in English | MEDLINE | ID: mdl-37462501

Subject(s)

Curriculum , Educational Measurement , Humans , Educational Measurement/methods

10.

Evaluating Biases and Quality Issues in Intermodality Image Translation Studies for Neuroradiology: A Systematic Review.

Walston, Shannon L; Tatekawa, Hiroyuki; Takita, Hirotaka; Miki, Yukio; Ueda, Daiju.

AJNR Am J Neuroradiol ; 45(6): 826-832, 2024 06 07.

Article in English | MEDLINE | ID: mdl-38663993

ABSTRACT

BACKGROUND: Intermodality image-to-image translation is an artificial intelligence technique for generating one technique from another. PURPOSE: This review was designed to systematically identify and quantify biases and quality issues preventing validation and clinical application of artificial intelligence models for intermodality image-to-image translation of brain imaging. DATA SOURCES: PubMed, Scopus, and IEEE Xplore were searched through August 2, 2023, for artificial intelligence-based image translation models of radiologic brain images. STUDY SELECTION: This review collected 102 works published between April 2017 and August 2023. DATA ANALYSIS: Eligible studies were evaluated for quality using the Checklist for Artificial Intelligence in Medical Imaging (CLAIM) and for bias using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). Medically-focused article adherence was compared with that of engineering-focused articles overall with the Mann-Whitney U test and for each criterion using the Fisher exact test. DATA SYNTHESIS: Median adherence to the relevant CLAIM criteria was 69% and 38% for PROBAST questions. CLAIM adherence was lower for engineering-focused articles compared with medically-focused articles (65% versus 73%, P < .001). Engineering-focused studies had higher adherence for model description criteria, and medically-focused studies had higher adherence for data set and evaluation descriptions. LIMITATIONS: Our review is limited by the study design and model heterogeneity. CONCLUSIONS: Nearly all studies revealed critical issues preventing clinical application, with engineering-focused studies showing higher adherence for the technical model description but significantly lower overall adherence than medically-focused studies. The pursuit of clinical application requires collaboration from both fields to improve reporting.

Subject(s)

Neuroimaging , Humans , Neuroimaging/methods , Neuroimaging/standards , Bias , Artificial Intelligence

11.

Comparing the Diagnostic Performance of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and Radiologists in Challenging Neuroradiology Cases.

Horiuchi, Daisuke; Tatekawa, Hiroyuki; Oura, Tatsushi; Oue, Satoshi; Walston, Shannon L; Takita, Hirotaka; Matsushita, Shu; Mitsuyama, Yasuhito; Shimono, Taro; Miki, Yukio; Ueda, Daiju.

Clin Neuroradiol ; 2024 May 28.

Article in English | MEDLINE | ID: mdl-38806794

ABSTRACT

PURPOSE: To compare the diagnostic performance among Generative Pre-trained Transformer (GPT)-4-based ChatGPT, GPT4 with vision (GPT-4V) based ChatGPT, and radiologists in challenging neuroradiology cases. METHODS: We collected 32 consecutive "Freiburg Neuropathology Case Conference" cases from the journal Clinical Neuroradiology between March 2016 and December 2023. We input the medical history and imaging findings into GPT-4-based ChatGPT and the medical history and images into GPT-4V-based ChatGPT, then both generated a diagnosis for each case. Six radiologists (three radiology residents and three board-certified radiologists) independently reviewed all cases and provided diagnoses. ChatGPT and radiologists' diagnostic accuracy rates were evaluated based on the published ground truth. Chi-square tests were performed to compare the diagnostic accuracy of GPT-4-based ChatGPT, GPT-4V-based ChatGPT, and radiologists. RESULTS: GPT4 and GPT-4V-based ChatGPTs achieved accuracy rates of 22% (7/32) and 16% (5/32), respectively. Radiologists achieved the following accuracy rates: three radiology residents 28% (9/32), 31% (10/32), and 28% (9/32); and three board-certified radiologists 38% (12/32), 47% (15/32), and 44% (14/32). GPT-4-based ChatGPT's diagnostic accuracy was lower than each radiologist, although not significantly (all pâ¯> 0.07). GPT-4V-based ChatGPT's diagnostic accuracy was also lower than each radiologist and significantly lower than two board-certified radiologists (pâ¯= 0.02 and 0.03) (not significant for radiology residents and one board-certified radiologist [all pâ¯> 0.09]). CONCLUSION: While GPT-4-based ChatGPT demonstrated relatively higher diagnostic performance than GPT-4V-based ChatGPT, the diagnostic performance of GPT4 and GPT-4V-based ChatGPTs did not reach the performance level of either radiology residents or board-certified radiologists in challenging neuroradiology cases.

12.

Comparison of clinical and radiological characteristics of inflammatory and non-inflammatory Rathke cleft cysts.

Matsushita, Shu; Shimono, Taro; Maeda, Hiroyuki; Tsukamoto, Taro; Horiuchi, Daisuke; Oura, Tatsushi; Ishibashi, Kenichi; Takita, Hirotaka; Tatekawa, Hiroyuki; Atsukawa, Natsuko; Goto, Takeo; Miki, Yukio.

Jpn J Radiol ; 2024 Aug 20.

Article in English | MEDLINE | ID: mdl-39162782

ABSTRACT

PURPOSE: Rathke cleft cysts are commonly encountered sellar lesions, and their inflammation induces symptoms and recurrence. Cyst wall enhancement is related to inflammation; however, its range and frequency have not yet been investigated. This study aimed to investigate the clinical and radiological differences between inflammatory and non-inflammatory Rathke cleft cysts. METHODS: Forty-one patients who underwent cyst decompression surgery for Rathke's cleft cysts between January 2008 and July 2022 were retrospectively analyzed. Based on the pathological reports, patients were divided into inflammatory and non-inflammatory groups. Clinical assessments, endocrinological evaluations, cyst content analysis, and imaging metrics (mean computed tomographic value, maximum diameter, mean apparent diffusion coefficient [ADC] value, and qualitative features) were analyzed. Receiver operating characteristic curve analysis was performed, to determine ADC cutoff values, for differentiating inflammatory group from non-inflammatory group. RESULTS: Totally, 21 and 20 cases were categorized into the inflammatory and non-inflammatory groups, respectively. The inflammatory group displayed a higher incidence of central diabetes insipidus (arginine vasopressin deficiency) (p = 0.04), turbid cyst content (p = 0.03), significantly lower mean ADC values (p = 0.04), and more extensive circumferential wall enhancement on magnetic resonance imaging (MRI) (p < 0.001). In the inflammatory group, all cases revealed circumferential wall enhancement, with some exhibiting thick wall enhancement. There were no significant differences in other radiological features. The ADC cutoff value for differentiating the two groups was 1.57 × 10-3 mm2/s, showing a sensitivity of 81.3% and specificity of 66.7% CONCLUSION: Inflammatory Rathke cleft cysts tended to show a higher incidence of central diabetes insipidus and turbid cyst content. Radiologically, they exhibited lower mean ADC values and greater circumferential wall enhancement on MRI.

13.

Data set terminology of deep learning in medicine: a historical review and recommendation.

Walston, Shannon L; Seki, Hiroshi; Takita, Hirotaka; Mitsuyama, Yasuhito; Sato, Shingo; Hagiwara, Akifumi; Ito, Rintaro; Hanaoka, Shouhei; Miki, Yukio; Ueda, Daiju.

Jpn J Radiol ; 2024 Jun 10.

Article in English | MEDLINE | ID: mdl-38856878

ABSTRACT

Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.

14.

Deep learning-based diffusion tensor image generation model: a proof-of-concept study.

Tatekawa, Hiroyuki; Ueda, Daiju; Takita, Hirotaka; Matsumoto, Toshimasa; Walston, Shannon L; Mitsuyama, Yasuhito; Horiuchi, Daisuke; Matsushita, Shu; Oura, Tatsushi; Tomita, Yuichiro; Tsukamoto, Taro; Shimono, Taro; Miki, Yukio.

Sci Rep ; 14(1): 2911, 2024 02 05.

Article in English | MEDLINE | ID: mdl-38316892

ABSTRACT

This study created an image-to-image translation model that synthesizes diffusion tensor images (DTI) from conventional diffusion weighted images, and validated the similarities between the original and synthetic DTI. Thirty-two healthy volunteers were prospectively recruited. DTI and DWI were obtained with six and three directions of the motion probing gradient (MPG), respectively. The identical imaging plane was paired for the image-to-image translation model that synthesized one direction of the MPG from DWI. This process was repeated six times in the respective MPG directions. Regions of interest (ROIs) in the lentiform nucleus, thalamus, posterior limb of the internal capsule, posterior thalamic radiation, and splenium of the corpus callosum were created and applied to maps derived from the original and synthetic DTI. The mean values and signal-to-noise ratio (SNR) of the original and synthetic maps for each ROI were compared. The Bland-Altman plot between the original and synthetic data was evaluated. Although the test dataset showed a larger standard deviation of all values and lower SNR in the synthetic data than in the original data, the Bland-Altman plots showed each plot localizing in a similar distribution. Synthetic DTI could be generated from conventional DWI with an image-to-image translation model.

Subject(s)

Deep Learning , White Matter , Humans , Corpus Callosum/diagnostic imaging , Signal-To-Noise Ratio , Internal Capsule , Diffusion Magnetic Resonance Imaging/methods

15.

Diagnostic accuracy of vision-language models on Japanese diagnostic radiology, nuclear medicine, and interventional radiology specialty board examinations.

Oura, Tatsushi; Tatekawa, Hiroyuki; Horiuchi, Daisuke; Matsushita, Shu; Takita, Hirotaka; Atsukawa, Natsuko; Mitsuyama, Yasuhito; Yoshida, Atsushi; Murai, Kazuki; Tanaka, Rikako; Shimono, Taro; Yamamoto, Akira; Miki, Yukio; Ueda, Daiju.

Jpn J Radiol ; 2024 Jul 20.

Article in English | MEDLINE | ID: mdl-39031270

ABSTRACT

PURPOSE: The performance of vision-language models (VLMs) with image interpretation capabilities, such as GPT-4 omni (GPT-4o), GPT-4 vision (GPT-4V), and Claude-3, has not been compared and remains unexplored in specialized radiological fields, including nuclear medicine and interventional radiology. This study aimed to evaluate and compare the diagnostic accuracy of various VLMs, including GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus, using Japanese diagnostic radiology, nuclear medicine, and interventional radiology (JDR, JNM, and JIR, respectively) board certification tests. MATERIALS AND METHODS: In total, 383 questions from the JDR test (358 images), 300 from the JNM test (92 images), and 322 from the JIR test (96 images) from 2019 to 2023 were consecutively collected. The accuracy rates of the GPT-4 + GPT-4V, GPT-4o, Claude-3 Sonnet, and Claude-3 Opus were calculated for all questions or questions with images. The accuracy rates of the VLMs were compared using McNemar's test. RESULTS: GPT-4o demonstrated the highest accuracy rates across all evaluations with the JDR (all questions, 49%; questions with images, 48%), JNM (all questions, 64%; questions with images, 59%), and JIR tests (all questions, 43%; questions with images, 34%), followed by Claude-3 Opus with the JDR (all questions, 40%; questions with images, 38%), JNM (all questions, 42%; questions with images, 43%), and JIR tests (all questions, 40%; questions with images, 30%). For all questions, McNemar's test showed that GPT-4o significantly outperformed the other VLMs (all P < 0.007), except for Claude-3 Opus in the JIR test. For questions with images, GPT-4o outperformed the other VLMs in the JDR and JNM tests (all P < 0.001), except Claude-3 Opus in the JNM test. CONCLUSION: The GPT-4o had the highest success rates for questions with images and all questions from the JDR, JNM, and JIR board certification tests.

16.

A deep learning-based model to estimate pulmonary function from chest x-rays: multi-institutional model development and validation study in Japan.

Ueda, Daiju; Matsumoto, Toshimasa; Yamamoto, Akira; Walston, Shannon L; Mitsuyama, Yasuhito; Takita, Hirotaka; Asai, Kazuhisa; Watanabe, Tetsuya; Abo, Koji; Kimura, Tatsuo; Fukumoto, Shinya; Watanabe, Toshio; Takeshita, Tohru; Miki, Yukio.

Lancet Digit Health ; 6(8): e580-e588, 2024 Aug.

Article in English | MEDLINE | ID: mdl-38981834

ABSTRACT

BACKGROUND: Chest x-ray is a basic, cost-effective, and widely available imaging method that is used for static assessments of organic diseases and anatomical abnormalities, but its ability to estimate dynamic measurements such as pulmonary function is unknown. We aimed to estimate two major pulmonary functions from chest x-rays. METHODS: In this retrospective model development and validation study, we trained, validated, and externally tested a deep learning-based artificial intelligence (AI) model to estimate forced vital capacity (FVC) and forced expiratory volume in 1 s (FEV1) from chest x-rays. We included consecutively collected results of spirometry and any associated chest x-rays that had been obtained between July 1, 2003, and Dec 31, 2021, from five institutions in Japan (labelled institutions A-E). Eligible x-rays had been acquired within 14 days of spirometry and were labelled with the FVC and FEV1. X-rays from three institutions (A-C) were used for training, validation, and internal testing, with the testing dataset being independent of the training and validation datasets, and then x-rays from the two other institutions (D and E) were used for independent external testing. Performance for estimating FVC and FEV1 was evaluated by calculating the Pearson's correlation coefficient (r), intraclass correlation coefficient (ICC), mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) compared with the results of spirometry. FINDINGS: We included 141 734 x-ray and spirometry pairs from 81 902 patients from the five institutions. The training, validation, and internal test datasets included 134 307 x-rays from 75 768 patients (37 718 [50%] female, 38 050 [50%] male; mean age 56 years [SD 18]), and the external test datasets included 2137 x-rays from 1861 patients (742 [40%] female, 1119 [60%] male; mean age 65 years [SD 17]) from institution D and 5290 x-rays from 4273 patients (1972 [46%] female, 2301 [54%] male; mean age 63 years [SD 17]) from institution E. External testing for FVC yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·90 (0·89-0·91) for institution E, ICC of 0·91 (99% CI 0·90-0·92) and 0·89 (0·88-0·90), MSE of 0·17 L2 (99% CI 0·15-0·19) and 0·17 L2 (0·16-0·19), RMSE of 0·41 L (99% CI 0·39-0·43) and 0·41 L (0·39-0·43), and MAE of 0·31 L (99% CI 0·29-0·32) and 0·31 L (0·30-0·32). External testing for FEV1 yielded r values of 0·91 (99% CI 0·90-0·92) for institution D and 0·91 (0·90-0·91) for institution E, ICC of 0·90 (99% CI 0·89-0·91) and 0·90 (0·90-0·91), MSE of 0·13 L2 (99% CI 0·12-0·15) and 0·11 L2 (0·10-0·12), RMSE of 0·37 L (99% CI 0·35-0·38) and 0·33 L (0·32-0·35), and MAE of 0·28 L (99% CI 0·27-0·29) and 0·25 L (0·25-0·26). INTERPRETATION: This deep learning model allowed estimation of FVC and FEV1 from chest x-rays, showing high agreement with spirometry. The model offers an alternative to spirometry for assessing pulmonary function, which is especially useful for patients who are unable to undergo spirometry, and might enhance the customisation of CT imaging protocols based on insights gained from chest x-rays, improving the diagnosis and management of lung diseases. Future studies should investigate the performance of this AI model in combination with clinical information to enable more appropriate and targeted use. FUNDING: None.

Subject(s)

Deep Learning , Humans , Japan , Male , Female , Retrospective Studies , Middle Aged , Aged , Vital Capacity , Lung/diagnostic imaging , Lung/physiology , Forced Expiratory Volume , Radiography, Thoracic , Spirometry/methods , Adult , Respiratory Function Tests/methods

17.

Improved reproducibility of diffusion tensor image analysis along the perivascular space (DTI-ALPS) index: an analysis of reorientation technique of the OASIS-3 dataset.

Tatekawa, Hiroyuki; Matsushita, Shu; Ueda, Daiju; Takita, Hirotaka; Horiuchi, Daisuke; Atsukawa, Natsuko; Morishita, Yuka; Tsukamoto, Taro; Shimono, Taro; Miki, Yukio.

Jpn J Radiol ; 41(4): 393-400, 2023 Apr.

Article in English | MEDLINE | ID: mdl-36472803

ABSTRACT

PURPOSE: Diffusion tensor image analysis along the perivascular space (DTI-ALPS) index is intended to reflect the glymphatic function of the brain; however, head rotation may reduce reproducibility and reliability. This study aimed to evaluate whether reorientation of DTI data improves the reproducibility of the ALPS index using the OASIS-3 dataset. MATERIALS AND METHODS: 234 cognitively normal subjects from the OASIS-3 dataset were included. Original and reoriented ALPS indices were calculated using a technique that registered vector information of DTI to another space and created reoriented diffusivity maps. The F test was used to compare variances of the original and reoriented ALPS indices. Subsequently, subjects with head rotation around the z- (inferior-superior; n = 43) or x axis (right-left; n = 25) and matched subjects with neutral head position were selected for evaluation of intra- and inter-rater reliability. Intraclass correlation coefficients (ICCs) of the original and reoriented ALPS indices for participants with head rotation and neutral head position were calculated separately. The Bland-Altman plot comparing the original and reoriented ALPS indices was also evaluated. RESULTS: The reoriented ALPS index exhibited a significantly smaller variance than the original ALPS index (p < 0.001). For intra- and inter-reliability, the reorientation technique showed good-to-excellent reproducibility in calculating the ALPS index even in subjects with head rotation (ICCs of original ALPS index: 0.52-0.81; ICCs of reoriented ALPS index: > 0.85). A wider range of the 95% limit of agreement of the Bland-Altman plot for subjects with x axis rotation was identified, indicating that x axis rotation may remarkably affect calculation of the ALPS index. CONCLUSION: The technique used in this study enabled the creation of reoriented diffusivity maps and improved reproducibility in calculating the ALPS index.

Subject(s)

Diffusion Tensor Imaging , Image Processing, Computer-Assisted , Humans , Diffusion Tensor Imaging/methods , Reproducibility of Results , Diffusion Magnetic Resonance Imaging/methods , Brain/diagnostic imaging

18.

Correlation between Phase-difference-enhanced MR Imaging and Amyloid Positron Emission Tomography: A Study on Alzheimer's Disease Patients and Normal Controls.

Takita, Hirotaka; Doishita, Satoshi; Yoneda, Tetsuya; Tatekawa, Hiroyuki; Abe, Takato; Itoh, Yoshiaki; Horiuchi, Daisuke; Tsukamoto, Taro; Shimono, Taro; Miki, Yukio.

Magn Reson Med Sci ; 22(1): 67-78, 2023 Jan 01.

Article in English | MEDLINE | ID: mdl-35082221

ABSTRACT

PURPOSE: While amyloid-ß deposition in the cerebral cortex for Alzheimer's disease (AD) is often evaluated by amyloid positron emission tomography (PET), amyloid-ß-related iron can be detected using phase difference enhanced (PADRE) imaging; however, no study has validated the association between PADRE imaging and amyloid PET. This study investigated whether the degree of hypointense areas on PADRE imaging correlated with the uptake of amyloid PET. METHODS: PADRE imaging and amyloid PET were performed in 8 patients with AD and 10 age-matched normal controls. ROIs in the cuneus, precuneus, superior frontal gyrus (SFG), and superior temporal gyrus (STG) were automatically segmented. The degree of hypointense areas on PADRE imaging in each ROI was evaluated using 4-point scaling of visual assessment or volumetric semiquantitative assessment (the percentage of hypointense volume within each ROI). The mean standardized uptake value ratio (SUVR) of amyloid PET in each ROI was also calculated. The Spearman's correlation coefficient between the 4-point scale of PADRE imaging and SUVR of amyloid PET or between the semiquantitative hypointense volume percentage and SUVR in each ROI was evaluated. RESULTS: In the precuneus, a significant positive correlation was identified between the 4-point scale of PADRE imaging and SUVR of amyloid PET (Rs = 0.5; P = 0.034) in all subjects. In the cuneus, a significant positive correlation was identified between the semiquantitative volume percentage of PADRE imaging and SUVR of amyloid PET (Rs = 0.55; P = 0.02) in all subjects. CONCLUSION: Amyloid-ß-enhancing PADRE imaging can be used to predict the SUVR of amyloid PET, especially in the cuneus and precuneus, and may have the potential to be used for diagnosing AD by detecting amyloid deposition.

Subject(s)

Alzheimer Disease , Humans , Alzheimer Disease/diagnostic imaging , Alzheimer Disease/pathology , Positron-Emission Tomography/methods , Amyloid beta-Peptides/metabolism , Magnetic Resonance Imaging/methods , Cerebral Cortex

19.

Orbital apex schwannoma with a high titer of proteinase 3 antineutrophil cytoplasmic antibody.

Oura, Tatsushi; Shimono, Taro; Pas, Maciej; Takita, Hirotaka; Horiuchi, Daisuke; Mitsuyama, Yasuhito; Miki, Yukio.

Radiol Case Rep ; 17(4): 1120-1123, 2022 Apr.

Article in English | MEDLINE | ID: mdl-35169412

ABSTRACT

Here, we present a very unusual case of orbital apex schwannoma with a high titer of proteinase 3 antineutrophil cytoplasmic antibody (PR3-ANCA). A 67-year-old man presented with a 3-month history of double vision. Radiological examinations revealed a mass lesion at the left orbital apex, and laboratory examination revealed a high titer of PR3-ANCA, of 49.1 U/mL (reference range<2.0). After the surgery, the lesion was histologically diagnosed as schwannoma, and the PR3-ANCA titer decreased to 8.4 U/m. Although making a correct diagnosis of orbital apex schwannoma may be difficult due to the need to differentiate from granulomatosis with polyangiitis when PR3-ANCA serum levels are elevated, careful examination of the radiological findings may aid the diagnosis.

20.

Malignant transformation of a dysembryoplastic neuroepithelial tumor presenting with intraventricular hemorrhage.

Takita, Hirotaka; Shimono, Taro; Uda, Takehiro; Ikota, Hayato; Kawashima, Toshiyuki; Horiuchi, Daisuke; Terayama, Eisaku; Tsukamoto, Taro; Miki, Yukio.

Radiol Case Rep ; 17(3): 939-943, 2022 Mar.

Article in English | MEDLINE | ID: mdl-35140831

ABSTRACT

Dysembryoplastic neuroepithelial tumors (DNTs) are benign brain tumors classified as grade 1 in the 2021 World Health Organization (WHO) classification of central nervous system tumors. DNTs rarely undergo malignant transformation and cause symptomatic intracranial hemorrhage. We report a case of malignant transformation of DNT presenting with intraventricular hemorrhage and review the literature on malignant transformation of DNTs. An 18-year-old woman with a history of epilepsy presented with a sudden headache and vomiting. Radiological examination revealed a mass lesion in the left parietal lobe and intraventricular hemorrhage. The patient underwent an emergency craniotomy for brain tumor resection. The lesion was pathologically diagnosed as a malignant transformation of DNT. She had been followed up without tumor recurrence for 2 years after surgery.

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL