Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
1.
Jpn J Radiol ; 2024 Jun 10.
Article in English | MEDLINE | ID: mdl-38856878

ABSTRACT

Medicine and deep learning-based artificial intelligence (AI) engineering represent two distinct fields each with decades of published history. The current rapid convergence of deep learning and medicine has led to significant advancements, yet it has also introduced ambiguity regarding data set terms common to both fields, potentially leading to miscommunication and methodological discrepancies. This narrative review aims to give historical context for these terms, accentuate the importance of clarity when these terms are used in medical deep learning contexts, and offer solutions to mitigate misunderstandings by readers from either field. Through an examination of historical documents, including articles, writing guidelines, and textbooks, this review traces the divergent evolution of terms for data sets and their impact. Initially, the discordant interpretations of the word 'validation' in medical and AI contexts are explored. We then show that in the medical field as well, terms traditionally used in the deep learning domain are becoming more common, with the data for creating models referred to as the 'training set', the data for tuning of parameters referred to as the 'validation (or tuning) set', and the data for the evaluation of models as the 'test set'. Additionally, the test sets used for model evaluation are classified into internal (random splitting, cross-validation, and leave-one-out) sets and external (temporal and geographic) sets. This review then identifies often misunderstood terms and proposes pragmatic solutions to mitigate terminological confusion in the field of deep learning in medicine. We support the accurate and standardized description of these data sets and the explicit definition of data set splitting terminologies in each publication. These are crucial methods for demonstrating the robustness and generalizability of deep learning applications in medicine. This review aspires to enhance the precision of communication, thereby fostering more effective and transparent research methodologies in this interdisciplinary field.

2.
Article in English | MEDLINE | ID: mdl-38719605

ABSTRACT

BACKGROUND AND PURPOSE: The rise of large language models such as generative pre-trained transformers (GPTs) has sparked significant interest in radiology, especially in interpreting radiological reports and image findings. While existing research has focused on GPTs estimating diagnoses from radiological descriptions, exploring alternative diagnostic information sources is also crucial. This study introduces the use of GPTs (GPT-3.5 Turbo and GPT-4) for information retrieval and summarization, searching relevant case reports via PubMed, and investigates their potential to aid diagnosis. MATERIALS AND METHODS: From October 2021 to December 2023, we selected 115 cases from the "Case of the Week" series on the American Journal of Neuroradiology website. Their Description and Legend sections were presented to the GPTs for the two tasks. For the Direct Diagnosis task, the models provided three differential diagnoses that were considered correct if they matched the diagnosis in the diagnosis section. For the Case Report Search task, the models generated two keywords per case, creating PubMed search queries to extract up to three relevant reports. A response was considered correct if reports containing the disease name stated in the diagnosis section were extracted. McNemar's test was employed to evaluate whether adding a Case Report Search to Direct Diagnosis improved overall accuracy. RESULTS: In the Direct Diagnosis task, GPT-3.5 Turbo achieved a correct response rate of 26% (30/115 cases), whereas GPT-4 achieved 41% (47/115). For the Case Report Search task, GPT-3.5 Turbo scored 10% (11/115), and GPT-4 scored 7% (8/115). Correct responses totaled 32% (37/115) with three overlapping cases for GPT-3.5 Turbo, whereas GPT-4 had 43% (50/115) of correct responses with five overlapping cases. Adding Case Report Search improved GPT-3.5 Turbo's performance (p = 0.023) but not that of GPT-4 (p = 0.248). CONCLUSIONS: The effectiveness of adding Case Report Search to GPT-3.5 Turbo was particularly pronounced, suggesting its potential as an alternative diagnostic approach to GPTs, particularly in scenarios where direct diagnoses from GPTs are not obtainable. Nevertheless, the overall performance of GPT models in both direct diagnosis and case report retrieval tasks remains not optimal, and users should be aware of their limitations.ABBREVIATIONS: AI = Artificial Intelligence, GPT = generative pretrained transformer, LLM = large language model.

3.
Jpn J Radiol ; 2024 May 11.
Article in English | MEDLINE | ID: mdl-38733472

ABSTRACT

PURPOSE: To assess the performance of GPT-4 Turbo with Vision (GPT-4TV), OpenAI's latest multimodal large language model, by comparing its ability to process both text and image inputs with that of the text-only GPT-4 Turbo (GPT-4 T) in the context of the Japan Diagnostic Radiology Board Examination (JDRBE). MATERIALS AND METHODS: The dataset comprised questions from JDRBE 2021 and 2023. A total of six board-certified diagnostic radiologists discussed the questions and provided ground-truth answers by consulting relevant literature as necessary. The following questions were excluded: those lacking associated images, those with no unanimous agreement on answers, and those including images rejected by the OpenAI application programming interface. The inputs for GPT-4TV included both text and images, whereas those for GPT-4 T were entirely text. Both models were deployed on the dataset, and their performance was compared using McNemar's exact test. The radiological credibility of the responses was assessed by two diagnostic radiologists through the assignment of legitimacy scores on a five-point Likert scale. These scores were subsequently used to compare model performance using Wilcoxon's signed-rank test. RESULTS: The dataset comprised 139 questions. GPT-4TV correctly answered 62 questions (45%), whereas GPT-4 T correctly answered 57 questions (41%). A statistical analysis found no significant performance difference between the two models (P = 0.44). The GPT-4TV responses received significantly lower legitimacy scores from both radiologists than the GPT-4 T responses. CONCLUSION: No significant enhancement in accuracy was observed when using GPT-4TV with image input compared with that of using text-only GPT-4 T for JDRBE questions.

4.
Article in English | MEDLINE | ID: mdl-38625446

ABSTRACT

PURPOSE: The quality and bias of annotations by annotators (e.g., radiologists) affect the performance changes in computer-aided detection (CAD) software using machine learning. We hypothesized that the difference in the years of experience in image interpretation among radiologists contributes to annotation variability. In this study, we focused on how the performance of CAD software changes with retraining by incorporating cases annotated by radiologists with varying experience. METHODS: We used two types of CAD software for lung nodule detection in chest computed tomography images and cerebral aneurysm detection in magnetic resonance angiography images. Twelve radiologists with different years of experience independently annotated the lesions, and the performance changes were investigated by repeating the retraining of the CAD software twice, with the addition of cases annotated by each radiologist. Additionally, we investigated the effects of retraining using integrated annotations from multiple radiologists. RESULTS: The performance of the CAD software after retraining differed among annotating radiologists. In some cases, the performance was degraded compared to that of the initial software. Retraining using integrated annotations showed different performance trends depending on the target CAD software, notably in cerebral aneurysm detection, where the performance decreased compared to using annotations from a single radiologist. CONCLUSIONS: Although the performance of the CAD software after retraining varied among the annotating radiologists, no direct correlation with their experience was found. The performance trends differed according to the type of CAD software used when integrated annotations from multiple radiologists were used.

5.
Insights Imaging ; 15(1): 102, 2024 Apr 05.
Article in English | MEDLINE | ID: mdl-38578554

ABSTRACT

OBJECTIVES: To investigate the relationship between low kidney volume and subsequent estimated glomerular filtration rate (eGFR) decline in eGFR category G2 (60-89 mL/min/1.73 m2) population. METHODS: In this retrospective study, we evaluated 5531 individuals with eGFR category G2 who underwent medical checkups at our institution between November 2006 and October 2017. Exclusion criteria were absent for follow-up visit, missing data, prior renal surgery, current renal disease under treatment, large renal masses, and horseshoe kidney. We developed a 3D U-net-based automated system for renal volumetry on CT images. Participants were grouped by sex-specific kidney volume deviations set at mean minus one standard deviation. After 1:1 propensity score matching, we obtained 397 pairs of individuals in the low kidney volume (LKV) and control groups. The primary endpoint was progression of eGFR categories within 5 years, assessed using Cox regression analysis. RESULTS: This study included 3220 individuals (mean age, 60.0 ± 9.7 years; men, n = 2209). The kidney volume was 404.6 ± 67.1 and 376.8 ± 68.0 cm3 in men and women, respectively. The low kidney volume (LKV) cutoff was 337.5 and 308.8 cm3 for men and women, respectively. LKV was a significant risk factor for the endpoint with an adjusted hazard ratio of 1.64 (95% confidence interval: 1.09-2.45; p = 0.02). CONCLUSION: Low kidney volume may adversely affect subsequent eGFR maintenance; hence, the use of imaging metrics may help predict eGFR decline. CRITICAL RELEVANCE STATEMENT: Low kidney volume is a significant predictor of reduced kidney function over time; thus, kidney volume measurements could aid in early identification of individuals at risk for declining kidney health. KEY POINTS: • This study explores how kidney volume affects subsequent kidney function maintenance. • Low kidney volume was associated with estimated glomerular filtration rate decreases. • Low kidney volume is a prognostic indicator of estimated glomerular filtration rate decline.

6.
JMIR Med Educ ; 10: e54393, 2024 Mar 12.
Article in English | MEDLINE | ID: mdl-38470459

ABSTRACT

BACKGROUND: Previous research applying large language models (LLMs) to medicine was focused on text-based information. Recently, multimodal variants of LLMs acquired the capability of recognizing images. OBJECTIVE: We aim to evaluate the image recognition capability of generative pretrained transformer (GPT)-4V, a recent multimodal LLM developed by OpenAI, in the medical field by testing how visual information affects its performance to answer questions in the 117th Japanese National Medical Licensing Examination. METHODS: We focused on 108 questions that had 1 or more images as part of a question and presented GPT-4V with the same questions under two conditions: (1) with both the question text and associated images and (2) with the question text only. We then compared the difference in accuracy between the 2 conditions using the exact McNemar test. RESULTS: Among the 108 questions with images, GPT-4V's accuracy was 68% (73/108) when presented with images and 72% (78/108) when presented without images (P=.36). For the 2 question categories, clinical and general, the accuracies with and those without images were 71% (70/98) versus 78% (76/98; P=.21) and 30% (3/10) versus 20% (2/10; P≥.99), respectively. CONCLUSIONS: The additional information from the images did not significantly improve the performance of GPT-4V in the Japanese National Medical Licensing Examination.


Subject(s)
Licensure , Medicine , Japan , Language
7.
J Imaging Inform Med ; 37(3): 1217-1227, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38351224

ABSTRACT

To generate synthetic medical data incorporating image-tabular hybrid data by merging an image encoding/decoding model with a table-compatible generative model and assess their utility. We used 1342 cases from the Stony Brook University Covid-19-positive cases, comprising chest X-ray radiographs (CXRs) and tabular clinical data as a private dataset (pDS). We generated a synthetic dataset (sDS) through the following steps: (I) dimensionally reducing CXRs in the pDS using a pretrained encoder of the auto-encoding generative adversarial networks (αGAN) and integrating them with the correspondent tabular clinical data; (II) training the conditional tabular GAN (CTGAN) on this combined data to generate synthetic records, encompassing encoded image features and clinical data; and (III) reconstructing synthetic images from these encoded image features in the sDS using a pretrained decoder of the αGAN. The utility of sDS was assessed by the performance of the prediction models for patient outcomes (deceased or discharged). For the pDS test set, the area under the receiver operating characteristic (AUC) curve was calculated to compare the performance of prediction models trained separately with pDS, sDS, or a combination of both. We created an sDS comprising CXRs with a resolution of 256 × 256 pixels and tabular data containing 13 variables. The AUC for the outcome was 0.83 when the model was trained with the pDS, 0.74 with the sDS, and 0.87 when combining pDS and sDS for training. Our method is effective for generating synthetic records consisting of both images and tabular clinical data.


Subject(s)
COVID-19 , Radiography, Thoracic , SARS-CoV-2 , Humans , COVID-19/diagnostic imaging , Radiography, Thoracic/methods , Female , Male , Middle Aged , Aged , ROC Curve , Adult
8.
Int J Comput Assist Radiol Surg ; 19(3): 581-590, 2024 Mar.
Article in English | MEDLINE | ID: mdl-38180621

ABSTRACT

PURPOSE: Standardized uptake values (SUVs) derived from 18F-fluoro-2-deoxy-D-glucose positron emission tomography/computed tomography are a crucial parameter for identifying tumors or abnormalities in an organ. Moreover, exploring ways to improve the identification of tumors or abnormalities using a statistical measurement tool is important in clinical research. Therefore, we developed a fully automatic method to create a personally normalized Z-score map of the liver SUV. METHODS: The normalized Z-score map for each patient was created using the SUV mean and standard deviation estimated from blood-test-derived variables, such as alanine aminotransferase and aspartate aminotransferase, as well as other demographic information. This was performed using the least absolute shrinkage and selection operator (LASSO)-based estimation formula. We also used receiver operating characteristic (ROC) to analyze the results of people with and without hepatic tumors and compared them to the ROC curve of normal SUV. RESULTS: A total of 7757 people were selected for this study. Of these, 7744 were healthy, while 13 had abnormalities. The area under the ROC curve results indicated that the anomaly detection approach (0.91) outperformed only the maximum SUV (0.89). To build the LASSO regression, sets of covariates, including sex, weight, body mass index, blood glucose level, triglyceride, total cholesterol, γ-glutamyl transpeptidase, total protein, creatinine, insulin, albumin, and cholinesterase, were used to determine the SUV mean, whereas weight was used to determine the SUV standard deviation. CONCLUSION: The Z-score normalizes the mean and standard deviation. It is effective in ROC curve analysis and increases the clarity of the abnormality. This normalization is a key technique for effective measurement of maximum glucose consumption by tumors in the liver.


Subject(s)
Fluorodeoxyglucose F18 , Neoplasms , Humans , Radiopharmaceuticals , Positron-Emission Tomography/methods , Neoplasms/diagnostic imaging , Liver/diagnostic imaging
9.
Radiol Phys Technol ; 17(1): 103-111, 2024 Mar.
Article in English | MEDLINE | ID: mdl-37917288

ABSTRACT

The purpose of the study was to develop a liver nodule diagnostic method that accurately localizes and classifies focal liver lesions and identifies the specific liver segments in which they reside by integrating a liver segment division algorithm using a four-dimensional (4D) fully convolutional residual network (FC-ResNet) with a localization and classification model. We retrospectively collected data and divided 106 gadolinium-ethoxybenzyl-diethylenetriamine pentaacetic acid-enhanced magnetic resonance examinations into Case-sets 1, 2, and 3. A liver segment division algorithm was developed using a 4D FC-ResNet and trained with semi-automatically created silver-standard annotations; performance was evaluated using manually created gold-standard annotations by calculating the Dice scores for each liver segment. The performance of the liver nodule diagnostic method was assessed by comparing the results with those of the original radiology reports. The mean Dice score between the output of the liver segment division model and the gold standard was 0.643 for Case-set 2 (normal liver contours) and 0.534 for Case-set 1 (deformed liver contours). Among the 64 lesions in Case-set 3, the diagnostic method localized 37 lesions, classified 33 lesions, and identified the liver segments for 30 lesions. A total of 28 lesions were true positives, matching the original radiology reports. The liver nodule diagnostic method, which integrates a liver segment division algorithm with a lesion localization and classification model, exhibits great potential for localizing and classifying focal liver lesions and identifying the liver segments in which they reside. Further improvements and validation using larger sample sizes will enhance its performance and clinical applicability.


Subject(s)
Contrast Media , Liver Neoplasms , Humans , Liver Neoplasms/diagnostic imaging , Liver Neoplasms/pathology , Retrospective Studies , Liver/diagnostic imaging , Gadolinium DTPA , Magnetic Resonance Imaging/methods
10.
Clin Nutr ; 43(1): 134-141, 2024 01.
Article in English | MEDLINE | ID: mdl-38041939

ABSTRACT

BACKGROUND & AIMS: While skeletal muscle index (SMI) is the most widely used indicator of low muscle mass (or sarcopenia) in oncology, optimal cut-offs (or definitions) to better predict survival are not standardized. METHODS: We compared five major definitions of SMI-based low muscle mass using an Asian patient cohort with gastrointestinal or genitourinary cancers. We analyzed 2015 patients with surgically-treated gastrointestinal (n = 1382) or genitourinary (n = 633) cancer with pre-surgical computed tomography images. We assessed the associations of clinical parameters, including low muscle mass by each definition, with cancer-specific survival (CSS) and overall survival (OS). RESULTS: During a median follow-up period of 61 months, 303 (15%) died of cancer, and 147 died of other causes. An Asian-based definition diagnosed 17.8% of patients as having low muscle mass, while the other Caucasian-based ones classified most (>70%) patients as such. All definitions significantly discriminated both CSS and OS between patients with low or normal muscle mass. Low muscle mass using any definition but one predicted a lower CSS on multivariate Cox regression analyses. All definitions were independent predictors of lower OS. The original multivariate model without incorporating low muscle mass had c-indices of 0.63 for CSS and 0.66 for OS, which increased to 0.64-0.67 for CSS and 0.67-0.70 for OS when low muscle mass was considered. The model with an Asian-based definition had the highest c-indices (0.67 for CSS and 0.70 for OS). CONCLUSIONS: The Asian-specific definition had the best predictive ability for mortality in this Asian patient cohort.


Subject(s)
Neoplasms , Sarcopenia , Humans , Prognosis , Sarcopenia/etiology , Muscle, Skeletal/diagnostic imaging , Muscle, Skeletal/pathology , Tomography, X-Ray Computed , Neoplasms/complications , Retrospective Studies
11.
Life (Basel) ; 13(12)2023 Dec 06.
Article in English | MEDLINE | ID: mdl-38137904

ABSTRACT

This study aimed to explore the relationship between thyroid-stimulating hormone (TSH) elevation and the baseline computed tomography (CT) density and volume of the thyroid. We examined 86 cases with new-onset hypothyroidism (TSH > 4.5 IU/mL) and 1071 controls from a medical check-up database over 5 years. A deep learning-based thyroid segmentation method was used to assess CT density and volume. Statistical tests and logistic regression were employed to determine differences and odds ratios. Initially, the case group showed a higher CT density (89.8 vs. 81.7 Hounsfield units (HUs)) and smaller volume (13.0 vs. 15.3 mL) than those in the control group. For every +10 HU in CT density and -3 mL in volume, the odds of developing hypothyroidism increased by 1.40 and 1.35, respectively. Over the course of the study, the case group showed a notable CT density reduction (median: -8.9 HU), whereas the control group had a minor decrease (-2.9 HU). Thyroid volume remained relatively stable for both groups. Higher CT density and smaller thyroid volume at baseline are correlated with future TSH elevation. Over time, there was a substantial and minor decrease in CT density in the case and control groups, respectively. Thyroid volumes remained consistent in both cohorts.

12.
J Pers Med ; 13(11)2023 Oct 25.
Article in English | MEDLINE | ID: mdl-38003843

ABSTRACT

Mammography images contain a lot of information about not only the mammary glands but also the skin, adipose tissue, and stroma, which may reflect the risk of developing breast cancer. We aimed to establish a method to predict breast cancer risk using radiomics features of mammography images and to enable further examinations and prophylactic treatment to reduce breast cancer mortality. We used mammography images of 4000 women with breast cancer and 1000 healthy women from the 'starting point set' of the OPTIMAM dataset, a public dataset. We trained a Light Gradient Boosting Machine using radiomics features extracted from mammography images of women with breast cancer (only the healthy side) and healthy women. This model was a binary classifier that could discriminate whether a given mammography image was of the contralateral side of women with breast cancer or not, and its performance was evaluated using five-fold cross-validation. The average area under the curve for five folds was 0.60122. Some radiomics features, such as 'wavelet-H_glcm_Correlation' and 'wavelet-H_firstorder_Maximum', showed distribution differences between the malignant and normal groups. Therefore, a single radiomics feature might reflect the breast cancer risk. The odds ratio of breast cancer incidence was 7.38 in women whose estimated malignancy probability was ≥0.95. Radiomics features from mammography images can help predict breast cancer risk.

13.
Front Neurosci ; 17: 1220848, 2023.
Article in English | MEDLINE | ID: mdl-37662100

ABSTRACT

Resting-state functional magnetic resonance imaging (rsfMRI) has been widely applied to investigate spontaneous neural activity, often based on its macroscopic organization that is termed resting-state networks (RSNs). Although the neurophysiological mechanisms underlying the RSN organization remain largely unknown, accumulating evidence points to a substantial contribution from the global signals to their structured synchronization. This study further explored the phenomenon by taking advantage of the inter- and intra-subject variations of the time delay and correlation coefficient of the signal timeseries in each region using the global mean signal as the reference signal. Consistent with the hypothesis based on the empirical and theoretical findings, the time lag and correlation, which have consistently been proven to represent local hemodynamic status, were shown to organize networks equivalent to RSNs. The results not only provide further evidence that the local hemodynamic status could be the direct source of the RSNs' spatial patterns but also explain how the regional variations in the hemodynamics, combined with the changes in the global events' power spectrum, lead to the observations. While the findings pose challenges to interpretations of rsfMRI studies, they further support the view that rsfMRI can offer detailed information related to global neurophysiological phenomena as well as local hemodynamics that would have great potential as biomarkers.

14.
JAMA Netw Open ; 6(6): e2318153, 2023 06 01.
Article in English | MEDLINE | ID: mdl-37378985

ABSTRACT

Importance: Characterizing longitudinal patterns of regional brain volume changes in a population with normal cognition at the individual level could improve understanding of the brain aging process and may aid in the prevention of age-related neurodegenerative diseases. Objective: To investigate age-related trajectories of the volumes and volume change rates of brain structures in participants without dementia. Design, Setting, and Participants: This cohort study was conducted from November 1, 2006, to April 30, 2021, at a single academic health-checkup center among 653 individuals who participated in a health screening program with more than 10 years of serial visits. Exposure: Serial magnetic resonance imaging, Mini-Mental State Examination, health checkup. Main Outcomes and Measures: Volumes and volume change rates across brain tissue types and regions. Results: The study sample included 653 healthy control individuals (mean [SD] age at baseline, 55.1 [9.3] years; median age, 55 years [IQR, 47-62 years]; 447 men [69%]), who were followed up annually for up to 15 years (mean [SD], 11.5 [1.8] years; mean [SD] number of scans, 12.1 [1.9]; total visits, 7915). Each brain structure showed characteristic age-dependent volume and atrophy change rates. In particular, the cortical gray matter showed a consistent pattern of volume loss in each brain lobe with aging. The white matter showed an age-related decrease in volume and an accelerated atrophy rate (regression coefficient, -0.016 [95% CI, -0.012 to -0.011]; P < .001). An accelerated age-related volume increase in the cerebrospinal fluid-filled spaces, particularly in the inferior lateral ventricle and the Sylvian fissure, was also observed (ventricle regression coefficient, 0.042 [95% CI, 0.037-0.047]; P < .001; sulcus regression coefficient, 0.021 [95% CI, 0.018-0.023]; P < .001). The temporal lobe atrophy rate accelerated from approximately 70 years of age, preceded by acceleration of atrophy in the hippocampus and amygdala. Conclusions and Relevance: In this cohort study of adults without dementia, age-dependent brain structure volumes and volume change rates in various brain structures were characterized using serial magnetic resonance imaging scans. These findings clarified the normal distributions in the aging brain, which are essential for understanding the process of age-related neurodegenerative diseases.


Subject(s)
Brain , Dementia , Male , Adult , Humans , Middle Aged , Child , Cohort Studies , Brain/diagnostic imaging , Brain/pathology , Aging/pathology , Magnetic Resonance Imaging , Cognition , Atrophy , Dementia/pathology
15.
Radiol Phys Technol ; 16(1): 28-38, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36344662

ABSTRACT

The purpose of this study was to realize an automated volume measurement of abdominal adipose tissue from the entire abdominal cavity in Dixon magnetic resonance (MR) images using deep learning. Our algorithm involves a combination of extraction of the abdominal cavity and body trunk regions using deep learning and extraction of a fat region based on automatic thresholding. To evaluate the proposed method, we calculated the Dice coefficient (DC) between the extracted regions using deep learning and labeled images. We also compared the visceral adipose tissue (VAT) and subcutaneous adipose tissue volumes calculated by employing the proposed method with those calculated from computed tomography (CT) images scanned on the same day using the automatic calculation method previously developed by our group. We implemented our method as a plug-in in a web-based medical image processing platform. The DCs of the abdominal cavity and body trunk regions were 0.952 ± 0.014 and 0.995 ± 0.002, respectively. The VAT volume measured from MR images using the proposed method was almost equivalent to that measured from CT images. The time required for our plug-in to process the test set was 118.9 ± 28.0 s. Using our proposed method, the VAT volume measured from MR images can be an alternative to that measured from CT images.


Subject(s)
Abdominal Cavity , Deep Learning , Reproducibility of Results , Abdominal Fat/diagnostic imaging , Image Processing, Computer-Assisted/methods , Magnetic Resonance Imaging/methods , Adipose Tissue
16.
Eur Thyroid J ; 12(1)2023 02 01.
Article in English | MEDLINE | ID: mdl-36562641

ABSTRACT

Objective: This study aimed to determine a standardized cut-off value for abnormal 18F-fluorodeoxyglucose (FDG) accumulation in the thyroid gland. Methods: Herein, 7013 FDG-PET/CT scans were included. An automatic thyroid segmentation method using two U-nets (2D- and 3D-U-net) was constructed; mean FDG standardized uptake value (SUV), CT value, and volume of the thyroid gland were obtained from each participant. The values were categorized by thyroid function into three groups based on serum thyroid-stimulating hormone levels. Thyroid function and mean SUV with increments of 1 were analyzed, and risk for thyroid dysfunction was calculated. Thyroid dysfunction detection ability was examined using a machine learning method (LightGBM, Microsoft) with age, sex, height, weight, CT value, volume, and mean SUV as explanatory variables. Results: Mean SUV was significantly higher in females with hypothyroidism. Almost 98.9% of participants in the normal group had mean SUV < 2 and 93.8% participants with mean SUV < 2 had normal thyroid function. The hypothyroidism group had more cases with mean SUV ≥ 2. The relative risk of having abnormal thyroid function was 4.6 with mean SUV ≥ 2. The sensitivity and specificity for detecting thyroid dysfunction using LightGBM (Microsoft) were 14.5 and 99%, respectively. Conclusions: Mean SUV ≥ 2 was strongly associated with abnormal thyroid function in this large cohort, indicating that mean SUV with FDG-PET/CT can be used as a criterion for thyroid evaluation. Preliminarily, this study shows the potential utility of detecting thyroid dysfunction based on imaging findings.


Subject(s)
Hypothyroidism , Thyroid Diseases , Female , Humans , Fluorodeoxyglucose F18 , Positron Emission Tomography Computed Tomography , Tomography, X-Ray Computed/methods , Thyroid Diseases/diagnostic imaging
17.
Tomography ; 8(5): 2129-2152, 2022 08 24.
Article in English | MEDLINE | ID: mdl-36136875

ABSTRACT

Ultra-sparse-view computed tomography (CT) algorithms can reduce radiation exposure for patients, but these algorithms lack an explicit cycle consistency loss minimization and an explicit log-likelihood maximization in testing. Here, we propose X2CT-FLOW for the maximum a posteriori (MAP) reconstruction of a three-dimensional (3D) chest CT image from a single or a few two-dimensional (2D) projection images using a progressive flow-based deep generative model, especially for ultra-low-dose protocols. The MAP reconstruction can simultaneously optimize the cycle consistency loss and the log-likelihood. We applied X2CT-FLOW for the reconstruction of 3D chest CT images from biplanar projection images without noise contamination (assuming a standard-dose protocol) and with strong noise contamination (assuming an ultra-low-dose protocol). We simulated an ultra-low-dose protocol. With the standard-dose protocol, our images reconstructed from 2D projected images and 3D ground-truth CT images showed good agreement in terms of structural similarity (SSIM, 0.7675 on average), peak signal-to-noise ratio (PSNR, 25.89 dB on average), mean absolute error (MAE, 0.02364 on average), and normalized root mean square error (NRMSE, 0.05731 on average). Moreover, with the ultra-low-dose protocol, our images reconstructed from 2D projected images and the 3D ground-truth CT images also showed good agreement in terms of SSIM (0.7008 on average), PSNR (23.58 dB on average), MAE (0.02991 on average), and NRMSE (0.07349 on average).


Subject(s)
Algorithms , Tomography, X-Ray Computed , Humans , Imaging, Three-Dimensional/methods , Radiation Dosage , Signal-To-Noise Ratio , Tomography, X-Ray Computed/methods
18.
J Obstet Gynaecol Res ; 48(11): 2973-2978, 2022 Nov.
Article in English | MEDLINE | ID: mdl-35915563

ABSTRACT

Imaging and histological changes occurring in adenomyosis due to pregnancy are unclear. A 38-year-old nulliparous woman presented with dysmenorrhea and infertility. Pelvic magnetic resonance imaging (MRI) showed diffuse-type adenomyosis. Following pregnancy by in vitro fertilization, she was hospitalized at 23 weeks of gestation due to fetal growth restriction and subsequently diagnosed with preeclampsia. A second MRI performed due to an elevated inflammatory response at 31 weeks of gestation detected no obvious degenerative findings. An emergency cesarean section was performed at 33 weeks of gestation because of nonreassuring fetal status. On postpartum day 2, she showed uterine tenderness with a dramatically elevated inflammatory response. A third MRI showed cyst-like degenerations with hemorrhagic changes without abscess. By postpartum day 7, she was quickly relieved and discharged from the hospital. A fourth MRI at postpartum month 4 confirmed the disappearance of degenerations. This is the first report of imaging findings of early postpartum degeneration of adenomyosis.


Subject(s)
Adenomyosis , Cysts , Humans , Pregnancy , Female , Adult , Cesarean Section , Magnetic Resonance Imaging , Postpartum Period , Hemorrhage
19.
Eur J Radiol ; 154: 110445, 2022 Sep.
Article in English | MEDLINE | ID: mdl-35901601

ABSTRACT

PURPOSE: To assess the clinical effectiveness of temporal subtraction computed tomography (TS CT) using deep learning to improve vertebral bone metastasis detection. METHOD: This retrospective study used TS CT comprising bony landmark detection, bone segmentation with a multi-atlas-based method, and spatial registration of two images by a log-domain diffeomorphic Demons algorithm. Paired current and past CT images of 50 patients without vertebral metastasis, recorded during June 2011-September 2016, were included as training data. A deep learning-based method estimated registration errors and suppressed false positives. Thereafter, paired CT images of 40 cancer patients with newly developed vertebral metastases and 40 control patients without vertebral metastases were evaluated. Six board-certified radiologists and five radiology residents independently interpreted 80 paired CT images with and without TS CT. RESULTS: Records of 40 patients in the metastasis group (median age: 64.5 years; 20 males) and 40 patients in the control group (median age: 64.0 years; 20 males) were evaluated. With TS CT, the overall figure of merit (FOM) of the board-certified radiologist and resident groups improved from 0.848 to 0.876 (p = 0.01) and from 0.752 to 0.799 (p = 0.02), respectively. The sub-analysis focusing on attenuation changes in lesions revealed that the FOM of osteoblastic lesions significantly improved in both the board-certified radiologist and resident groups using TS CT. The sub-analysis focusing on lesion location showed that the FOM of the resident group significantly improved in the vertebral arch (p = 0.04). CONCLUSIONS: TS CT was effective in detecting bone metastasis by both board-certified radiologists and radiology residents.


Subject(s)
Bone Neoplasms , Deep Learning , Bone Neoplasms/diagnostic imaging , Bone Neoplasms/secondary , Humans , Male , Middle Aged , Retrospective Studies , Subtraction Technique , Tomography, X-Ray Computed/methods
20.
Stud Health Technol Inform ; 290: 253-257, 2022 Jun 06.
Article in English | MEDLINE | ID: mdl-35673012

ABSTRACT

Medical artificial intelligence (AI) systems need to learn to recognize synonyms or paraphrases describing the same anatomy, disease, treatment, etc. to better understand real-world clinical documents. Existing linguistic resources focus on variants at the word or sentence level. To handle linguistic variations on a broader scale, we proposed the Medical Text Radiology Report section Japanese version (MedTxt-RR-JA), the first clinical comparable corpus. MedTxt-RR-JA was built by recruiting nine radiologists to diagnose the same 15 lung cancer cases in Radiopaedia, an open-access radiological repository. The 135 radiology reports in MedTxt-RR-JA were shown to contain word-, sentence- and document-level variations maintaining similarity of contents. MedTxt-RR-JA is also the first publicly available Japanese radiology report corpus that would help to overcome poor data availability for Japanese medical AI systems. Moreover, our methodology can be applied widely to building clinical corpora without privacy concerns.


Subject(s)
Artificial Intelligence , Radiology , Humans , Language , Radiography , Radiologists
SELECTION OF CITATIONS
SEARCH DETAIL
...