RESUMO
BACKGROUND: Coronary artery disease (CAD) has one of the highest mortality rates in humans worldwide. Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) provides clinicians with myocardial metabolic information non-invasively. However, there are some limitations to interpreting SPECT images performed by physicians or automatic quantitative approaches. Radiomics analyzes images objectively by extracting quantitative features and can potentially reveal biological characteristics that the human eye cannot detect. However, the reproducibility and repeatability of some radiomic features can be highly susceptible to segmentation and imaging conditions. PURPOSE: We aimed to assess the reproducibility of radiomic features extracted from uncorrected MPI-SPECT images reconstructed with 15 different settings before and after ComBat harmonization, along with evaluating the effectiveness of ComBat in realigning feature distributions. MATERIALS AND METHODS: A total of 200 patients (50% normal and 50% abnormal) including rest and stress (without attenuation and scatter corrections) MPI-SPECT images were included. Images were reconstructed using 15 combinations of filter cut-off frequencies, filter orders, filter types, reconstruction algorithms, number of iterations and subsets resulting in 6000 images. Image segmentation was performed on the left ventricle in the first reconstruction for each patient and applied to 14 others. A total of 93 radiomic features were extracted from the segmented area, and ComBat was used to harmonize them. The intraclass correlation coefficient (ICC) and overall concordance correlation coefficient (OCCC) tests were performed before and after ComBat to examine the impact of each parameter on feature robustness and to assess harmonization efficiency. The ANOVA and the Kruskal-Wallis tests were performed to evaluate the effectiveness of ComBat in correcting feature distributions. In addition, the Student's t-test, Wilcoxon rank-sum, and signed-rank tests were implemented to assess the significance level of the impacts made by each parameter of different batches and patient groups (normal vs. abnormal) on radiomic features. RESULTS: Before applying ComBat, the majority of features (ICC: 82, OCCC: 61) achieved high reproducibility (ICC/OCCC ≥ 0.900) under every batch except Reconstruction. The largest and smallest number of poor features (ICC/OCCC < 0.500) were obtained by IterationSubset and Order batches, respectively. The most reliable features were from the first-order (FO) and gray-level co-occurrence matrix (GLCM) families. Following harmonization, the minimum number of robust features increased (ICC: 84, OCCC: 78). Applying ComBat showed that Order and Reconstruction were the least and the most responsive batches, respectively. The most robust families, in a descending order, were found to be FO, neighborhood gray-tone difference matrix (NGTDM), GLCM, gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and gray-level dependence matrix (GLDM) under Cut-off, Filter, and Order batches. The Wilcoxon rank-sum test showed that the number of robust features significantly differed under most batches in the Normal and Abnormal groups. CONCLUSION: The majority of radiomic features show high levels of robustness across different OSEM reconstruction parameters in uncorrected MPI-SPECT. ComBat is effective in realigning feature distributions and enhancing radiomic features reproducibility.
RESUMO
INTRODUCTION: We propose a fully automated framework to conduct a region-wise image quality assessment (IQA) on whole-body 18 F-FDG PET scans. This framework (1) can be valuable in daily clinical image acquisition procedures to instantly recognize low-quality scans for potential rescanning and/or image reconstruction, and (2) can make a significant impact in dataset collection for the development of artificial intelligence-driven 18 F-FDG PET analysis models by rejecting low-quality images and those presenting with artifacts, toward building clean datasets. PATIENTS AND METHODS: Two experienced nuclear medicine physicians separately evaluated the quality of 174 18 F-FDG PET images from 87 patients, for each body region, based on a 5-point Likert scale. The body regisons included the following: (1) the head and neck, including the brain, (2) the chest, (3) the chest-abdomen interval (diaphragmatic region), (4) the abdomen, and (5) the pelvis. Intrareader and interreader reproducibility of the quality scores were calculated using 39 randomly selected scans from the dataset. Utilizing a binarized classification, images were dichotomized into low-quality versus high-quality for physician quality scores ≤3 versus >3, respectively. Inputting the 18 F-FDG PET/CT scans, our proposed fully automated framework applies 2 deep learning (DL) models on CT images to perform region identification and whole-body contour extraction (excluding extremities), then classifies PET regions as low and high quality. For classification, 2 mainstream artificial intelligence-driven approaches, including machine learning (ML) from radiomic features and DL, were investigated. All models were trained and evaluated on scores attributed by each physician, and the average of the scores reported. DL and radiomics-ML models were evaluated on the same test dataset. The performance evaluation was carried out on the same test dataset for radiomics-ML and DL models using the area under the curve, accuracy, sensitivity, and specificity and compared using the Delong test with P values <0.05 regarded as statistically significant. RESULTS: In the head and neck, chest, chest-abdomen interval, abdomen, and pelvis regions, the best models achieved area under the curve, accuracy, sensitivity, and specificity of [0.97, 0.95, 0.96, and 0.95], [0.85, 0.82, 0.87, and 0.76], [0.83, 0.76, 0.68, and 0.80], [0.73, 0.72, 0.64, and 0.77], and [0.72, 0.68, 0.70, and 0.67], respectively. In all regions, models revealed highest performance, when developed on the quality scores with higher intrareader reproducibility. Comparison of DL and radiomics-ML models did not show any statistically significant differences, though DL models showed overall improved trends. CONCLUSIONS: We developed a fully automated and human-perceptive equivalent model to conduct region-wise IQA over 18 F-FDG PET images. Our analysis emphasizes the necessity of developing separate models for body regions and performing data annotation based on multiple experts' consensus in IQA studies.
Assuntos
Fluordesoxiglucose F18 , Processamento de Imagem Assistida por Computador , Tomografia por Emissão de Pósitrons , Humanos , Tomografia por Emissão de Pósitrons/normas , Tomografia por Emissão de Pósitrons/métodos , Processamento de Imagem Assistida por Computador/métodos , Automação , Masculino , Feminino , Pessoa de Meia-Idade , Controle de Qualidade , Idoso , Imagem Corporal Total , AdultoRESUMO
The current study aimed to predict lymphovascular invasion (LVI) using multiple machine learning algorithms and multi-segmentation positron emission tomography (PET) radiomics in non-small cell lung cancer (NSCLC) patients, offering new avenues for personalized treatment strategies and improving patient outcomes. One hundred and twenty-six patients with NSCLC were enrolled in this study. Various automated and semi-automated PET image segmentation methods were applied, including Local Active Contour (LAC), Fuzzy-C-mean (FCM), K-means (KM), Watershed, Region Growing (RG), and Iterative thresholding (IT) with different percentages of the threshold. One hundred five radiomic features were extracted from each region of interest (ROI). Multiple feature selection methods, including Minimum Redundancy Maximum Relevance (MRMR), Recursive Feature Elimination (RFE), and Boruta, and multiple classifiers, including Multilayer Perceptron (MLP), Logistic Regression (LR), XGBoost (XGB), Naive Bayes (NB), and Random Forest (RF), were employed. Synthetic Minority Oversampling Technique (SMOTE) was also used to determine if it boosts the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Our results indicated that the combination of SMOTE, IT (with 45% threshold), RFE feature selection and LR classifier showed the best performance (AUC = 0.93, ACC = 0.84, SEN = 0.85, SPE = 0.84) followed by SMOTE, FCM segmentation, MRMR feature selection, and LR classifier (AUC = 0.92, ACC = 0.87, SEN = 1, SPE = 0.84). The highest ACC belonged to the IT segmentation (with 45 and 50% thresholds) alongside Boruta feature selection and the NB classifier without SMOTE (ACC = 0.9, AUC = 0.78 and 0.76, SEN = 0.7, and SPE = 0.94, respectively). Our results indicate that selection of appropriate segmentation method and machine learning algorithm may be helpful in successful prediction of LVI in patients with NSCLC with high accuracy using PET radiomics analysis.
RESUMO
PURPOSE: Non-small cell lung cancer is the most common subtype of lung cancer. Patient survival prediction using machine learning (ML) and radiomics analysis proved to provide promising outcomes. However, most studies reported in the literature focused on information extracted from malignant lesions. This study aims to explore the relevance and additional value of information extracted from healthy organs in addition to tumoral tissue using ML algorithms. PATIENTS AND METHODS: This study included PET/CT images of 154 patients collected from available online databases. The gross tumor volume and 33 volumes of interest defined on healthy organs were segmented using nnU-Net deep learning-based segmentation. Subsequently, 107 radiomic features were extracted from PET and CT images (Organomics). Clinical information was combined with PET and CT radiomics from organs and gross tumor volumes considering 19 different combinations of inputs. Finally, different feature selection (FS; 5 methods) and ML (6 algorithms) algorithms were tested in a 3-fold data split cross-validation scheme. The performance of the models was quantified in terms of the concordance index (C-index) metric. RESULTS: For an input combination of all radiomics information, most of the selected features belonged to PET Organomics and CT Organomics. The highest C-index (0.68) was achieved using univariate C-index FS method and random survival forest ML model using CT Organomics + PET Organomics as input as well as minimum depth FS method and CoxPH ML model using PET Organomics as input. Considering all 17 combinations with C-index higher than 0.65, Organomics from PET or CT images were used as input in 16 of them. CONCLUSIONS: The selected features and C-indices demonstrated that the additional information extracted from healthy organs of both PET and CT imaging modalities improved the ML performance. Organomics could be a step toward exploiting the whole information available from multimodality medical images, contributing to the emerging field of digital twins in health care.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Aprendizado de Máquina , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Humanos , Carcinoma Pulmonar de Células não Pequenas/diagnóstico por imagem , Carcinoma Pulmonar de Células não Pequenas/patologia , Neoplasias Pulmonares/diagnóstico por imagem , Masculino , Prognóstico , Feminino , Idoso , Pessoa de Meia-Idade , Processamento de Imagem Assistida por Computador/métodos , Idoso de 80 Anos ou mais , Adulto , RadiômicaRESUMO
BACKGROUND: Overall Survival (OS) and Progression-Free Survival (PFS) analyses are crucial metrics for evaluating the efficacy and impact of treatment. This study evaluated the role of clinical biomarkers and dosimetry parameters on survival outcomes of patients undergoing 90Y selective internal radiation therapy (SIRT). MATERIALS/METHODS: This preliminary and retrospective analysis included 17 patients with hepatocellular carcinoma (HCC) treated with 90Y SIRT. The patients underwent personalized treatment planning and voxel-wise dosimetry. After the procedure, the OS and PFS were evaluated. Three structures were delineated including tumoral liver (TL), normal perfused liver (NPL), and whole normal liver (WNL). 289 dose-volume constraints (DVCs) were extracted from dose-volume histograms of physical and biological effective dose (BED) maps calculated on 99mTc-MAA and 90Y SPECT/CT images. Subsequently, the DVCs and 16 clinical biomarkers were used as features for univariate and multivariate analysis. Cox proportional hazard ratio (HR) was employed for univariate analysis. HR and the concordance index (C-Index) were calculated for each feature. Using eight different strategies, a cross-combination of various models and feature selection (FS) methods was applied for multivariate analysis. The performance of each model was assessed using an averaged C-Index on a three-fold nested cross-validation framework. The Kaplan-Meier (KM) curve was employed for univariate and machine learning (ML) model performance assessment. RESULTS: The median OS was 11 months [95% CI: 8.5, 13.09], whereas the PFS was seven months [95% CI: 5.6, 10.98]. Univariate analysis demonstrated the presence of Ascites (HR: 9.2[1.8,47]) and the aim of SIRT (segmentectomy, lobectomy, palliative) (HR: 0.066 [0.0057, 0.78]), Aspartate aminotransferase (AST) level (HR:0.1 [0.012-0.86]), and MAA-Dose-V205(%)-TL (HR:8.5[1,72]) as predictors for OS. 90Y-derived parameters were associated with PFS but not with OS. MAA-Dose-V205(%)-WNL, MAA-BED-V400(%)-WNL with (HR:13 [1.5-120]) and 90Y-Dose-mean-TL, 90Y-D50-TL-Gy, 90Y-Dose-V205(%)-TL, 90Y-Dose- D50-TL-Gy, and 90Y-BED-V400(%)-TL (HR:15 [1.8-120]) were highly associated with PFS among dosimetry parameters. The highest C-index observed in multivariate analysis using ML was 0.94 ± 0.13 obtained from Variable Hunting-variable-importance (VH.VIMP) FS and Cox Proportional Hazard model predicting OS, using clinical features. However, the combination of VH. VIMP FS method with a Generalized Linear Model Network model predicting OS using Therapy strategy features outperformed the other models in terms of both C-index and stratification of KM curves (C-Index: 0.93 ± 0.14 and log-rank p-value of 0.023 for KM curve stratification). CONCLUSION: This preliminary study confirmed the role played by baseline clinical biomarkers and dosimetry parameters in predicting the treatment outcome, paving the way for the establishment of a dose-effect relationship. In addition, the feasibility of using ML along with these features was demonstrated as a helpful tool in the clinical management of patients, both prior to and following 90Y-SIRT.
Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Aprendizado de Máquina , Microesferas , Medicina de Precisão , Radiometria , Radioisótopos de Ítrio , Humanos , Masculino , Feminino , Radioisótopos de Ítrio/uso terapêutico , Pessoa de Meia-Idade , Neoplasias Hepáticas/radioterapia , Neoplasias Hepáticas/diagnóstico por imagem , Idoso , Carcinoma Hepatocelular/radioterapia , Carcinoma Hepatocelular/diagnóstico por imagem , Medicina de Precisão/métodos , Intervalo Livre de Progressão , Estudos Retrospectivos , Vidro , Biomarcadores TumoraisRESUMO
BACKGROUND: Contrast-enhanced computed tomography (CECT) provides much more information compared to non-enhanced CT images, especially for the differentiation of malignancies, such as liver carcinomas. Contrast media injection phase information is usually missing on public datasets and not standardized in the clinic even in the same region and language. This is a barrier to effective use of available CECT images in clinical research. PURPOSE: The aim of this study is to detect contrast media injection phase from CT images by means of organ segmentation and machine learning algorithms. METHODS: A total number of 2509 CT images split into four subsets of non-contrast (class #0), arterial (class #1), venous (class #2), and delayed (class #3) after contrast media injection were collected from two CT scanners. Seven organs including the liver, spleen, heart, kidneys, lungs, urinary bladder, and aorta along with body contour masks were generated by pre-trained deep learning algorithms. Subsequently, five first-order statistical features including average, standard deviation, 10, 50, and 90 percentiles extracted from the above-mentioned masks were fed to machine learning models after feature selection and reduction to classify the CT images in one of four above mentioned classes. A 10-fold data split strategy was followed. The performance of our methodology was evaluated in terms of classification accuracy metrics. RESULTS: The best performance was achieved by Boruta feature selection and RF model with average area under the curve of more than 0.999 and accuracy of 0.9936 averaged over four classes and 10 folds. Boruta feature selection selected all predictor features. The lowest classification was observed for class #2 (0.9888), which is already an excellent result. In the 10-fold strategy, only 33 cases from 2509 cases (â¼1.4%) were misclassified. The performance over all folds was consistent. CONCLUSIONS: We developed a fast, accurate, reliable, and explainable methodology to classify contrast media phases which may be useful in data curation and annotation in big online datasets or local datasets with non-standard or no series description. Our model containing two steps of deep learning and machine learning may help to exploit available datasets more effectively.
Assuntos
Automação , Meios de Contraste , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Tomografia Computadorizada por Raios X , Humanos , Processamento de Imagem Assistida por Computador/métodos , Radiografia Abdominal , Abdome/diagnóstico por imagemRESUMO
PURPOSE: This study aimed to examine the robustness of positron emission tomography (PET) radiomic features extracted via different segmentation methods before and after ComBat harmonization in patients with non-small cell lung cancer (NSCLC). METHODS: We included 120 patients (positive recurrence = 46 and negative recurrence = 74) referred for PET scanning as a routine part of their care. All patients had a biopsy-proven NSCLC. Nine segmentation methods were applied to each image, including manual delineation, K-means (KM), watershed, fuzzy-C-mean, region-growing, local active contour (LAC), and iterative thresholding (IT) with 40, 45, and 50% thresholds. Diverse image discretizations, both without a filter and with different wavelet decompositions, were applied to PET images. Overall, 6741 radiomic features were extracted from each image (749 radiomic features from each segmented area). Non-parametric empirical Bayes (NPEB) ComBat harmonization was used to harmonize the features. Linear Support Vector Classifier (LinearSVC) with L1 regularization For feature selection and Support Vector Machine classifier (SVM) with fivefold nested cross-validation was performed using StratifiedKFold with 'n_splits' set to 5 to predict recurrence in NSCLC patients and assess the impact of ComBat harmonization on the outcome. RESULTS: From 749 extracted radiomic features, 206 (27%) and 389 (51%) features showed excellent reliability (ICC ≥ 0.90) against segmentation method variation before and after NPEB ComBat harmonization, respectively. Among all, 39 features demonstrated poor reliability, which declined to 10 after ComBat harmonization. The 64 fixed bin widths (without any filter) and wavelets (LLL)-based radiomic features set achieved the best performance in terms of robustness against diverse segmentation techniques before and after ComBat harmonization. The first-order and GLRLM and also first-order and NGTDM feature families showed the largest number of robust features before and after ComBat harmonization, respectively. In terms of predicting recurrence in NSCLC, our findings indicate that using ComBat harmonization can significantly enhance machine learning outcomes, particularly improving the accuracy of watershed segmentation, which initially had fewer reliable features than manual contouring. Following the application of ComBat harmonization, the majority of cases saw substantial increase in sensitivity and specificity. CONCLUSION: Radiomic features are vulnerable to different segmentation methods. ComBat harmonization might be considered a solution to overcome the poor reliability of radiomic features.
Assuntos
Carcinoma Pulmonar de Células não Pequenas , Processamento de Imagem Assistida por Computador , Neoplasias Pulmonares , Humanos , Carcinoma Pulmonar de Células não Pequenas/diagnóstico por imagem , Neoplasias Pulmonares/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodos , Masculino , Feminino , Pessoa de Meia-Idade , Idoso , Tomografia por Emissão de Pósitrons/métodos , Máquina de Vetores de Suporte , Adulto , RadiômicaRESUMO
This study investigated the impact of ComBat harmonization on the reproducibility of radiomic features extracted from magnetic resonance images (MRI) acquired on different scanners, using various data acquisition parameters and multiple image pre-processing techniques using a dedicated MRI phantom. Four scanners were used to acquire an MRI of a nonanatomic phantom as part of the TCIA RIDER database. In fast spin-echo inversion recovery (IR) sequences, several inversion durations were employed, including 50, 100, 250, 500, 750, 1000, 1500, 2000, 2500, and 3000 ms. In addition, a 3D fast spoiled gradient recalled echo (FSPGR) sequence was used to investigate several flip angles (FA): 2, 5, 10, 15, 20, 25, and 30 degrees. Nineteen phantom compartments were manually segmented. Different approaches were used to pre-process each image: Bin discretization, Wavelet filter, Laplacian of Gaussian, logarithm, square, square root, and gradient. Overall, 92 first-, second-, and higher-order statistical radiomic features were extracted. ComBat harmonization was also applied to the extracted radiomic features. Finally, the Intraclass Correlation Coefficient (ICC) and Kruskal-Wallis's (KW) tests were implemented to assess the robustness of radiomic features. The number of non-significant features in the KW test ranged between 0-5 and 29-74 for various scanners, 31-91 and 37-92 for three times tests, 0-33 to 34-90 for FAs, and 3-68 to 65-89 for IRs before and after ComBat harmonization, with different image pre-processing techniques, respectively. The number of features with ICC over 90% ranged between 0-8 and 6-60 for various scanners, 11-75 and 17-80 for three times tests, 3-83 to 9-84 for FAs, and 3-49 to 3-63 for IRs before and after ComBat harmonization, with different image pre-processing techniques, respectively. The use of various scanners, IRs, and FAs has a great impact on radiomic features. However, the majority of scanner-robust features is also robust to IR and FA. Among the effective parameters in MR images, several tests in one scanner have a negligible impact on radiomic features. Different scanners and acquisition parameters using various image pre-processing might affect radiomic features to a large extent. ComBat harmonization might significantly impact the reproducibility of MRI radiomic features.
Assuntos
Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Imagens de Fantasmas , Imageamento por Ressonância Magnética/métodos , Imageamento por Ressonância Magnética/normas , Reprodutibilidade dos Testes , Humanos , Processamento de Imagem Assistida por Computador/métodos , Algoritmos , RadiômicaRESUMO
BACKGROUND: Notwithstanding the encouraging results of previous studies reporting on the efficiency of deep learning (DL) in COVID-19 prognostication, clinical adoption of the developed methodology still needs to be improved. To overcome this limitation, we set out to predict the prognosis of a large multi-institutional cohort of patients with COVID-19 using a DL-based model. PURPOSE: This study aimed to evaluate the performance of deep privacy-preserving federated learning (DPFL) in predicting COVID-19 outcomes using chest CT images. METHODS: After applying inclusion and exclusion criteria, 3055 patients from 19 centers, including 1599 alive and 1456 deceased, were enrolled in this study. Data from all centers were split (randomly with stratification respective to each center and class) into a training/validation set (70%/10%) and a hold-out test set (20%). For the DL model, feature extraction was performed on 2D slices, and averaging was performed at the final layer to construct a 3D model for each scan. The DensNet model was used for feature extraction. The model was developed using centralized and FL approaches. For FL, we employed DPFL approaches. Membership inference attack was also evaluated in the FL strategy. For model evaluation, different metrics were reported in the hold-out test sets. In addition, models trained in two scenarios, centralized and FL, were compared using the DeLong test for statistical differences. RESULTS: The centralized model achieved an accuracy of 0.76, while the DPFL model had an accuracy of 0.75. Both the centralized and DPFL models achieved a specificity of 0.77. The centralized model achieved a sensitivity of 0.74, while the DPFL model had a sensitivity of 0.73. A mean AUC of 0.82 and 0.81 with 95% confidence intervals of (95% CI: 0.79-0.85) and (95% CI: 0.77-0.84) were achieved by the centralized model and the DPFL model, respectively. The DeLong test did not prove statistically significant differences between the two models (p-value = 0.98). The AUC values for the inference attacks fluctuate between 0.49 and 0.51, with an average of 0.50 ± 0.003 and 95% CI for the mean AUC of 0.500 to 0.501. CONCLUSION: The performance of the proposed model was comparable to centralized models while operating on large and heterogeneous multi-institutional datasets. In addition, the model was resistant to inference attacks, ensuring the privacy of shared data during the training process.
Assuntos
COVID-19 , Aprendizado Profundo , Tomografia Computadorizada por Raios X , COVID-19/diagnóstico por imagem , Humanos , Prognóstico , Masculino , Feminino , Idoso , Pessoa de Meia-Idade , Privacidade , Radiografia Torácica , Conjuntos de Dados como AssuntoRESUMO
BACKGROUND: This study aimed to investigate the value of clinical, radiomic features extracted from gross tumor volumes (GTVs) delineated on CT images, dose distributions (Dosiomics), and fusion of CT and dose distributions to predict outcomes in head and neck cancer (HNC) patients. METHODS: A cohort of 240 HNC patients from five different centers was obtained from The Cancer Imaging Archive. Seven strategies, including four non-fusion (Clinical, CT, Dose, DualCT-Dose), and three fusion algorithms (latent low-rank representation referred (LLRR),Wavelet, weighted least square (WLS)) were applied. The fusion algorithms were used to fuse the pre-treatment CT images and 3-dimensional dose maps. Overall, 215 radiomics and Dosiomics features were extracted from the GTVs, alongside with seven clinical features incorporated. Five feature selection (FS) methods in combination with six machine learning (ML) models were implemented. The performance of the models was quantified using the concordance index (CI) in one-center-leave-out 5-fold cross-validation for overall survival (OS) prediction considering the time-to-event. RESULTS: The mean CI and Kaplan-Meier curves were used for further comparisons. The CoxBoost ML model using the Minimal Depth (MD) FS method and the glmnet model using the Variable hunting (VH) FS method showed the best performance with CI = 0.73 ± 0.15 for features extracted from LLRR fused images. In addition, both glmnet-Cindex and Coxph-Cindex classifiers achieved a CI of 0.72 ± 0.14 by employing the dose images (+ incorporated clinical features) only. CONCLUSION: Our results demonstrated that clinical features, Dosiomics and fusion of dose and CT images by specific ML-FS models could predict the overall survival of HNC patients with acceptable accuracy. Besides, the performance of ML methods among the three different strategies was almost comparable.
Assuntos
Neoplasias de Cabeça e Pescoço , Radiômica , Humanos , Prognóstico , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Neoplasias de Cabeça e Pescoço/radioterapia , Aprendizado de Máquina , Tomografia Computadorizada por Raios XRESUMO
BACKGROUND: There are few studies for detecting rhythm abnormalities among healthy children and adolescents. The aim of the study was to investigate the prevalence of abnormal electrocardiographic findings in the young Iranian population and its association with blood pressure and obesity. METHODS: A total of 15084 children and adolescents were examined in a randomly selected population of Tehran city, Iran, between October 2017 and December 2018. Anthropometric values and blood pressure measurements were also assessed. A standard 12-lead electrocardiogram was recorded by a unique recorder, and those were examined by electrophysiologists. RESULTS: All students mean age was 12.3 ± 3.1 years (6-18 years), and 52% were boys. A total of 2900 students (192.2/1000 persons; 95% confidence interval 186-198.6) had electrocardiographic abnormalities. The rate of electrocardiographic abnormalities was higher in boys than girls (p < 0.001). Electrocardiographic abnormalities were significantly higher in thin than obese students (p < 0.001), and there was a trend towards hypertensive individuals to have more electrocardiographic abnormalities compared to normotensive individuals (p = 0.063). Based on the multivariable analysis, individuals with electrocardiographic abnormalities were less likely to be girls (odds ratio 0.745, 95% confidence interval 0.682-0.814) and had a lower body mass index (odds ratio 0.961, 95% confidence interval 0.944-0.979). CONCLUSIONS: In this large-scale study, there was a high prevalence of electrocardiographic abnormalities among young population. In addition, electrocardiographic findings were significantly influenced by increasing age, sex, obesity, and blood pressure levels. This community-based study revealed the implications of electrocardiographic screening to improve the care delivery by early detection.
Assuntos
Pressão Sanguínea , Eletrocardiografia , Humanos , Irã (Geográfico)/epidemiologia , Masculino , Feminino , Adolescente , Criança , Prevalência , Pressão Sanguínea/fisiologia , Hipertensão/epidemiologia , Estudos Transversais , Obesidade Infantil/epidemiologia , Arritmias Cardíacas/epidemiologia , Arritmias Cardíacas/diagnóstico , Índice de Massa Corporal , Fatores de RiscoRESUMO
BACKGROUND: PET/CT images combining anatomic and metabolic data provide complementary information that can improve clinical task performance. PET image segmentation algorithms exploiting the multi-modal information available are still lacking. PURPOSE: Our study aimed to assess the performance of PET and CT image fusion for gross tumor volume (GTV) segmentations of head and neck cancers (HNCs) utilizing conventional, deep learning (DL), and output-level voting-based fusions. METHODS: The current study is based on a total of 328 histologically confirmed HNCs from six different centers. The images were automatically cropped to a 200 × 200 head and neck region box, and CT and PET images were normalized for further processing. Eighteen conventional image-level fusions were implemented. In addition, a modified U2-Net architecture as DL fusion model baseline was used. Three different input, layer, and decision-level information fusions were used. Simultaneous truth and performance level estimation (STAPLE) and majority voting to merge different segmentation outputs (from PET and image-level and network-level fusions), that is, output-level information fusion (voting-based fusions) were employed. Different networks were trained in a 2D manner with a batch size of 64. Twenty percent of the dataset with stratification concerning the centers (20% in each center) were used for final result reporting. Different standard segmentation metrics and conventional PET metrics, such as SUV, were calculated. RESULTS: In single modalities, PET had a reasonable performance with a Dice score of 0.77 ± 0.09, while CT did not perform acceptably and reached a Dice score of only 0.38 ± 0.22. Conventional fusion algorithms obtained a Dice score range of [0.76-0.81] with guided-filter-based context enhancement (GFCE) at the low-end, and anisotropic diffusion and Karhunen-Loeve transform fusion (ADF), multi-resolution singular value decomposition (MSVD), and multi-level image decomposition based on latent low-rank representation (MDLatLRR) at the high-end. All DL fusion models achieved Dice scores of 0.80. Output-level voting-based models outperformed all other models, achieving superior results with a Dice score of 0.84 for Majority_ImgFus, Majority_All, and Majority_Fast. A mean error of almost zero was achieved for all fusions using SUVpeak , SUVmean and SUVmedian . CONCLUSION: PET/CT information fusion adds significant value to segmentation tasks, considerably outperforming PET-only and CT-only methods. In addition, both conventional image-level and DL fusions achieve competitive results. Meanwhile, output-level voting-based fusion using majority voting of several algorithms results in statistically significant improvements in the segmentation of HNC.
Assuntos
Neoplasias de Cabeça e Pescoço , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Algoritmos , Neoplasias de Cabeça e Pescoço/diagnóstico por imagem , Processamento de Imagem Assistida por Computador/métodosRESUMO
This study intends to predict in-hospital and 6-month mortality, as well as 30-day and 90-day hospital readmission, using Machine Learning (ML) approach via conventional features. A total of 737 patients remained after applying the exclusion criteria to 1101 heart failure patients. Thirty-four conventional features were collected for each patient. First, the data were divided into train and test cohorts with a 70-30% ratio. Then train data were normalized using the Z-score method, and its mean and standard deviation were applied to the test data. Subsequently, Boruta, RFE, and MRMR feature selection methods were utilized to select more important features in the training set. In the next step, eight ML approaches were used for modeling. Next, hyperparameters were optimized using tenfold cross-validation and grid search in the train dataset. All model development steps (normalization, feature selection, and hyperparameter optimization) were performed on a train set without touching the hold-out test set. Then, bootstrapping was done 1000 times on the hold-out test data. Finally, the obtained results were evaluated using four metrics: area under the ROC curve (AUC), accuracy (ACC), specificity (SPE), and sensitivity (SEN). The RFE-LR (AUC: 0.91, ACC: 0.84, SPE: 0.84, SEN: 0.83) and Boruta-LR (AUC: 0.90, ACC: 0.85, SPE: 0.85, SEN: 0.83) models generated the best results in terms of in-hospital mortality. In terms of 30-day rehospitalization, Boruta-SVM (AUC: 0.73, ACC: 0.81, SPE: 0.85, SEN: 0.50) and MRMR-LR (AUC: 0.71, ACC: 0.68, SPE: 0.69, SEN: 0.63) models performed the best. The best model for 3-month rehospitalization was MRMR-KNN (AUC: 0.60, ACC: 0.63, SPE: 0.66, SEN: 0.53) and regarding 6-month mortality, the MRMR-LR (AUC: 0.61, ACC: 0.63, SPE: 0.44, SEN: 0.66) and MRMR-NB (AUC: 0.59, ACC: 0.61, SPE: 0.48, SEN: 0.63) models outperformed the others. Reliable models were developed in 30-day rehospitalization and in-hospital mortality using conventional features and ML techniques. Such models can effectively personalize treatment, decision-making, and wiser budget allocation. Obtained results in 3-month rehospitalization and 6-month mortality endpoints were not astonishing and further experiments with additional information are needed to fetch promising results in these endpoints.
Assuntos
Insuficiência Cardíaca , Readmissão do Paciente , Humanos , Mortalidade Hospitalar , Aprendizado de MáquinaRESUMO
PROPOSE: An electrocardiogram (ECG) has been extensively used to detect rhythm disturbances. We sought to determine the accuracy of different machine learning in distinguishing abnormal ECGs from normal ones in children who were examined using a resting 12-Lead ECG machine, and we also compared the manual and automated measurement using the modular ECG Analysis System (MEANS) algorithm of ECG features. METHODS: Altogether, 10745 ECGs were recorded for students aged 6 to 18. Manual and automatic ECG features were extracted for each participant. Features were normalized using Z-score normalization and went through the student's t-test and chi-squared test to measure their relevance. We applied the Boruta algorithm for feature selection and then implemented eight classifier algorithms. The dataset was split into training (80%) and test (20%) partitions. The performance of the classifiers was evaluated on the test data (unseen data) by 1000 bootstrap, and sensitivity (SEN), specificity (SPE), AUC, and accuracy (ACC) were reported. RESULTS: In univariate analysis, the highest performance was heart rate and RR interval in the manual dataset and heart rate in an automated dataset with AUC of 0.72 and 0.71, respectively. The best classifiers in the manual dataset were random forest (RF) and quadratic-discriminant-analysis (QDA) with AUC, ACC, SEN, and SPE equal to 0.93, 0.98, 0.69, 0.99, and 0.90, 0.95, 0.75, 0.96, respectively. In the automated dataset, QDA (AUC: 0.89, ACC:0.92, SEN:0.71, SPE:0.93) and stack learning (SL) (AUC:0.89, ACC:0.96, SEN:0.61, SPE:0.99) reached best performances. CONCLUSION: This study demonstrated that the manual measurement of 12-Lead ECG features had better performance than the automated measurement (MEANS algorithm), but some classifiers had promising results in discriminating between normal and abnormal cases. Further studies can help us evaluate the applicability and efficacy of machine-learning approaches for distinguishing abnormal ECGs in community-based investigations in both adults and children.
Assuntos
Algoritmos , Aprendizado de Máquina , Adulto , Criança , Humanos , Adolescente , Estudos de Coortes , Arritmias Cardíacas/diagnóstico , Eletrocardiografia/métodosRESUMO
PURPOSE: Glioblastoma Multiforme (GBM) represents the predominant aggressive primary tumor of the brain with short overall survival (OS) time. We aim to assess the potential of radiomic features in predicting the time-to-event OS of patients with GBM using machine learning (ML) algorithms. MATERIALS AND METHODS: One hundred nineteen patients with GBM, who had T1-weighted contrast-enhanced and T2-FLAIR MRI sequences, along with clinical data and survival time, were enrolled. Image preprocessing methods included 64 bin discretization, Laplacian of Gaussian (LOG) filters with three Sigma values and eight variations of Wavelet Transform. Images were then segmented, followed by the extraction of 1212 radiomic features. Seven feature selection (FS) methods and six time-to-event ML algorithms were utilized. The combination of preprocessing, FS, and ML algorithms (12 × 7 × 6 = 504 models) was evaluated by multivariate analysis. RESULTS: Our multivariate analysis showed that the best prognostic FS/ML combinations are the Mutual Information (MI)/Cox Boost, MI/Generalized Linear Model Boosting (GLMB) and MI/Generalized Linear Model Network (GLMN), all of which were done via the LOG (Sigma = 1 mm) preprocessing method (C-index = 0.77). The LOG filter with Sigma = 1 mm preprocessing method, MI, GLMB and GLMN achieved significantly higher C-indices than other preprocessing, FS, and ML methods (all p values < 0.05, mean C-indices of 0.65, 0.70, and 0.64, respectively). CONCLUSION: ML algorithms are capable of predicting the time-to-event OS of patients using MRI-based radiomic and clinical features. MRI-based radiomics analysis in combination with clinical variables might appear promising in assisting clinicians in the survival prediction of patients with GBM. Further research is needed to establish the applicability of radiomics in the management of GBM in the clinic.
Assuntos
Neoplasias Encefálicas , Glioblastoma , Humanos , Glioblastoma/patologia , Imageamento por Ressonância Magnética/métodos , Encéfalo/patologia , Prognóstico , Proteínas Adaptadoras de Transdução de SinalRESUMO
Heart failure caused by iron deposits in the myocardium is the primary cause of mortality in beta-thalassemia major patients. Cardiac magnetic resonance imaging (CMRI) T2* is the primary screening technique used to detect myocardial iron overload, but inherently bears some limitations. In this study, we aimed to differentiate beta-thalassemia major patients with myocardial iron overload from those without myocardial iron overload (detected by T2*CMRI) based on radiomic features extracted from echocardiography images and machine learning (ML) in patients with normal left ventricular ejection fraction (LVEF > 55%) in echocardiography. Out of 91 cases, 44 patients with thalassemia major with normal LVEF (> 55%) and T2* ≤ 20 ms and 47 people with LVEF > 55% and T2* > 20 ms as the control group were included in the study. Radiomic features were extracted for each end-systolic (ES) and end-diastolic (ED) image. Then, three feature selection (FS) methods and six different classifiers were used. The models were evaluated using various metrics, including the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Maximum relevance-minimum redundancy-eXtreme gradient boosting (MRMR-XGB) (AUC = 0.73, ACC = 0.73, SPE = 0.73, SEN = 0.73), ANOVA-MLP (AUC = 0.69, ACC = 0.69, SPE = 0.56, SEN = 0.83), and recursive feature elimination-K-nearest neighbors (RFE-KNN) (AUC = 0.65, ACC = 0.65, SPE = 0.64, SEN = 0.65) were the best models in ED, ES, and ED&ES datasets. Using radiomic features extracted from echocardiographic images and ML, it is feasible to predict cardiac problems caused by iron overload.
Assuntos
Sobrecarga de Ferro , Talassemia , Disfunção Ventricular Esquerda , Talassemia beta , Humanos , Talassemia beta/complicações , Talassemia beta/diagnóstico por imagem , Volume Sistólico , Função Ventricular Esquerda , Talassemia/complicações , Talassemia/diagnóstico por imagem , Miocárdio , Ecocardiografia/métodos , Sobrecarga de Ferro/complicações , Sobrecarga de Ferro/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Disfunção Ventricular Esquerda/etiologia , Disfunção Ventricular Esquerda/complicaçõesRESUMO
PURPOSE: Image artefacts continue to pose challenges in clinical molecular imaging, resulting in misdiagnoses, additional radiation doses to patients and financial costs. Mismatch and halo artefacts occur frequently in gallium-68 (68Ga)-labelled compounds whole-body PET/CT imaging. Correcting for these artefacts is not straightforward and requires algorithmic developments, given that conventional techniques have failed to address them adequately. In the current study, we employed differential privacy-preserving federated transfer learning (FTL) to manage clinical data sharing and tackle privacy issues for building centre-specific models that detect and correct artefacts present in PET images. METHODS: Altogether, 1413 patients with 68Ga prostate-specific membrane antigen (PSMA)/DOTA-TATE (TOC) PET/CT scans from 3 countries, including 8 different centres, were enrolled in this study. CT-based attenuation and scatter correction (CT-ASC) was used in all centres for quantitative PET reconstruction. Prior to model training, an experienced nuclear medicine physician reviewed all images to ensure the use of high-quality, artefact-free PET images (421 patients' images). A deep neural network (modified U2Net) was trained on 80% of the artefact-free PET images to utilize centre-based (CeBa), centralized (CeZe) and the proposed differential privacy FTL frameworks. Quantitative analysis was performed in 20% of the clean data (with no artefacts) in each centre. A panel of two nuclear medicine physicians conducted qualitative assessment of image quality, diagnostic confidence and image artefacts in 128 patients with artefacts (256 images for CT-ASC and FTL-ASC). RESULTS: The three approaches investigated in this study for 68Ga-PET imaging (CeBa, CeZe and FTL) resulted in a mean absolute error (MAE) of 0.42 ± 0.21 (CI 95%: 0.38 to 0.47), 0.32 ± 0.23 (CI 95%: 0.27 to 0.37) and 0.28 ± 0.15 (CI 95%: 0.25 to 0.31), respectively. Statistical analysis using the Wilcoxon test revealed significant differences between the three approaches, with FTL outperforming CeBa and CeZe (p-value < 0.05) in the clean test set. The qualitative assessment demonstrated that FTL-ASC significantly improved image quality and diagnostic confidence and decreased image artefacts, compared to CT-ASC in 68Ga-PET imaging. In addition, mismatch and halo artefacts were successfully detected and disentangled in the chest, abdomen and pelvic regions in 68Ga-PET imaging. CONCLUSION: The proposed approach benefits from using large datasets from multiple centres while preserving patient privacy. Qualitative assessment by nuclear medicine physicians showed that the proposed model correctly addressed two main challenging artefacts in 68Ga-PET imaging. This technique could be integrated in the clinic for 68Ga-PET imaging artefact detection and disentanglement using multicentric heterogeneous datasets.
Assuntos
Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias da Próstata , Masculino , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Artefatos , Radioisótopos de Gálio , Privacidade , Tomografia por Emissão de Pósitrons/métodos , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodosRESUMO
This study aimed to investigate the diagnostic performance of machine learning-based radiomics analysis to diagnose coronary artery disease status and risk from rest/stress Myocardial Perfusion Imaging (MPI) single-photon emission computed tomography (SPECT). A total of 395 patients suspicious of coronary artery disease who underwent 2-day stress-rest protocol MPI SPECT were enrolled in this study. The left ventricle myocardium, excluding the cardiac cavity, was manually delineated on rest and stress images to define a volume of interest. Added to clinical features (age, sex, family history, diabetes status, smoking, and ejection fraction), a total of 118 radiomics features, were extracted from rest and stress MPI SPECT images to establish different feature sets, including Rest-, Stress-, Delta-, and Combined-radiomics (all together) feature sets. The data were randomly divided into 80% and 20% subsets for training and testing, respectively. The performance of classifiers built from combinations of three feature selections, and nine machine learning algorithms was evaluated for two different diagnostic tasks, including 1) normal/abnormal (no CAD vs. CAD) classification, and 2) low-risk/high-risk CAD classification. Different metrics, including the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE), were reported for models' evaluation. Overall, models built on the Stress feature set (compared to other feature sets), and models to diagnose the second task (compared to task 1 models) revealed better performance. The Stress-mRMR-KNN (feature set-feature selection-classifier) reached the highest performance for task 1 with AUC, ACC, SEN, and SPE equal to 0.61, 0.63, 0.64, and 0.6, respectively. The Stress-Boruta-GB model achieved the highest performance for task 2 with AUC, ACC, SEN, and SPE of 0.79, 0.76, 0.75, and 0.76, respectively. Diabetes status from the clinical feature family, and dependence count non-uniformity normalized, from the NGLDM family, which is representative of non-uniformity in the region of interest were the most frequently selected features from stress feature set for CAD risk classification. This study revealed promising results for CAD risk classification using machine learning models built on MPI SPECT radiomics. The proposed models are helpful to alleviate the labor-intensive MPI SPECT interpretation process regarding CAD status and can potentially expedite the diagnostic process.
Assuntos
Doença da Artéria Coronariana , Diabetes Mellitus , Imagem de Perfusão do Miocárdio , Humanos , Doença da Artéria Coronariana/diagnóstico por imagem , Aprendizado de Máquina , Tomografia Computadorizada de Emissão de Fóton Único , Masculino , FemininoRESUMO
PURPOSE: In Parkinson's disease (PD), 5-10% of cases are of genetic origin with mutations identified in several genes such as leucine-rich repeat kinase 2 (LRRK2) and glucocerebrosidase (GBA). We aim to predict these two gene mutations using hybrid machine learning systems (HMLS), via imaging and non-imaging data, with the long-term goal to predict conversion to active disease. METHODS: We studied 264 and 129 patients with known LRRK2 and GBA mutations status from PPMI database. Each dataset includes 513 features such as clinical features (CFs), conventional imaging features (CIFs) and radiomic features (RFs) extracted from DAT-SPECT images. Features, normalized by Z-score, were univariately analyzed for statistical significance by the t-test and chi-square test, adjusted by Benjamini-Hochberg correction. Multiple HMLSs, including 11 features extraction (FEA) or 10 features selection algorithms (FSA) linked with 21 classifiers were utilized. We also employed Ensemble Voting (EV) to classify the genes. RESULTS: For prediction of LRRK2 mutation status, a number of HMLSs resulted in accuracies of 0.98 ± 0.02 and 1.00 in 5-fold cross-validation (80% out of total data points) and external testing (remaining 20%), respectively. For predicting GBA mutation status, multiple HMLSs resulted in high accuracies of 0.90 ± 0.08 and 0.96 in 5-fold cross-validation and external testing, respectively. We additionally showed that SPECT-based RFs added value to the specific prediction of of GBA mutation status. CONCLUSION: We demonstrated that combining medical information with SPECT-based imaging features, and optimal utilization of HMLS can produce excellent prediction of the mutations status in PD patients.
Assuntos
Doença de Parkinson , Humanos , Doença de Parkinson/diagnóstico por imagem , Doença de Parkinson/genética , Serina-Treonina Proteína Quinase-2 com Repetições Ricas em Leucina/genética , Mutação/genética , Tomografia Computadorizada de Emissão de Fóton Único , Glucosilceramidase/genéticaRESUMO
OBJECTIVES: This study aims to use ultrasound derived features as biomarkers to assess the malignancy of thyroid nodules in patients who were candidates for FNA according to the ACR TI-RADS guidelines. METHODS: Two hundred and ten patients who met the selection criteria were enrolled in the study and subjected to ultrasound-guided FNA of thyroid nodules. Different radiomics features were extracted from sonographic images, including intensity, shape, and texture feature sets. Least Absolute Shrinkage and Selection Operator (LASSO), Minimum Redundancy Maximum Relevance (MRMR), and Random Forests/Extreme Gradient Boosting Machine (XGBoost) algorithms were used for feature selection and classification of the univariate and multivariate modeling, respectively. Evaluation of models performed using accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). RESULTS: In the univariate analysis, Gray Level Run Length Matrix - Run-Length Non-Uniformity (GLRLM-RLNU) and gray-level zone length matrix - Run-Length Non-Uniformity (GLZLM-GLNU) (both with an AUC of 0.67) were top-performing for predicting nodules malignancy. In the multivariate analysis of the training dataset, the AUC of all combinations of feature selection algorithms and classifiers was 0.99, and the highest sensitivity was for XGBoost classifier and MRMR feature selection algorithms (0.99). Finally, the test dataset was used to evaluate our model in which XGBoost classifier with MRMR and LASSO feature selection algorithms had the highest performance (AUC = 0.95). CONCLUSIONS: Ultrasound-extracted features can be used as non-invasive biomarkers for thyroid nodules' malignancy prediction.