RESUMEN
BACKGROUND: Overall Survival (OS) and Progression-Free Survival (PFS) analyses are crucial metrics for evaluating the efficacy and impact of treatment. This study evaluated the role of clinical biomarkers and dosimetry parameters on survival outcomes of patients undergoing 90Y selective internal radiation therapy (SIRT). MATERIALS/METHODS: This preliminary and retrospective analysis included 17 patients with hepatocellular carcinoma (HCC) treated with 90Y SIRT. The patients underwent personalized treatment planning and voxel-wise dosimetry. After the procedure, the OS and PFS were evaluated. Three structures were delineated including tumoral liver (TL), normal perfused liver (NPL), and whole normal liver (WNL). 289 dose-volume constraints (DVCs) were extracted from dose-volume histograms of physical and biological effective dose (BED) maps calculated on 99mTc-MAA and 90Y SPECT/CT images. Subsequently, the DVCs and 16 clinical biomarkers were used as features for univariate and multivariate analysis. Cox proportional hazard ratio (HR) was employed for univariate analysis. HR and the concordance index (C-Index) were calculated for each feature. Using eight different strategies, a cross-combination of various models and feature selection (FS) methods was applied for multivariate analysis. The performance of each model was assessed using an averaged C-Index on a three-fold nested cross-validation framework. The Kaplan-Meier (KM) curve was employed for univariate and machine learning (ML) model performance assessment. RESULTS: The median OS was 11 months [95% CI: 8.5, 13.09], whereas the PFS was seven months [95% CI: 5.6, 10.98]. Univariate analysis demonstrated the presence of Ascites (HR: 9.2[1.8,47]) and the aim of SIRT (segmentectomy, lobectomy, palliative) (HR: 0.066 [0.0057, 0.78]), Aspartate aminotransferase (AST) level (HR:0.1 [0.012-0.86]), and MAA-Dose-V205(%)-TL (HR:8.5[1,72]) as predictors for OS. 90Y-derived parameters were associated with PFS but not with OS. MAA-Dose-V205(%)-WNL, MAA-BED-V400(%)-WNL with (HR:13 [1.5-120]) and 90Y-Dose-mean-TL, 90Y-D50-TL-Gy, 90Y-Dose-V205(%)-TL, 90Y-Dose- D50-TL-Gy, and 90Y-BED-V400(%)-TL (HR:15 [1.8-120]) were highly associated with PFS among dosimetry parameters. The highest C-index observed in multivariate analysis using ML was 0.94 ± 0.13 obtained from Variable Hunting-variable-importance (VH.VIMP) FS and Cox Proportional Hazard model predicting OS, using clinical features. However, the combination of VH. VIMP FS method with a Generalized Linear Model Network model predicting OS using Therapy strategy features outperformed the other models in terms of both C-index and stratification of KM curves (C-Index: 0.93 ± 0.14 and log-rank p-value of 0.023 for KM curve stratification). CONCLUSION: This preliminary study confirmed the role played by baseline clinical biomarkers and dosimetry parameters in predicting the treatment outcome, paving the way for the establishment of a dose-effect relationship. In addition, the feasibility of using ML along with these features was demonstrated as a helpful tool in the clinical management of patients, both prior to and following 90Y-SIRT.
Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Aprendizaje Automático , Microesferas , Medicina de Precisión , Radiometría , Radioisótopos de Itrio , Humanos , Masculino , Femenino , Radioisótopos de Itrio/uso terapéutico , Persona de Mediana Edad , Neoplasias Hepáticas/radioterapia , Neoplasias Hepáticas/diagnóstico por imagen , Anciano , Carcinoma Hepatocelular/radioterapia , Carcinoma Hepatocelular/diagnóstico por imagen , Medicina de Precisión/métodos , Supervivencia sin Progresión , Estudios Retrospectivos , Vidrio , Biomarcadores de TumorRESUMEN
BACKGROUND: There are few studies for detecting rhythm abnormalities among healthy children and adolescents. The aim of the study was to investigate the prevalence of abnormal electrocardiographic findings in the young Iranian population and its association with blood pressure and obesity. METHODS: A total of 15084 children and adolescents were examined in a randomly selected population of Tehran city, Iran, between October 2017 and December 2018. Anthropometric values and blood pressure measurements were also assessed. A standard 12-lead electrocardiogram was recorded by a unique recorder, and those were examined by electrophysiologists. RESULTS: All students mean age was 12.3 ± 3.1 years (6-18 years), and 52% were boys. A total of 2900 students (192.2/1000 persons; 95% confidence interval 186-198.6) had electrocardiographic abnormalities. The rate of electrocardiographic abnormalities was higher in boys than girls (p < 0.001). Electrocardiographic abnormalities were significantly higher in thin than obese students (p < 0.001), and there was a trend towards hypertensive individuals to have more electrocardiographic abnormalities compared to normotensive individuals (p = 0.063). Based on the multivariable analysis, individuals with electrocardiographic abnormalities were less likely to be girls (odds ratio 0.745, 95% confidence interval 0.682-0.814) and had a lower body mass index (odds ratio 0.961, 95% confidence interval 0.944-0.979). CONCLUSIONS: In this large-scale study, there was a high prevalence of electrocardiographic abnormalities among young population. In addition, electrocardiographic findings were significantly influenced by increasing age, sex, obesity, and blood pressure levels. This community-based study revealed the implications of electrocardiographic screening to improve the care delivery by early detection.
Asunto(s)
Presión Sanguínea , Electrocardiografía , Humanos , Irán/epidemiología , Masculino , Femenino , Adolescente , Niño , Prevalencia , Presión Sanguínea/fisiología , Hipertensión/epidemiología , Estudios Transversales , Obesidad Infantil/epidemiología , Arritmias Cardíacas/epidemiología , Arritmias Cardíacas/diagnóstico , Índice de Masa Corporal , Factores de RiesgoRESUMEN
PURPOSE: Image artefacts continue to pose challenges in clinical molecular imaging, resulting in misdiagnoses, additional radiation doses to patients and financial costs. Mismatch and halo artefacts occur frequently in gallium-68 (68Ga)-labelled compounds whole-body PET/CT imaging. Correcting for these artefacts is not straightforward and requires algorithmic developments, given that conventional techniques have failed to address them adequately. In the current study, we employed differential privacy-preserving federated transfer learning (FTL) to manage clinical data sharing and tackle privacy issues for building centre-specific models that detect and correct artefacts present in PET images. METHODS: Altogether, 1413 patients with 68Ga prostate-specific membrane antigen (PSMA)/DOTA-TATE (TOC) PET/CT scans from 3 countries, including 8 different centres, were enrolled in this study. CT-based attenuation and scatter correction (CT-ASC) was used in all centres for quantitative PET reconstruction. Prior to model training, an experienced nuclear medicine physician reviewed all images to ensure the use of high-quality, artefact-free PET images (421 patients' images). A deep neural network (modified U2Net) was trained on 80% of the artefact-free PET images to utilize centre-based (CeBa), centralized (CeZe) and the proposed differential privacy FTL frameworks. Quantitative analysis was performed in 20% of the clean data (with no artefacts) in each centre. A panel of two nuclear medicine physicians conducted qualitative assessment of image quality, diagnostic confidence and image artefacts in 128 patients with artefacts (256 images for CT-ASC and FTL-ASC). RESULTS: The three approaches investigated in this study for 68Ga-PET imaging (CeBa, CeZe and FTL) resulted in a mean absolute error (MAE) of 0.42 ± 0.21 (CI 95%: 0.38 to 0.47), 0.32 ± 0.23 (CI 95%: 0.27 to 0.37) and 0.28 ± 0.15 (CI 95%: 0.25 to 0.31), respectively. Statistical analysis using the Wilcoxon test revealed significant differences between the three approaches, with FTL outperforming CeBa and CeZe (p-value < 0.05) in the clean test set. The qualitative assessment demonstrated that FTL-ASC significantly improved image quality and diagnostic confidence and decreased image artefacts, compared to CT-ASC in 68Ga-PET imaging. In addition, mismatch and halo artefacts were successfully detected and disentangled in the chest, abdomen and pelvic regions in 68Ga-PET imaging. CONCLUSION: The proposed approach benefits from using large datasets from multiple centres while preserving patient privacy. Qualitative assessment by nuclear medicine physicians showed that the proposed model correctly addressed two main challenging artefacts in 68Ga-PET imaging. This technique could be integrated in the clinic for 68Ga-PET imaging artefact detection and disentanglement using multicentric heterogeneous datasets.
Asunto(s)
Tomografía Computarizada por Tomografía de Emisión de Positrones , Neoplasias de la Próstata , Masculino , Humanos , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Artefactos , Radioisótopos de Galio , Privacidad , Tomografía de Emisión de Positrones/métodos , Aprendizaje Automático , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
BACKGROUND: Patients' rights are integral to medical ethics. This study aimed to perform sentiment analysis and opinion mining on patients' messages by a combination of lexicon-based and machine learning methods to identify positive or negative comments and to determine the different ward and staff names mentioned in patients' messages. METHODS: The level of satisfaction and observance of the rights of 250 service recipients of the hospital was evaluated through the related checklists by the evaluator. In total, 822 Persian messages, composed of 540 negative and 282 positive comments, were collected and labeled by the evaluator. Pre-processing was performed on the messages and followed by 2 feature vectors which were extracted from the messages, including the term frequency-inverse document frequency (TFIDF) vector and a combination of the multifeature (MF) (a lexicon-based method) and TFIDF (MF + TFIDF) vectors. Six feature selectors and 5 classifiers were used in this study. For the evaluations, 5-fold cross-validation with different metrics including area under the receiver operating characteristic curve (AUC), accuracy (ACC), F1 score, sensitivity (SEN), specificity (SPE) and Precision-Recall Curves (PRC) were reported. Message tag detection, which featured different hospital wards and identified staff names mentioned in the study patients' messages, was implemented by the lexicon-based method. RESULTS: The best classifier was Multinomial Naïve Bayes in combination with MF + TFIDF feature vector and SelectFromModel (SFM) feature selection (ACC = 0.89 ± 0.03, AUC = 0.87 ± 0.03, F1 = 0.92 ± 0.03, SEN = 0.93 ± 0.04, and SPE = 0.82 ± 0.02, PRC-AUC = 0.97). Two methods of assessment by the evaluator and artificial intelligence as well as survey systems were compared. CONCLUSION: Our results demonstrated that the lexicon-based method, in combination with machine learning classifiers, could extract sentiments in patients' comments and classify them into positive and negative categories. We also developed an online survey system to analyze patients' satisfaction in different wards and to remove conventional assessments by the evaluator.
Asunto(s)
Inteligencia Artificial , Satisfacción del Paciente , Humanos , Teorema de Bayes , Aprendizaje Automático , Curva ROCRESUMEN
PURPOSE: Glioblastoma Multiforme (GBM) represents the predominant aggressive primary tumor of the brain with short overall survival (OS) time. We aim to assess the potential of radiomic features in predicting the time-to-event OS of patients with GBM using machine learning (ML) algorithms. MATERIALS AND METHODS: One hundred nineteen patients with GBM, who had T1-weighted contrast-enhanced and T2-FLAIR MRI sequences, along with clinical data and survival time, were enrolled. Image preprocessing methods included 64 bin discretization, Laplacian of Gaussian (LOG) filters with three Sigma values and eight variations of Wavelet Transform. Images were then segmented, followed by the extraction of 1212 radiomic features. Seven feature selection (FS) methods and six time-to-event ML algorithms were utilized. The combination of preprocessing, FS, and ML algorithms (12 × 7 × 6 = 504 models) was evaluated by multivariate analysis. RESULTS: Our multivariate analysis showed that the best prognostic FS/ML combinations are the Mutual Information (MI)/Cox Boost, MI/Generalized Linear Model Boosting (GLMB) and MI/Generalized Linear Model Network (GLMN), all of which were done via the LOG (Sigma = 1 mm) preprocessing method (C-index = 0.77). The LOG filter with Sigma = 1 mm preprocessing method, MI, GLMB and GLMN achieved significantly higher C-indices than other preprocessing, FS, and ML methods (all p values < 0.05, mean C-indices of 0.65, 0.70, and 0.64, respectively). CONCLUSION: ML algorithms are capable of predicting the time-to-event OS of patients using MRI-based radiomic and clinical features. MRI-based radiomics analysis in combination with clinical variables might appear promising in assisting clinicians in the survival prediction of patients with GBM. Further research is needed to establish the applicability of radiomics in the management of GBM in the clinic.
Asunto(s)
Neoplasias Encefálicas , Glioblastoma , Humanos , Glioblastoma/patología , Imagen por Resonancia Magnética/métodos , Encéfalo/patología , Pronóstico , Proteínas Adaptadoras Transductoras de SeñalesRESUMEN
Heart failure caused by iron deposits in the myocardium is the primary cause of mortality in beta-thalassemia major patients. Cardiac magnetic resonance imaging (CMRI) T2* is the primary screening technique used to detect myocardial iron overload, but inherently bears some limitations. In this study, we aimed to differentiate beta-thalassemia major patients with myocardial iron overload from those without myocardial iron overload (detected by T2*CMRI) based on radiomic features extracted from echocardiography images and machine learning (ML) in patients with normal left ventricular ejection fraction (LVEF > 55%) in echocardiography. Out of 91 cases, 44 patients with thalassemia major with normal LVEF (> 55%) and T2* ≤ 20 ms and 47 people with LVEF > 55% and T2* > 20 ms as the control group were included in the study. Radiomic features were extracted for each end-systolic (ES) and end-diastolic (ED) image. Then, three feature selection (FS) methods and six different classifiers were used. The models were evaluated using various metrics, including the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Maximum relevance-minimum redundancy-eXtreme gradient boosting (MRMR-XGB) (AUC = 0.73, ACC = 0.73, SPE = 0.73, SEN = 0.73), ANOVA-MLP (AUC = 0.69, ACC = 0.69, SPE = 0.56, SEN = 0.83), and recursive feature elimination-K-nearest neighbors (RFE-KNN) (AUC = 0.65, ACC = 0.65, SPE = 0.64, SEN = 0.65) were the best models in ED, ES, and ED&ES datasets. Using radiomic features extracted from echocardiographic images and ML, it is feasible to predict cardiac problems caused by iron overload.
Asunto(s)
Sobrecarga de Hierro , Talasemia , Disfunción Ventricular Izquierda , Talasemia beta , Humanos , Talasemia beta/complicaciones , Talasemia beta/diagnóstico por imagen , Volumen Sistólico , Función Ventricular Izquierda , Talasemia/complicaciones , Talasemia/diagnóstico por imagen , Miocardio , Ecocardiografía/métodos , Sobrecarga de Hierro/complicaciones , Sobrecarga de Hierro/diagnóstico por imagen , Imagen por Resonancia Magnética/métodos , Disfunción Ventricular Izquierda/etiología , Disfunción Ventricular Izquierda/complicacionesRESUMEN
A U-shaped contraction pattern was shown to be associated with a better Cardiac resynchronization therapy (CRT) response. The main goal of this study is to automatically recognize left ventricular contractile patterns using machine learning algorithms trained on conventional quantitative features (ConQuaFea) and radiomic features extracted from Gated single-photon emission computed tomography myocardial perfusion imaging (GSPECT MPI). Among 98 patients with standard resting GSPECT MPI included in this study, 29 received CRT therapy and 69 did not (also had CRT inclusion criteria but did not receive treatment yet at the time of data collection, or refused treatment). A total of 69 non-CRT patients were employed for training, and the 29 were employed for testing. The models were built utilizing features from three distinct feature sets (ConQuaFea, radiomics, and ConQuaFea + radiomics (combined)), which were chosen using Recursive feature elimination (RFE) feature selection (FS), and then trained using seven different machine learning (ML) classifiers. In addition, CRT outcome prediction was assessed by different treatment inclusion criteria as the study's final phase. The MLP classifier had the highest performance among ConQuaFea models (AUC, SEN, SPE = 0.80, 0.85, 0.76). RF achieved the best performance in terms of AUC, SEN, and SPE with values of 0.65, 0.62, and 0.68, respectively, among radiomic models. GB and RF approaches achieved the best AUC, SEN, and SPE values of 0.78, 0.92, and 0.63 and 0.74, 0.93, and 0.56, respectively, among the combined models. A promising outcome was obtained when using radiomic and ConQuaFea from GSPECT MPI to detect left ventricular contractile patterns by machine learning.
Asunto(s)
Imagen de Perfusión Miocárdica , Humanos , Tomografía Computarizada de Emisión de Fotón Único , Aprendizaje Automático , Algoritmos , PerfusiónRESUMEN
BACKGROUND: The aim of this work was to assess the robustness of cardiac SPECT radiomic features against changes in imaging settings, including acquisition, and reconstruction parameters. METHODS: Four commercial SPECT and SPECT/CT cameras were used to acquire images of a static cardiac phantom mimicking typical myorcardial perfusion imaging using 185 MBq of 99mTc. The effects of different image acquisition and reconstruction parameters, including number of views, view matrix size, attenuation correction, as well as image reconstruction related parameters (algorithm, number of iterations, number of subsets, type of post-reconstruction filter, and its associated parameters, including filter order and cut-off frequency) were studied. In total, 5,063 transverse views were reconstructed by varying the aforementioned factors. Eighty-seven radiomic features including first-, second-, and high-order textures were extracted from these images. To assess reproducibility and repeatability, the coefficient of variation (COV), as a widely adopted metric, was measured for each of the radiomic features over the different imaging settings. RESULTS: The Inverse Difference Moment Normalized (IDMN) and Inverse Difference Normalized (IDN) features from the Gray Level Co-occurrence Matrix (GLCM), Run Percentage (RP) from the Gray Level Co-occurrence Matrix (GLRLM), Zone Entropy (ZE) from the Gray Level Size Zone Matrix (GLSZM), and Dependence Entropy (DE) from the Gray Level Dependence Matrix (GLDM) feature sets were the only features that exhibited high reproducibility (COV ≤ 5%) against changes in all imaging settings. In addition, Large Area Low Gray Level Emphasis (LALGLE), Small Area Low Gray Level Emphasis (SALGLE) and Low Gray Level Zone Emphasis (LGLZE) from GLSZM, and Small Dependence Low Gray Level Emphasis (SDLGLE) from GLDM feature sets turned out to be less reproducible (COV > 20%) against changes in imaging settings. The GLRLM (31.88%) and GLDM feature set (54.2%) had the highest (COV < 5%) and lowest (COV > 20%) number of the reproducible features, respectively. Matrix size had the largest impact on feature variability as most of the features were not repeatable when matrix size was modified with 82.8% of them having a COV > 20%. CONCLUSION: The repeatability and reproducibility of SPECT/CT cardiac radiomic features under different imaging settings is feature-dependent. Different image acquisition and reconstruction protocols have variable effects on radiomic features. The radiomic features exhibiting low COV are potential candidates for future clinical studies.
Asunto(s)
Técnicas de Imagen Cardíaca/métodos , Procesamiento de Imagen Asistido por Computador , Fantasmas de Imagen , Tomografía Computarizada de Emisión de Fotón Único/métodos , Humanos , Reproducibilidad de los Resultados , Tomografía Computarizada por Tomografía Computarizada de Emisión de Fotón ÚnicoRESUMEN
OBJECTIVE: We demonstrate the feasibility of direct generation of attenuation and scatter-corrected images from uncorrected images (PET-nonASC) using deep residual networks in whole-body 18F-FDG PET imaging. METHODS: Two- and three-dimensional deep residual networks using 2D successive slices (DL-2DS), 3D slices (DL-3DS) and 3D patches (DL-3DP) as input were constructed to perform joint attenuation and scatter correction on uncorrected whole-body images in an end-to-end fashion. We included 1150 clinical whole-body 18F-FDG PET/CT studies, among which 900, 100 and 150 patients were randomly partitioned into training, validation and independent validation sets, respectively. The images generated by the proposed approach were assessed using various evaluation metrics, including the root-mean-squared-error (RMSE) and absolute relative error (ARE %) using CT-based attenuation and scatter-corrected (CTAC) PET images as reference. PET image quantification variability was also assessed through voxel-wise standardized uptake value (SUV) bias calculation in different regions of the body (head, neck, chest, liver-lung, abdomen and pelvis). RESULTS: Our proposed attenuation and scatter correction (Deep-JASC) algorithm provided good image quality, comparable with those produced by CTAC. Across the 150 patients of the independent external validation set, the voxel-wise REs (%) were - 1.72 ± 4.22%, 3.75 ± 6.91% and - 3.08 ± 5.64 for DL-2DS, DL-3DS and DL-3DP, respectively. Overall, the DL-2DS approach led to superior performance compared with the other two 3D approaches. The brain and neck regions had the highest and lowest RMSE values between Deep-JASC and CTAC images, respectively. However, the largest ARE was observed in the chest (15.16 ± 3.96%) and liver/lung (11.18 ± 3.23%) regions for DL-2DS. DL-3DS and DL-3DP performed slightly better in the chest region, leading to AREs of 11.16 ± 3.42% and 11.69 ± 2.71%, respectively (p value < 0.05). The joint histogram analysis resulted in correlation coefficients of 0.985, 0.980 and 0.981 for DL-2DS, DL-3DS and DL-3DP approaches, respectively. CONCLUSION: This work demonstrated the feasibility of direct attenuation and scatter correction of whole-body 18F-FDG PET images using emission-only data via a deep residual network. The proposed approach achieved accurate attenuation and scatter correction without the need for anatomical images, such as CT and MRI. The technique is applicable in a clinical setting on standalone PET or PET/MRI systems. Nevertheless, Deep-JASC showing promising quantitative accuracy, vulnerability to noise was observed, leading to pseudo hot/cold spots and/or poor organ boundary definition in the resulting PET images.
Asunto(s)
Fluorodesoxiglucosa F18 , Tomografía Computarizada por Tomografía de Emisión de Positrones , Humanos , Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética , Tomografía de Emisión de Positrones , Tomografía Computarizada por Rayos XRESUMEN
PURPOSE: To identify optimal classification methods for computed tomography (CT) radiomics-based preoperative prediction of clear cell renal cell carcinoma (ccRCC) grade. MATERIALS AND METHODS: Seventy-one ccRCC patients (31 low grade and 40 high grade) were included in this study. Tumors were manually segmented on CT images followed by the application of three image preprocessing techniques (Laplacian of Gaussian, wavelet filter, and discretization of the intensity values) on delineated tumor volumes. Overall, 2530 radiomics features (tumor shape and size, intensity statistics, and texture) were extracted from each segmented tumor volume. Univariate analysis was performed to assess the association between each feature and the histological condition. Multivariate analysis involved the use of machine learning (ML) algorithms and the following three feature selection algorithms: the least absolute shrinkage and selection operator, Student's t test, and minimum Redundancy Maximum Relevance. These selected features were then used to construct three classification models (SVM, random forest, and logistic regression) to discriminate high from low-grade ccRCC at nephrectomy. Lastly, multivariate model performance was evaluated on the bootstrapped validation cohort using the area under the receiver operating characteristic curve (AUC) metric. RESULTS: The univariate analysis demonstrated that among the different image sets, 128 bin-discretized images have statistically significant different texture parameters with a mean AUC of 0.74 ± 3 (q value < 0.05). The three ML-based classifiers showed proficient discrimination between high and low-grade ccRCC. The AUC was 0.78 for logistic regression, 0.62 for random forest, and 0.83 for the SVM model, respectively. CONCLUSION: CT radiomic features can be considered as a useful and promising noninvasive methodology for preoperative evaluation of ccRCC Fuhrman grades.
Asunto(s)
Carcinoma de Células Renales/diagnóstico por imagen , Carcinoma de Células Renales/patología , Neoplasias Renales/diagnóstico por imagen , Neoplasias Renales/patología , Aprendizaje Automático , Interpretación de Imagen Radiográfica Asistida por Computador/métodos , Tomografía Computarizada por Rayos X/métodos , Femenino , Humanos , Masculino , Persona de Mediana Edad , Clasificación del TumorRESUMEN
BACKGROUND: Contrast-enhanced computed tomography (CECT) provides much more information compared to non-enhanced CT images, especially for the differentiation of malignancies, such as liver carcinomas. Contrast media injection phase information is usually missing on public datasets and not standardized in the clinic even in the same region and language. This is a barrier to effective use of available CECT images in clinical research. PURPOSE: The aim of this study is to detect contrast media injection phase from CT images by means of organ segmentation and machine learning algorithms. METHODS: A total number of 2509 CT images split into four subsets of non-contrast (class #0), arterial (class #1), venous (class #2), and delayed (class #3) after contrast media injection were collected from two CT scanners. Seven organs including the liver, spleen, heart, kidneys, lungs, urinary bladder, and aorta along with body contour masks were generated by pre-trained deep learning algorithms. Subsequently, five first-order statistical features including average, standard deviation, 10, 50, and 90 percentiles extracted from the above-mentioned masks were fed to machine learning models after feature selection and reduction to classify the CT images in one of four above mentioned classes. A 10-fold data split strategy was followed. The performance of our methodology was evaluated in terms of classification accuracy metrics. RESULTS: The best performance was achieved by Boruta feature selection and RF model with average area under the curve of more than 0.999 and accuracy of 0.9936 averaged over four classes and 10 folds. Boruta feature selection selected all predictor features. The lowest classification was observed for class #2 (0.9888), which is already an excellent result. In the 10-fold strategy, only 33 cases from 2509 cases (â¼1.4%) were misclassified. The performance over all folds was consistent. CONCLUSIONS: We developed a fast, accurate, reliable, and explainable methodology to classify contrast media phases which may be useful in data curation and annotation in big online datasets or local datasets with non-standard or no series description. Our model containing two steps of deep learning and machine learning may help to exploit available datasets more effectively.
Asunto(s)
Automatización , Medios de Contraste , Procesamiento de Imagen Asistido por Computador , Aprendizaje Automático , Tomografía Computarizada por Rayos X , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Radiografía Abdominal , Abdomen/diagnóstico por imagenRESUMEN
BACKGROUND: This study aimed to investigate the value of clinical, radiomic features extracted from gross tumor volumes (GTVs) delineated on CT images, dose distributions (Dosiomics), and fusion of CT and dose distributions to predict outcomes in head and neck cancer (HNC) patients. METHODS: A cohort of 240 HNC patients from five different centers was obtained from The Cancer Imaging Archive. Seven strategies, including four non-fusion (Clinical, CT, Dose, DualCT-Dose), and three fusion algorithms (latent low-rank representation referred (LLRR),Wavelet, weighted least square (WLS)) were applied. The fusion algorithms were used to fuse the pre-treatment CT images and 3-dimensional dose maps. Overall, 215 radiomics and Dosiomics features were extracted from the GTVs, alongside with seven clinical features incorporated. Five feature selection (FS) methods in combination with six machine learning (ML) models were implemented. The performance of the models was quantified using the concordance index (CI) in one-center-leave-out 5-fold cross-validation for overall survival (OS) prediction considering the time-to-event. RESULTS: The mean CI and Kaplan-Meier curves were used for further comparisons. The CoxBoost ML model using the Minimal Depth (MD) FS method and the glmnet model using the Variable hunting (VH) FS method showed the best performance with CI = 0.73 ± 0.15 for features extracted from LLRR fused images. In addition, both glmnet-Cindex and Coxph-Cindex classifiers achieved a CI of 0.72 ± 0.14 by employing the dose images (+ incorporated clinical features) only. CONCLUSION: Our results demonstrated that clinical features, Dosiomics and fusion of dose and CT images by specific ML-FS models could predict the overall survival of HNC patients with acceptable accuracy. Besides, the performance of ML methods among the three different strategies was almost comparable.
Asunto(s)
Neoplasias de Cabeza y Cuello , Radiómica , Humanos , Pronóstico , Neoplasias de Cabeza y Cuello/diagnóstico por imagen , Neoplasias de Cabeza y Cuello/radioterapia , Aprendizaje Automático , Tomografía Computarizada por Rayos XRESUMEN
PURPOSE: Non-small cell lung cancer is the most common subtype of lung cancer. Patient survival prediction using machine learning (ML) and radiomics analysis proved to provide promising outcomes. However, most studies reported in the literature focused on information extracted from malignant lesions. This study aims to explore the relevance and additional value of information extracted from healthy organs in addition to tumoral tissue using ML algorithms. PATIENTS AND METHODS: This study included PET/CT images of 154 patients collected from available online databases. The gross tumor volume and 33 volumes of interest defined on healthy organs were segmented using nnU-Net deep learning-based segmentation. Subsequently, 107 radiomic features were extracted from PET and CT images (Organomics). Clinical information was combined with PET and CT radiomics from organs and gross tumor volumes considering 19 different combinations of inputs. Finally, different feature selection (FS; 5 methods) and ML (6 algorithms) algorithms were tested in a 3-fold data split cross-validation scheme. The performance of the models was quantified in terms of the concordance index (C-index) metric. RESULTS: For an input combination of all radiomics information, most of the selected features belonged to PET Organomics and CT Organomics. The highest C-index (0.68) was achieved using univariate C-index FS method and random survival forest ML model using CT Organomics + PET Organomics as input as well as minimum depth FS method and CoxPH ML model using PET Organomics as input. Considering all 17 combinations with C-index higher than 0.65, Organomics from PET or CT images were used as input in 16 of them. CONCLUSIONS: The selected features and C-indices demonstrated that the additional information extracted from healthy organs of both PET and CT imaging modalities improved the ML performance. Organomics could be a step toward exploiting the whole information available from multimodality medical images, contributing to the emerging field of digital twins in health care.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Aprendizaje Automático , Tomografía Computarizada por Tomografía de Emisión de Positrones , Humanos , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico por imagen , Carcinoma de Pulmón de Células no Pequeñas/patología , Neoplasias Pulmonares/diagnóstico por imagen , Masculino , Pronóstico , Femenino , Anciano , Persona de Mediana Edad , Procesamiento de Imagen Asistido por Computador/métodos , Anciano de 80 o más Años , Adulto , RadiómicaRESUMEN
This study investigated the impact of ComBat harmonization on the reproducibility of radiomic features extracted from magnetic resonance images (MRI) acquired on different scanners, using various data acquisition parameters and multiple image pre-processing techniques using a dedicated MRI phantom. Four scanners were used to acquire an MRI of a nonanatomic phantom as part of the TCIA RIDER database. In fast spin-echo inversion recovery (IR) sequences, several inversion durations were employed, including 50, 100, 250, 500, 750, 1000, 1500, 2000, 2500, and 3000 ms. In addition, a 3D fast spoiled gradient recalled echo (FSPGR) sequence was used to investigate several flip angles (FA): 2, 5, 10, 15, 20, 25, and 30 degrees. Nineteen phantom compartments were manually segmented. Different approaches were used to pre-process each image: Bin discretization, Wavelet filter, Laplacian of Gaussian, logarithm, square, square root, and gradient. Overall, 92 first-, second-, and higher-order statistical radiomic features were extracted. ComBat harmonization was also applied to the extracted radiomic features. Finally, the Intraclass Correlation Coefficient (ICC) and Kruskal-Wallis's (KW) tests were implemented to assess the robustness of radiomic features. The number of non-significant features in the KW test ranged between 0-5 and 29-74 for various scanners, 31-91 and 37-92 for three times tests, 0-33 to 34-90 for FAs, and 3-68 to 65-89 for IRs before and after ComBat harmonization, with different image pre-processing techniques, respectively. The number of features with ICC over 90% ranged between 0-8 and 6-60 for various scanners, 11-75 and 17-80 for three times tests, 3-83 to 9-84 for FAs, and 3-49 to 3-63 for IRs before and after ComBat harmonization, with different image pre-processing techniques, respectively. The use of various scanners, IRs, and FAs has a great impact on radiomic features. However, the majority of scanner-robust features is also robust to IR and FA. Among the effective parameters in MR images, several tests in one scanner have a negligible impact on radiomic features. Different scanners and acquisition parameters using various image pre-processing might affect radiomic features to a large extent. ComBat harmonization might significantly impact the reproducibility of MRI radiomic features.
Asunto(s)
Procesamiento de Imagen Asistido por Computador , Imagen por Resonancia Magnética , Fantasmas de Imagen , Imagen por Resonancia Magnética/métodos , Imagen por Resonancia Magnética/normas , Reproducibilidad de los Resultados , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , RadiómicaRESUMEN
BACKGROUND: Coronary artery disease (CAD) has one of the highest mortality rates in humans worldwide. Single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) provides clinicians with myocardial metabolic information non-invasively. However, there are some limitations to interpreting SPECT images performed by physicians or automatic quantitative approaches. Radiomics analyzes images objectively by extracting quantitative features and can potentially reveal biological characteristics that the human eye cannot detect. However, the reproducibility and repeatability of some radiomic features can be highly susceptible to segmentation and imaging conditions. PURPOSE: We aimed to assess the reproducibility of radiomic features extracted from uncorrected MPI-SPECT images reconstructed with 15 different settings before and after ComBat harmonization, along with evaluating the effectiveness of ComBat in realigning feature distributions. MATERIALS AND METHODS: A total of 200 patients (50% normal and 50% abnormal) including rest and stress (without attenuation and scatter corrections) MPI-SPECT images were included. Images were reconstructed using 15 combinations of filter cut-off frequencies, filter orders, filter types, reconstruction algorithms, number of iterations and subsets resulting in 6000 images. Image segmentation was performed on the left ventricle in the first reconstruction for each patient and applied to 14 others. A total of 93 radiomic features were extracted from the segmented area, and ComBat was used to harmonize them. The intraclass correlation coefficient (ICC) and overall concordance correlation coefficient (OCCC) tests were performed before and after ComBat to examine the impact of each parameter on feature robustness and to assess harmonization efficiency. The ANOVA and the Kruskal-Wallis tests were performed to evaluate the effectiveness of ComBat in correcting feature distributions. In addition, the Student's t-test, Wilcoxon rank-sum, and signed-rank tests were implemented to assess the significance level of the impacts made by each parameter of different batches and patient groups (normal vs. abnormal) on radiomic features. RESULTS: Before applying ComBat, the majority of features (ICC: 82, OCCC: 61) achieved high reproducibility (ICC/OCCC ≥ 0.900) under every batch except Reconstruction. The largest and smallest number of poor features (ICC/OCCC < 0.500) were obtained by IterationSubset and Order batches, respectively. The most reliable features were from the first-order (FO) and gray-level co-occurrence matrix (GLCM) families. Following harmonization, the minimum number of robust features increased (ICC: 84, OCCC: 78). Applying ComBat showed that Order and Reconstruction were the least and the most responsive batches, respectively. The most robust families, in a descending order, were found to be FO, neighborhood gray-tone difference matrix (NGTDM), GLCM, gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and gray-level dependence matrix (GLDM) under Cut-off, Filter, and Order batches. The Wilcoxon rank-sum test showed that the number of robust features significantly differed under most batches in the Normal and Abnormal groups. CONCLUSION: The majority of radiomic features show high levels of robustness across different OSEM reconstruction parameters in uncorrected MPI-SPECT. ComBat is effective in realigning feature distributions and enhancing radiomic features reproducibility.
RESUMEN
INTRODUCTION: We propose a fully automated framework to conduct a region-wise image quality assessment (IQA) on whole-body 18 F-FDG PET scans. This framework (1) can be valuable in daily clinical image acquisition procedures to instantly recognize low-quality scans for potential rescanning and/or image reconstruction, and (2) can make a significant impact in dataset collection for the development of artificial intelligence-driven 18 F-FDG PET analysis models by rejecting low-quality images and those presenting with artifacts, toward building clean datasets. PATIENTS AND METHODS: Two experienced nuclear medicine physicians separately evaluated the quality of 174 18 F-FDG PET images from 87 patients, for each body region, based on a 5-point Likert scale. The body regisons included the following: (1) the head and neck, including the brain, (2) the chest, (3) the chest-abdomen interval (diaphragmatic region), (4) the abdomen, and (5) the pelvis. Intrareader and interreader reproducibility of the quality scores were calculated using 39 randomly selected scans from the dataset. Utilizing a binarized classification, images were dichotomized into low-quality versus high-quality for physician quality scores ≤3 versus >3, respectively. Inputting the 18 F-FDG PET/CT scans, our proposed fully automated framework applies 2 deep learning (DL) models on CT images to perform region identification and whole-body contour extraction (excluding extremities), then classifies PET regions as low and high quality. For classification, 2 mainstream artificial intelligence-driven approaches, including machine learning (ML) from radiomic features and DL, were investigated. All models were trained and evaluated on scores attributed by each physician, and the average of the scores reported. DL and radiomics-ML models were evaluated on the same test dataset. The performance evaluation was carried out on the same test dataset for radiomics-ML and DL models using the area under the curve, accuracy, sensitivity, and specificity and compared using the Delong test with P values <0.05 regarded as statistically significant. RESULTS: In the head and neck, chest, chest-abdomen interval, abdomen, and pelvis regions, the best models achieved area under the curve, accuracy, sensitivity, and specificity of [0.97, 0.95, 0.96, and 0.95], [0.85, 0.82, 0.87, and 0.76], [0.83, 0.76, 0.68, and 0.80], [0.73, 0.72, 0.64, and 0.77], and [0.72, 0.68, 0.70, and 0.67], respectively. In all regions, models revealed highest performance, when developed on the quality scores with higher intrareader reproducibility. Comparison of DL and radiomics-ML models did not show any statistically significant differences, though DL models showed overall improved trends. CONCLUSIONS: We developed a fully automated and human-perceptive equivalent model to conduct region-wise IQA over 18 F-FDG PET images. Our analysis emphasizes the necessity of developing separate models for body regions and performing data annotation based on multiple experts' consensus in IQA studies.
Asunto(s)
Fluorodesoxiglucosa F18 , Procesamiento de Imagen Asistido por Computador , Tomografía de Emisión de Positrones , Humanos , Tomografía de Emisión de Positrones/normas , Tomografía de Emisión de Positrones/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Automatización , Masculino , Femenino , Persona de Mediana Edad , Control de Calidad , Anciano , Imagen de Cuerpo Entero , AdultoRESUMEN
PURPOSE: This study aimed to examine the robustness of positron emission tomography (PET) radiomic features extracted via different segmentation methods before and after ComBat harmonization in patients with non-small cell lung cancer (NSCLC). METHODS: We included 120 patients (positive recurrence = 46 and negative recurrence = 74) referred for PET scanning as a routine part of their care. All patients had a biopsy-proven NSCLC. Nine segmentation methods were applied to each image, including manual delineation, K-means (KM), watershed, fuzzy-C-mean, region-growing, local active contour (LAC), and iterative thresholding (IT) with 40, 45, and 50% thresholds. Diverse image discretizations, both without a filter and with different wavelet decompositions, were applied to PET images. Overall, 6741 radiomic features were extracted from each image (749 radiomic features from each segmented area). Non-parametric empirical Bayes (NPEB) ComBat harmonization was used to harmonize the features. Linear Support Vector Classifier (LinearSVC) with L1 regularization For feature selection and Support Vector Machine classifier (SVM) with fivefold nested cross-validation was performed using StratifiedKFold with 'n_splits' set to 5 to predict recurrence in NSCLC patients and assess the impact of ComBat harmonization on the outcome. RESULTS: From 749 extracted radiomic features, 206 (27%) and 389 (51%) features showed excellent reliability (ICC ≥ 0.90) against segmentation method variation before and after NPEB ComBat harmonization, respectively. Among all, 39 features demonstrated poor reliability, which declined to 10 after ComBat harmonization. The 64 fixed bin widths (without any filter) and wavelets (LLL)-based radiomic features set achieved the best performance in terms of robustness against diverse segmentation techniques before and after ComBat harmonization. The first-order and GLRLM and also first-order and NGTDM feature families showed the largest number of robust features before and after ComBat harmonization, respectively. In terms of predicting recurrence in NSCLC, our findings indicate that using ComBat harmonization can significantly enhance machine learning outcomes, particularly improving the accuracy of watershed segmentation, which initially had fewer reliable features than manual contouring. Following the application of ComBat harmonization, the majority of cases saw substantial increase in sensitivity and specificity. CONCLUSION: Radiomic features are vulnerable to different segmentation methods. ComBat harmonization might be considered a solution to overcome the poor reliability of radiomic features.
Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Procesamiento de Imagen Asistido por Computador , Neoplasias Pulmonares , Humanos , Carcinoma de Pulmón de Células no Pequeñas/diagnóstico por imagen , Neoplasias Pulmonares/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Masculino , Femenino , Persona de Mediana Edad , Anciano , Tomografía de Emisión de Positrones/métodos , Máquina de Vectores de Soporte , Adulto , RadiómicaRESUMEN
BACKGROUND: PET/CT images combining anatomic and metabolic data provide complementary information that can improve clinical task performance. PET image segmentation algorithms exploiting the multi-modal information available are still lacking. PURPOSE: Our study aimed to assess the performance of PET and CT image fusion for gross tumor volume (GTV) segmentations of head and neck cancers (HNCs) utilizing conventional, deep learning (DL), and output-level voting-based fusions. METHODS: The current study is based on a total of 328 histologically confirmed HNCs from six different centers. The images were automatically cropped to a 200 × 200 head and neck region box, and CT and PET images were normalized for further processing. Eighteen conventional image-level fusions were implemented. In addition, a modified U2-Net architecture as DL fusion model baseline was used. Three different input, layer, and decision-level information fusions were used. Simultaneous truth and performance level estimation (STAPLE) and majority voting to merge different segmentation outputs (from PET and image-level and network-level fusions), that is, output-level information fusion (voting-based fusions) were employed. Different networks were trained in a 2D manner with a batch size of 64. Twenty percent of the dataset with stratification concerning the centers (20% in each center) were used for final result reporting. Different standard segmentation metrics and conventional PET metrics, such as SUV, were calculated. RESULTS: In single modalities, PET had a reasonable performance with a Dice score of 0.77 ± 0.09, while CT did not perform acceptably and reached a Dice score of only 0.38 ± 0.22. Conventional fusion algorithms obtained a Dice score range of [0.76-0.81] with guided-filter-based context enhancement (GFCE) at the low-end, and anisotropic diffusion and Karhunen-Loeve transform fusion (ADF), multi-resolution singular value decomposition (MSVD), and multi-level image decomposition based on latent low-rank representation (MDLatLRR) at the high-end. All DL fusion models achieved Dice scores of 0.80. Output-level voting-based models outperformed all other models, achieving superior results with a Dice score of 0.84 for Majority_ImgFus, Majority_All, and Majority_Fast. A mean error of almost zero was achieved for all fusions using SUVpeak , SUVmean and SUVmedian . CONCLUSION: PET/CT information fusion adds significant value to segmentation tasks, considerably outperforming PET-only and CT-only methods. In addition, both conventional image-level and DL fusions achieve competitive results. Meanwhile, output-level voting-based fusion using majority voting of several algorithms results in statistically significant improvements in the segmentation of HNC.
Asunto(s)
Neoplasias de Cabeza y Cuello , Tomografía Computarizada por Tomografía de Emisión de Positrones , Humanos , Tomografía Computarizada por Tomografía de Emisión de Positrones/métodos , Algoritmos , Neoplasias de Cabeza y Cuello/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
The current study aimed to predict lymphovascular invasion (LVI) using multiple machine learning algorithms and multi-segmentation positron emission tomography (PET) radiomics in non-small cell lung cancer (NSCLC) patients, offering new avenues for personalized treatment strategies and improving patient outcomes. One hundred and twenty-six patients with NSCLC were enrolled in this study. Various automated and semi-automated PET image segmentation methods were applied, including Local Active Contour (LAC), Fuzzy-C-mean (FCM), K-means (KM), Watershed, Region Growing (RG), and Iterative thresholding (IT) with different percentages of the threshold. One hundred five radiomic features were extracted from each region of interest (ROI). Multiple feature selection methods, including Minimum Redundancy Maximum Relevance (MRMR), Recursive Feature Elimination (RFE), and Boruta, and multiple classifiers, including Multilayer Perceptron (MLP), Logistic Regression (LR), XGBoost (XGB), Naive Bayes (NB), and Random Forest (RF), were employed. Synthetic Minority Oversampling Technique (SMOTE) was also used to determine if it boosts the area under the ROC curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Our results indicated that the combination of SMOTE, IT (with 45% threshold), RFE feature selection and LR classifier showed the best performance (AUC = 0.93, ACC = 0.84, SEN = 0.85, SPE = 0.84) followed by SMOTE, FCM segmentation, MRMR feature selection, and LR classifier (AUC = 0.92, ACC = 0.87, SEN = 1, SPE = 0.84). The highest ACC belonged to the IT segmentation (with 45 and 50% thresholds) alongside Boruta feature selection and the NB classifier without SMOTE (ACC = 0.9, AUC = 0.78 and 0.76, SEN = 0.7, and SPE = 0.94, respectively). Our results indicate that selection of appropriate segmentation method and machine learning algorithm may be helpful in successful prediction of LVI in patients with NSCLC with high accuracy using PET radiomics analysis.
RESUMEN
BACKGROUND: Notwithstanding the encouraging results of previous studies reporting on the efficiency of deep learning (DL) in COVID-19 prognostication, clinical adoption of the developed methodology still needs to be improved. To overcome this limitation, we set out to predict the prognosis of a large multi-institutional cohort of patients with COVID-19 using a DL-based model. PURPOSE: This study aimed to evaluate the performance of deep privacy-preserving federated learning (DPFL) in predicting COVID-19 outcomes using chest CT images. METHODS: After applying inclusion and exclusion criteria, 3055 patients from 19 centers, including 1599 alive and 1456 deceased, were enrolled in this study. Data from all centers were split (randomly with stratification respective to each center and class) into a training/validation set (70%/10%) and a hold-out test set (20%). For the DL model, feature extraction was performed on 2D slices, and averaging was performed at the final layer to construct a 3D model for each scan. The DensNet model was used for feature extraction. The model was developed using centralized and FL approaches. For FL, we employed DPFL approaches. Membership inference attack was also evaluated in the FL strategy. For model evaluation, different metrics were reported in the hold-out test sets. In addition, models trained in two scenarios, centralized and FL, were compared using the DeLong test for statistical differences. RESULTS: The centralized model achieved an accuracy of 0.76, while the DPFL model had an accuracy of 0.75. Both the centralized and DPFL models achieved a specificity of 0.77. The centralized model achieved a sensitivity of 0.74, while the DPFL model had a sensitivity of 0.73. A mean AUC of 0.82 and 0.81 with 95% confidence intervals of (95% CI: 0.79-0.85) and (95% CI: 0.77-0.84) were achieved by the centralized model and the DPFL model, respectively. The DeLong test did not prove statistically significant differences between the two models (p-value = 0.98). The AUC values for the inference attacks fluctuate between 0.49 and 0.51, with an average of 0.50 ± 0.003 and 95% CI for the mean AUC of 0.500 to 0.501. CONCLUSION: The performance of the proposed model was comparable to centralized models while operating on large and heterogeneous multi-institutional datasets. In addition, the model was resistant to inference attacks, ensuring the privacy of shared data during the training process.