RESUMO
BACKGROUND AND PURPOSE: Artificial intelligence models in radiology are frequently developed and validated using data sets from a single institution and are rarely tested on independent, external data sets, raising questions about their generalizability and applicability in clinical practice. The American Society of Functional Neuroradiology (ASFNR) organized a multicenter artificial intelligence competition to evaluate the proficiency of developed models in identifying various pathologies on NCCT, assessing age-based normality and estimating medical urgency. MATERIALS AND METHODS: In total, 1201 anonymized, full-head NCCT clinical scans from 5 institutions were pooled to form the data set. The data set encompassed studies with normal findings as well as those with pathologies, including acute ischemic stroke, intracranial hemorrhage, traumatic brain injury, and mass effect (detection of these, task 1). NCCTs were also assessed to determine if findings were consistent with expected brain changes for the patient's age (task 2: age-based normality assessment) and to identify any abnormalities requiring immediate medical attention (task 3: evaluation of findings for urgent intervention). Five neuroradiologists labeled each NCCT, with consensus interpretations serving as the ground truth. The competition was announced online, inviting academic institutions and companies. Independent central analysis assessed the performance of each model. Accuracy, sensitivity, specificity, positive and negative predictive values, and receiver operating characteristic (ROC) curves were generated for each artificial intelligence model, along with the area under the ROC curve. RESULTS: Four teams processed 1177 studies. The median age of patients was 62 years, with an interquartile range of 33 years. Nineteen teams from various academic institutions registered for the competition. Of these, 4 teams submitted their final results. No commercial entities participated in the competition. For task 1, areas under the ROC curve ranged from 0.49 to 0.59. For task 2, two teams completed the task with area under the ROC curve values of 0.57 and 0.52. For task 3, teams had little-to-no agreement with the ground truth. CONCLUSIONS: To assess the performance of artificial intelligence models in real-world clinical scenarios, we analyzed their performance in the ASFNR Artificial Intelligence Competition. The first ASFNR Competition underscored the gap between expectation and reality; and the models largely fell short in their assessments. As the integration of artificial intelligence tools into clinical workflows increases, neuroradiologists must carefully recognize the capabilities, constraints, and consistency of these technologies. Before institutions adopt these algorithms, thorough validation is essential to ensure acceptable levels of performance in clinical settings.
Assuntos
Inteligência Artificial , Humanos , Masculino , Estados Unidos , Pessoa de Meia-Idade , Adulto , Feminino , Idoso , Tomografia Computadorizada por Raios X/métodos , Sociedades Médicas , Encefalopatias/diagnóstico por imagem , Sensibilidade e Especificidade , Reprodutibilidade dos Testes , Adulto JovemRESUMO
[This corrects the article DOI: 10.1117/1.JMI.9.1.016001.].
RESUMO
[This corrects the article DOI: 10.1093/noajnl/vdaa066.].
RESUMO
Purpose: Deep learning has shown promise for predicting the molecular profiles of gliomas using MR images. Prior to clinical implementation, ensuring robustness to real-world problems, such as patient motion, is crucial. The purpose of this study is to perform a preliminary evaluation on the effects of simulated motion artifact on glioma marker classifier performance and determine if motion correction can restore classification accuracies. Approach: T2w images and molecular information were retrieved from the TCIA and TCGA databases. Simulated motion was added in the k-space domain along the phase encoding direction. Classifier performance for IDH mutation, 1p/19q co-deletion, and MGMT methylation was assessed over the range of 0% to 100% corrupted k-space lines. Rudimentary motion correction networks were trained on the motion-corrupted images. The performance of the three glioma marker classifiers was then evaluated on the motion-corrected images. Results: Glioma marker classifier performance decreased markedly with increasing motion corruption. Applying motion correction effectively restored classification accuracy for even the most motion-corrupted images. For isocitrate dehydrogenase (IDH) classification, 99% accuracy was achieved, exceeding the original performance of the network and representing a new benchmark in non-invasive MRI-based IDH classification. Conclusions: Robust motion correction can facilitate highly accurate deep learning MRI-based molecular marker classification, rivaling invasive tissue-based characterization methods. Motion correction may be able to increase classification accuracy even in the absence of a visible artifact, representing a new strategy for boosting classifier performance.
RESUMO
BACKGROUND: One of the most important recent discoveries in brain glioma biology has been the identification of the isocitrate dehydrogenase (IDH) mutation and 1p/19q co-deletion status as markers for therapy and prognosis. 1p/19q co-deletion is the defining genomic marker for oligodendrogliomas and confers a better prognosis and treatment response than gliomas without it. Our group has previously developed a highly accurate deep-learning network for determining IDH mutation status using T2-weighted (T2w) MRI only. The purpose of this study was to develop a similar 1p/19q deep-learning classification network. METHODS: Multiparametric brain MRI and corresponding genomic information were obtained for 368 subjects from The Cancer Imaging Archive and The Cancer Genome Atlas. 1p/19 co-deletions were present in 130 subjects. Two-hundred and thirty-eight subjects were non-co-deleted. A T2w image-only network (1p/19q-net) was developed to perform 1p/19q co-deletion status classification and simultaneous single-label tumor segmentation using 3D-Dense-UNets. Three-fold cross-validation was performed to generalize the network performance. Receiver operating characteristic analysis was also performed. Dice scores were computed to determine tumor segmentation accuracy. RESULTS: 1p/19q-net demonstrated a mean cross-validation accuracy of 93.46% across the 3 folds (93.4%, 94.35%, and 92.62%, SD = 0.8) in predicting 1p/19q co-deletion status with a sensitivity and specificity of 0.90 ± 0.003 and 0.95 ± 0.01, respectively and a mean area under the curve of 0.95 ± 0.01. The whole tumor segmentation mean Dice score was 0.80 ± 0.007. CONCLUSION: We demonstrate high 1p/19q co-deletion classification accuracy using only T2w MR images. This represents an important milestone toward using MRI to predict glioma histology, prognosis, and response to treatment.
RESUMO
We developed a fully automated method for brain tumor segmentation using deep learning; 285 brain tumor cases with multiparametric magnetic resonance images from the BraTS2018 data set were used. We designed 3 separate 3D-Dense-UNets to simplify the complex multiclass segmentation problem into individual binary-segmentation problems for each subcomponent. We implemented a 3-fold cross-validation to generalize the network's performance. The mean cross-validation Dice-scores for whole tumor (WT), tumor core (TC), and enhancing tumor (ET) segmentations were 0.92, 0.84, and 0.80, respectively. We then retrained the individual binary-segmentation networks using 265 of the 285 cases, with 20 cases held-out for testing. We also tested the network on 46 cases from the BraTS2017 validation data set, 66 cases from the BraTS2018 validation data set, and 52 cases from an independent clinical data set. The average Dice-scores for WT, TC, and ET were 0.90, 0.84, and 0.80, respectively, on the 20 held-out testing cases. The average Dice-scores for WT, TC, and ET on the BraTS2017 validation data set, the BraTS2018 validation data set, and the clinical data set were as follows: 0.90, 0.80, and 0.78; 0.90, 0.82, and 0.80; and 0.85, 0.80, and 0.77, respectively. A fully automated deep learning method was developed to segment brain tumors into their subcomponents, which achieved high prediction accuracy on the BraTS data set and on the independent clinical data set. This method is promising for implementation into a clinical workflow.
Assuntos
Neoplasias Encefálicas , Aprendizado Profundo , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/genética , Humanos , Processamento de Imagem Assistida por Computador , Imageamento por Ressonância Magnética , Redes Neurais de ComputaçãoRESUMO
BACKGROUND: Isocitrate dehydrogenase (IDH) mutation status has emerged as an important prognostic marker in gliomas. Currently, reliable IDH mutation determination requires invasive surgical procedures. The purpose of this study was to develop a highly accurate, MRI-based, voxelwise deep-learning IDH classification network using T2-weighted (T2w) MR images and compare its performance to a multicontrast network. METHODS: Multiparametric brain MRI data and corresponding genomic information were obtained for 214 subjects (94 IDH-mutated, 120 IDH wild-type) from The Cancer Imaging Archive and The Cancer Genome Atlas. Two separate networks were developed, including a T2w image-only network (T2-net) and a multicontrast (T2w, fluid attenuated inversion recovery, and T1 postcontrast) network (TS-net) to perform IDH classification and simultaneous single label tumor segmentation. The networks were trained using 3D Dense-UNets. Three-fold cross-validation was performed to generalize the networks' performance. Receiver operating characteristic analysis was also performed. Dice scores were computed to determine tumor segmentation accuracy. RESULTS: T2-net demonstrated a mean cross-validation accuracy of 97.14% ± 0.04 in predicting IDH mutation status, with a sensitivity of 0.97 ± 0.03, specificity of 0.98 ± 0.01, and an area under the curve (AUC) of 0.98 ± 0.01. TS-net achieved a mean cross-validation accuracy of 97.12% ± 0.09, with a sensitivity of 0.98 ± 0.02, specificity of 0.97 ± 0.001, and an AUC of 0.99 ± 0.01. The mean whole tumor segmentation Dice scores were 0.85 ± 0.009 for T2-net and 0.89 ± 0.006 for TS-net. CONCLUSION: We demonstrate high IDH classification accuracy using only T2-weighted MR images. This represents an important milestone toward clinical translation.
Assuntos
Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/genética , Aprendizado Profundo , Glioma/diagnóstico por imagem , Glioma/genética , Isocitrato Desidrogenase/genética , Imageamento por Ressonância Magnética , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Sensibilidade e EspecificidadeRESUMO
Transcranial infrared laser stimulation (TILS) has shown effectiveness in improving human cognition and was investigated using broadband near-infrared spectroscopy (bb-NIRS) in our previous study, but the effect of laser heating on the actual bb-NIRS measurements was not investigated. To address this potential confounding factor, 11 human participants were studied. First, we measured time-dependent temperature increases on forehead skin using clinical-grade thermometers following the TILS experimental protocol used in our previous study. Second, a subject-averaged, time-dependent temperature alteration curve was obtained, based on which a heat generator was controlled to induce the same temperature increase at the same forehead location that TILS was delivered on each participant. Third, the same bb-NIRS system was employed to monitor hemodynamic and metabolic changes of forehead tissue near the thermal stimulation site before, during, and after the heat stimulation. The results showed that cytochrome-c-oxidase of forehead tissue was not significantly modified by this heat stimulation. Significant differences in oxyhemoglobin, total hemoglobin, and differential hemoglobin concentrations were observed during the heat stimulation period versus the laser stimulation. The study demonstrated a transient hemodynamic effect of heat-based stimulation distinct to that of TILS. We concluded that the observed effects of TILS on cerebral hemodynamics and metabolism are not induced by heating the skin.
RESUMO
Transcranial infrared laser stimulation (TILS) is a noninvasive form of brain photobiomulation. Cytochrome-c-oxidase (CCO), the terminal enzyme in the mitochondrial electron transport chain, is hypothesized to be the primary intracellular photoacceptor. We hypothesized that TILS up-regulates cerebral CCO and causes hemodynamic changes. We delivered 1064-nm laser stimulation to the forehead of healthy participants ( n = 11), while broadband near-infrared spectroscopy was utilized to acquire light reflectance from the TILS-treated cortical region before, during, and after TILS. Placebo experiments were also performed for accurate comparison. Time course of spectroscopic readings were analyzed and fitted to the modified Beer-Lambert law. With respect to the placebo readings, we observed (1) significant increases in cerebral concentrations of oxidized CCO (Δ[CCO]; >0.08 µM; p < 0.01), oxygenated hemoglobin (Δ[HbO]; >0.8 µM; p < 0.01), and total hemoglobin (Δ[HbT]; >0.5 µM; p < 0.01) during and after TILS, and (2) linear interplays between Δ[CCO] versus Δ[HbO] and between Δ[CCO] versus Δ[HbT]. Ratios of Δ[CCO]/Δ[HbO] and Δ[CCO]/Δ[HbT] were introduced as TILS-induced metabolic-hemodynamic coupling indices to quantify the coupling strength between TILS-enhanced cerebral metabolism and blood oxygen supply. This study provides the first demonstration that TILS causes up-regulation of oxidized CCO in the human brain, and contributes important insight into the physiological mechanisms.