RESUMO
Significant advances in artificial intelligence (AI) over the past decade potentially may lead to dramatic effects on clinical practice. Digitized histology represents an area ripe for AI implementation. We describe several current needs within the world of gastrointestinal histopathology, and outline, using currently studied models, how AI potentially can address them. We also highlight pitfalls as AI makes inroads into clinical practice.
Assuntos
Inteligência Artificial , Gastroenteropatias , Humanos , Gastroenteropatias/patologia , Gastroenteropatias/diagnóstico , Trato Gastrointestinal/patologia , Histocitoquímica/métodosRESUMO
OBJECTIVE: To develop a whole-body low-dose CT (WBLDCT) deep learning model and determine its accuracy in predicting the presence of cytogenetic abnormalities in multiple myeloma (MM). MATERIALS AND METHODS: WBLDCTs of MM patients performed within a year of diagnosis were included. Cytogenetic assessments of clonal plasma cells via fluorescent in situ hybridization (FISH) were used to risk-stratify patients as high-risk (HR) or standard-risk (SR). Presence of any of del(17p), t(14;16), t(4;14), and t(14;20) on FISH was defined as HR. The dataset was evenly divided into five groups (folds) at the individual patient level for model training. Mean and standard deviation (SD) of the area under the receiver operating curve (AUROC) across the folds were recorded. RESULTS: One hundred fifty-one patients with MM were included in the study. The model performed best for t(4;14), mean (SD) AUROC of 0.874 (0.073). The lowest AUROC was observed for trisomies: AUROC of 0.717 (0.058). Two- and 5-year survival rates for HR cytogenetics were 87% and 71%, respectively, compared to 91% and 79% for SR cytogenetics. Survival predictions by the WBLDCT deep learning model revealed 2- and 5-year survival rates for patients with HR cytogenetics as 87% and 71%, respectively, compared to 92% and 81% for SR cytogenetics. CONCLUSION: A deep learning model trained on WBLDCT scans predicted the presence of cytogenetic abnormalities used for risk stratification in MM. Assessment of the model's performance revealed good to excellent classification of the various cytogenetic abnormalities.
RESUMO
BACKGROUND: Revision total hip arthroplasty (THA) requires preoperatively identifying in situ implants, a time-consuming and sometimes unachievable task. Although deep learning (DL) tools have been attempted to automate this process, existing approaches are limited by classifying few femoral and zero acetabular components, only classify on anterior-posterior (AP) radiographs, and do not report prediction uncertainty or flag outlier data. METHODS: This study introduces Total Hip Arhtroplasty Automated Implant Detector (THA-AID), a DL tool trained on 241,419 radiographs that identifies common designs of 20 femoral and 8 acetabular components from AP, lateral, or oblique views and reports prediction uncertainty using conformal prediction and outlier detection using a custom framework. We evaluated THA-AID using internal, external, and out-of-domain test sets and compared its performance with human experts. RESULTS: THA-AID achieved internal test set accuracies of 98.9% for both femoral and acetabular components with no significant differences based on radiographic view. The femoral classifier also achieved 97.0% accuracy on the external test set. Adding conformal prediction increased true label prediction by 0.1% for acetabular and 0.7 to 0.9% for femoral components. More than 99% of out-of-domain and >89% of in-domain outlier data were correctly identified by THA-AID. CONCLUSIONS: The THA-AID is an automated tool for implant identification from radiographs with exceptional performance on internal and external test sets and no decrement in performance based on radiographic view. Importantly, this is the first study in orthopedics to our knowledge including uncertainty quantification and outlier detection of a DL model.
Assuntos
Artroplastia de Quadril , Aprendizado Profundo , Prótese de Quadril , Humanos , Incerteza , Acetábulo/cirurgia , Estudos RetrospectivosRESUMO
In recent years, deep learning (DL) has shown impressive performance in radiologic image analysis. However, for a DL model to be useful in a real-world setting, its confidence in a prediction must also be known. Each DL model's output has an estimated probability, and these estimated probabilities are not always reliable. Uncertainty represents the trustworthiness (validity) of estimated probabilities. The higher the uncertainty, the lower the validity. Uncertainty quantification (UQ) methods determine the uncertainty level of each prediction. Predictions made without UQ methods are generally not trustworthy. By implementing UQ in medical DL models, users can be alerted when a model does not have enough information to make a confident decision. Consequently, a medical expert could reevaluate the uncertain cases, which would eventually lead to gaining more trust when using a model. This review focuses on recent trends using UQ methods in DL radiologic image analysis within a conceptual framework. Also discussed in this review are potential applications, challenges, and future directions of UQ in DL radiologic image analysis.
Assuntos
Aprendizado Profundo , Radiologia , Humanos , Incerteza , Processamento de Imagem Assistida por ComputadorRESUMO
BACKGROUND: Whole-body low-dose CT is the recommended initial imaging modality to evaluate bone destruction as a result of multiple myeloma. Accurate interpretation of these scans to detect small lytic bone lesions is time intensive. A functional deep learning) algorithm to detect lytic lesions on CTs could improve the value of these CTs for myeloma imaging. Our objectives were to develop a DL algorithm and determine its performance at detecting lytic lesions of multiple myeloma. METHODS: Axial slices (2-mm section thickness) from whole-body low-dose CT scans of subjects with biochemically confirmed plasma cell dyscrasias were included in the study. Data were split into train and test sets at the patient level targeting a 90%/10% split. Two musculoskeletal radiologists annotated lytic lesions on the images with bounding boxes. Subsequently, we developed a two-step deep learning model comprising bone segmentation followed by lesion detection. Unet and "You Look Only Once" (YOLO) models were used as bone segmentation and lesion detection algorithms, respectively. Diagnostic performance was determined using the area under the receiver operating characteristic curve (AUROC). RESULTS: Forty whole-body low-dose CTs from 40 subjects yielded 2193 image slices. A total of 5640 lytic lesions were annotated. The two-step model achieved a sensitivity of 91.6% and a specificity of 84.6%. Lesion detection AUROC was 90.4%. CONCLUSION: We developed a deep learning model that detects lytic bone lesions of multiple myeloma on whole-body low-dose CTs with high performance. External validation is required prior to widespread adoption in clinical practice.
Assuntos
Aprendizado Profundo , Mieloma Múltiplo , Osteólise , Humanos , Mieloma Múltiplo/diagnóstico por imagem , Mieloma Múltiplo/patologia , Algoritmos , Tomografia Computadorizada por Raios X/métodosRESUMO
BACKGROUND: In this work, we applied and validated an artificial intelligence technique known as generative adversarial networks (GANs) to create large volumes of high-fidelity synthetic anteroposterior (AP) pelvis radiographs that can enable deep learning (DL)-based image analyses, while ensuring patient privacy. METHODS: AP pelvis radiographs with native hips were gathered from an institutional registry between 1998 and 2018. The data was used to train a model to create 512 × 512 pixel synthetic AP pelvis images. The network was trained on 25 million images produced through augmentation. A set of 100 random images (50/50 real/synthetic) was evaluated by 3 orthopaedic surgeons and 2 radiologists to discern real versus synthetic images. Two models (joint localization and segmentation) were trained using synthetic images and tested on real images. RESULTS: The final model was trained on 37,640 real radiographs (16,782 patients). In a computer assessment of image fidelity, the final model achieved an "excellent" rating. In a blinded review of paired images (1 real, 1 synthetic), orthopaedic surgeon reviewers were unable to correctly identify which image was synthetic (accuracy = 55%, Kappa = 0.11), highlighting synthetic image fidelity. The synthetic and real images showed equivalent performance when they were assessed by established DL models. CONCLUSION: This work shows the ability to use a DL technique to generate a large volume of high-fidelity synthetic pelvis images not discernible from real imaging by computers or experts. These images can be used for cross-institutional sharing and model pretraining, further advancing the performance of DL models without risk to patient data safety. LEVEL OF EVIDENCE: Level III.
Assuntos
Aprendizado Profundo , Humanos , Inteligência Artificial , Privacidade , Processamento de Imagem Assistida por Computador/métodos , Pelve/diagnóstico por imagemRESUMO
Glioblastoma (GBM) is the most common primary malignant brain tumor in adults. The standard treatment for GBM consists of surgical resection followed by concurrent chemoradiotherapy and adjuvant temozolomide. O-6-methylguanine-DNA methyltransferase (MGMT) promoter methylation status is an important prognostic biomarker that predicts the response to temozolomide and guides treatment decisions. At present, the only reliable way to determine MGMT promoter methylation status is through the analysis of tumor tissues. Considering the complications of the tissue-based methods, an imaging-based approach is preferred. This study aimed to compare three different deep learning-based approaches for predicting MGMT promoter methylation status. We obtained 576 T2WI with their corresponding tumor masks, and MGMT promoter methylation status from, The Brain Tumor Segmentation (BraTS) 2021 datasets. We developed three different models: voxel-wise, slice-wise, and whole-brain. For voxel-wise classification, methylated and unmethylated MGMT tumor masks were made into 1 and 2 with 0 background, respectively. We converted each T2WI into 32 × 32 × 32 patches. We trained a 3D-Vnet model for tumor segmentation. After inference, we constructed the whole brain volume based on the patch's coordinates. The final prediction of MGMT methylation status was made by majority voting between the predicted voxel values of the biggest connected component. For slice-wise classification, we trained an object detection model for tumor detection and MGMT methylation status prediction, then for final prediction, we used majority voting. For the whole-brain approach, we trained a 3D Densenet121 for prediction. Whole-brain, slice-wise, and voxel-wise, accuracy was 65.42% (SD 3.97%), 61.37% (SD 1.48%), and 56.84% (SD 4.38%), respectively.
Assuntos
Neoplasias Encefálicas , Aprendizado Profundo , Glioblastoma , Adulto , Humanos , Glioblastoma/diagnóstico por imagem , Glioblastoma/genética , Glioblastoma/patologia , Temozolomida/uso terapêutico , Neoplasias Encefálicas/diagnóstico por imagem , Neoplasias Encefálicas/genética , Neoplasias Encefálicas/patologia , Metilação de DNA , Encéfalo/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , O(6)-Metilguanina-DNA Metiltransferase/genética , Metilases de Modificação do DNA/genética , Proteínas Supressoras de Tumor/genética , Enzimas Reparadoras do DNA/genéticaRESUMO
Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword "Deep Learning" and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.
Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Humanos , Estudos Transversais , Reprodutibilidade dos Testes , AlgoritmosRESUMO
BACKGROUND AND AIMS: The risk of progression in Barrett's esophagus (BE) increases with development of dysplasia. There is a critical need to improve the diagnosis of BE dysplasia, given substantial interobserver disagreement among expert pathologists and overdiagnosis of dysplasia by community pathologists. We developed a deep learning model to predict dysplasia grade on whole-slide imaging. METHODS: We digitized nondysplastic BE (NDBE), low-grade dysplasia (LGD), and high-grade dysplasia (HGD) histology slides. Two expert pathologists confirmed all histology and digitally annotated areas of dysplasia. Training, validation, and test sets were created (by a random 70/20/10 split). We used an ensemble approach combining a "you only look once" model to identify regions of interest and histology class (NDBE, LGD, or HGD) followed by a ResNet101 model pretrained on ImageNet applied to the regions of interest. Diagnostic performance was determined for the whole slide. RESULTS: We included slides from 542 patients (164 NDBE, 226 LGD, and 152 HGD) yielding 8596 bounding boxes in the training set, 1946 bounding boxes in the validation set, and 840 boxes in the test set. When the ensemble model was used, sensitivity and specificity for LGD was 81.3% and 100%, respectively, and >90% for NDBE and HGD. The overall positive predictive value and sensitivity metric (calculated as F1 score) was .91 for NDBE, .90 for LGD, and 1.0 for HGD. CONCLUSIONS: We successfully trained and validated a deep learning model to accurately identify dysplasia on whole-slide images. This model can potentially help improve the histologic diagnosis of BE dysplasia and the appropriate application of endoscopic therapy.
Assuntos
Adenocarcinoma , Esôfago de Barrett , Aprendizado Profundo , Neoplasias Esofágicas , Humanos , Esôfago de Barrett/diagnóstico , Esôfago de Barrett/patologia , Neoplasias Esofágicas/patologia , Adenocarcinoma/patologia , Progressão da Doença , HiperplasiaRESUMO
INTRODUCTION: Glioblastomas (GBMs) are highly aggressive tumors. A common clinical challenge after standard of care treatment is differentiating tumor progression from treatment-related changes, also known as pseudoprogression (PsP). Usually, PsP resolves or stabilizes without further treatment or a course of steroids, whereas true progression (TP) requires more aggressive management. Differentiating PsP from TP will affect the patient's outcome. This study investigated using deep learning to distinguish PsP MRI features from progressive disease. METHOD: We included GBM patients with a new or increasingly enhancing lesion within the original radiation field. We labeled those who subsequently were stable or improved on imaging and clinically as PsP and those with clinical and imaging deterioration as TP. A subset of subjects underwent a second resection. We labeled these subjects as PsP, or TP based on the histological diagnosis. We coregistered contrast-enhanced T1 MRIs with T2-weighted images for each patient and used them as input to a 3-D Densenet121 model and using five-fold cross-validation to predict TP vs PsP. RESULT: We included 124 patients who met the criteria, and of those, 63 were PsP and 61 were TP. We trained a deep learning model that achieved 76.4% (range 70-84%, SD 5.122) mean accuracy over the 5 folds, 0.7560 (range 0.6553-0.8535, SD 0.069) mean AUROCC, 88.72% (SD 6.86) mean sensitivity, and 62.05% (SD 9.11) mean specificity. CONCLUSION: We report the development of a deep learning model that distinguishes PsP from TP in GBM patients treated per the Stupp protocol. Further refinement and external validation are required prior to widespread adoption in clinical practice.
Assuntos
Neoplasias Encefálicas , Aprendizado Profundo , Glioblastoma , Progressão da Doença , Humanos , Imageamento por Ressonância Magnética , Estudos RetrospectivosRESUMO
OBJECTIVE: To evaluate the performance of the automated abstract screening tool Rayyan. METHODS: The records obtained from the search for three systematic reviews were manually screened in four stages. At the end of each stage, Rayyan was used to predict the eligibility score for the remaining records. At two different thresholds (≤2.5 and < 2.5 for exclusion of a record) Rayyan-generated ratings were compared with the decisions made by human reviewers in the manual screening process and the tool's accuracy metrics were calculated. RESULTS: Two thousand fifty-four records were screened manually, of which 379 were judged to be eligible for full-text assessment, and 112 were eventually included in the final review. For finding records eligible for full-text assessment, at the threshold of < 2.5 for exclusion, Rayyan managed to achieve sensitivity values of 97-99% with specificity values of 19-58%, while at the threshold of ≤2.5 for exclusion it had a specificity of 100% with sensitivity values of 1-29%. For the task of finding eligible reports for inclusion in the final review, almost similar results were obtained. DISCUSSION: At the threshold of < 2.5 for exclusion, Rayyan managed to be a reliable tool for excluding ineligible records, but it was not much reliable for finding eligible records. We emphasize that this study was conducted on diagnostic test accuracy reviews, which are more difficult to screen due to inconsistent terminology.
Assuntos
Testes Diagnósticos de Rotina , Pesquisa , Atenção à Saúde , HumanosRESUMO
BACKGROUND: Stroke is one of the leading causes of disability worldwide. Recently, stroke prognosis estimation has received much attention. This study investigates the prognostic role of aspartate transaminase/alanine transaminase (De Ritis, AAR), alkaline phosphatase/alanine transaminase (ALP/ALT), and aspartate transaminase/alkaline phosphatase (AST/ALP) ratios in acute ischemic stroke (AIS). METHODS: This retrospective cohort study involved patients who experienced their first-ever AIS between September 2019 and June 2021. Clinical and laboratory data were collected within the first 24 hours after admission. Functional and mortality outcomes were evaluated 90 days after hospital discharge in clinical follow-up. Functional outcome was assessed by a modified Rankin Scale (mRS). The correlation between the laboratory data and study outcomes was evaluated using univariate analysis. In addition, regression models were developed to evaluate the predictive role of AST/ALP, ALP/ALT, and AAR ratios on the study outcomes. RESULTS: Two hundred seventy-seven patients (mean age 69.10 ± 13.55, 53.1% female) were included. According to univariate analysis, there was a weak association between 3-months mRS, and both AST/ALT (r = 0.222, P < 0.001), and AST/ALP (r = 0.164, P = 0.008). Subsequently, higher levels of these ratios and absolute values of AST, ALT, and ALP were reported in deceased patients. Based on regression models adjusted with co-variable (age, gender, underlying disease, and history of smoking) AST/ALT and AST/ALP ratios had a significant independent association with 3-month mRS (CI:1.37-4.52, p = 0.003, and CI: 4.45-11,547.32, p = 0.007, respectively) and mortality (CI: 0.17-1.06, adjusted R2 = 0.21, p = 0.007, and CI: 0.10-2.91, p = 0.035, adjusted R2 = 0.20, respectively). CONCLUSIONS: Elevated AST/ALP and AAR ratios at admission were correlated with poorer outcomes at 3 months in patients with first-ever AIS. Prospective studies in larger cohorts are required to confirm our findings and to evaluate further whether the AST/ALP and De Ritis ratios may represent a useful tool for determining the prognosis of AIS patients.
Assuntos
AVC Isquêmico , Acidente Vascular Cerebral , Humanos , Feminino , Masculino , AVC Isquêmico/diagnóstico , Fosfatase Alcalina , Alanina Transaminase , Estudos Prospectivos , Prognóstico , Estudos Retrospectivos , Aspartato Aminotransferases , Acidente Vascular Cerebral/diagnósticoRESUMO
Trustworthiness is crucial for artificial intelligence (AI) models in clinical settings, and a fundamental aspect of trustworthy AI is uncertainty quantification (UQ). Conformal prediction as a robust uncertainty quantification (UQ) framework has been receiving increasing attention as a valuable tool in improving model trustworthiness. An area of active research is the method of non-conformity score calculation for conformal prediction. We propose deep conformal supervision (DCS), which leverages the intermediate outputs of deep supervision for non-conformity score calculation, via weighted averaging based on the inverse of mean calibration error for each stage. We benchmarked our method on two publicly available datasets focused on medical image classification: a pneumonia chest radiography dataset and a preprocessed version of the 2019 RSNA Intracranial Hemorrhage dataset. Our method achieved mean coverage errors of 16e-4 (CI: 1e-4, 41e-4) and 5e-4 (CI: 1e-4, 10e-4) compared to baseline mean coverage errors of 28e-4 (CI: 2e-4, 64e-4) and 21e-4 (CI: 8e-4, 3e-4) on the two datasets, respectively (p < 0.001 on both datasets). Based on our findings, the baseline results of conformal prediction already exhibit small coverage errors. However, our method shows a significant improvement on coverage error, particularly noticeable in scenarios involving smaller datasets or when considering smaller acceptable error levels, which are crucial in developing UQ frameworks for healthcare AI applications.
RESUMO
OBJECTIVES: Research has demonstrated that chronic stress experienced early in life can lead to impairments in memory and learning. These deficits are attributed to an imbalance in the interaction between glucocorticoids, the end product of the hypothalamic-pituitary-adrenal (HPA) axis, and glucocorticoid receptors in brain regions responsible for mediating memory, such as the hippocampus. This imbalance can result in detrimental conditions like neuroinflammation. The aim of this study was to assess the impact of sumatriptan, a selective agonist for 5-HT 1B/1D receptors, on fear learning capabilities in a chronic social isolation stress model in mice, with a particular focus on the role of the HPA axis. METHODS: Mice were assigned to two opposing conditions, including social condition (SC) and isolated condition (IC) for a duration of five weeks. All mice underwent passive avoidance test, with their subsequent freezing behavior serving as an indicator of fear retrieval. Mice in the IC group were administered either a vehicle, sumatriptan, GR-127935 (a selective antagonist for 5-HT 1B/1D receptors), or a combination of sumatriptan and GR-127935 during the testing sessions. At the end, all mice were sacrificed and samples of their serum and hippocampus were collected for further analysis. RESULTS: Isolation was found to significantly reduce freezing behavior (p<0.001). An increase in the freezing response among IC mice was observed following the administration of varying doses of sumatriptan, as indicated by a one-way ANOVA analysis (p<0.001). However, the mitigating effects of sumatriptan were reversed upon the administration of GR-127935. An ELISA assay conducted before and after the passive avoidance test revealed no significant change in serum corticosterone levels among SC mice. In contrast, a significant increase was observed among IC mice, suggesting hyper-responsiveness of the HPA axis in isolated animals. This hyper-responsiveness was ameliorated following the administration of sumatriptan. Furthermore, both the sumatriptan and SC groups exhibited a similar trend, showing a significant increase in the expression of hippocampal glucocorticoid receptors following the stress of the passive avoidance test. Lastly, the elevated production of inflammatory cytokines (TNF-α, IL-1ß) observed following social isolation was attenuated in the sumatriptan group. CONCLUSION: Sumatriptan improved fear learning probably through modulation of HPA axis and hippocampus neuroinflammation.
Assuntos
Sistema Hipotálamo-Hipofisário , Sumatriptana , Camundongos , Animais , Sistema Hipotálamo-Hipofisário/metabolismo , Sumatriptana/farmacologia , Sumatriptana/metabolismo , Receptores de Glucocorticoides/metabolismo , Serotonina/metabolismo , Doenças Neuroinflamatórias , Sistema Hipófise-Suprarrenal/metabolismo , Corticosterona , Estresse Psicológico/metabolismo , Isolamento Social , MedoRESUMO
Introduction: Dual-energy CT (DECT) is a non-invasive way to determine the presence of monosodium urate (MSU) crystals in the workup of gout. Color-coding distinguishes MSU from calcium following material decomposition and post-processing. Manually identifying these foci (most commonly labeled green) is tedious, and an automated detection system could streamline the process. This study aims to evaluate the impact of a deep-learning (DL) algorithm developed for detecting green pixelations on DECT on reader time, accuracy, and confidence. Methods: We collected a sample of positive and negative DECTs, reviewed twice-once with and once without the DL tool-with a 2-week washout period. An attending musculoskeletal radiologist and a fellow separately reviewed the cases, simulating clinical workflow. Metrics such as time taken, confidence in diagnosis, and the tool's helpfulness were recorded and statistically analyzed. Results: We included thirty DECTs from different patients. The DL tool significantly reduced the reading time for the trainee radiologist (p = 0.02), but not for the attending radiologist (p = 0.15). Diagnostic confidence remained unchanged for both (p = 0.45). However, the DL model identified tiny MSU deposits that led to a change in diagnosis in two cases for the in-training radiologist and one case for the attending radiologist. In 3/3 of these cases, the diagnosis was correct when using DL. Conclusions: The implementation of the developed DL model slightly reduced reading time for our less experienced reader and led to improved diagnostic accuracy. There was no statistically significant difference in diagnostic confidence when studies were interpreted without and with the DL model.
RESUMO
BACKGROUND AND PURPOSE: Spontaneous intracranial hypotension is an increasingly recognized condition. Spontaneous intracranial hypotension is caused by a CSF leak, which is commonly related to a CSF-venous fistula. In patients with spontaneous intracranial hypotension, multiple intracranial abnormalities can be observed on brain MR imaging, including dural enhancement, "brain sag," and pituitary engorgement. This study seeks to create a deep learning model for the accurate diagnosis of CSF-venous fistulas via brain MR imaging. MATERIALS AND METHODS: A review of patients with clinically suspected spontaneous intracranial hypotension who underwent digital subtraction myelogram imaging preceded by brain MR imaging was performed. The patients were categorized as having a definite CSF-venous fistula, no fistula, or indeterminate findings on a digital subtraction myelogram. The data set was split into 5 folds at the patient level and stratified by label. A 5-fold cross-validation was then used to evaluate the reliability of the model. The predictive value of the model to identify patients with a CSF leak was assessed by using the area under the receiver operating characteristic curve for each validation fold. RESULTS: There were 129 patients were included in this study. The median age was 54 years, and 66 (51.2%) had a CSF-venous fistula. In discriminating between positive and negative cases for CSF-venous fistulas, the classifier demonstrated an average area under the receiver operating characteristic curve of 0.8668 with a standard deviation of 0.0254 across the folds. CONCLUSIONS: This study developed a deep learning model that can predict the presence of a spinal CSF-venous fistula based on brain MR imaging in patients with suspected spontaneous intracranial hypotension. However, further model refinement and external validation are necessary before clinical adoption. This research highlights the substantial potential of deep learning in diagnosing CSF-venous fistulas by using brain MR imaging.
Assuntos
Anormalidades Múltiplas , Aprendizado Profundo , Fístula , Hipotensão Intracraniana , Humanos , Pessoa de Meia-Idade , Encéfalo/diagnóstico por imagem , Vazamento de Líquido Cefalorraquidiano/diagnóstico por imagem , Vazamento de Líquido Cefalorraquidiano/complicações , Fístula/complicações , Hipotensão Intracraniana/complicações , Hipotensão Intracraniana/diagnóstico por imagem , Imageamento por Ressonância Magnética/métodos , Mielografia/métodos , Reprodutibilidade dos TestesRESUMO
Autism spectrum disorder (ASD) represents a panel of conditions that begin during the developmental period and result in impairments of personal, social, academic, or occupational functioning. Early diagnosis is directly related to a better prognosis. Unfortunately, the diagnosis of ASD requires a long and exhausting subjective process. We aimed to review the state of the art for automated autism diagnosis and recognition in this research. In February 2022, we searched multiple databases and sources of gray literature for eligible studies. We used an adapted version of the QUADAS-2 tool to assess the risk of bias in the studies. A brief report of the methods and results of each study is presented. Data were synthesized for each modality separately using the Split Component Synthesis (SCS) method. We assessed heterogeneity using the I 2 statistics and evaluated publication bias using trim and fill tests combined with ln DOR. Confidence in cumulative evidence was assessed using the GRADE approach for diagnostic studies. We included 344 studies from 186,020 participants (51,129 are estimated to be unique) for nine different modalities in this review, from which 232 reported sufficient data for meta-analysis. The area under the curve was in the range of 0.71-0.90 for all the modalities. The studies on EEG data provided the best accuracy, with the area under the curve ranging between 0.85 and 0.93. We found that the literature is rife with bias and methodological/reporting flaws. Recommendations are provided for future research to provide better studies and fill in the current knowledge gaps.
Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Humanos , Transtorno do Espectro Autista/diagnóstico , Inteligência ArtificialRESUMO
Background: Dual-energy CT (DECT) is a non-invasive way to determine the presence of monosodium urate (MSU) crystals in the workup of gout. Color-coding distinguishes MSU from calcium following material decomposition and post-processing. Most software labels MSU as green and calcium as blue. There are limitations in the current image processing methods of segmenting green-encoded pixels. Additionally, identifying green foci is tedious, and automated detection would improve workflow. This study aimed to determine the optimal deep learning (DL) algorithm for segmenting green-encoded pixels of MSU crystals on DECTs. Methods: DECT images of positive and negative gout cases were retrospectively collected. The dataset was split into train (N = 28) and held-out test (N = 30) sets. To perform cross-validation, the train set was split into seven folds. The images were presented to two musculoskeletal radiologists, who independently identified green-encoded voxels. Two 3D Unet-based DL models, Segresnet and SwinUNETR, were trained, and the Dice similarity coefficient (DSC), sensitivity, and specificity were reported as the segmentation metrics. Results: Segresnet showed superior performance, achieving a DSC of 0.9999 for the background pixels, 0.7868 for the green pixels, and an average DSC of 0.8934 for both types of pixels, respectively. According to the post-processed results, the Segresnet reached voxel-level sensitivity and specificity of 98.72 % and 99.98 %, respectively. Conclusion: In this study, we compared two DL-based segmentation approaches for detecting MSU deposits in a DECT dataset. The Segresnet resulted in superior performance metrics. The developed algorithm provides a potential fast, consistent, highly sensitive and specific computer-aided diagnosis tool. Ultimately, such an algorithm could be used by radiologists to streamline DECT workflow and improve accuracy in the detection of gout.
RESUMO
In recent years, the role of Artificial Intelligence (AI) in medical imaging has become increasingly prominent, with the majority of AI applications approved by the FDA being in imaging and radiology in 2023. The surge in AI model development to tackle clinical challenges underscores the necessity for preparing high-quality medical imaging data. Proper data preparation is crucial as it fosters the creation of standardized and reproducible AI models while minimizing biases. Data curation transforms raw data into a valuable, organized, and dependable resource and is a fundamental process to the success of machine learning and analytical projects. Considering the plethora of available tools for data curation in different stages, it is crucial to stay informed about the most relevant tools within specific research areas. In the current work, we propose a descriptive outline for different steps of data curation while we furnish compilations of tools collected from a survey applied among members of the Society of Imaging Informatics (SIIM) for each of these stages. This collection has the potential to enhance the decision-making process for researchers as they select the most appropriate tool for their specific tasks.
Assuntos
Inteligência Artificial , Diagnóstico por Imagem , Diagnóstico por Imagem/métodos , Humanos , Sociedades Médicas , Informática Médica/métodos , Inquéritos e Questionários , Curadoria de Dados/métodos , Aprendizado de MáquinaRESUMO
The application of deep learning (DL) in medicine introduces transformative tools with the potential to enhance prognosis, diagnosis, and treatment planning. However, ensuring transparent documentation is essential for researchers to enhance reproducibility and refine techniques. Our study addresses the unique challenges presented by DL in medical imaging by developing a comprehensive checklist using the Delphi method to enhance reproducibility and reliability in this dynamic field. We compiled a preliminary checklist based on a comprehensive review of existing checklists and relevant literature. A panel of 11 experts in medical imaging and DL assessed these items using Likert scales, with two survey rounds to refine responses and gauge consensus. We also employed the content validity ratio with a cutoff of 0.59 to determine item face and content validity. Round 1 included a 27-item questionnaire, with 12 items demonstrating high consensus for face and content validity that were then left out of round 2. Round 2 involved refining the checklist, resulting in an additional 17 items. In the last round, 3 items were deemed non-essential or infeasible, while 2 newly suggested items received unanimous agreement for inclusion, resulting in a final 26-item DL model reporting checklist derived from the Delphi process. The 26-item checklist facilitates the reproducible reporting of DL tools and enables scientists to replicate the study's results.