RESUMO
Fairness of artificial intelligence and machine learning models, often caused by imbalanced datasets, has long been a concern. While many efforts aim to minimize model bias, this study suggests that traditional fairness evaluation methods may be biased, highlighting the need for a proper evaluation scheme with multiple evaluation metrics due to varying results under different criteria. Moreover, the limited data size of minority groups introduces significant data uncertainty, which can undermine the judgement of fairness. This paper introduces an innovative evaluation approach that estimates data uncertainty in minority groups through bootstrapping from majority groups for a more objective statistical assessment. Extensive experiments reveal that traditional evaluation methods might have drawn inaccurate conclusions about model fairness. The proposed method delivers an unbiased fairness assessment by adeptly addressing the inherent complications of model evaluation on imbalanced datasets. The results show that such comprehensive evaluation can provide more confidence when adopting those models.
RESUMO
Alzheimer's disease (AD) is the most prevalent neurodegenerative disease, yet its current treatments are limited to stopping disease progression. Moreover, the effectiveness of these treatments remains uncertain due to the heterogeneity of the disease. Therefore, it is essential to identify disease subtypes at a very early stage. Current data-driven approaches can be used to classify subtypes during later stages of AD or related disorders, but making predictions in the asymptomatic or prodromal stage is challenging. Furthermore, the classifications of most existing models lack explainability, and these models rely solely on a single modality for assessment, limiting the scope of their analysis. Thus, we propose a multimodal framework that utilizes early-stage indicators, including imaging, genetics, and clinical assessments, to classify AD patients into progression-specific subtypes at an early stage. In our framework, we introduce a tri-modal co-attention mechanism (Tri-COAT) to explicitly capture cross-modal feature associations. Data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) (slow progressing = 177, intermediate = 302, and fast = 15) were used to train and evaluate Tri-COAT using a 10-fold stratified cross-testing approach. Our proposed model outperforms baseline models and sheds light on essential associations across multimodal features supported by known biological mechanisms. The multimodal design behind Tri-COAT allows it to achieve the highest classification area under the receiver operating characteristic curve while simultaneously providing interpretability to the model predictions through the co-attention mechanism.
RESUMO
Purpose To determine whether saliency maps in radiology artificial intelligence (AI) are vulnerable to subtle perturbations of the input, which could lead to misleading interpretations, using prediction-saliency correlation (PSC) for evaluating the sensitivity and robustness of saliency methods. Materials and Methods In this retrospective study, locally trained deep learning models and a research prototype provided by a commercial vendor were systematically evaluated on 191 229 chest radiographs from the CheXpert dataset and 7022 MR images from a human brain tumor classification dataset. Two radiologists performed a reader study on 270 chest radiograph pairs. A model-agnostic approach for computing the PSC coefficient was used to evaluate the sensitivity and robustness of seven commonly used saliency methods. Results The saliency methods had low sensitivity (maximum PSC, 0.25; 95% CI: 0.12, 0.38) and weak robustness (maximum PSC, 0.12; 95% CI: 0.0, 0.25) on the CheXpert dataset, as demonstrated by leveraging locally trained model parameters. Further evaluation showed that the saliency maps generated from a commercial prototype could be irrelevant to the model output, without knowledge of the model specifics (area under the receiver operating characteristic curve decreased by 8.6% without affecting the saliency map). The human observer studies confirmed that it is difficult for experts to identify the perturbed images; the experts had less than 44.8% correctness. Conclusion Popular saliency methods scored low PSC values on the two datasets of perturbed chest radiographs, indicating weak sensitivity and robustness. The proposed PSC metric provides a valuable quantification tool for validating the trustworthiness of medical AI explainability. Keywords: Saliency Maps, AI Trustworthiness, Dynamic Consistency, Sensitivity, Robustness Supplemental material is available for this article. © RSNA, 2023 See also the commentary by Yanagawa and Sato in this issue.
Assuntos
Inteligência Artificial , Radiologia , Humanos , Estudos Retrospectivos , Radiografia , RadiologistasRESUMO
Fusing intraoperative 2-D ultrasound (US) frames with preoperative 3-D magnetic resonance (MR) images for guiding interventions has become the clinical gold standard in image-guided prostate cancer biopsy. However, developing an automatic image registration system for this application is challenging because of the modality gap between US/MR and the dimensionality gap between 2-D/3-D data. To overcome these challenges, we propose a novel US frame-to-volume registration (FVReg) pipeline to bridge the dimensionality gap between 2-D US frames and 3-D US volume. The developed pipeline is implemented using deep neural networks, which are fully automatic without requiring external tracking devices. The framework consists of three major components, including one) a frame-to-frame registration network (Frame2Frame) that estimates the current frame's 3-D spatial position based on previous video context, two) a frame-to-slice correction network (Frame2Slice) adjusting the estimated frame position using the 3-D US volumetric information, and three) a similarity filtering (SF) mechanism selecting the frame with the highest image similarity with the query frame. We validated our method on a clinical dataset with 618 subjects and tested its potential on real-time 2-D-US to 3-D-MR fusion navigation tasks. The proposed FVReg achieved an average target navigation error of 1.93 mm at 5-14 fps. Our source code is publicly available at https://github.com/DIAL-RPI/Frame-to-Volume-Registration.
Assuntos
Aprendizado Profundo , Neoplasias da Próstata , Masculino , Humanos , Imageamento Tridimensional/métodos , Ultrassonografia , Neoplasias da Próstata/diagnóstico por imagem , Neoplasias da Próstata/cirurgia , Redes Neurais de ComputaçãoRESUMO
In the past several years, various adversarial training (AT) approaches have been invented to robustify deep learning model against adversarial attacks. However, mainstream AT methods assume the training and testing data are drawn from the same distribution and the training data are annotated. When the two assumptions are violated, existing AT methods fail because either they cannot pass knowledge learnt from a source domain to an unlabeled target domain or they are confused by the adversarial samples in that unlabeled space. In this paper, we first point out this new and challenging problem-adversarial training in unlabeled target domain. We then propose a novel framework named Unsupervised Cross-domain Adversarial Training (UCAT) to address this problem. UCAT effectively leverages the knowledge of the labeled source domain to prevent the adversarial samples from misleading the training process, under the guidance of automatically selected high quality pseudo labels of the unannotated target domain data together with the discriminative and robust anchor representations of the source domain data. The experiments on four public benchmarks show that models trained with UCAT can achieve both high accuracy and strong robustness. The effectiveness of the proposed components is demonstrated through a large set of ablation studies. The source code is publicly available at https://github.com/DIAL-RPI/UCAT.
RESUMO
Transrectal ultrasound is commonly used for guiding prostate cancer biopsy, where 3D ultrasound volume reconstruction is often desired. Current methods for 3D reconstruction from freehand ultrasound scans require external tracking devices to provide spatial information of an ultrasound transducer. This paper presents a novel deep learning approach for sensorless ultrasound volume reconstruction, which efficiently exploits content correspondence between ultrasound frames to reconstruct 3D volumes without external tracking. The underlying deep learning model, deep contextual-contrastive network (DC 2-Net), utilizes self-attention to focus on the speckle-rich areas to estimate spatial movement and then minimizes a margin ranking loss for contrastive feature learning. A case-wise correlation loss over the entire input video helps further smooth the estimated trajectory. We train and validate DC 2-Net on two independent datasets, one containing 619 transrectal scans and the other having 100 transperineal scans. Our proposed approach attained superior performance compared with other methods, with a drift rate of 9.64 % and a prostate Dice of 0.89. The promising results demonstrate the capability of deep neural networks for universal ultrasound volume reconstruction from freehand 2D ultrasound scans without tracking information.
Assuntos
Imageamento Tridimensional , Redes Neurais de Computação , Masculino , Humanos , Imageamento Tridimensional/métodos , Ultrassonografia/métodos , Próstata/diagnóstico por imagem , MovimentoRESUMO
In the past few years, convolutional neural networks (CNNs) have been proven powerful in extracting image features crucial for medical image registration. However, challenging applications and recent advances in computer vision suggest that CNNs are limited in their ability to understand the spatial correspondence between features, which is at the core of image registration. The issue is further exaggerated when it comes to multi-modal image registration, where the appearances of input images can differ significantly. This paper presents a novel cross-modal attention mechanism for correlating features extracted from the multi-modal input images and mapping such correlation to image registration transformation. To efficiently train the developed network, a contrastive learning-based pre-training method is also proposed to aid the network in extracting high-level features across the input modalities for the following cross-modal attention learning. We validated the proposed method on transrectal ultrasound (TRUS) to magnetic resonance (MR) registration, a clinically important procedure that benefits prostate cancer biopsy. Our experimental results demonstrate that for MR-TRUS registration, a deep neural network embedded with the cross-modal attention block outperforms other advanced CNN-based networks with ten times its size. We also incorporated visualization techniques to improve the interpretability of our network, which helps bring insights into the deep learning based image registration methods. The source code of our work is available at https://github.com/DIAL-RPI/Attention-Reg.
Assuntos
Próstata , Neoplasias da Próstata , Humanos , Masculino , Próstata/diagnóstico por imagem , Redes Neurais de Computação , Imageamento por Ressonância Magnética/métodos , Neoplasias da Próstata/patologia , Ultrassonografia/métodosRESUMO
Significance: Functional near-infrared spectroscopy (fNIRS), a well-established neuroimaging technique, enables monitoring cortical activation while subjects are unconstrained. However, motion artifact is a common type of noise that can hamper the interpretation of fNIRS data. Current methods that have been proposed to mitigate motion artifacts in fNIRS data are still dependent on expert-based knowledge and the post hoc tuning of parameters. Aim: Here, we report a deep learning method that aims at motion artifact removal from fNIRS data while being assumption free. To the best of our knowledge, this is the first investigation to report on the use of a denoising autoencoder (DAE) architecture for motion artifact removal. Approach: To facilitate the training of this deep learning architecture, we (i) designed a specific loss function and (ii) generated data to mimic the properties of recorded fNIRS sequences. Results: The DAE model outperformed conventional methods in lowering residual motion artifacts, decreasing mean squared error, and increasing computational efficiency. Conclusion: Overall, this work demonstrates the potential of deep learning models for accurate and fast motion artifact removal in fNIRS data.
RESUMO
Parkinson's disease (PD) is the second most common neurodegenerative disease and presents a complex etiology with genomic and environmental factors and no recognized cures. Genotype data, such as single nucleotide polymorphisms (SNPs), could be used as a prodromal factor for early detection of PD. However, the polygenic nature of PD presents a challenge as the complex relationships between SNPs towards disease development are difficult to model. Traditional assessment methods such as polygenic risk scores and machine learning approaches struggle to capture the complex interactions present in the genotype data, thus limiting their discriminative capabilities in diagnosis. On the other hand, deep learning models are better suited for this task. Nevertheless, they encounter difficulties of their own such as a lack of interpretability. To overcome these limitations, in this work, a novel transformer encoder-based model is introduced to classify PD patients from healthy controls based on their genotype. This method is designed to effectively model complex global feature interactions and enable increased interpretability through the learned attention scores. The proposed framework outperformed traditional machine learning models and multilayer perceptron (MLP) baseline models. Moreover, visualization of the learned SNP-SNP associations provides not only interpretability to the model but also valuable insights into the biochemical pathways underlying PD development, which are corroborated by pathway enrichment analysis. Our results suggest novel SNP interactions to be further studied in wet lab and clinical settings.
RESUMO
Gait is a unique biometric feature that can be recognized at a distance; thus, it has broad applications in crime prevention, forensic identification, and social security. To portray a gait, existing gait recognition methods utilize either a gait template which makes it difficult to preserve temporal information, or a gait sequence that maintains unnecessary sequential constraints and thus loses the flexibility of gait recognition. In this paper, we present a novel perspective that utilizes gait as a deep set, which means that a set of gait frames are integrated by a global-local fused deep network inspired by the way our left- and right-hemisphere processes information to learn information that can be used in identification. Based on this deep set perspective, our method is immune to frame permutations, and can naturally integrate frames from different videos that have been acquired under different scenarios, such as diverse viewing angles, different clothes, or different item-carrying conditions. Experiments show that under normal walking conditions, our single-model method achieves an average rank-1 accuracy of 96.1 percent on the CASIA-B gait dataset and an accuracy of 87.9 percent on the OU-MVLP gait dataset. Under various complex scenarios, our model also exhibits a high level of robustness. It achieves accuracies of 90.8 and 70.3 percent on CASIA-B under bag-carrying and coat-wearing walking conditions respectively, significantly outperforming the best existing methods. Moreover, the proposed method maintains a satisfactory accuracy even when only small numbers of frames are available in the test samples; for example, it achieves 85.0 percent on CASIA-B even when using only 7 frames. The source code has been released at https://github.com/AbnerHqC/GaitSet.
Assuntos
Algoritmos , Aprendizado Profundo , Marcha , Software , CaminhadaRESUMO
Cancer patients have a higher risk of cardiovascular disease (CVD) mortality than the general population. Low dose computed tomography (LDCT) for lung cancer screening offers an opportunity for simultaneous CVD risk estimation in at-risk patients. Our deep learning CVD risk prediction model, trained with 30,286 LDCTs from the National Lung Cancer Screening Trial, achieves an area under the curve (AUC) of 0.871 on a separate test set of 2,085 subjects and identifies patients with high CVD mortality risks (AUC of 0.768). We validate our model against ECG-gated cardiac CT based markers, including coronary artery calcification (CAC) score, CAD-RADS score, and MESA 10-year risk score from an independent dataset of 335 subjects. Our work shows that, in high-risk patients, deep learning can convert LDCT for lung cancer screening into a dual-screening quantitative tool for CVD risk estimation.
Assuntos
Doenças Cardiovasculares/epidemiologia , Aprendizado Profundo , Processamento de Imagem Assistida por Computador/métodos , Neoplasias Pulmonares/diagnóstico , Programas de Rastreamento/estatística & dados numéricos , Adulto , Idoso , Idoso de 80 Anos ou mais , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/etiologia , Ensaios Clínicos como Assunto , Vasos Coronários/diagnóstico por imagem , Conjuntos de Dados como Assunto , Eletrocardiografia , Feminino , Seguimentos , Humanos , Pulmão/diagnóstico por imagem , Neoplasias Pulmonares/complicações , Masculino , Programas de Rastreamento/métodos , Pessoa de Meia-Idade , Curva ROC , Estudos Retrospectivos , Medição de Risco/métodos , Medição de Risco/estatística & dados numéricos , Fatores de Risco , Tomografia Computadorizada por Raios X/estatística & dados numéricosRESUMO
PURPOSE: Severity scoring is a key step in managing patients with COVID-19 pneumonia. However, manual quantitative analysis by radiologists is a time-consuming task, while qualitative evaluation may be fast but highly subjective. This study aims to develop artificial intelligence (AI)-based methods to quantify disease severity and predict COVID-19 patient outcome. METHODS: We develop an AI-based framework that employs deep neural networks to efficiently segment lung lobes and pulmonary opacities. The volume ratio of pulmonary opacities inside each lung lobe gives the severity scores of the lobes, which are then used to predict ICU admission and mortality with three different machine learning methods. The developed methods were evaluated on datasets from two hospitals (site A: Firoozgar Hospital, Iran, 105 patients; site B: Massachusetts General Hospital, USA, 88 patients). RESULTS: AI-based severity scores are strongly associated with those evaluated by radiologists (Spearman's rank correlation 0.837, [Formula: see text]). Using AI-based scores produced significantly higher ([Formula: see text]) area under the ROC curve (AUC) values. The developed AI method achieved the best performance of AUC = 0.813 (95% CI [0.729, 0.886]) in predicting ICU admission and AUC = 0.741 (95% CI [0.640, 0.837]) in mortality estimation on the two datasets. CONCLUSIONS: Accurate severity scores can be obtained using the developed AI methods over chest CT images. The computed severity scores achieved better performance than radiologists in predicting COVID-19 patient outcome by consistently quantifying image features. Such developed techniques of severity assessment may be extended to other lung diseases beyond the current pandemic.
Assuntos
Inteligência Artificial , COVID-19/diagnóstico por imagem , Tórax/diagnóstico por imagem , Adulto , Idoso , Idoso de 80 Anos ou mais , Bases de Dados Factuais , Feminino , Hospitalização , Humanos , Pulmão/diagnóstico por imagem , Masculino , Pessoa de Meia-Idade , Redes Neurais de Computação , Pandemias , Prognóstico , Estudos Retrospectivos , Índice de Gravidade de Doença , Tomografia Computadorizada por Raios X/métodos , Resultado do TratamentoRESUMO
While image analysis of chest computed tomography (CT) for COVID-19 diagnosis has been intensively studied, little work has been performed for image-based patient outcome prediction. Management of high-risk patients with early intervention is a key to lower the fatality rate of COVID-19 pneumonia, as a majority of patients recover naturally. Therefore, an accurate prediction of disease progression with baseline imaging at the time of the initial presentation can help in patient management. In lieu of only size and volume information of pulmonary abnormalities and features through deep learning based image segmentation, here we combine radiomics of lung opacities and non-imaging features from demographic data, vital signs, and laboratory findings to predict need for intensive care unit (ICU) admission. To our knowledge, this is the first study that uses holistic information of a patient including both imaging and non-imaging data for outcome prediction. The proposed methods were thoroughly evaluated on datasets separately collected from three hospitals, one in the United States, one in Iran, and another in Italy, with a total 295 patients with reverse transcription polymerase chain reaction (RT-PCR) assay positive COVID-19 pneumonia. Our experimental results demonstrate that adding non-imaging features can significantly improve the performance of prediction to achieve AUC up to 0.884 and sensitivity as high as 96.1%, which can be valuable to provide clinical decision support in managing COVID-19 patients. Our methods may also be applied to other lung diseases including but not limited to community acquired pneumonia. The source code of our work is available at https://github.com/DIAL-RPI/COVID19-ICUPrediction.
Assuntos
COVID-19/diagnóstico por imagem , Unidades de Terapia Intensiva/estatística & dados numéricos , Admissão do Paciente/estatística & dados numéricos , Pneumonia Viral/diagnóstico por imagem , Adulto , Idoso , COVID-19/epidemiologia , Conjuntos de Dados como Assunto , Progressão da Doença , Feminino , Humanos , Irã (Geográfico)/epidemiologia , Itália/epidemiologia , Masculino , Pessoa de Meia-Idade , Valor Preditivo dos Testes , Prognóstico , SARS-CoV-2 , Estados Unidos/epidemiologiaRESUMO
While image analysis of chest computed tomography (CT) for COVID-19 diagnosis has been intensively studied, little work has been performed for image-based patient outcome prediction. Management of high-risk patients with early intervention is a key to lower the fatality rate of COVID-19 pneumonia, as a majority of patients recover naturally. Therefore, an accurate prediction of disease progression with baseline imaging at the time of the initial presentation can help in patient management. In lieu of only size and volume information of pulmonary abnormalities and features through deep learning based image segmentation, here we combine radiomics of lung opacities and non-imaging features from demographic data, vital signs, and laboratory findings to predict need for intensive care unit (ICU) admission. To our knowledge, this is the first study that uses holistic information of a patient including both imaging and non-imaging data for outcome prediction. The proposed methods were thoroughly evaluated on datasets separately collected from three hospitals, one in the United States, one in Iran, and another in Italy, with a total 295 patients with reverse transcription polymerase chain reaction (RT-PCR) assay positive COVID-19 pneumonia. Our experimental results demonstrate that adding non-imaging features can significantly improve the performance of prediction to achieve AUC up to 0.884 and sensitivity as high as 96.1%, which can be valuable to provide clinical decision support in managing COVID-19 patients. Our methods may also be applied to other lung diseases including but not limited to community acquired pneumonia. The source code of our work is available at https://github.com/DIAL-RPI/COVID19-ICUPrediction.