Pesquisa | BVS IEC

Analyzing Transfer Learning of Vision Transformers for Interpreting Chest Radiography.

Usman, Mohammad; Zia, Tehseen; Tariq, Ali.

J Digit Imaging ; 35(6): 1445-1462, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-35819537

RESUMO

Limited availability of medical imaging datasets is a vital limitation when using "data hungry" deep learning to gain performance improvements. Dealing with the issue, transfer learning has become a de facto standard, where a pre-trained convolution neural network (CNN), typically on natural images (e.g., ImageNet), is finetuned on medical images. Meanwhile, pre-trained transformers, which are self-attention-based models, have become de facto standard in natural language processing (NLP) and state of the art in image classification due to their powerful transfer learning abilities. Inspired by the success of transformers in NLP and image classification, large-scale transformers (such as vision transformer) are trained on natural images. Based on these recent developments, this research aims to explore the efficacy of pre-trained natural image transformers for medical images. Specifically, we analyze pre-trained vision transformer on CheXpert and pediatric pneumonia dataset. We use CNN standard models including VGGNet and ResNet as baseline models. By examining the acquired representations and results, we discover that transfer learning from the pre-trained vision transformer shows improved results as compared to pre-trained CNN which demonstrates a greater transfer ability of the transformers in medical imaging.

Assuntos

Aprendizado de Máquina , Redes Neurais de Computação , Humanos , Criança , Radiografia

High Efficiency Video Coding (HEVC)-Based Surgical Telementoring System Using Shallow Convolutional Neural Network.

Hassan, Ali; Ghafoor, Mubeen; Tariq, Syed Ali; Zia, Tehseen; Ahmad, Waqas.

J Digit Imaging ; 32(6): 1027-1043, 2019 12.

Artigo em Inglês | MEDLINE | ID: mdl-30980262

RESUMO

Surgical telementoring systems have gained lots of interest, especially in remote locations. However, bandwidth constraint has been the primary bottleneck for efficient telementoring systems. This study aims to establish an efficient surgical telementoring system, where the qualified surgeon (mentor) provides real-time guidance and technical assistance for surgical procedures to the on-spot physician (surgeon). High Efficiency Video Coding (HEVC/H.265)-based video compression has shown promising results for telementoring applications. However, there is a trade-off between the bandwidth resources required for video transmission and quality of video received by the remote surgeon. In order to efficiently compress and transmit real-time surgical videos, a hybrid lossless-lossy approach is proposed where surgical incision region is coded in high quality whereas the background region is coded in low quality based on distance from the surgical incision region. For surgical incision region extraction, state-of-the-art deep learning (DL) architectures for semantic segmentation can be used. However, the computational complexity of these architectures is high resulting in large training and inference times. For telementoring systems, encoding time is crucial; therefore, very deep architectures are not suitable for surgical incision extraction. In this study, we propose a shallow convolutional neural network (S-CNN)-based segmentation approach that consists of encoder network only for surgical region extraction. The segmentation performance of S-CNN is compared with one of the state-of-the-art image segmentation networks (SegNet), and results demonstrate the effectiveness of the proposed network. The proposed telementoring system is efficient and explicitly considers the physiological nature of the human visual system to encode the video by providing good overall visual impact in the location of surgery. The results of the proposed S-CNN-based segmentation demonstrated a pixel accuracy of 97% and a mean intersection over union accuracy of 79%. Similarly, HEVC experimental results showed that the proposed surgical region-based encoding scheme achieved an average bitrate reduction of 88.8% at high-quality settings in comparison with default full-frame HEVC encoding. The average gain in encoding performance (signal-to-noise) of the proposed algorithm is 11.5 dB in the surgical region. The bitrate saving and visual quality of the proposed optimal bit allocation scheme are compared with the mean shift segmentation-based coding scheme for fair comparison. The results show that the proposed scheme maintains high visual quality in surgical incision region along with achieving good bitrate saving. Based on comparison and results, the proposed encoding algorithm can be considered as an efficient and effective solution for surgical telementoring systems for low-bandwidth networks.

Assuntos

Tutoria/métodos , Redes Neurais de Computação , Procedimentos Cirúrgicos Operatórios/métodos , Telemedicina/métodos , Gravação em Vídeo , Humanos

Visual attribution using Adversarial Latent Transformations.

Zia, Tehseen; Wahab, Abdul; Windridge, David; Tirunagari, Santosh; Bhatti, Nauman Bashir.

Comput Biol Med ; 166: 107521, 2023 Sep 23.

Artigo em Inglês | MEDLINE | ID: mdl-37778213

RESUMO

The ability to accurately locate all indicators of disease within medical images is vital for comprehending the effects of the disease, as well as for weakly-supervised segmentation and localization of the diagnostic correlators of disease. Existing methods either use classifiers to make predictions based on class-salient regions or else use adversarial learning based image-to-image translation to capture such disease effects. However, the former does not capture all relevant features for visual attribution (VA) and are prone to data biases; the latter can generate adversarial (misleading) and inefficient solutions when dealing in pixel values. To address this issue, we propose a novel approach Visual Attribution using Adversarial Latent Transformations (VA2LT). Our method uses adversarial learning to generate counterfactual (CF) normal images from abnormal images by finding and modifying discrepancies in the latent space. We use cycle consistency between the query and CF latent representations to guide our training. We evaluate our method on three datasets including a synthetic dataset, the Alzheimer's Disease Neuroimaging Initiative dataset, and the BraTS dataset. Our method outperforms baseline and related methods on all datasets.

Fingerprint Identification With Shallow Multifeature View Classifier.

Ghafoor, Mubeen; Tariq, Syed Ali; Zia, Tehseen; Taj, Imtiaz Ahmad; Abbas, Assad; Hassan, Ali; Zomaya, Albert Y.

IEEE Trans Cybern ; 51(9): 4515-4527, 2021 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-31880579

RESUMO

This article presents an efficient fingerprint identification system that implements an initial classification for search-space reduction followed by minutiae neighbor-based feature encoding and matching. The current state-of-the-art fingerprint classification methods use a deep convolutional neural network (DCNN) to assign confidence for the classification prediction, and based on this prediction, the input fingerprint is matched with only the subset of the database that belongs to the predicted class. It can be observed for the DCNNs that as the architectures deepen, the farthest layers of the network learn more abstract information from the input images that result in higher prediction accuracies. However, the downside is that the DCNNs are data hungry and require lots of annotated (labeled) data to learn generalized network parameters for deeper layers. In this article, a shallow multifeature view CNN (SMV-CNN) fingerprint classifier is proposed that extracts: 1) fine-grained features from the input image and 2) abstract features from explicitly derived representations obtained from the input image. The multifeature views are fed to a fully connected neural network (NN) to compute a global classification prediction. The classification results show that the SMV-CNN demonstrated an improvement of 2.8% when compared to baseline CNN consisting of a single grayscale view on an open-source database. Moreover, in comparison with the state-of-the-art residual network (ResNet-50) image classification model, the proposed method performs comparably while being less complex and more efficient during training. The result of classification-based fingerprint identification has shown that the search space is reduced by over 50% without degradation of identification accuracies.

Assuntos

Redes Neurais de Computação

Uncertainty Assisted Robust Tuberculosis Identification With Bayesian Convolutional Neural Networks.

Ul Abideen, Zain; Ghafoor, Mubeen; Munir, Kamran; Saqib, Madeeha; Ullah, Ata; Zia, Tehseen; Tariq, Syed Ali; Ahmed, Ghufran; Zahra, Asma.

IEEE Access ; 8: 22812-22825, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32391238

RESUMO

Tuberculosis (TB) is an infectious disease that can lead towards death if left untreated. TB detection involves extraction of complex TB manifestation features such as lung cavity, air space consolidation, endobronchial spread, and pleural effusions from chest x-rays (CXRs). Deep learning based approach named convolutional neural network (CNN) has the ability to learn complex features from CXR images. The main problem is that CNN does not consider uncertainty to classify CXRs using softmax layer. It lacks in presenting the true probability of CXRs by differentiating confusing cases during TB detection. This paper presents the solution for TB identification by using Bayesian-based convolutional neural network (B-CNN). It deals with the uncertain cases that have low discernibility among the TB and non-TB manifested CXRs. The proposed TB identification methodology based on B-CNN is evaluated on two TB benchmark datasets, i.e., Montgomery and Shenzhen. For training and testing of proposed scheme we have utilized Google Colab platform which provides NVidia Tesla K80 with 12 GB of VRAM, single core of 2.3 GHz Xeon Processor, 12 GB RAM and 320 GB of disk. B-CNN achieves 96.42% and 86.46% accuracy on both dataset, respectively as compared to the state-of-the-art machine learning and CNN approaches. Moreover, B-CNN validates its results by filtering the CXRs as confusion cases where the variance of B-CNN predicted outputs is more than a certain threshold. Results prove the supremacy of B-CNN for the identification of TB and non-TB sample CXRs as compared to counterparts in terms of accuracy, variance in the predicted probabilities and model uncertainty.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA