RESUMO
Deep learning methods are widely applied in digital pathology to address clinical challenges such as prognosis and diagnosis. As one of the most recent applications, deep models have also been used to extract molecular features from whole slide images. Although molecular tests carry rich information, they are often expensive, time-consuming, and require additional tissue to sample. In this paper, we propose tRNAsformer, an attention-based topology that can learn both to predict the bulk RNA-seq from an image and represent the whole slide image of a glass slide simultaneously. The tRNAsformer uses multiple instance learning to solve a weakly supervised problem while the pixel-level annotation is not available for an image. We conducted several experiments and achieved better performance and faster convergence in comparison to the state-of-the-art algorithms. The proposed tRNAsformer can assist as a computational pathology tool to facilitate a new generation of search and classification methods by combining the tissue morphology and the molecular fingerprint of the biopsy samples.
Assuntos
Algoritmos , Sequência de Bases , RNA-SeqRESUMO
Out-of-distribution (OOD) generalization, especially for medical setups, is a key challenge in modern machine learning which has only recently received much attention. We investigate how different convolutional pre-trained models perform on OOD test data-that is data from domains that have not been seen during training-on histopathology repositories attributed to different trial sites. Different trial site repositories, pre-trained models, and image transformations are examined as specific aspects of pre-trained models. A comparison is also performed among models trained entirely from scratch (i.e., without pre-training) and models already pre-trained. The OOD performance of pre-trained models on natural images, i.e., (1) vanilla pre-trained ImageNet, (2) semi-supervised learning (SSL), and (3) semi-weakly-supervised learning (SWSL) models pre-trained on IG-1B-Targeted are examined in this study. In addition, the performance of a histopathology model (i.e., KimiaNet) trained on the most comprehensive histopathology dataset, i.e., TCGA, has also been studied. Although the performance of SSL and SWSL pre-trained models are conducive to better OOD performance in comparison to the vanilla ImageNet pre-trained model, the histopathology pre-trained model is still the best in overall. In terms of top-1 accuracy, we demonstrate that diversifying the images in the training using reasonable image transformations is effective to avoid learning shortcuts when the distribution shift is significant. In addition, XAI techniques-which aim to achieve high-quality human-understandable explanations of AI decisions-are leveraged for further investigations.
Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Humanos , Aprendizado de Máquina SupervisionadoRESUMO
Medical centers and healthcare providers have concerns and hence restrictions around sharing data with external collaborators. Federated learning, as a privacy-preserving method, involves learning a site-independent model without having direct access to patient-sensitive data in a distributed collaborative fashion. The federated approach relies on decentralized data distribution from various hospitals and clinics. The collaboratively learned global model is supposed to have acceptable performance for the individual sites. However, existing methods focus on minimizing the average of the aggregated loss functions, leading to a biased model that performs perfectly for some hospitals while exhibiting undesirable performance for other sites. In this paper, we improve model "fairness" among participating hospitals by proposing a novel federated learning scheme called Proportionally Fair Federated Learning, short Prop-FFL. Prop-FFL is based on a novel optimization objective function to decrease the performance variations among participating hospitals. This function encourages a fair model, providing us with more uniform performance across participating hospitals. We validate the proposed Prop-FFL on two histopathology datasets as well as two general datasets to shed light on its inherent capabilities. The experimental results suggest promising performance in terms of learning speed, accuracy, and fairness.
Assuntos
Hospitais , Patologia , Aprendizado de Máquina Supervisionado , HumanosRESUMO
Whole Slide Images (WSIs) in digital pathology are used to diagnose cancer subtypes. The difference in procedures to acquire WSIs at various trial sites gives rise to variability in the histopathology images, thus making consistent diagnosis challenging. These differences may stem from variability in image acquisition through multi-vendor scanners, variable acquisition parameters, and differences in staining procedure; as well, patient demographics may bias the glass slide batches before image acquisition. These variabilities are assumed to cause a domain shift in the images of different hospitals. It is crucial to overcome this domain shift because an ideal machine-learning model must be able to work on the diverse sources of images, independent of the acquisition center. A domain generalization technique is leveraged in this study to improve the generalization capability of a Deep Neural Network (DNN), to an unseen histopathology image set (i.e., from an unseen hospital/trial site) in the presence of domain shift. According to experimental results, the conventional supervisedlearning regime generalizes poorly to data collected from different hospitals. However, the proposed hospital-agnostic learning can improve the generalization considering the lowdimensional latent space representation visualization, and classification accuracy results.
Assuntos
Neoplasias , Redes Neurais de Computação , Hospitais , Humanos , Aprendizado de Máquina , Neoplasias/diagnóstico por imagemRESUMO
Feature vectors provided by pre-trained deep artificial neural networks have become a dominant source for image representation in recent literature. Their contribution to the performance of image analysis can be improved through fine-tuning. As an ultimate solution, one might even train a deep network from scratch with the domain-relevant images, a highly desirable option which is generally impeded in pathology by lack of labeled images and the computational expense. In this study, we propose a new network, namely KimiaNet, that employs the topology of the DenseNet with four dense blocks, fine-tuned and trained with histopathology images in different configurations. We used more than 240,000 image patches with 1000×1000 pixels acquired at 20× magnification through our proposed "high-cellularity mosaic" approach to enable the usage of weak labels of 7126 whole slide images of formalin-fixed paraffin-embedded human pathology samples publicly available through The Cancer Genome Atlas (TCGA) repository. We tested KimiaNet using three public datasets, namely TCGA, endometrial cancer images, and colorectal cancer images by evaluating the performance of search and classification when corresponding features of different networks are used for image representation. As well, we designed and trained multiple convolutional batch-normalized ReLU (CBR) networks. The results show that KimiaNet provides superior results compared to the original DenseNet and smaller CBR networks when used as feature extractor to represent histopathology images.
Assuntos
Neoplasias , Redes Neurais de Computação , Humanos , Processamento de Imagem Assistida por Computador , Neoplasias/diagnóstico por imagemRESUMO
As many algorithms depend on a suitable representation of data, learning unique features is considered a crucial task. Although supervised techniques using deep neural networks have boosted the performance of representation learning, the need for a large sets of labeled data limits the application of such methods. As an example, high-quality delineations of regions of interest in the field of pathology is a tedious and time-consuming task due to the large image dimensions. In this work, we explored the performance of a deep neural network and triplet loss in the area of representation learning. We investigated the notion of similarity and dissimilarity in pathology whole-slide images and compared different setups from unsupervised and semi-supervised to supervised learning in our experiments. Additionally, different approaches were tested, applying few-shot learning on two publicly available pathology image datasets. We achieved high accuracy and generalization when the learned representations were applied to two different pathology datasets.