Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 1.907
Filtrar
1.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38465982

RESUMO

In many modern machine learning applications, changes in covariate distributions and difficulty in acquiring outcome information have posed challenges to robust model training and evaluation. Numerous transfer learning methods have been developed to robustly adapt the model itself to some unlabeled target populations using existing labeled data in a source population. However, there is a paucity of literature on transferring performance metrics, especially receiver operating characteristic (ROC) parameters, of a trained model. In this paper, we aim to evaluate the performance of a trained binary classifier on unlabeled target population based on ROC analysis. We proposed Semisupervised Transfer lEarning of Accuracy Measures (STEAM), an efficient three-step estimation procedure that employs (1) double-index modeling to construct calibrated density ratio weights and (2) robust imputation to leverage the large amount of unlabeled data to improve estimation efficiency. We establish the consistency and asymptotic normality of the proposed estimator under the correct specification of either the density ratio model or the outcome model. We also correct for potential overfitting bias in the estimators in finite samples with cross-validation. We compare our proposed estimators to existing methods and show reductions in bias and gains in efficiency through simulations. We illustrate the practical utility of the proposed method on evaluating prediction performance of a phenotyping model for rheumatoid arthritis (RA) on a temporally evolving EHR cohort.


Assuntos
Aprendizado de Máquina , Aprendizado de Máquina Supervisionado , Humanos , Curva ROC , Projetos de Pesquisa , Viés
2.
Sci Rep ; 14(1): 6100, 2024 03 13.
Artigo em Inglês | MEDLINE | ID: mdl-38480815

RESUMO

Endoscopy, a widely used medical procedure for examining the gastrointestinal (GI) tract to detect potential disorders, poses challenges in manual diagnosis due to non-specific symptoms and difficulties in accessing affected areas. While supervised machine learning models have proven effective in assisting clinical diagnosis of GI disorders, the scarcity of image-label pairs created by medical experts limits their availability. To address these limitations, we propose a curriculum self-supervised learning framework inspired by human curriculum learning. Our approach leverages the HyperKvasir dataset, which comprises 100k unlabeled GI images for pre-training and 10k labeled GI images for fine-tuning. By adopting our proposed method, we achieved an impressive top-1 accuracy of 88.92% and an F1 score of 73.39%. This represents a 2.1% increase over vanilla SimSiam for the top-1 accuracy and a 1.9% increase for the F1 score. The combination of self-supervised learning and a curriculum-based approach demonstrates the efficacy of our framework in advancing the diagnosis of GI disorders. Our study highlights the potential of curriculum self-supervised learning in utilizing unlabeled GI tract images to improve the diagnosis of GI disorders, paving the way for more accurate and efficient diagnosis in GI endoscopy.


Assuntos
Currículo , Autogestão , Humanos , Endoscopia Gastrointestinal , Trato Gastrointestinal , Aprendizado de Máquina Supervisionado
3.
Sci Rep ; 14(1): 6086, 2024 03 13.
Artigo em Inglês | MEDLINE | ID: mdl-38480847

RESUMO

Research on different machine learning (ML) has become incredibly popular during the past few decades. However, for some researchers not familiar with statistics, it might be difficult to understand how to evaluate the performance of ML models and compare them with each other. Here, we introduce the most common evaluation metrics used for the typical supervised ML tasks including binary, multi-class, and multi-label classification, regression, image segmentation, object detection, and information retrieval. We explain how to choose a suitable statistical test for comparing models, how to obtain enough values of the metric for testing, and how to perform the test and interpret its results. We also present a few practical examples about comparing convolutional neural networks used to classify X-rays with different lung infections and detect cancer tumors in positron emission tomography images.


Assuntos
Processamento de Imagem Assistida por Computador , Aprendizado de Máquina , Processamento de Imagem Assistida por Computador/métodos , Redes Neurais de Computação , Aprendizado de Máquina Supervisionado , Tomografia por Emissão de Pósitrons
4.
BMC Bioinformatics ; 25(1): 103, 2024 Mar 08.
Artigo em Inglês | MEDLINE | ID: mdl-38459463

RESUMO

BACKGROUND: Blood test is extensively performed for screening, diagnoses and surveillance purposes. Although it is possible to automatically evaluate the raw blood test data with the advanced deep self-supervised machine learning approaches, it has not been profoundly investigated and implemented yet. RESULTS: This paper proposes deep machine learning algorithms with multi-dimensional adaptive feature elimination, self-feature weighting and novel feature selection approaches. To classify the health risks based on the processed data with the deep layers, four machine learning algorithms having various properties from being utterly model free to gradient driven are modified. CONCLUSIONS: The results show that the proposed deep machine learning algorithms can remove the unnecessary features, assign self-importance weights, selects their most informative ones and classify the health risks automatically from the worst-case low to worst-case high values.


Assuntos
Algoritmos , Aprendizado de Máquina , Aprendizado de Máquina Supervisionado
5.
Sci Rep ; 14(1): 6791, 2024 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-38514697

RESUMO

Extracellular vesicles (EVs) released from cells attract interest for their possible role in health and diseases. The detection and characterization of EVs is challenging due to the lack of specialized methodologies. Raman spectroscopy, however, has been suggested as a novel approach for biochemical analysis of EVs. To extract information from the spectra, a novel deep learning architecture is explored as a versatile variant of autoencoders. The proposed architecture considers the frequency range separately from the intensity of the spectra. This enables the model to adapt to the frequency range, rather than requiring that all spectra be pre-processed to the same frequency range as it was trained on. It is demonstrated that the proposed architecture accepts Raman spectra of EVs and lipoproteins from 13 biological sources and from two laboratories. High reconstruction accuracy is maintained despite large variances in frequency range and noise level. It is also shown that the architecture is able to cluster the biological nanoparticles by their Raman spectra and differentiate them by their origin without pre-processing of the spectra or supervision during learning. The model performs label-free differentiation, including separating EVs from activated vs. non-activated blood platelets and EVs/lipoproteins from prostate cancer patients versus non-cancer controls. The differentiation is evaluated by creating a neural network classifier that observes the features extracted by the model to classify the spectra according to their sample origin. The classification reveals a test sensitivity of 92.2 % and selectivity of 92.3 % over 769 measurements from two labs that have different measurement configurations.


Assuntos
Vesículas Extracelulares , Nanopartículas , Neoplasias da Próstata , Masculino , Humanos , Vesículas Extracelulares/química , Neoplasias da Próstata/diagnóstico , Lipoproteínas , Aprendizado de Máquina Supervisionado , Análise Espectral Raman/métodos
6.
Artif Intell Med ; 149: 102778, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38462280

RESUMO

Many computational methods have been proposed to identify potential drug-target interactions (DTIs) to expedite drug development. Graph neural network (GNN) methods are considered to be one of the most effective approaches. However, shallow GNN methods can only aggregate local information from nodes. Also, deep GNN methods may result in over-smoothing while obtaining long-distance neighbourhood information. As a result, existing GNN methods struggle to extract the complete features of the graph. Additionally, the number of known DTIs is insufficient, and there are far more unknown drug-target pairs than known DTIs, leading to class imbalance. This article proposes a model that combines graph autoencoder and self-supervised learning to accurately encode multilevel features of graphs using only a small number of labelled samples. We introduce a positive sample compensation coefficient to the objective function to mitigate the impact of class imbalance. Experiments on two datasets demonstrated that our model outperforms the four baseline methods, and the new DTIs predicted by the SSLDTI model were verified by the DrugBank database.


Assuntos
Desenvolvimento de Medicamentos , Redes Neurais de Computação , Bases de Dados Factuais , Aprendizado de Máquina Supervisionado
7.
Comput Med Imaging Graph ; 113: 102351, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38335784

RESUMO

Low resolution of positron emission tomography (PET) limits its diagnostic performance. Deep learning has been successfully applied to achieve super-resolution PET. However, commonly used supervised learning methods in this context require many pairs of low- and high-resolution (LR and HR) PET images. Although unsupervised learning utilizes unpaired images, the results are not as good as that obtained with supervised deep learning. In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches. Specifically, LR image patches are taken from a patient as inputs, while the most similar HR patches from other patients are found as labels. The similarity between the matched HR and LR patches serves as a prior for network construction. Our proposed method can be implemented by designing a new network or modifying an existing network. As an example in this study, we have modified the cycle-consistent generative adversarial network (CycleGAN) for super-resolution PET. Our numerical and experimental results qualitatively and quantitatively show the merits of our method relative to the state-of-the-art methods. The code is publicly available at https://github.com/PigYang-ops/CycleGAN-QSDL.


Assuntos
Tomografia por Emissão de Pósitrons , Aprendizado de Máquina Supervisionado , Humanos
8.
Bioinformatics ; 40(2)2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38366925

RESUMO

MOTIVATION: Cell-type annotation is fundamental in revealing cell heterogeneity for single-cell data analysis. Although a host of works have been developed, the low signal-to-noise-ratio single-cell RNA-sequencing data that suffers from batch effects and dropout still poses obstacles in discovering grouped patterns for cell types by unsupervised learning and its alternative-semi-supervised learning that utilizes a few labeled cells as guidance for cell-type annotation. RESULTS: We propose a robust cell-type annotation method scSemiGCN based on graph convolutional networks. Built upon a denoised network structure that characterizes reliable cell-to-cell connections, scSemiGCN generates pseudo labels for unannotated cells. Then supervised contrastive learning follows to refine the noisy single-cell data. Finally, message passing with the refined features over the denoised network structure is conducted for semi-supervised cell-type annotation. Comparison over several datasets with six methods under extremely limited supervision validates the effectiveness and efficiency of scSemiGCN for cell-type annotation. AVAILABILITY AND IMPLEMENTATION: Implementation of scSemiGCN is available at https://github.com/Jane9898/scSemiGCN.


Assuntos
Redes Neurais de Computação , Análise de Célula Única , Razão Sinal-Ruído , Aprendizado de Máquina Supervisionado
9.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38317054

RESUMO

MOTIVATION: Effective identification of cell types is of critical importance in single-cell RNA-sequencing (scRNA-seq) data analysis. To date, many supervised machine learning-based predictors have been implemented to identify cell types from scRNA-seq datasets. Despite the technical advances of these state-of-the-art tools, most existing predictors were single classifiers, of which the performances can still be significantly improved. It is therefore highly desirable to employ the ensemble learning strategy to develop more accurate computational models for robust and comprehensive identification of cell types on scRNA-seq datasets. RESULTS: We propose a two-layer stacking model, termed CTISL (Cell Type Identification by Stacking ensemble Learning), which integrates multiple classifiers to identify cell types. In the first layer, given a reference scRNA-seq dataset with known cell types, CTISL dynamically combines multiple cell-type-specific classifiers (i.e. support-vector machine and logistic regression) as the base learners to deliver the outcomes for the input of a meta-classifier in the second layer. We conducted a total of 24 benchmarking experiments on 17 human and mouse scRNA-seq datasets to evaluate and compare the prediction performance of CTISL and other state-of-the-art predictors. The experiment results demonstrate that CTISL achieves superior or competitive performance compared to these state-of-the-art approaches. We anticipate that CTISL can serve as a useful and reliable tool for cost-effective identification of cell types from scRNA-seq datasets. AVAILABILITY AND IMPLEMENTATION: The webserver and source code are freely available at http://bigdata.biocie.cn/CTISLweb/home and https://zenodo.org/records/10568906, respectively.


Assuntos
Análise de Célula Única , Análise da Expressão Gênica de Célula Única , Animais , Humanos , Camundongos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Software , Aprendizado de Máquina Supervisionado , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
10.
PLoS One ; 19(2): e0299487, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38421999

RESUMO

AIMS: Metabolic dysfunction Associated Steatotic Liver Disease (MASLD) outcomes such as MASH (metabolic dysfunction associated steatohepatitis), fibrosis and cirrhosis are ordinarily determined by resource-intensive and invasive biopsies. We aim to show that routine clinical tests offer sufficient information to predict these endpoints. METHODS: Using the LITMUS Metacohort derived from the European NAFLD Registry, the largest MASLD dataset in Europe, we create three combinations of features which vary in degree of procurement including a 19-variable feature set that are attained through a routine clinical appointment or blood test. This data was used to train predictive models using supervised machine learning (ML) algorithm XGBoost, alongside missing imputation technique MICE and class balancing algorithm SMOTE. Shapley Additive exPlanations (SHAP) were added to determine relative importance for each clinical variable. RESULTS: Analysing nine biopsy-derived MASLD outcomes of cohort size ranging between 5385 and 6673 subjects, we were able to predict individuals at training set AUCs ranging from 0.719-0.994, including classifying individuals who are At-Risk MASH at an AUC = 0.899. Using two further feature combinations of 26-variables and 35-variables, which included composite scores known to be good indicators for MASLD endpoints and advanced specialist tests, we found predictive performance did not sufficiently improve. We are also able to present local and global explanations for each ML model, offering clinicians interpretability without the expense of worsening predictive performance. CONCLUSIONS: This study developed a series of ML models of accuracy ranging from 71.9-99.4% using only easily extractable and readily available information in predicting MASLD outcomes which are usually determined through highly invasive means.


Assuntos
Doenças Metabólicas , Hepatopatia Gordurosa não Alcoólica , Humanos , Aprendizado de Máquina , Hepatopatia Gordurosa não Alcoólica/diagnóstico , Pacientes , Aprendizado de Máquina Supervisionado
11.
Cancer Imaging ; 24(1): 30, 2024 Feb 29.
Artigo em Inglês | MEDLINE | ID: mdl-38424612

RESUMO

BACKGROUND: Prostate-specific membrane antigen (PSMA) PET/CT imaging is widely used for quantitative image analysis, especially in radioligand therapy (RLT) for metastatic castration-resistant prostate cancer (mCRPC). Unknown features influencing PSMA biodistribution can be explored by analyzing segmented organs at risk (OAR) and lesions. Manual segmentation is time-consuming and labor-intensive, so automated segmentation methods are desirable. Training deep-learning segmentation models is challenging due to the scarcity of high-quality annotated images. Addressing this, we developed shifted windows UNEt TRansformers (Swin UNETR) for fully automated segmentation. Within a self-supervised framework, the model's encoder was pre-trained on unlabeled data. The entire model was fine-tuned, including its decoder, using labeled data. METHODS: In this work, 752 whole-body [68Ga]Ga-PSMA-11 PET/CT images were collected from two centers. For self-supervised model pre-training, 652 unlabeled images were employed. The remaining 100 images were manually labeled for supervised training. In the supervised training phase, 5-fold cross-validation was used with 64 images for model training and 16 for validation, from one center. For testing, 20 hold-out images, evenly distributed between two centers, were used. Image segmentation and quantification metrics were evaluated on the test set compared to the ground-truth segmentation conducted by a nuclear medicine physician. RESULTS: The model generates high-quality OARs and lesion segmentation in lesion-positive cases, including mCRPC. The results show that self-supervised pre-training significantly improved the average dice similarity coefficient (DSC) for all classes by about 3%. Compared to nnU-Net, a well-established model in medical image segmentation, our approach outperformed with a 5% higher DSC. This improvement was attributed to our model's combined use of self-supervised pre-training and supervised fine-tuning, specifically when applied to PET/CT input. Our best model had the lowest DSC for lesions at 0.68 and the highest for liver at 0.95. CONCLUSIONS: We developed a state-of-the-art neural network using self-supervised pre-training on whole-body [68Ga]Ga-PSMA-11 PET/CT images, followed by fine-tuning on a limited set of annotated images. The model generates high-quality OARs and lesion segmentation for PSMA image analysis. The generalizable model holds potential for various clinical applications, including enhanced RLT and patient-specific internal dosimetry.


Assuntos
Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada , Neoplasias de Próstata Resistentes à Castração , Masculino , Humanos , Tomografia por Emissão de Pósitrons combinada à Tomografia Computadorizada/métodos , Radioisótopos de Gálio , Órgãos em Risco , Distribuição Tecidual , Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador/métodos
12.
Nat Commun ; 15(1): 1014, 2024 Feb 03.
Artigo em Inglês | MEDLINE | ID: mdl-38307875

RESUMO

A crucial step in the analysis of single-cell data is annotating cells to cell types and states. While a myriad of approaches has been proposed, manual labeling of cells to create training datasets remains tedious and time-consuming. In the field of machine learning, active and self-supervised learning methods have been proposed to improve the performance of a classifier while reducing both annotation time and label budget. However, the benefits of such strategies for single-cell annotation have yet to be evaluated in realistic settings. Here, we perform a comprehensive benchmarking of active and self-supervised labeling strategies across a range of single-cell technologies and cell type annotation algorithms. We quantify the benefits of active learning and self-supervised strategies in the presence of cell type imbalance and variable similarity. We introduce adaptive reweighting, a heuristic procedure tailored to single-cell data-including a marker-aware version-that shows competitive performance with existing approaches. In addition, we demonstrate that having prior knowledge of cell type markers improves annotation accuracy. Finally, we summarize our findings into a set of recommendations for those implementing cell type annotation procedures or platforms. An R package implementing the heuristic approaches introduced in this work may be found at https://github.com/camlab-bioml/leader .


Assuntos
Algoritmos , Aprendizado de Máquina , Tecnologia , Conscientização , Aprendizado de Máquina Supervisionado , Análise de Célula Única
13.
Sci Rep ; 14(1): 3202, 2024 02 08.
Artigo em Inglês | MEDLINE | ID: mdl-38331955

RESUMO

Developing a clinical AI model necessitates a significant amount of highly curated and carefully annotated dataset by multiple medical experts, which results in increased development time and costs. Self-supervised learning (SSL) is a method that enables AI models to leverage unlabelled data to acquire domain-specific background knowledge that can enhance their performance on various downstream tasks. In this work, we introduce CypherViT, a cluster-based histo-pathology phenotype representation learning by self-supervised multi-class-token hierarchical Vision Transformer (ViT). CypherViT is a novel backbone that can be integrated into a SSL pipeline, accommodating both coarse and fine-grained feature learning for histopathological images via a hierarchical feature agglomerative attention module with multiple classification (cls) tokens in ViT. Our qualitative analysis showcases that our approach successfully learns semantically meaningful regions of interest that align with morphological phenotypes. To validate the model, we utilize the DINO self-supervised learning (SSL) framework to train CypherViT on a substantial dataset of unlabeled breast cancer histopathological images. This trained model proves to be a generalizable and robust feature extractor for colorectal cancer images. Notably, our model demonstrates promising performance in patch-level tissue phenotyping tasks across four public datasets. The results from our quantitative experiments highlight significant advantages over existing state-of-the-art SSL models and traditional transfer learning methods, such as those relying on ImageNet pre-training.


Assuntos
Fontes de Energia Elétrica , Autogestão , Humanos , Conhecimento , Fenótipo , Aprendizado de Máquina Supervisionado
14.
Phys Med Biol ; 69(6)2024 Mar 13.
Artigo em Inglês | MEDLINE | ID: mdl-38324897

RESUMO

Objective. In the field of medicine, semi-supervised segmentation algorithms hold crucial research significance while also facing substantial challenges, primarily due to the extreme scarcity of expert-level annotated medical image data. However, many existing semi-supervised methods still process labeled and unlabeled data in inconsistent ways, which can lead to knowledge learned from labeled data being discarded to some extent. This not only lacks a variety of perturbations to explore potential robust information in unlabeled data but also ignores the confirmation bias and class imbalance issues in pseudo-labeling methods.Approach. To solve these problems, this paper proposes a semi-supervised medical image segmentation method 'mixup-decoupling training (MDT)' that combines the idea of consistency and pseudo-labeling. Firstly, MDT introduces a new perturbation strategy 'mixup-decoupling' to fully regularize training data. It not only mixes labeled and unlabeled data at the data level but also performs decoupling operations between the output predictions of mixed target data and labeled data at the feature level to obtain strong version predictions of unlabeled data. Then it establishes a dual learning paradigm based on consistency and pseudo-labeling. Secondly, MDT employs a novel categorical entropy filtering approach to pick high-confidence pseudo-labels for unlabeled data, facilitating more refined supervision.Main results. This paper compares MDT with other advanced semi-supervised methods on 2D and 3D datasets separately. A large number of experimental results show that MDT achieves competitive segmentation performance and outperforms other state-of-the-art semi-supervised segmentation methods.Significance. This paper proposes a semi-supervised medical image segmentation method MDT, which greatly reduces the demand for manually labeled data and eases the difficulty of data annotation to a great extent. In addition, MDT not only outperforms many advanced semi-supervised image segmentation methods in quantitative and qualitative experimental results, but also provides a new and developable idea for semi-supervised learning and computer-aided diagnosis technology research.


Assuntos
Algoritmos , Diagnóstico por Computador , Entropia , Cabeça , Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador
15.
Ultrasound Med Biol ; 50(5): 703-711, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38350787

RESUMO

OBJECTIVE: The aim of this study was address the challenges posed by the manual labeling of fetal ultrasound images by introducing an unsupervised approach, the fetal ultrasound semantic clustering (FUSC) method. The primary objective was to automatically cluster a large volume of ultrasound images into various fetal views, reducing or eliminating the need for labor-intensive manual labeling. METHODS: The FUSC method was developed by using a substantial data set comprising 88,063 images. The methodology involves an unsupervised clustering approach to categorize ultrasound images into diverse fetal views. The method's effectiveness was further evaluated on an additional, unseen data set consisting of 8187 images. The evaluation included assessment of the clustering purity, and the entire process is detailed to provide insights into the method's performance. RESULTS: The FUSC method exhibited notable success, achieving >92% clustering purity on the evaluation data set of 8187 images. The results signify the feasibility of automatically clustering fetal ultrasound images without relying on manual labeling. The study showcases the potential of this approach in handling a large volume of ultrasound scans encountered in clinical practice, with implications for improving efficiency and accuracy in fetal ultrasound imaging. CONCLUSION: The findings of this investigation suggest that the FUSC method holds significant promise for the field of fetal ultrasound imaging. By automating the clustering of ultrasound images, this approach has the potential to reduce the manual labeling burden, making the process more efficient. The results pave the way for advanced automated labeling solutions, contributing to the enhancement of clinical practices in fetal ultrasound imaging. Our code is available at https://github.com/BioMedIA-MBZUAI/FUSC.


Assuntos
Semântica , Ultrassonografia Pré-Natal , Gravidez , Feminino , Humanos , Segundo Trimestre da Gravidez , Ultrassonografia Pré-Natal/métodos , Aprendizado de Máquina Supervisionado , Análise por Conglomerados
16.
Sci Rep ; 14(1): 4506, 2024 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-38402356

RESUMO

One drawback of existing artificial intelligence (AI)-based histopathological prediction models is the lack of interpretability. The objective of this study is to extract p16-positive oropharyngeal squamous cell carcinoma (OPSCC) features in a form that can be interpreted by pathologists using AI model. We constructed a model for predicting p16 expression using a dataset of whole-slide images from 114 OPSCC biopsy cases. We used the clustering-constrained attention-based multiple-instance learning (CLAM) model, a weakly supervised learning approach. To improve performance, we incorporated tumor annotation into the model (Annot-CLAM) and achieved the mean area under the receiver operating characteristic curve of 0.905. Utilizing the image patches on which the model focused, we examined the features of model interest via histopathologic morphological analysis and cycle-consistent adversarial network (CycleGAN) image translation. The histopathologic morphological analysis evaluated the histopathological characteristics of image patches, revealing significant differences in the numbers of nuclei, the perimeters of the nuclei, and the intercellular bridges between p16-negative and p16-positive image patches. By using the CycleGAN-converted images, we confirmed that the sizes and densities of nuclei are significantly converted. This novel approach improves interpretability in histopathological morphology-based AI models and contributes to the advancement of clinically valuable histopathological morphological features.


Assuntos
Carcinoma de Células Escamosas , Neoplasias de Cabeça e Pescoço , Neoplasias Orofaríngeas , Humanos , Carcinoma de Células Escamosas/patologia , Inteligência Artificial , Patologistas , Neoplasias Orofaríngeas/patologia , Carcinoma de Células Escamosas de Cabeça e Pescoço , Aprendizado de Máquina Supervisionado
17.
Sci Rep ; 14(1): 4489, 2024 02 24.
Artigo em Inglês | MEDLINE | ID: mdl-38396157

RESUMO

Many critical issues arise when training deep neural networks using limited biological datasets. These include overfitting, exploding/vanishing gradients and other inefficiencies which are exacerbated by class imbalances and can affect the overall accuracy of a model. There is a need to develop semi-supervised models that can reduce the need for large, balanced, manually annotated datasets so that researchers can easily employ neural networks for experimental analysis. In this work, Iterative Pseudo Balancing (IPB) is introduced to classify stem cell microscopy images while performing on the fly dataset balancing using a student-teacher meta-pseudo-label framework. In addition, multi-scale patches of multi-label images are incorporated into the network training to provide previously inaccessible image features with both local and global information for effective and efficient learning. The combination of these inputs is shown to increase the classification accuracy of the proposed deep neural network by 3[Formula: see text] over baseline, which is determined to be statistically significant. This work represents a novel use of pseudo-labeling for data limited settings, which are common in biological image datasets, and highlights the importance of the exhaustive use of available image features for improving performance of semi-supervised networks. The proposed methods can be used to reduce the need for expensive manual dataset annotation and in turn accelerate the pace of scientific research involving non-invasive cellular imaging.


Assuntos
Aprendizagem , Microscopia , Humanos , Redes Neurais de Computação , Rotulagem de Produtos , Células-Tronco , Processamento de Imagem Assistida por Computador , Aprendizado de Máquina Supervisionado
18.
Comput Biol Med ; 170: 108026, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38308865

RESUMO

Automatic segmentation of histopathology whole-slide images (WSI) usually involves supervised training of deep learning models with pixel-level labels to classify each pixel of the WSI into tissue regions such as benign or cancerous. However, fully supervised segmentation requires large-scale data manually annotated by experts, which can be expensive and time-consuming to obtain. Non-fully supervised methods, ranging from semi-supervised to unsupervised, have been proposed to address this issue and have been successful in WSI segmentation tasks. But these methods have mainly been focused on technical advancements in algorithmic performance rather than on the development of practical tools that could be used by pathologists or researchers in real-world scenarios. In contrast, we present DEPICTER (Deep rEPresentatIon ClusTERing), an interactive segmentation tool for histopathology annotation that produces a patch-wise dense segmentation map at WSI level. The interactive nature of DEPICTER leverages self- and semi-supervised learning approaches to allow the user to participate in the segmentation producing reliable results while reducing the workload. DEPICTER consists of three steps: first, a pretrained model is used to compute embeddings from image patches. Next, the user selects a number of benign and cancerous patches from the multi-resolution image. Finally, guided by the deep representations, label propagation is achieved using our novel seeded iterative clustering method or by directly interacting with the embedding space via feature space gating. We report both real-time interaction results with three pathologists and evaluate the performance on three public cancer classification dataset benchmarks through simulations. The code and demos of DEPICTER are publicly available at https://github.com/eduardchelebian/depicter.


Assuntos
Benchmarking , Aprendizado de Máquina Supervisionado , Análise por Conglomerados , Carga de Trabalho , Processamento de Imagem Assistida por Computador
19.
Comput Biol Med ; 170: 108006, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38325216

RESUMO

BACKGROUND: AI-assisted polyp segmentation in colonoscopy plays a crucial role in enabling prompt diagnosis and treatment of colorectal cancer. However, the lack of sufficient annotated data poses a significant challenge for supervised learning approaches. Existing semi-supervised learning methods also suffer from performance degradation, mainly due to task-specific characteristics, such as class imbalance in polyp segmentation. PURPOSE: The purpose of this work is to develop an effective semi-supervised learning framework for accurate polyp segmentation in colonoscopy, addressing limited annotated data and class imbalance challenges. METHODS: We proposed PolypMixNet, a semi-supervised framework, for colorectal polyp segmentation, utilizing novel augmentation techniques and a Mean Teacher architecture to improve model performance. PolypMixNet introduces the polyp-aware mixup (PolypMix) algorithm and incorporates dual-level consistency regularization. PolypMix addresses the class imbalance in colonoscopy datasets and enhances the diversity of training data. By performing a polyp-aware mixup on unlabeled samples, it generates mixed images with polyp context along with their artificial labels. A polyp-directed soft pseudo-labeling (PDSPL) mechanism was proposed to generate high-quality pseudo labels and eliminate the dilution of lesion features caused by mixup operations. To ensure consistency in the training phase, we introduce the PolypMix prediction consistency (PMPC) loss and PolypMix attention consistency (PMAC) loss, enforcing consistency at both image and feature levels. Code is available at https://github.com/YChienHung/PolypMix. RESULTS: PolypMixNet was evaluated on four public colonoscopy datasets, achieving 88.97% Dice and 88.85% mIoU on the benchmark dataset of Kvasir-SEG. In scenarios where the labeled training data is limited to 15%, PolypMixNet outperforms the state-of-the-art semi-supervised approaches with a 2.88-point improvement in Dice. It also shows the ability to reach performance comparable to the fully supervised counterpart. Additionally, we conducted extensive ablation studies to validate the effectiveness of each module and highlight the superiority of our proposed approach. CONCLUSION: PolypMixNet effectively addresses the challenges posed by limited annotated data and unbalanced class distributions in polyp segmentation. By leveraging unlabeled data and incorporating novel augmentation and consistency regularization techniques, our method achieves state-of-the-art performance. We believe that the insights and contributions presented in this work will pave the way for further advancements in semi-supervised polyp segmentation and inspire future research in the medical imaging domain.


Assuntos
Algoritmos , Benchmarking , Colonoscopia , Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador
20.
Sensors (Basel) ; 24(3)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38339601

RESUMO

Deep learning models have gained prominence in human activity recognition using ambient sensors, particularly for telemonitoring older adults' daily activities in real-world scenarios. However, collecting large volumes of annotated sensor data presents a formidable challenge, given the time-consuming and costly nature of traditional manual annotation methods, especially for extensive projects. In response to this challenge, we propose a novel AttCLHAR model rooted in the self-supervised learning framework SimCLR and augmented with a self-attention mechanism. This model is designed for human activity recognition utilizing ambient sensor data, tailored explicitly for scenarios with limited or no annotations. AttCLHAR encompasses unsupervised pre-training and fine-tuning phases, sharing a common encoder module with two convolutional layers and a long short-term memory (LSTM) layer. The output is further connected to a self-attention layer, allowing the model to selectively focus on different input sequence segments. The incorporation of sharpness-aware minimization (SAM) aims to enhance model generalization by penalizing loss sharpness. The pre-training phase focuses on learning representative features from abundant unlabeled data, capturing both spatial and temporal dependencies in the sensor data. It facilitates the extraction of informative features for subsequent fine-tuning tasks. We extensively evaluated the AttCLHAR model using three CASAS smart home datasets (Aruba-1, Aruba-2, and Milan). We compared its performance against the SimCLR framework, SimCLR with SAM, and SimCLR with the self-attention layer. The experimental results demonstrate the superior performance of our approach, especially in semi-supervised and transfer learning scenarios. It outperforms existing models, marking a significant advancement in using self-supervised learning to extract valuable insights from unlabeled ambient sensor data in real-world environments.


Assuntos
Conscientização , Atividades Humanas , Humanos , Idoso , Memória de Longo Prazo , Reconhecimento Psicológico , Aprendizado de Máquina Supervisionado
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...