Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 401
Filtrar
1.
Comput Med Imaging Graph ; 116: 102416, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-39018640

RESUMEN

Despite that deep learning has achieved state-of-the-art performance for automatic medical image segmentation, it often requires a large amount of pixel-level manual annotations for training. Obtaining these high-quality annotations is time-consuming and requires specialized knowledge, which hinders the widespread application that relies on such annotations to train a model with good segmentation performance. Using scribble annotations can substantially reduce the annotation cost, but often leads to poor segmentation performance due to insufficient supervision. In this work, we propose a novel framework named as ScribSD+ that is based on multi-scale knowledge distillation and class-wise contrastive regularization for learning from scribble annotations. For a student network supervised by scribbles and the teacher based on Exponential Moving Average (EMA), we first introduce multi-scale prediction-level Knowledge Distillation (KD) that leverages soft predictions of the teacher network to supervise the student at multiple scales, and then propose class-wise contrastive regularization which encourages feature similarity within the same class and dissimilarity across different classes, thereby effectively improving the segmentation performance of the student network. Experimental results on the ACDC dataset for heart structure segmentation and a fetal MRI dataset for placenta and fetal brain segmentation demonstrate that our method significantly improves the student's performance and outperforms five state-of-the-art scribble-supervised learning methods. Consequently, the method has a potential for reducing the annotation cost in developing deep learning models for clinical diagnosis.

2.
Radiat Oncol ; 19(1): 89, 2024 Jul 09.
Artículo en Inglés | MEDLINE | ID: mdl-38982452

RESUMEN

BACKGROUND AND PURPOSE: To investigate the feasibility of synthesizing computed tomography (CT) images from magnetic resonance (MR) images in multi-center datasets using generative adversarial networks (GANs) for rectal cancer MR-only radiotherapy. MATERIALS AND METHODS: Conventional T2-weighted MR and CT images were acquired from 90 rectal cancer patients at Peking University People's Hospital and 19 patients in public datasets. This study proposed a new model combining contrastive learning loss and consistency regularization loss to enhance the generalization of model for multi-center pelvic MRI-to-CT synthesis. The CT-to-sCT image similarity was evaluated by computing the mean absolute error (MAE), peak signal-to-noise ratio (SNRpeak), structural similarity index (SSIM) and Generalization Performance (GP). The dosimetric accuracy of synthetic CT was verified against CT-based dose distributions for the photon plan. Relative dose differences in the planning target volume and organs at risk were computed. RESULTS: Our model presented excellent generalization with a GP of 0.911 on unseen datasets and outperformed the plain CycleGAN, where MAE decreased from 47.129 to 42.344, SNRpeak improved from 25.167 to 26.979, SSIM increased from 0.978 to 0.992. The dosimetric analysis demonstrated that most of the relative differences in dose and volume histogram (DVH) indicators between synthetic CT and real CT were less than 1%. CONCLUSION: The proposed model can generate accurate synthetic CT in multi-center datasets from T2w-MR images. Most dosimetric differences were within clinically acceptable criteria for photon radiotherapy, demonstrating the feasibility of an MRI-only workflow for patients with rectal cancer.


Asunto(s)
Aprendizaje Profundo , Imagen por Resonancia Magnética , Planificación de la Radioterapia Asistida por Computador , Neoplasias del Recto , Tomografía Computarizada por Rayos X , Humanos , Tomografía Computarizada por Rayos X/métodos , Imagen por Resonancia Magnética/métodos , Planificación de la Radioterapia Asistida por Computador/métodos , Neoplasias del Recto/radioterapia , Neoplasias del Recto/diagnóstico por imagen , Femenino , Masculino , Persona de Mediana Edad , Dosificación Radioterapéutica , Órganos en Riesgo/efectos de la radiación , Adulto , Anciano , Pelvis/diagnóstico por imagen , Procesamiento de Imagen Asistido por Computador/métodos , Estudios de Factibilidad
3.
Sensors (Basel) ; 24(13)2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-39000906

RESUMEN

Rock image classification represents a challenging fine-grained image classification task characterized by subtle differences among closely related rock categories. Current contrastive learning methods prevalently utilized in fine-grained image classification restrict the model's capacity to discern critical features contrastively from image pairs, and are typically too large for deployment on mobile devices used for in situ rock identification. In this work, we introduce an innovative and compact model generation framework anchored by the design of a Feature Positioning Comparison Network (FPCN). The FPCN facilitates interaction between feature vectors from localized regions within image pairs, capturing both shared and distinctive features. Further, it accommodates the variable scales of objects depicted in images, which correspond to differing quantities of inherent object information, directing the network's attention to additional contextual details based on object size variability. Leveraging knowledge distillation, the architecture is streamlined, with a focus on nuanced information at activation boundaries to master the precise fine-grained decision boundaries, thereby enhancing the small model's accuracy. Empirical evidence demonstrates that our proposed method based on FPCN improves the classification accuracy mobile lightweight models by nearly 2% while maintaining the same time and space consumption.

4.
Neural Netw ; 179: 106516, 2024 Jul 06.
Artículo en Inglés | MEDLINE | ID: mdl-39003981

RESUMEN

Temporal Knowledge Graphs (TKGs) enable effective modeling of knowledge dynamics and event evolution, facilitating deeper insights and analysis into temporal information. Recently, extrapolation of TKG reasoning has attracted great significance due to its remarkable ability to capture historical correlations and predict future events. Existing studies of extrapolation aim mainly at encoding the structural and temporal semantics based on snapshot sequences, which contain graph aggregators for the association within snapshots and recurrent units for the evolution. However, these methods are limited to modeling long-distance history, as they primarily focus on capturing temporal correlations over shorter periods. Besides, a few approaches rely on compiling historical repetitive statistics of TKGs for predicting future facts. But they often overlook explicit interactions in the graph structure among concurrent events. To address these issues, we propose a PotentiaL concurrEnt Aggregation and contraStive learnING (PLEASING) method for TKG extrapolation. PLEASING is a two-step reasoning framework that effectively leverages the historical and potential features of TKGs. It includes two encoders for historical and global events with an adaptive gated mechanism, acquiring predictions with appropriate weight of the two aspects. Specifically, PLEASING constructs two auxiliary graphs to capture temporal interaction among timestamps and correlations among potential concurrent events, respectively, enabling a holistic investigation of temporal characteristics and future potential possibilities in TKGs. Furthermore, PLEASING incorporates contrastive learning to strengthen its capacity to identify whether queries are related to history. Extensive experiments on seven benchmark datasets demonstrate the state-of-the-art performances of PLEASING and its comprehensive ability to model TKG semantics.

5.
Neural Netw ; 179: 106503, 2024 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-38986189

RESUMEN

Fusion-style Deep Multi-view Clustering (FDMC) can efficiently integrate comprehensive feature information from latent embeddings of multiple views and has drawn much attention recently. However, existing FDMC methods suffer from the interference of view-specific information for fusion representation, affecting the learning of discriminative cluster structure. In this paper, we propose a new framework of Progressive Neighbor-masked Contrastive Learning for FDMC (PNCL-FDMC) to tackle the aforementioned issues. Specifically, by using neighbor-masked contrastive learning, PNCL-FDMC can explicitly maintain the local structure during the embedding aggregation, which is beneficial to the common semantics enhancement on the fusion view. Based on the consistent aggregation, the fusion view is further enhanced by diversity-aware cluster structure enhancement. In this process, the enhanced cluster assignments and cluster discrepancies are employed to guide the weighted neighbor-masked contrastive alignment of semantic structure between individual views and the fusion view. Extensive experiments validate the effectiveness of the proposed framework, revealing its ability in discriminative representation learning and improving clustering performance.

6.
Curr Med Imaging ; 20(1): e15734056313837, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-39039669

RESUMEN

INTRODUCTION: This study introduces SkinLiTE, a lightweight supervised contrastive learning model tailored to enhance the detection and typification of skin lesions in dermoscopic images. The core of SkinLiTE lies in its unique integration of supervised and contrastive learning approaches, which leverages labeled data to learn generalizable representations. This approach is particularly adept at handling the challenge of complexities and imbalances inherent in skin lesion datasets. METHODS: The methodology encompasses a two-phase learning process. In the first phase, SkinLiTE utilizes an encoder network and a projection head to transform and project dermoscopic images into a feature space where contrastive loss is applied, focusing on minimizing intra-class variations while maximizing inter-class differences. The second phase freezes the encoder's weights, leveraging the learned representations for classification through a series of dense and dropout layers. The model was evaluated using three datasets from Skin Cancer ISIC 2019-2020, covering a wide range of skin conditions. RESULTS: SkinLiTE demonstrated superior performance across various metrics, including accuracy, AUC, and F1 scores, particularly when compared with traditional supervised learning models. Notably, SkinLiTE achieved an accuracy of 0.9087 using AugMix augmentation for binary classification of skin lesions. It also showed comparable results with the state-of-the-art approaches of ISIC challenge without relying on external data, underscoring its efficacy and efficiency. The results highlight the potential of SkinLiTE as a significant step forward in the field of dermatological AI, offering a robust, efficient, and accurate tool for skin lesion detection and classification. Its lightweight architecture and ability to handle imbalanced datasets make it particularly suited for integration into Internet of Medical Things environments, paving the way for enhanced remote patient monitoring and diagnostic capabilities. CONCLUSION: This research contributes to the evolving landscape of AI in healthcare, demonstrating the impact of innovative learning methodologies in medical image analysis.


Asunto(s)
Dermoscopía , Neoplasias Cutáneas , Aprendizaje Automático Supervisado , Humanos , Dermoscopía/métodos , Neoplasias Cutáneas/diagnóstico por imagen , Interpretación de Imagen Asistida por Computador/métodos , Piel/diagnóstico por imagen
7.
Neural Netw ; 178: 106485, 2024 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-38959597

RESUMEN

Detecting Out-of-Distribution (OOD) inputs is essential for reliable deep learning in the open world. However, most existing OOD detection methods have been developed based on training sets that exhibit balanced class distributions, making them susceptible when confronted with training sets following a long-tailed distribution. To alleviate this problem, we propose an effective three-branch training framework, which demonstrates the efficacy of incorporating an extra rejection class along with auxiliary outlier training data for effective OOD detection in long-tailed image classification. In our proposed framework, all outlier training samples are assigned the label of the rejection class. We employ an inlier loss, an outlier loss, and a Tail-class prototype induced Supervised Contrastive Loss (TSCL) to train both the in-distribution classifier and OOD detector within one network. During inference, the OOD detector is constructed using the rejection class. Extensive experimental results demonstrate that the superior OOD detection performance of our proposed method in long-tailed image classification. For example, in the more challenging case where CIFAR100-LT is used as in-distribution, our method improves the average AUROC by 1.23% and reduces the average FPR95 by 3.18% compared to the baseline method utilizing Outlier Exposure (OE). Code is available at github.

8.
Accid Anal Prev ; 206: 107698, 2024 Jul 03.
Artículo en Inglés | MEDLINE | ID: mdl-38964139

RESUMEN

With the development of driving behavior monitoring technologies, commercial transportation enterprises have leveraged aberrant driving event detection results for evaluating crash risk and triggering proactive interventions. The state-of-the-art applications were established based upon instant associations between events and crash occurrence, which assumed crash risk surged with aberrant events. Consequently, the generated crash risk monitoring results merely contain discrete abrupt changes, failing to depict the time-varying trend of crash risk and posing challenges for interventions. Given the multiple types of aberrant events and their various temporal combinations, the key to depict crash risk time-varying trend is the analysis of multi-type events' temporal coupling influence. Existing studies employed event frequency to model combined influence, lacking the capability to differentiate the temporal sequential characteristics of events. Hence, there is an urgent need to further explore multi-type events' temporal coupling influence on crash risk. In this study, the temporal associations between multi-type aberrant driving events and crash occurrence are explored. Specifically, a contrastive learning method, fusing prior domain knowledge and empirical data, was proposed to analyze the single event temporal influence on crash risk. After that, a novel Crash Risk Evaluation Transformer (RiskFormer) was developed. In the RiskFormer, a unified encoding method for different events, as well as a self-attention mechanism, were established to learn multi-type events' temporal coupling influence. Empirical data from online ride-hailing services were employed, and the modeling results unveiled three distinct time-varying patterns of crash risk, including decay, increasing, and increasing-decay pattern. Additionally, RiskFormer exhibited remarkable crash risk evaluation performance, demonstrating a 12.8% improvement in the Area Under Curve (AUC) score compared to the conventional instant-association-based model. Furthermore, the practical utility of RiskFormer was illustrated through a crash risk monitoring sample case. Finally, applications of the proposed methods and their further investigations have been discussed.

9.
Front Neurorobot ; 18: 1428785, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38947247

RESUMEN

Next Point-of-Interest (POI) recommendation aims to predict the next POI for users from their historical activities. Existing methods typically rely on location-level POI check-in trajectories to explore user sequential transition patterns, which suffer from the severe check-in data sparsity issue. However, taking into account region-level and category-level POI sequences can help address this issue. Moreover, collaborative information between different granularities of POI sequences is not well utilized, which can facilitate mutual enhancement and benefit to augment user preference learning. To address these challenges, we propose multi-granularity contrastive learning (MGCL) for next POI recommendation, which utilizes multi-granularity representation and contrastive learning to improve the next POI recommendation performance. Specifically, location-level POI graph, category-level, and region-level sequences are first constructed. Then, we use graph convolutional networks on POI graph to extract cross-user sequential transition patterns. Furthermore, self-attention networks are used to learn individual user sequential transition patterns for each granularity level. To capture the collaborative signals between multi-granularity, we apply the contrastive learning approach. Finally, we jointly train the recommendation and contrastive learning tasks. Extensive experiments demonstrate that MGCL is more effective than state-of-the-art methods.

10.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-38975895

RESUMEN

Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Algoritmos , Humanos , Animales , Programas Informáticos , Aprendizaje Automático
11.
Med Phys ; 2024 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-39031488

RESUMEN

BACKGROUND: With the rapid advancement of medical imaging technologies, precise image analysis and diagnosis play a crucial role in enhancing treatment outcomes and patient care. Computed tomography (CT) and magnetic resonance imaging (MRI), as pivotal technologies in medical imaging, exhibit unique advantages in bone imaging and soft tissue contrast, respectively. However, cross-domain medical image registration confronts significant challenges due to the substantial differences in contrast, texture, and noise levels between different imaging modalities. PURPOSE: The purpose of this study is to address the major challenges encountered in the field of cross-domain medical image registration by proposing a spatial-aware contrastive learning approach that effectively integrates shared information from CT and MRI images. Our objective is to optimize the feature space representation by employing advanced reconstruction and contrastive loss functions, overcoming the limitations of traditional registration methods when dealing with different imaging modalities. Through this approach, we aim to enhance the model's ability to learn structural similarities across domain images, improve registration accuracy, and provide more precise imaging analysis tools for clinical diagnosis and treatment planning. METHODS: With prior knowledge that different domains of images (CT and MRI) share same content-style information, we extract equivalent feature spaces from both images, enabling accurate cross-domain point matching. We employ a structure resembling that of an autoencoder, augmented with designed reconstruction and contrastive losses to fulfill our objectives. We also propose region mask to solve the conflict between spatial correlation and distinctiveness, to obtain a better representation space. RESULTS: Our research results demonstrate the significant superiority of the proposed spatial-aware contrastive learning approach in the domain of cross-domain medical image registration. Quantitatively, our method achieved an average Dice similarity coefficient (DSC) of 85.68%, target registration error (TRE) of 1.92 mm, and mean Hausdorff distance (MHD) of 1.26 mm, surpassing current state-of-the-art methods. Additionally, the registration processing time was significantly reduced to 2.67 s on a GPU, highlighting the efficiency of our approach. The experimental outcomes not only validate the effectiveness of our method in improving the accuracy of cross-domain image registration but also prove its adaptability across different medical image analysis scenarios, offering robust support for enhancing diagnostic precision and patient treatment outcomes. CONCLUSIONS: The spatial-aware contrastive learning approach proposed in this paper introduces a new perspective and solution to the domain of cross-domain medical image registration. By effectively optimizing the feature space representation through carefully designed reconstruction and contrastive loss functions, our method significantly improves the accuracy and stability of registration between CT and MRI images. The experimental results demonstrate the clear advantages of our approach in enhancing the accuracy of cross-domain image registration, offering significant application value in promoting precise diagnosis and personalized treatment planning. In the future, we look forward to further exploring the application of this method in a broader range of medical imaging datasets and its potential integration with other advanced technologies, contributing more innovations to the field of medical image analysis and processing.

12.
Int J Mol Sci ; 25(13)2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-39000335

RESUMEN

In various domains, including everyday activities, agricultural practices, and medical treatments, the escalating challenge of antibiotic resistance poses a significant concern. Traditional approaches to studying antibiotic resistance genes (ARGs) often require substantial time and effort and are limited in accuracy. Moreover, the decentralized nature of existing data repositories complicates comprehensive analysis of antibiotic resistance gene sequences. In this study, we introduce a novel computational framework named TGC-ARG designed to predict potential ARGs. This framework takes protein sequences as input, utilizes SCRATCH-1D for protein secondary structure prediction, and employs feature extraction techniques to derive distinctive features from both sequence and structural data. Subsequently, a Siamese network is employed to foster a contrastive learning environment, enhancing the model's ability to effectively represent the data. Finally, a multi-layer perceptron (MLP) integrates and processes sequence embeddings alongside predicted secondary structure embeddings to forecast ARG presence. To evaluate our approach, we curated a pioneering open dataset termed ARSS (Antibiotic Resistance Sequence Statistics). Comprehensive comparative experiments demonstrate that our method surpasses current state-of-the-art methodologies. Additionally, through detailed case studies, we illustrate the efficacy of our approach in predicting potential ARGs.


Asunto(s)
Farmacorresistencia Microbiana , Farmacorresistencia Microbiana/genética , Biología Computacional/métodos , Estructura Secundaria de Proteína , Aprendizaje Automático , Antibacterianos/farmacología , Farmacorresistencia Bacteriana/genética , Redes Neurales de la Computación
13.
Gigascience ; 132024 Jan 02.
Artículo en Inglés | MEDLINE | ID: mdl-39028588

RESUMEN

BACKGROUND: Integrative analysis of spatially resolved transcriptomics datasets empowers a deeper understanding of complex biological systems. However, integrating multiple tissue sections presents challenges for batch effect removal, particularly when the sections are measured by various technologies or collected at different times. FINDINGS: We propose spatiAlign, an unsupervised contrastive learning model that employs the expression of all measured genes and the spatial location of cells, to integrate multiple tissue sections. It enables the joint downstream analysis of multiple datasets not only in low-dimensional embeddings but also in the reconstructed full expression space. CONCLUSIONS: In benchmarking analysis, spatiAlign outperforms state-of-the-art methods in learning joint and discriminative representations for tissue sections, each potentially characterized by complex batch effects or distinct biological characteristics. Furthermore, we demonstrate the benefits of spatiAlign for the integrative analysis of time-series brain sections, including spatial clustering, differential expression analysis, and particularly trajectory inference that requires a corrected gene expression matrix.


Asunto(s)
Perfilación de la Expresión Génica , Transcriptoma , Aprendizaje Automático no Supervisado , Perfilación de la Expresión Génica/métodos , Biología Computacional/métodos , Humanos , Algoritmos , Animales , Análisis por Conglomerados , Encéfalo/metabolismo
14.
Heliyon ; 10(12): e32929, 2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-39022062

RESUMEN

Criteria Based Content Analysis (CBCA) is a forensic tool that analyzes victim statements. It involves the categorization of victims' statements into 19 distinct criteria classifications, playing a crucial role in evaluating the authenticity of testimonies by discerning whether they are rooted in genuine experiences or fabricated accounts. The exclusion of subjective opinions becomes imperative to assess statements through this forensic tool objectively. This study proposes developing an objective classification model for CBCA-based statement analysis using natural language processing techniques. Nevertheless, achieving optimal classification performance proves challenging due to imbalances in data distribution among the various criterion classifications. To enhance the accuracy and reliability of the classification model, this research employs data augmentation techniques and dual contrastive learning methods for fine-tuning the RoBERTa language model. Furthermore, model-based optimization techniques are also applied to identify augmented hyper-parameters and maximize the model's classification performance. The study's findings, including an 8.5% improvement in macro F1 score compared to human classification results, a 24% improvement in macro F1 score, and a 13% improvement in accuracy compared to previous human classification results, suggest that the proposed model is highly effective in reducing the influence of human subjectivity in statement analysis. The proposed model has significant implications for legal proceedings and criminal investigations, as it can provide a more objective and reliable method for evaluating the credibility of victim statements. Reducing human subjectivity in the statement analysis process can increase the accuracy of verdicts and help ensure that justice is served.

15.
Brief Bioinform ; 25(4)2024 May 23.
Artículo en Inglés | MEDLINE | ID: mdl-39038935

RESUMEN

Functional peptides play crucial roles in various biological processes and hold significant potential in many fields such as drug discovery and biotechnology. Accurately predicting the functions of peptides is essential for understanding their diverse effects and designing peptide-based therapeutics. Here, we propose CELA-MFP, a deep learning framework that incorporates feature Contrastive Enhancement and Label Adaptation for predicting Multi-Functional therapeutic Peptides. CELA-MFP utilizes a protein language model (pLM) to extract features from peptide sequences, which are then fed into a Transformer decoder for function prediction, effectively modeling correlations between different functions. To enhance the representation of each peptide sequence, contrastive learning is employed during training. Experimental results demonstrate that CELA-MFP outperforms state-of-the-art methods on most evaluation metrics for two widely used datasets, MFBP and MFTP. The interpretability of CELA-MFP is demonstrated by visualizing attention patterns in pLM and Transformer decoder. Finally, a user-friendly online server for predicting multi-functional peptides is established as the implementation of the proposed CELA-MFP and can be freely accessed at http://dreamai.cmii.online/CELA-MFP.


Asunto(s)
Aprendizaje Profundo , Péptidos , Péptidos/química , Biología Computacional/métodos , Programas Informáticos , Humanos , Algoritmos , Bases de Datos de Proteínas
16.
Front Artif Intell ; 7: 1414352, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38933470

RESUMEN

Time series is a typical data type in numerous domains; however, labeling large amounts of time series data can be costly and time-consuming. Learning effective representation from unlabeled time series data is a challenging task. Contrastive learning stands out as a promising method to acquire representations of unlabeled time series data. Therefore, we propose a self-supervised time-series representation learning framework via Time-Frequency Fusion Contrasting (TF-FC) to learn time-series representation from unlabeled data. Specifically, TF-FC combines time-domain augmentation with frequency-domain augmentation to generate the diverse samples. For time-domain augmentation, the raw time series data pass through the time-domain augmentation bank (such as jitter, scaling, permutation, and masking) and get time-domain augmentation data. For frequency-domain augmentation, first, the raw time series undergoes conversion into frequency domain data following Fast Fourier Transform (FFT) analysis. Then, the frequency data passes through the frequency-domain augmentation bank (such as low pass filter, remove frequency, add frequency, and phase shift) and gets frequency-domain augmentation data. The fusion method of time-domain augmentation data and frequency-domain augmentation data is kernel PCA, which is useful for extracting nonlinear features in high-dimensional spaces. By capturing both the time and frequency domains of the time series, the proposed approach is able to extract more informative features from the data, enhancing the model's capacity to distinguish between different time series. To verify the effectiveness of the TF-FC method, we conducted experiments on four time series domain datasets (i.e., SleepEEG, HAR, Gesture, and Epilepsy). Experimental results show that TF-FC significantly improves in recognition accuracy compared with other SOTA methods.

17.
Neural Netw ; 178: 106477, 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38936109

RESUMEN

Clothing change person re-identification (CC-ReID) aims to match images of the same person wearing different clothes across diverse scenes. Leveraging biological features or clothing labels, existing CC-ReID methods have demonstrated promising performance. However, current research primarily focuses on supervised CC-ReID methods, which require a substantial number of manually annotated labels. To tackle this challenge, we propose a novel clothing-invariant contrastive learning (CICL) framework for unsupervised CC-ReID task. Firstly, to obtain clothing change positive pairs at a low computational cost, we propose a random clothing augmentation (RCA) method. RCA initially partitions clothing regions based on parsing images, then applies random augmentation to different clothing regions, ultimately generating clothing change positive pairs to facilitate clothing-invariant learning. Secondly, to generate pseudo-labels strongly correlated with identity in an unsupervised manner, we design semantic fusion clustering (SFC), which enhances identity-related information through semantic fusion. Additionally, we develop a semantic alignment contrastive loss (SAC loss) to encourages the model to learn features strongly correlated with identity and enhances the model's robustness to clothing changes. Unlike existing optimization methods that forcibly bring closer clusters with different pseudo-labels, SAC loss aligns the clustering results of real image features with those generated by SFC, forming a mutually reinforcing scheme with SFC. Experimental results on multiple CC-ReID datasets demonstrate that the proposed CICL not only outperforms existing unsupervised methods but can even achieves competitive performance with supervised CC-ReID methods. Code is made available at https://github.com/zqpang/CICL.

18.
Entropy (Basel) ; 26(6)2024 May 29.
Artículo en Inglés | MEDLINE | ID: mdl-38920479

RESUMEN

Multi-view clustering requires simultaneous attention to both consistency and the diversity of information between views. Deep learning techniques have shown impressive abilities to learn complex features when working with extensive datasets; however, existing deep multi-view clustering methods often focus only on either consistency information or diversity information, making it difficult to balance both aspects. Therefore, this paper proposes a view-driven multi-view clustering using the contrastive double-learning method (VMC-CD), aiming to generate better clustering results. This method first adopts a view-driven approach to consider information from other views to encourage diversity, thus guiding feature learning. Additionally, it presents the idea of dual contrastive learning to enhance the alignment of views at both the clustering and feature levels. The VMC-CD method's superiority over various cutting-edge methods is substantiated by experimental findings across three datasets, affirming its effectiveness.

19.
Front Artif Intell ; 7: 1398844, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38873178

RESUMEN

Active learning is a field of machine learning that seeks to find the most efficient labels to annotate with a given budget, particularly in cases where obtaining labeled data is expensive or infeasible. This is becoming increasingly important with the growing success of learning-based methods, which often require large amounts of labeled data. Computer vision is one area where active learning has shown promise in tasks such as image classification, semantic segmentation, and object detection. In this research, we propose a pool-based semi-supervised active learning method for image classification that takes advantage of both labeled and unlabeled data. Many active learning approaches do not utilize unlabeled data, but we believe that incorporating these data can improve performance. To address this issue, our method involves several steps. First, we cluster the latent space of a pre-trained convolutional autoencoder. Then, we use a proposed clustering contrastive loss to strengthen the latent space's clustering while using a small amount of labeled data. Finally, we query the samples with the highest uncertainty to annotate with an oracle. We repeat this process until the end of the given budget. Our method is effective when the number of annotated samples is small, and we have validated its effectiveness through experiments on benchmark datasets. Our empirical results demonstrate the power of our method for image classification tasks in accuracy terms.

20.
J Imaging Inform Med ; 2024 Jun 10.
Artículo en Inglés | MEDLINE | ID: mdl-38858260

RESUMEN

To develop a robust segmentation model, encoding the underlying features/structures of the input data is essential to discriminate the target structure from the background. To enrich the extracted feature maps, contrastive learning and self-learning techniques are employed, particularly when the size of the training dataset is limited. In this work, we set out to investigate the impact of contrastive learning and self-learning on the performance of the deep learning-based semantic segmentation. To this end, three different datasets were employed used for brain tumor and hippocampus delineation from MR images (BraTS and Decathlon datasets, respectively) and kidney segmentation from CT images (Decathlon dataset). Since data augmentation techniques are also aimed at enhancing the performance of deep learning methods, a deformable data augmentation technique was proposed and compared with contrastive learning and self-learning frameworks. The segmentation accuracy for the three datasets was assessed with and without applying data augmentation, contrastive learning, and self-learning to individually investigate the impact of these techniques. The self-learning and deformable data augmentation techniques exhibited comparable performance with Dice indices of 0.913 ± 0.030 and 0.920 ± 0.022 for kidney segmentation, 0.890 ± 0.035 and 0.898 ± 0.027 for hippocampus segmentation, and 0.891 ± 0.045 and 0.897 ± 0.040 for lesion segmentation, respectively. These two approaches significantly outperformed the contrastive learning and the original model with Dice indices of 0.871 ± 0.039 and 0.868 ± 0.042 for kidney segmentation, 0.872 ± 0.045 and 0.865 ± 0.048 for hippocampus segmentation, and 0.870 ± 0.049 and 0.860 ± 0.058 for lesion segmentation, respectively. The combination of self-learning with deformable data augmentation led to a robust segmentation model with no outliers in the outcomes. This work demonstrated the beneficial impact of self-learning and deformable data augmentation on organ and lesion segmentation, where no additional training datasets are needed.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...