Búsqueda | Portal Regional de la BVS

Exploring Generalizable Distillation for Efficient Medical Image Segmentation.

Qi, Xingqun; Wu, Zhuojie; Zou, Wenxuan; Ren, Min; Gao, Yifan; Sun, Muyi; Zhang, Shanghang; Shan, Caifeng; Sun, Zhenan.

IEEE J Biomed Health Inform ; 28(7): 4170-4183, 2024 Jul.

Artículo en Inglés | MEDLINE | ID: mdl-38954557

RESUMEN

Efficient medical image segmentation aims to provide accurate pixel-wise predictions with a lightweight implementation framework. However, existing lightweight networks generally overlook the generalizability of the cross-domain medical segmentation tasks. In this paper, we propose Generalizable Knowledge Distillation (GKD), a novel framework for enhancing the performance of lightweight networks on cross-domain medical segmentation by generalizable knowledge distillation from powerful teacher networks. Considering the domain gaps between different medical datasets, we propose the Model-Specific Alignment Networks (MSAN) to obtain the domain-invariant representations. Meanwhile, a customized Alignment Consistency Training (ACT) strategy is designed to promote the MSAN training. Based on the domain-invariant vectors in MSAN, we propose two generalizable distillation schemes, Dual Contrastive Graph Distillation (DCGD) and Domain-Invariant Cross Distillation (DICD). In DCGD, two implicit contrastive graphs are designed to model the intra-coupling and inter-coupling semantic correlations. Then, in DICD, the domain-invariant semantic vectors are reconstructed from two networks (i.e., teacher and student) with a crossover manner to achieve simultaneous generalization of lightweight networks, hierarchically. Moreover, a metric named Fréchet Semantic Distance (FSD) is tailored to verify the effectiveness of the regularized domain-invariant features. Extensive experiments conducted on the Liver, Retinal Vessel and Colonoscopy segmentation datasets demonstrate the superiority of our method, in terms of performance and generalization ability on lightweight networks.

Asunto(s)

Procesamiento de Imagen Asistido por Computador , Humanos , Procesamiento de Imagen Asistido por Computador/métodos , Algoritmos , Redes Neurales de la Computación , Bases de Datos Factuales , Aprendizaje Profundo

A multimodal physiological dataset for driving behaviour analysis.

Tao, Xiaoming; Gao, Dingcheng; Zhang, Wenqi; Liu, Tianqi; Du, Bing; Zhang, Shanghang; Qin, Yanjun.

Sci Data ; 11(1): 378, 2024 Apr 12.

Artículo en Inglés | MEDLINE | ID: mdl-38609440

RESUMEN

Physiological signal monitoring and driver behavior analysis have gained increasing attention in both fundamental research and applied research. This study involved the analysis of driving behavior using multimodal physiological data collected from 35 participants. The data included 59-channel EEG, single-channel ECG, 4-channel EMG, single-channel GSR, and eye movement data obtained via a six-degree-of-freedom driving simulator. We categorized driving behavior into five groups: smooth driving, acceleration, deceleration, lane changing, and turning. Through extensive experiments, we confirmed that both physiological and vehicle data met the requirements. Subsequently, we developed classification models, including linear discriminant analysis (LDA), MMPNet, and EEGNet, to demonstrate the correlation between physiological data and driving behaviors. Notably, we propose a multimodal physiological dataset for analyzing driving behavior(MPDB). The MPDB dataset's scale, accuracy, and multimodality provide unprecedented opportunities for researchers in the autonomous driving field and beyond. With this dataset, we will contribute to the field of traffic psychology and behavior.

Asunto(s)

Conducción de Automóvil , Movimientos Oculares , Humanos

EfficientBioAI: making bioimaging AI models efficient in energy and latency.

Zhou, Yu; Cao, Jiajun; Sonneck, Justin; Banerjee, Sweta; Dörr, Stefanie; Grüneboom, Anika; Lorenz, Kristina; Zhang, Shanghang; Chen, Jianxu.

Nat Methods ; 21(3): 368-369, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38267660

Biphasic Face Photo-Sketch Synthesis via Semantic-Driven Generative Adversarial Network With Graph Representation Learning.

Qi, Xingqun; Sun, Muyi; Wang, Zijian; Liu, Jiaming; Li, Qi; Zhao, Fang; Zhang, Shanghang; Shan, Caifeng.

IEEE Trans Neural Netw Learn Syst ; PP2023 Dec 19.

Artículo en Inglés | MEDLINE | ID: mdl-38113153

RESUMEN

Biphasic face photo-sketch synthesis has significant practical value in wide-ranging fields such as digital entertainment and law enforcement. Previous approaches directly generate the photo-sketch in a global view, they always suffer from the low quality of sketches and complex photograph variations, leading to unnatural and low-fidelity results. In this article, we propose a novel semantic-driven generative adversarial network to address the above issues, cooperating with graph representation learning. Considering that human faces have distinct spatial structures, we first inject class-wise semantic layouts into the generator to provide style-based spatial information for synthesized face photographs and sketches. In addition, to enhance the authenticity of details in generated faces, we construct two types of representational graphs via semantic parsing maps upon input faces, dubbed the intraclass semantic graph (IASG) and the interclass structure graph (IRSG). Specifically, the IASG effectively models the intraclass semantic correlations of each facial semantic component, thus producing realistic facial details. To preserve the generated faces being more structure-coordinated, the IRSG models interclass structural relations among every facial component by graph representation learning. To further enhance the perceptual quality of synthesized images, we present a biphasic interactive cycle training strategy by fully taking advantage of the multilevel feature consistency between the photograph and sketch. Extensive experiments demonstrate that our method outperforms the state-of-the-art competitors on the CUHK Face Sketch (CUFS) and CUHK Face Sketch FERET (CUFSF) datasets.

A Review of Single-Source Deep Unsupervised Visual Domain Adaptation.

Zhao, Sicheng; Yue, Xiangyu; Zhang, Shanghang; Li, Bo; Zhao, Han; Wu, Bichen; Krishna, Ravi; Gonzalez, Joseph E; Sangiovanni-Vincentelli, Alberto L; Seshia, Sanjit A; Keutzer, Kurt.

IEEE Trans Neural Netw Learn Syst ; 33(2): 473-493, 2022 02.

Artículo en Inglés | MEDLINE | ID: mdl-33095718

RESUMEN

Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks. However, in many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data. To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain. Unfortunately, direct transfer across domains often performs poorly due to the presence of domain shift or dataset bias. Domain adaptation (DA) is a machine learning paradigm that aims to learn a model from a source domain that can perform well on a different (but related) target domain. In this article, we review the latest single-source deep unsupervised DA methods focused on visual tasks and discuss new perspectives for future research. We begin with the definitions of different DA strategies and the descriptions of existing benchmark datasets. We then summarize and compare different categories of single-source unsupervised DA methods, including discrepancy-based methods, adversarial discriminative methods, adversarial generative methods, and self-supervision-based methods. Finally, we discuss future research directions with challenges and possible solutions.

Asunto(s)

Aprendizaje Automático , Redes Neurales de la Computación , Adaptación Fisiológica , Benchmarking

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA