Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 4 de 4
Filter
Add more filters










Database
Language
Publication year range
2.
Sci Rep ; 13(1): 18369, 2023 Oct 26.
Article in English | MEDLINE | ID: mdl-37884556

ABSTRACT

Existing knowledge distillation (KD) methods are mainly based on features, logic, or attention, where features and logic represent the results of reasoning at different stages of a convolutional neural network, and attention maps symbolize the reasoning process. Because of the continuity of the two in time, transferring only one of them to the student network will lead to unsatisfactory results. We study the knowledge transfer between the teacher-student network to different degrees, revealing the importance of simultaneously transferring knowledge related to the reasoning process and reasoning results to the student network, providing a new perspective for the study of KD. On this basis, we proposed the knowledge distillation method based on attention and feature transfer (AFT-KD). First, we use transformation structures to transform intermediate features into attentional and feature block (AFB) that contain both inference process information and inference outcome information, and force students to learn the knowledge in AFBs. To save computation in the learning process, we use block operations to align the teacher-student network. In addition, in order to balance the attenuation ratio between different losses, we design an adaptive loss function based on the loss optimization rate. Experiments have shown that AFT-KD achieves state-of-the-art performance in multiple benchmark tests.

3.
PLoS One ; 18(10): e0292517, 2023.
Article in English | MEDLINE | ID: mdl-37812605

ABSTRACT

Previous studies have shown that deep models are often over-parameterized, and this parameter redundancy makes deep compression possible. The redundancy of model weight is often manifested as low rank and sparsity. Ignoring any part of the two or the different distributions of these two characteristics in the model will lead to low accuracy and a low compression rate of deep compression. To make full use of the difference between low-rank and sparsity, a unified framework combining low-rank tensor decomposition and structured pruning is proposed: a hybrid model compression method based on sensitivity grouping (HMC). This framework unifies the existing additive hybrid compression method (AHC) and the non-additive hybrid compression method (NaHC) proposed by us into one model. The latter group the network according to the sensitivity difference of the convolutional layer to different compression methods, which can better integrate the low rank and sparsity of the model compared with the former. Experiments show that our approach achieves a better trade-off between test accuracy and compression ratio when compressing the ResNet family of models than other recent compression methods using a single strategy or additive hybrid compression.


Subject(s)
Data Compression , Physical Phenomena
4.
Sensors (Basel) ; 23(13)2023 Jun 27.
Article in English | MEDLINE | ID: mdl-37447828

ABSTRACT

Image dehazing based on convolutional neural networks has achieved significant success; however, there are still some problems, such as incomplete dehazing, color deviation, and loss of detailed information. To address these issues, in this study, we propose a multi-scale dehazing network with dark channel priors (MSDN-DCP). First, we introduce a feature extraction module (FEM), which effectively enhances the ability of feature extraction and correlation through a two-branch residual structure. Second, a feature fusion module (FFM) is devised to combine multi-scale features adaptively at different stages. Finally, we propose a dark channel refinement module (DCRM) that implements the dark channel prior theory to guide the network in learning the features of the hazy region, ultimately refining the feature map that the network extracted. We conduct experiments using the Haze4K dataset, and the achieved results include a peak signal-to-noise ratio of 29.57 dB and a structural similarity of 98.1%. The experimental results show that the MSDN-DCP can achieve superior dehazing compared to other algorithms in terms of objective metrics and visual perception.


Subject(s)
Algorithms , Benchmarking , Learning , Neural Networks, Computer , Signal-To-Noise Ratio
SELECTION OF CITATIONS
SEARCH DETAIL
...