Pesquisa | BVS CLAP/SMR-OPAS/OMS

DeforT: Deformable transformer for visual tracking.

Yang, Kai; Li, Qun; Tian, Chunwei; Zhang, Haijun; Shi, Aiwu; Li, Jinkai.

Neural Netw ; 176: 106380, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38754289

RESUMO

Most trackers formulate visual tracking as common classification and regression (i.e., bounding box regression) tasks. Correlation features that are computed through depth-wise convolution or channel-wise multiplication operations are input into both the classification and regression branches for inference. However, this matching computation with the linear correlation method tends to lose semantic features and obtain only a local optimum. Moreover, these trackers use an unreliable ranking based on the classification score and the intersection over union (IoU) loss for the regression training, thus degrading the tracking performance. In this paper, we introduce a deformable transformer model, which effectively computes the correlation features of the training and search sets. A new loss called the quality-aware focal loss (QAFL) is used to train the classification network; it efficiently alleviates the inconsistency between the classification and localization quality predictions. We use a new regression loss called α-GIoU to train the regression network, and it effectively improves localization accuracy. To further improve the tracker's robustness, the candidate object location is predicted by using a combination of online learning scores with a transformer-assisted framework and classification scores. An extensive experiment on six testing datasets demonstrates the effectiveness of our method. In particular, the proposed method attains a success score of 71.7% on the OTB-2015 dataset and an AUC score of 67.3% on the NFS30 dataset, respectively.

Assuntos

Redes Neurais de Computação , Humanos , Algoritmos , Tecnologia de Rastreamento Ocular

A self-supervised network for image denoising and watermark removal.

Tian, Chunwei; Xiao, Jingyu; Zhang, Bob; Zuo, Wangmeng; Zhang, Yudong; Lin, Chia-Wen.

Neural Netw ; 174: 106218, 2024 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-38518709

RESUMO

In image watermark removal, popular methods depend on given reference non-watermark images in a supervised way to remove watermarks. However, reference non-watermark images are difficult to be obtained in the real world. At the same time, they often suffer from the influence of noise when captured by digital devices. To resolve these issues, in this paper, we present a self-supervised network for image denoising and watermark removal (SSNet). SSNet uses a parallel network in a self-supervised learning way to remove noise and watermarks. Specifically, each sub-network contains two sub-blocks. The upper sub-network uses the first sub-block to remove noise, according to noise-to-noise. Then, the second sub-block in the upper sub-network is used to remove watermarks, according to the distributions of watermarks. To prevent the loss of important information, the lower sub-network is used to simultaneously learn noise and watermarks in a self-supervised learning way. Moreover, two sub-networks interact via attention to extract more complementary salient information. The proposed method does not depend on paired images to learn a blind denoising and watermark removal model, which is very meaningful for real applications. Also, it is more effective than the popular image watermark removal methods in public datasets. Codes can be found at https://github.com/hellloxiaotian/SSNet.

Editorial: Human-centered robot vision and artificial perception.

Gao, Qing; Zhang, Xin; Tian, Chunwei; Gao, Hongwei; Ju, Zhaojie.

Front Robot AI ; 11: 1406280, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-39040427

Ver mais detalhes

ENVIAR RESULTADO:

Exportar

Imprimir

RSS

XML

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA