Your browser doesn't support javascript.
loading
Neighbor-Guided Pseudo-Label Generation and Refinement for Single-Frame Supervised Temporal Action Localization.
IEEE Trans Image Process ; 33: 2419-2430, 2024.
Article in En | MEDLINE | ID: mdl-38517712
ABSTRACT
Due to the sparse single-frame annotations, current Single-Frame Temporal Action Localization (SF-TAL) methods generally employ threshold-based pseudo-label generation strategies. However, these approaches suffer from inefficient data utilization, as only parts of unlabeled frames with confidence scores surpassing a predefined threshold are selected for training. Moreover, the variability of single-frame annotations and unreliable model predictions introduce pseudo-label noise. To address these challenges, we propose two strategies by using the relationship of the video segments with their neighbors' 1) temporal neighbor-guided soft pseudo-label generation (TNPG); and 2) semantic neighbor-guided pseudo-label refinement (SNPR). TNPG utilizes a local-global self-attention mechanism in a transformer encoder to capture temporal neighbor information while focusing on the whole video. Then the generated self-attention map is multiplied by the network predictions to propagate information between labeled and unlabeled frames, and produce soft pseudo-label for all segments. Despite this, label noise persists due to unreliable model predictions. To mitigate this, SNPR refines pseudo-labels based on the assumption that predictions should resemble their semantic nearest neighbors'. Specifically, we search for semantic nearest neighbors of each video segment by cosine similarity in the feature space. Then the refined soft pseudo-labels can be obtained by a weight combination of the original pseudo-label and the semantic nearest neighbors'. Finally, the model can be trained with the refined pseudo-labels, and the performance has been greatly improved. Comprehensive experimental results on different benchmarks show that we achieve state-of-the-art performances on THUMOS14, ActivityNet1.2, and ActivityNet1.3 datasets.

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: IEEE Trans Image Process Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: IEEE Trans Image Process Journal subject: INFORMATICA MEDICA Year: 2024 Document type: Article