Your browser doesn't support javascript.
loading
Face anti-spoofing with cross-stage relation enhancement and spoof material perception.
Li, Daiyuan; Chen, Guo; Wu, Xixian; Yu, Zitong; Tan, Mingkui.
Afiliação
  • Li D; South China University of Technology, Guangzhou, 510006, Guangdong, China; Pazhou Laboratory, Guangzhou, 510000, Guangdong, China; Key Laboratory of Big Data and Intelligent Robot, Ministry of Education, Guangzhou, 510000, Guangdong, China. Electronic address: selidaiyuan@mail.scut.edu.cn.
  • Chen G; South China University of Technology, Guangzhou, 510006, Guangdong, China; CSSC Systems Engineering Research Institute, Beijing, 100000, Beijing, China. Electronic address: qwead134@gmail.com.
  • Wu X; HuNan Gmax Intelligent Technology, Changsha, 410000, Hunan, China. Electronic address: wuxixian@gmax-ai.com.
  • Yu Z; Great Bay University, Dongguan, 523000, Guangdong, China. Electronic address: zitong.yu@ieee.org.
  • Tan M; South China University of Technology, Guangzhou, 510006, Guangdong, China. Electronic address: mingkuitan@scut.edu.cn.
Neural Netw ; 175: 106275, 2024 Jul.
Article em En | MEDLINE | ID: mdl-38653078
ABSTRACT
Face Anti-Spoofing (FAS) seeks to protect face recognition systems from spoofing attacks, which is applied extensively in scenarios such as access control, electronic payment, and security surveillance systems. Face anti-spoofing requires the integration of local details and global semantic information. Existing CNN-based methods rely on small stride or image patch-based feature extraction structures, which struggle to capture spatial and cross-layer feature correlations effectively. Meanwhile, Transformer-based methods have limitations in extracting discriminative detailed features. To address the aforementioned issues, we introduce a multi-stage CNN-Transformer-based framework, which extracts local features through the convolutional layer and long-distance feature relationships via self-attention. Based on this, we proposed a cross-attention multi-stage feature fusion, employing semantically high-stage features to query task-relevant features in low-stage features for further cross-stage feature fusion. To enhance the discrimination of local features for subtle differences, we design pixel-wise material classification supervision and add a auxiliary branch in the intermediate layers of the model. Moreover, to address the limitations of a single acquisition environment and scarcity of acquisition devices in the existing Near-Infrared dataset, we create a large-scale Near-Infrared Face Anti-Spoofing dataset with 380k pictures of 1040 identities. The proposed method could achieve the state-of-the-art in OULU-NPU and our proposed Near-Infrared dataset at just 1.3GFlops and 3.2M parameter numbers, which demonstrate the effective of the proposed method.
Assuntos
Palavras-chave

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação Idioma: En Ano de publicação: 2024 Tipo de documento: Article

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Redes Neurais de Computação Idioma: En Ano de publicação: 2024 Tipo de documento: Article