Your browser doesn't support javascript.
loading
Semantic aware-based instruction embedding for binary code similarity detection.
Jia, Yuhao; Yu, Zhicheng; Hong, Zhen.
Afiliação
  • Jia Y; College of Information Engineering, Zhejiang University of Technology, Hangzhou, Zhejiang, China.
  • Yu Z; College of Information Engineering, Zhejiang University of Technology, Hangzhou, Zhejiang, China.
  • Hong Z; College of Information Engineering, Zhejiang University of Technology, Hangzhou, Zhejiang, China.
PLoS One ; 19(6): e0305299, 2024.
Article em En | MEDLINE | ID: mdl-38861533
ABSTRACT
Binary code similarity detection plays a crucial role in various applications within binary security, including vulnerability detection, malicious software analysis, etc. However, existing methods suffer from limited differentiation in binary embedding representations across different compilation environments, lacking dynamic high-level semantics. Moreover, current approaches often neglect multi-level semantic feature extraction, thereby failing to acquire precise semantic information about the binary code. To address these limitations, this paper introduces a novel detection solution called BinBcla. This method employs an enhanced pre-training model to generate instruction embeddings with dynamic semantics for binary functions. Subsequently, multi-feature fusion technique is utilized to extract local semantic information and long-distance global features from the code, respectively, employing self-attention to comprehend the structure information of the code. Finally, an improved cosine similarity method is employed to learn relationships among all elements of the distance vectors, thereby enhancing the model's robustness to new sample functions. Experiments are conducted across different architectures, compilers, and optimization levels. The results indicate that BinBcla achieves higher accuracy, precision and F1 score compared to existing methods.
Assuntos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Semântica Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China País de publicação: Estados Unidos

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Semântica Idioma: En Revista: PLoS One Assunto da revista: CIENCIA / MEDICINA Ano de publicação: 2024 Tipo de documento: Article País de afiliação: China País de publicação: Estados Unidos