Your browser doesn't support javascript.
loading
Re_Trans: Combined Retrieval and Transformer Model for Source Code Summarization.
Zhang, Chunyan; Zhou, Qinglei; Qiao, Meng; Tang, Ke; Xu, Lianqiu; Liu, Fudong.
Affiliation
  • Zhang C; State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China.
  • Zhou Q; School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China.
  • Qiao M; State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China.
  • Tang K; State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China.
  • Xu L; State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China.
  • Liu F; State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China.
Entropy (Basel) ; 24(10)2022 Sep 27.
Article in En | MEDLINE | ID: mdl-37420392
ABSTRACT
Source code summarization (SCS) is a natural language description of source code functionality. It can help developers understand programs and maintain software efficiently. Retrieval-based methods generate SCS by reorganizing terms selected from source code or use SCS of similar code snippets. Generative methods generate SCS via attentional encoder-decoder architecture. However, a generative method can generate SCS for any code, but sometimes the accuracy is still far from expectation (due to the lack of numerous high-quality training sets). A retrieval-based method is considered to have a higher accurac, but usually fails to generate SCS for a source code in the absence of a similar candidate in the database. In order to effectively combine the advantages of retrieval-based methods and generative methods, we propose a new

method:

Re_Trans. For a given code, we first utilize the retrieval-based method to obtain its most similar code with regard to sematic and corresponding SCS (S_RM). Then, we input the given code and similar code into the trained discriminator. If the discriminator outputs onr, we take S_RM as the result; otherwise, we utilize the generate model, transformer, to generate the given code' SCS. Particularly, we use AST-augmented (AbstractSyntax Tree) and code sequence-augmented information to make the source code semantic extraction more complete. Furthermore, we build a new SCS retrieval library through the public dataset. We evaluate our method on a dataset of 2.1 million Java code-comment pairs, and experimental results show improvement over the state-of-the-art (SOTA) benchmarks, which demonstrates the effectiveness and efficiency of our method.
Key words

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Entropy (Basel) Year: 2022 Type: Article Affiliation country: China

Full text: 1 Collection: 01-internacional Database: MEDLINE Language: En Journal: Entropy (Basel) Year: 2022 Type: Article Affiliation country: China