CLC number:
On-line Access: 2022-05-24
Received: 2021-08-07
Revision Accepted: 2022-03-24
Crosschecked: 0000-00-00
Cited: 0
Clicked: 87
Yichao SHAO, Zhiqiu HUANG, Weiwei LI, Yaoshen YU. Fast code recommendation via approximate sub-tree matching[J]. Frontiers of Information Technology & Electronic Engineering, 1998, -1(-1): .
@article{title="Fast code recommendation via approximate sub-tree matching",
author="Yichao SHAO, Zhiqiu HUANG, Weiwei LI, Yaoshen YU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="-1",
number="-1",
pages="",
year="1998",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2100379"
}
%0 Journal Article
%T Fast code recommendation via approximate sub-tree matching
%A Yichao SHAO
%A Zhiqiu HUANG
%A Weiwei LI
%A Yaoshen YU
%J Journal of Zhejiang University SCIENCE C
%V -1
%N -1
%P
%@ 2095-9184
%D 1998
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2100379
TY - JOUR
T1 - Fast code recommendation via approximate sub-tree matching
A1 - Yichao SHAO
A1 - Zhiqiu HUANG
A1 - Weiwei LI
A1 - Yaoshen YU
J0 - Journal of Zhejiang University Science C
VL - -1
IS - -1
SP -
EP -
%@ 2095-9184
Y1 - 1998
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2100379
Abstract: Software developers often write code that has similar functionality to existing code segments. A code recommendation tool that helps developers reuse these code fragments can significantly improve their efficiency. Several methods have been proposed in recent years. Some use sequence matching algorithms to find the related recommendations. Most of these methods are time-consuming and can leverage only low-level textual information from code. Others extract features from code and obtain similarity using numerical feature vectors. However, the similarity of feature vectors is often not equivalent to the original code's similarity. Structural information is lost during the process of transforming abstract syntax trees into vectors. We propose an approximate sub-tree matching-based method to solve this problem. Unlike existing tree-based approaches that match feature vectors, it retains the tree structure of the query code in the matching process to find code fragments that best match the current query. It uses a fast approximation subtree matching algorithm by transforming the subtree matching problem into the match between the tree and the list. In this way, the structural information can be used for code recommendation tasks that have high time requirements. We have constructed several real-world code databases to evaluate the effectiveness of our method, which covers different languages and granularities. The results show that our method outperforms two compared methods in terms of recall value on all the datasets, and can be applied to big datasets.
Open peer comments: Debate/Discuss/Question/Opinion
<1>