Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

An oversampling approach for mining program specifications

Abstract: Automatic protocol mining is a promising approach for inferring accurate and complete API protocols. However, just as with any data-mining technique, this approach requires sufficient training data (object usage scenarios). Existing approaches resolve the problem by analyzing more programs, which may cause significant runtime overhead. In this paper, we propose an inheritance-based oversampling approach for object usage scenarios (OUSs). Our technique is based on the inheritance relationship in object-oriented programs. Given an object-oriented program p, generally, the OUSs that can be collected from a run of p are not more than the objects used during the run. With our technique, a maximum of n times more OUSs can be achieved, where n is the average number of super-classes of all general OUSs. To investigate the effect of our technique, we implement it in our previous prototype tool, ISpecMiner, and use the tool to mine protocols from several real-world programs. Experimental results show that our technique can collect 1.95 times more OUSs than general approaches. Additionally, accurate and complete API protocols are more likely to be achieved. Furthermore, our technique can mine API protocols for classes never even used in programs, which are valuable for validating software architectures, program documentation, and understanding. Although our technique will introduce some runtime overhead, it is trivial and acceptable.

Key words: Object usage scenario, API protocol mining, Program temporal specification mining, Oversampling

Chinese Summary  <28> 一种用于程序约束挖掘的过采样方法

概要:自动协议挖掘是获取精确而完备的API使用协议的有效方法。然而,与其它数据挖掘应用类似,自动协议挖掘方法需要足够多训练数据(即对象使用场景)作为输入。虽然通过增加程序的规模可提取更多数量的对象使用场景,但这会导致程序分析运行时开销较大。本文针对面向对象程序提出一种基于继承关系的对象使用场景过采样方法。给定一个面向对象程序p,一般情况下,执行p所能获得的对象使用场景数不超过运行时实例化的对象数。而本文方法可获得多达上述n倍的对象使用场景,其中n为程序p中一般对象使用场景的平均父类数。为了验证效果,在前期API使用协议动态挖掘原型工具ISpecMiner中集成上述方法并开展实验研究。实验采用扩展后的ISpecMiner从多个实际的程序中挖掘API使用协议。结果显示,采用本文方法获得的对象使用场景数是一般化方法的1.95倍。不仅如此,对比实验结果表明本文方法有利于挖掘更加精确而完备的API使用协议。特别值得关注的是,本文方法适用于无法实例化的类并挖掘出其API使用协议。这类API使用协议对于验证软件架构、程序说明和理解具有重要意义。虽然本文方法会增加一定的运行开销,但其仍在可接受范围内。

关键词组:对象使用场景;API协议挖掘;程序时序约束挖掘;过采样


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.1601783

CLC number:

TP311

Download Full Text:

Click Here

Downloaded:

2323

Download summary:

<Click Here> 

Downloaded:

1715

Clicked:

6187

Cited:

0

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2018-06-12

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE