|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2017 Vol.18 No.8 P.1082-1107
Improved binary similarity measures for software modularization
Abstract: Various binary similarity measures have been employed in clustering approaches to make homogeneous groups of similar entities in the data. These similarity measures are mostly based only on the presence or absence of features. Binary similarity measures have also been explored with different clustering approaches (e.g., agglomerative hierarchical clustering) for software modularization to make software systems understandable and manageable. Each similarity measure has its own strengths and weaknesses which improve and deteriorate the clustering results, respectively. We highlight the strengths of some well-known existing binary similarity measures for software modularization. Furthermore, based on these existing similarity measures, we introduce several improved new binary similarity measures. Proofs of the correctness with illustration and a series of experiments are presented to evaluate the effectiveness of our new binary similarity measures.
Key words: Binary similarity measure, Binary features, Combination of measures, Software modularization
创新点:本文强调了软件模块化中一些已有的著名的二元相似度测量的优势。此外,基于这些已有的相似度测量,新提出了几种改进的相似度测量。
方法:首先,介绍了一些软件模块化中已有的著名的二元相似度测量的优势。接着,提出了几种改进的新的相似度测量。结合具体例子,说明这些新方法整合了JC、JNM和RR这几种已有的二元相似度测量的优势。最后,通过实验比较新方法与已有方法的结果,验证所提新方法的有效性。
结论:实验结果表明相较于已有的相似度测量,本文所提出的新的二元相似度测量结果的可信度更高。这些新方法能减少任意决策的数量,增加聚类过程中聚类的数量。尽管这些新方法仅基于数据的二元特征向量表达,它们能被用来测试任何编程语言编写的软件系统。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.1500373
CLC number:
TP311
Download Full Text:
Downloaded:
2567
Clicked:
7286
Cited:
0
On-line Access:
2017-09-08
Received:
2015-10-30
Revision Accepted:
2016-04-12
Crosschecked:
2017-08-04