|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2022 Vol.23 No.5 P.715-731
A software defect prediction method with metric compensation based on feature selection and transfer learning
Abstract: Cross-project software defect prediction solves the problem of insufficient training data for traditional defect prediction, and overcomes the challenge of applying models learned from multiple different source projects to target project. At the same time, two new problems emerge: (1) too many irrelevant and redundant features in the model training process will affect the training efficiency and thus decrease the prediction accuracy of the model; (2) the distribution of metric values will vary greatly from project to project due to the development environment and other factors, resulting in lower prediction accuracy when the model achieves cross-project prediction. In the proposed method, the Pearson feature selection method is introduced to address data redundancy, and the metric compensation based transfer learning technique is used to address the problem of large differences in data distribution between the source project and target project. In this paper, we propose a software defect prediction method with metric compensation based on feature selection and transfer learning. The experimental results show that the model constructed with this method achieves better results on area under the receiver operating characteristic curve (AUC) value and F1-measure metric.
Key words: Defect prediction; Feature selection; Transfer learning; Metric compensation
1江苏大学计算机科学与通信工程学院,中国镇江市,212013
2江苏大学工业网络空间安全技术江苏省重点实验室,中国镇江市,212013
摘要:跨项目软件缺陷预测解决了传统缺陷预测中训练数据不足的问题,克服了将多个不同源项目中学习的模型应用于目标项目的挑战。与此同时,出现两个新问题:(1)模型训练过程中过多无关和冗余特征影响训练效率,降低了模型预测精度;(2)由于开发环境等因素,度量值的分布因项目而异,当模型用于跨项目预测时,预测精度较低。本文引入皮尔逊特征选择方法解决数据冗余问题,采用基于迁移学习的度量补偿技术解决源项目和目标项目之间数据分布差异较大的问题。提出一种基于特征选择和迁移学习的度量补偿软件缺陷预测方法。实验结果表明,用该方法构建的模型在AUC(接收器工作特性曲线下面积)值和F1度量指标上取得较好结果。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.2100468
CLC number:
TP311.5
Download Full Text:
Downloaded:
6087
Download summary:
<Click Here>Downloaded:
495Clicked:
3356
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2022-02-05