|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2015 Vol.16 No.12 P.1059-1068
An ensemble method for data stream classification in the presence of concept drift
Abstract: One recent area of interest in computer science is data stream management and processing. By ‘data stream’, we refer to continuous and rapidly generated packages of data. Specific features of data streams are immense volume, high production rate, limited data processing time, and data concept drift; these features differentiate the data stream from standard types of data. An issue for the data stream is classification of input data. A novel ensemble classifier is proposed in this paper. The classifier uses base classifiers of two weighting functions under different data input conditions. In addition, a new method is used to determine drift, which emphasizes the precision of the algorithm. Another characteristic of the proposed method is removal of different numbers of the base classifiers based on their quality. Implementation of a weighting mechanism to the base classifiers at the decision-making stage is another advantage of the algorithm. This facilitates adaptability when drifts take place, which leads to classifiers with higher efficiency. Furthermore, the proposed method is tested on a set of standard data and the results confirm higher accuracy compared to available ensemble classifiers and single classifiers. In addition, in some cases the proposed classifier is faster and needs less storage space.
Key words: Data stream, Classificaion, Ensemble classifiers, Concept drift
创新点:在数据流分类器的基础上,提出一种包含概念漂移检测、基分类器移除和动态加权机制的方法。
方法:(1)针对不同数据输入条件,对基分类器使用两种加权函数;(2)利用Kappa系数确定概念漂移,提升算法精度;(3)基于基分类器的质量,移除不同数目的基分类器;(4)在决策阶段对基分类器应用加权机制,提升算法对漂移的适应性,提高分类器效率。
结论:在标准数据集上测试,本文方法较现有整体分类器和单分类器可获得更高的精度;在某些情况下可节省运行时间和内存用量。
关键词组:
Recommended Papers Related to this topic:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.1400398
CLC number:
TP391
Download Full Text:
Downloaded:
2729
Download summary:
<Click Here>Downloaded:
1926Clicked:
7437
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2015-11-11