JZUS - Journal of Zhejiang University SCIENCE

Journal of Zhejiang University SCIENCE C

ISSN 1869-1951(Print), 1869-196x(Online), Monthly

2011 Vol.12 No.8 P.615-628

Clustering feature decision trees for semi-supervised classification from high-speed data streams

Wen-hua Xu, Zheng Qin, Yang Chang

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China, School of Software, Tsinghua University, Beijing 100084, China

xwh07@mails.tsinghua.edu.cn, zhqing@mails.tsinghua.edu.cn

Abstract: Most stream data classification algorithms apply the supervised learning strategy which requires massive labeled data. Such approaches are impractical since labeled data are usually hard to obtain in reality. In this paper, we build a clustering feature decision tree model, CFDT, from data streams having both unlabeled and a small number of labeled examples. CFDT applies a micro-clustering algorithm that scans the data only once to provide the statistical summaries of the data for incremental decision tree induction. Micro-clusters also serve as classifiers in tree leaves to improve classification accuracy and reinforce the any-time property. Our experiments on synthetic and real-world datasets show that CFDT is highly scalable for data streams while generating high classification accuracy with high speed.

Key words: Clustering feature vector, Decision tree, Semi-supervised learning, Stream data classification, Very fast decision tree

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/jzus.C1000330

CLC number:

TP391

Download Full Text:

Click Here

Downloaded:

5052

Clicked:

9436

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2011-07-04

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS