CLC number: TP181
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2011-07-14
Cited: 1
Clicked: 7644
Zhi-yong Yan, Cong-fu Xu, Yun-he Pan. Improving naive Bayes classifier by dividing its decision regions[J]. Journal of Zhejiang University Science C, 2011, 12(8): 647-657.
@article{title="Improving naive Bayes classifier by dividing its decision regions",
author="Zhi-yong Yan, Cong-fu Xu, Yun-he Pan",
journal="Journal of Zhejiang University Science C",
volume="12",
number="8",
pages="647-657",
year="2011",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.C1000437"
}
%0 Journal Article
%T Improving naive Bayes classifier by dividing its decision regions
%A Zhi-yong Yan
%A Cong-fu Xu
%A Yun-he Pan
%J Journal of Zhejiang University SCIENCE C
%V 12
%N 8
%P 647-657
%@ 1869-1951
%D 2011
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.C1000437
TY - JOUR
T1 - Improving naive Bayes classifier by dividing its decision regions
A1 - Zhi-yong Yan
A1 - Cong-fu Xu
A1 - Yun-he Pan
J0 - Journal of Zhejiang University Science C
VL - 12
IS - 8
SP - 647
EP - 657
%@ 1869-1951
Y1 - 2011
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.C1000437
Abstract: Classification can be regarded as dividing the data space into decision regions separated by decision boundaries. In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective. Thus, a decision tree can be regarded as a classifier tree, in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node. Meanwhile, the NBTree algorithm, which generates a classifier tree with the c4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively, can also be regarded as training naive Bayes classifiers in decision regions of the c4.5 algorithm. We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier. These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers. The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries, and those that may be ignored by the naive Bayes classifier. Finally, we conduct experiments on 30 data sets from the UC Irvine (UCI) repository. Experiment results show that the SD algorithm can obtain better generalization abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the c4.5 algorithm and support vector machine (SVM) as leaf classifiers. Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately.
[1]Bezdek, J.C., 1981. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, USA.
[2]Bishop, C.M., 2006. Pattern Recognition and Machine Learning. Series: Information Science and Statistics. Springer-Verlag, New York, p.179-181.
[3]Domingos, P., Pazzani, M., 1997. On the optimality of the simple Bayesian classifier under zero-one loss. Mach. Learn., 29(2-3):103-130.
[4]Frank, A., Asuncion, A., 2010. UCI Machine Learning Repository. School of Information and Computer Science, University of California, Irvine, CA, USA. Available from http://archive.ics.uci.edu/ml [Accessed on July 7, 2010].
[5]Frank, E., Witten, I.H., 1998. Generating Accurate Rule Sets without Global Optimization. 15th Int. Conf. on Machine Learning, p.144-151.
[6]Frosyniotis, D., Stafylopatis, A., Likas, A., 2003. A divide-and-conquer method for multi-net classifiers. Pattern Anal. Appl., 6(1):32-40.
[7]Huang, K.Z., Yang, H.Q., King, I., Lyu, M., 2008. Machine Learning: Modeling Data Locally and Globally. Springer-Verlag, New York, p.1-28.
[8]Kohavi, R., 1996. Scaling up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid. 2nd Int. Conf. on Knowledge Discovery and Data Mining, p.202-207.
[9]Mitchell, T.M., 1997. Machine Learning. WCB/McGraw-Hill, New York, p.14-15.
[10]Pal, S.K., Mitra, S., 1992. Multi-layer perceptron, fuzzy sets, and classification. IEEE Trans. Neur. Networks, 3(5):683-697.
[11]Platt, J.C., 1999. Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Scholkopf, B., Burges, C., Smola, A. (Eds.), Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA, USA, p.185-208.
[12]Quinlan, J.R., 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, USA.
[13]Quinlan, J.R., 1996. Improved use of continuous attributes in C4.5. J. Artif. Intell. Res., 4(1):77-90.
[14]Vapnik, V.N., 1995. The Nature of Statistical Learning Theory. Springer, Berlin Heidelberg.
[15]Vlassis, N., Likas, A., 2002. A greedy EM algorithm for Gaussian mixture learning. Neur. Process. Lett., 15(1):77-87.
[16]Webb, G.I., Boughton, J.R., Wang, Z.H., 2005. Not so naive Bayes: aggregating one-dependence estimators. Mach. Learn., 58(1):5-24.
[17]Witten, I.H., Frank, E., 2005. Data Mining: Practical Machine Learning Tools and Techniques (2nd Ed.). Morgan Kaufmann, San Francisco, CA, USA.
[18]Wu, X.D., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., et al., 2008. Top 10 algorithms in data mining. Knowl. Inform. Syst., 14(1):1-37.
[19]Zheng, F., Webb, G.I., 2005. A Comparative Study of Semi-Naive Bayes Methods in Classification Learning. Fourth Australasian Data Mining Workshop, p.141-156.
Open peer comments: Debate/Discuss/Question/Opinion
<1>