CLC number: TP311;TP391
On-line Access:
Received: 2002-10-25
Revision Accepted: 2003-04-18
Crosschecked: 0000-00-00
Cited: 0
Clicked: 4700
LIU Jun-qiang, PAN Yun-he. An efficient algorithm for mining closed itemsets[J]. Journal of Zhejiang University Science A, 2004, 5(1): 8-15.
@article{title="An efficient algorithm for mining closed itemsets",
author="LIU Jun-qiang, PAN Yun-he",
journal="Journal of Zhejiang University Science A",
volume="5",
number="1",
pages="8-15",
year="2004",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2004.0008"
}
%0 Journal Article
%T An efficient algorithm for mining closed itemsets
%A LIU Jun-qiang
%A PAN Yun-he
%J Journal of Zhejiang University SCIENCE A
%V 5
%N 1
%P 8-15
%@ 1869-1951
%D 2004
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2004.0008
TY - JOUR
T1 - An efficient algorithm for mining closed itemsets
A1 - LIU Jun-qiang
A1 - PAN Yun-he
J0 - Journal of Zhejiang University Science A
VL - 5
IS - 1
SP - 8
EP - 15
%@ 1869-1951
Y1 - 2004
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2004.0008
Abstract: This paper presents a new efficient algorithm for mining frequent closed itemsets. It enumerates the closed set of frequent itemsets by using a novel compound frequent itemset tree that facilitates fast growth and efficient pruning of search space. It also employs a hybrid approach that adapts search strategies, representations of projected transaction subsets, and projecting methods to the characteristics of the dataset. Efficient local pruning, global subsumption checking, and fast hashing methods are detailed in this paper. The principle that balances the overheads of search space growth and pruning is also discussed. Extensive experimental evaluations on real world and artificial datasets showed that our algorithm outperforms CHARM by a factor of five and is one to three orders of magnitude more efficient than CLOSET and MAFIA.
[1] Agarwal, R., Aggarwal, C. and Prasad, V., 2000. Depth First Generation of Long Patterns. In: The 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA.
[2] Agrawal, R. and Srikant, R., 1994. Fast Algorithms for Mining Association Rules. In: VLDB'94, Santiago, Chile, p.487-499.
[3] Burdick, D., Calimlim, M. and Gehrke, J., 2001. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases. In: The 17th International Conference on Data Engineering, Heidelberg, Germany.
[4] Liu, J., Pan, Y., Wang, K. and Han, J., 2002. Mining Frequent Itemsets by Opportunistic Projection. In: The 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Alberta, Canada, p.229-238.
[5] Pasquier, N., Bastide, Y., Taouil, R. and Lakhal, L., 1998. Pruning Closed Itemset Lattices for Association Rules. In: The BDA French Conference on Advanced Databases, France.
[6] Pasquier, N., Bastide, Y., Taouil, R. and Lakhal., L., 1999. Discovering Frequent Closed Itemsets for Association Rules. In : ICDT'99, Jerusalem, Israel, p.398-416.
[7] Pei, J., Han, J. and Mao, R., 2000. CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets. In: The ACM-SIGMOD International Workshop on Data Mining and Knowledge Discovery, Dallas, TX.
[8] Wang, K., Liu, T., Han, J. and Liu, J., 2002. Top Down FP-Growth for Association Rule Mining. In: The 6th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Taipei, p.334-340.
[9] Zaki, M.J. and Hsiao, C.J., 2002. CHARM: An Efficient Algorithm for Closed Itemset Mining. In: The 2nd SIAM International Conference on Data Mining, Arlington, VA, USA.
Open peer comments: Debate/Discuss/Question/Opinion
<1>