CLC number: TP311.13
On-line Access: 2008-05-09
Received: 2007-10-16
Revision Accepted: 2008-01-28
Crosschecked: 0000-00-00
Cited: 3
Clicked: 5334
Yi-jun BEI, Gang CHEN, Jin-xiang DONG, Ke CHEN. Bottom-up mining of XML query patterns to improve XML querying[J]. Journal of Zhejiang University Science A, 2008, 9(6): 744-757.
@article{title="Bottom-up mining of XML query patterns to improve XML querying",
author="Yi-jun BEI, Gang CHEN, Jin-xiang DONG, Ke CHEN",
journal="Journal of Zhejiang University Science A",
volume="9",
number="6",
pages="744-757",
year="2008",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A071551"
}
%0 Journal Article
%T Bottom-up mining of XML query patterns to improve XML querying
%A Yi-jun BEI
%A Gang CHEN
%A Jin-xiang DONG
%A Ke CHEN
%J Journal of Zhejiang University SCIENCE A
%V 9
%N 6
%P 744-757
%@ 1673-565X
%D 2008
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A071551
TY - JOUR
T1 - Bottom-up mining of XML query patterns to improve XML querying
A1 - Yi-jun BEI
A1 - Gang CHEN
A1 - Jin-xiang DONG
A1 - Ke CHEN
J0 - Journal of Zhejiang University Science A
VL - 9
IS - 6
SP - 744
EP - 757
%@ 1673-565X
Y1 - 2008
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A071551
Abstract: Querying XML data is a computationally expensive process due to the complex nature of both the XML data and the XML queries. In this paper we propose an approach to expedite XML query processing by caching the results of frequent queries. We discover frequent query patterns from user-issued queries using an efficient bottom-up mining approach called VBUXMiner. VBUXMiner consists of two main steps. First, all queries are merged into a summary structure named “compressed global tree guide” (CGTG). Second, a bottom-up traversal scheme based on the CGTG is employed to generate frequent query patterns. We use the frequent query patterns in a cache mechanism to improve the XML query performance. Experimental results show that our proposed mining approach outperforms the previous mining algorithms for XML queries, such as XQPMinerTID and FastXMiner, and that by caching the results of frequent query patterns, XML query performance can be dramatically improved.
[1] Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y., 2002. Structural Joins: A Primitive for Efficient XML Query Pattern Matching. Proc. 18th ICDE, p.141-152.
[2] Asai, T., Abe, K., Kawasoe, S., Arimura, H., Satamoto, H., Arikawa, S., 2002. Efficient Substructure Discovery from Large Semi-structured Data. Proc. 2nd SIAM Int. Conf. on Data Mining, p.158-174.
[3] Asai, T., Arimura, H., Uno, T., Nakano, S., 2003. Discovering Frequent Substructures in Large Unordered Trees. Proc. 6th Int. Conf. on Discovery Science, p.47-61.
[4] Bei, Y.J., Chen, G., Dong, J.X., 2007. BUXMiner: An Efficient Bottom-Up Approach to Mining XML Query Patterns. Proc. APWEB/WAIM, p.709-720.
[5] Chehreghani, M.H., Rahgozar, M., Lucas, C., 2007. Mining Maximal Embedded Unordered Tree Patterns. Proc. IEEE Symp. on Computational Intelligence and Data Mining, p.437-443.
[6] Chen, L., Rundensteiner, E.A., Wang, S., 2002. XCache: A Semantic Caching System for XML Queries. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.618.
[7] Chen, L., Bhowmick, S.S., Chia, L.T., 2005. Mining Positive and Negative Association Rules from XML Query Patterns for Caching. Proc. 10th DASFAA, p.736-747.
[8] Chi, Y., Yang, Y., Muntz, R.R., 2003. Indexing and Mining Free Trees. Proc. 3rd ICDM, p.509-512.
[9] Chi, Y., Yang, Y., Muntz, R.R., 2004. HybridTreeMiner: An Efficient Algorihtm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms. Proc. 16th Int. Conf. on Scientific and Statistical Database Management, p.11-20.
[10] Gu, M.S., Hwang, J.H., Ryu, K.H., 2007. Frequent XML Query Pattern Mining Based on FP-Tree. Proc. DEXA Workshops, p.555-559.
[11] Hong, J.W., Kang, H., 2005. Data Integration and Cache-Answerability of Queries through XML View of Data Source on the Web. Proc. IMSA, p.242-247.
[12] Kim, Y., Park, S.H., Kim, T.S., Lee, J.H., Park, T.S., 2006. An Efficient Index Scheme for XML Databases. Proc. SOFSEM, p.370-378.
[13] Kutty, S., Nayak, R., Li, Y., 2007. PCITMiner-Prefix-Based Closed Induced Tree Miner for Finding Closed Induced Frequent Subtrees. Proc. 6th AusDM, p.151-160.
[14] Luccio, F., Enriquez, A.M., Rieumont, P.O., Pagli, L., 2001. Exact Rooted Subtree Matching in Sublinear Time. Technical Report TR-01-14.
[15] Nayak, R., Iryadi, W., 2006. XMine: A Methodology for Mining XML Structure. Proc. APWeb, p.786-792.
[16] Paik, J., Kim, U.M., 2006. A Simple Yet Efficient Approach for Maximal Frequent Subtrees Extraction from a Collection of XML Documents. Proc. WISE Workshops, p.94-103.
[17] Seo, D.M., Yoo, J.S., Cho, K.H., 2007. An Efficient XML Index Structure with Bottom-Up Query Processing. Proc. ICCS, p.813-820.
[18] Yang, L.H., Lee, M.L., Hsu, W., 2003a. Efficient Mining of XML Query Patterns for Caching. Proc. 29th VLDB, p.69-80.
[19] Yang, L.H., Lee, M.L., Hsu, W., Acharya, S., 2003b. Mining Frequent Query Patterns from XML Queries. Proc. 8th DASFAA, p.355-362.
[20] Zaki, M.J., 2002. Efficiently Mining Frequent Trees in a Forest. Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.71-80.
[21] Zaki, M.J., 2005. Efficiently mining frequent embedded unordered trees. Fundamenta Informaticae, 65:33-52.
Open peer comments: Debate/Discuss/Question/Opinion
<1>