CLC number: TP311.13
On-line Access:
Received: 2009-01-02
Revision Accepted: 2009-04-15
Crosschecked: 2009-10-18
Cited: 0
Clicked: 5973
Jin-hua JIANG, Ke CHEN, Xiao-yan LI, Gang CHEN, Li-dan SHOU. Efficient processing of ordered XML twig pattern matching based on extended Dewey[J]. Journal of Zhejiang University Science A, 2009, 10(12): 1769-1783.
@article{title="Efficient processing of ordered XML twig pattern matching based on extended Dewey",
author="Jin-hua JIANG, Ke CHEN, Xiao-yan LI, Gang CHEN, Li-dan SHOU",
journal="Journal of Zhejiang University Science A",
volume="10",
number="12",
pages="1769-1783",
year="2009",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A0920006"
}
%0 Journal Article
%T Efficient processing of ordered XML twig pattern matching based on extended Dewey
%A Jin-hua JIANG
%A Ke CHEN
%A Xiao-yan LI
%A Gang CHEN
%A Li-dan SHOU
%J Journal of Zhejiang University SCIENCE A
%V 10
%N 12
%P 1769-1783
%@ 1673-565X
%D 2009
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A0920006
TY - JOUR
T1 - Efficient processing of ordered XML twig pattern matching based on extended Dewey
A1 - Jin-hua JIANG
A1 - Ke CHEN
A1 - Xiao-yan LI
A1 - Gang CHEN
A1 - Li-dan SHOU
J0 - Journal of Zhejiang University Science A
VL - 10
IS - 12
SP - 1769
EP - 1783
%@ 1673-565X
Y1 - 2009
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A0920006
Abstract: Finding all occurrences of a twig pattern is a core operation of extensible markup language (XML) query processing. Holistic twig join algorithms, which avoid a large number of intermediate results, represent the state-of-the-art algorithms. However, ordered XML twig join is mentioned rarely in the literature and previous algorithms developed in attempts to solve the problem of ordered twig pattern (OTP) matching have poor performance. In this paper, we first propose a novel children linked stacks encoding scheme to represent compactly the partial ordered twig join results. Based on this encoding scheme and extended Dewey, we design a novel holistic OTP matching algorithm, called OTJFast, which needs only to access the labels of the leaf query nodes. Furthermore, we propose a new algorithm, named OTJFaster, incorporating three effective optimization rules to avoid unnecessary computations. This works well on available indices (such as B+-tree), skipping useless elements. Thus, not only is disk access reduced greatly, but also many unnecessary computations are avoided. Finally, our extensive experiments over both real and synthetic datasets indicate that our algorithms are superior to previous approaches.
[1] Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y., 2002. Structural Joins: A Primitive for Efficient XML Query Pattern Matching. Proc. 18th Int. Conf. on Data Engineering, p.141-152.
[2] Berglund, A., Boag, S., Chamberlin, D., Fernandez, M.F., Kay, M., Robie, J., Simeon, J., 2002. XML Path Language (XPath) 2.0. Technical Report WD-xpath20-20020816. W3C.
[3] Boag, S., Chamberlin, D., Fernandez, M.F., Florescu, D., Robie, J., Simeon, J., 2002. XQuery 1.0: An XML Query Language. Technical Report WD-xquery-20020816. W3C.
[4] Bruno, N., Koudas, N., Srivastava, D., 2002. Holistic Twig Joins: Optimal XML Pattern Matching. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.310-321.
[5] Chen, S., Li, H.G., Tatemura, J., Hsiung, W.P., Agrawal, D., Candan, K.S., 2006. Twig2Stack: Bottom-up Processing of Generalized-tree-pattern Queries over XML Documents. Proc. 32nd Very Large Database, p.283-294.
[6] Chen, T., Lu, J.H., Ling, T.W., 2005. On Boosting Holism in XML Twig Pattern Matching Using Structural Indexing Techniques. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.455-466.
[7] Jiang, H.F., Wang, W., Lu, H., 2003. Holistic Twig Joins on Indexed XML Documents. Proc. 29th Very Large Database, p.273-284.
[8] Jiang, J.H., Chen, G., Shou, L.D., Chen, K., 2008. OTJFast: Processing Ordered XML Twig Join Fast. Proc. IEEE Asia-Pacific Services Computing Conf., p.1289-1294.
[9] Jiang, Z.W., Luo, C., Hou, W.C., 2006. An Efficient One-phase Holistic Twig Join Algorithm for XML Data. Proc. 15th ACM Int. Conf. on Information and Knowledge Management, p.786-787.
[10] Li, G.L., Feng, J.H., Zhang,Y., Zhou, L.Z., 2007. Efficient Holistic Twig Joins in Leaf-to-Root Combining with Root-to-Leaf Way. Proc. 12th Int. Conf. on Database Systems for Advanced Applications, p.834-849.
[11] Lu, J.H., Chen, T., Ling, T.W., 2004. Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach. Proc. 13th ACM SIGMOD Int. Conf. on Management of Data, p.533-542.
[12] Lu, J.H., Ling, T.W., Chan, C.Y., Chen, T., 2005a. From Region Encoding to Extended Dewey: On Efficient Processing of XML Twig Pattern Matching. Proc. 31st Very Large Database, p.193-204.
[13] Lu, J.H., Ling, T.W., Yu, T., Li, C., Ni, W., 2005b. Efficient Processing of Ordered XML Twig Pattern. Proc. Int. Conf. on Database and Expert Systems Applications, p.300-309.
[14] Printer, R.Y., 1985. Efficient String Matching with Don’t Care Patterns. In: Apostolico A., Galil, Z. (Eds.), Combinatorial Algorithms on Words. NATO Advanced Science Institute Series F: Computer and System Sciences. Springer-Verlag, New York, p.11-29.
[15] Prűfer, H., 1918. Neuer beweis eines satzes űber permutationen. Arch. Math. Phys., 27:142-144 (in Spanish).
[16] Rao, P., Moon, B., 2004. PRIX: Indexing and Querying XML Using Prűfer Sequences. Proc. 20th Int. Conf. on Data Engineering, p.288-299.
[17] Rao, P., Moon, B., 2006. Sequencing XML data and query twig for fast pattern matching. ACM Trans. Database Syst., 31(1):299-345.
[18] Tatarinov, I., Viglas, S.D., Beyer, K., Shanmugasundaram, J., Shekita, E., Zhang, C., 2002. Storing and Querying Ordered XML Using a Relational Database System. Proc. ACM SIGMOD Int. Conf. on Management of Data, p.204-215.
[19] Zhang, C., Naughton, J., Dewitt, D.J., Luo, Q., Lohman, G.M., 2001. On supporting containment queries in relational database management systems. ACM SIGMOD Rec., 30(2):425-436.
[20] Zhu, J.Q., Wang, W., Meng, X.F., 2008. Efficient Processing of Complex Twig Pattern Matching. Proc. 9th Int. Conf. on Web-Age Information Management, p.135-140.
[21] Zografoula, V., Nick, K., Divesh, S., Vassilis, J.T., 2005. Answering Order-based Queries over XML Data. Proc. 14th Int. World Wide Web Conf., p.1162-1163.
Open peer comments: Debate/Discuss/Question/Opinion
<1>