eCite Digital Repository

Extended XML tree pattern matching: theories and algorithms


Lu, J and Ling, TW and Bao, Z and Wang, C, Extended XML tree pattern matching: theories and algorithms, IEEE Transactions on Knowledge and Data Engineering, 23, (3) pp. 402-416. ISSN 1041-4347 (2011) [Refereed Article]

Copyright Statement

Copyright 2011 IEEE Computer Society

DOI: doi:10.1109/TKDE.2010.126


As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing. However, XML query languages (e.g., XPath and XQuery) define more axes and functions such as negation function, order-based axis, and wildcards. In this paper, we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards, and order restriction. We establish a theoretical framework about "matching cross" which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms.

Item Details

Item Type:Refereed Article
Keywords:structured query processing, semi-structured data, algorithms, tree pattern, XML/XSL/RDF
Research Division:Information and Computing Sciences
Research Group:Data management and data science
Research Field:Data management and data science not elsewhere classified
Objective Division:Information and Communication Services
Objective Group:Information systems, technologies and services
Objective Field:Information systems, technologies and services not elsewhere classified
UTAS Author:Bao, Z (Dr Zhifeng Bao)
ID Code:92170
Year Published:2011
Web of Science® Times Cited:27
Deposited By:Information and Communication Technology
Deposited On:2014-06-09
Last Modified:2015-02-13

Repository Staff Only: item control page