University of Tasmania
Browse

File(s) under permanent embargo

Extended XML tree pattern matching: theories and algorithms

journal contribution
posted on 2023-05-18, 01:21 authored by Lu, J, Ling, TW, Bao, Z, Wang, C
As business and enterprises generate and exchange XML data more often, there is an increasing need for efficient processing of queries on XML data. Searching for the occurrences of a tree pattern query in an XML database is a core operation in XML query processing. Prior works demonstrate that holistic twig pattern matching algorithm is an efficient technique to answer an XML tree pattern with parent-child (P-C) and ancestor-descendant (A-D) relationships, as it can effectively control the size of intermediate results during query processing. However, XML query languages (e.g., XPath and XQuery) define more axes and functions such as negation function, order-based axis, and wildcards. In this paper, we research a large set of XML tree pattern, called extended XML tree pattern, which may include P-C, A-D relationships, negation functions, wildcards, and order restriction. We establish a theoretical framework about “matching cross” which demonstrates the intrinsic reason in the proof of optimality on holistic algorithms. Based on our theorems, we propose a set of novel algorithms to efficiently process three categories of extended XML tree patterns. A set of experimental results on both real-life and synthetic data sets demonstrate the effectiveness and efficiency of our proposed theories and algorithms.

History

Publication title

IEEE Transactions on Knowledge and Data Engineering

Volume

23

Pagination

402-416

ISSN

1041-4347

Department/School

School of Information and Communication Technology

Publisher

IEEE Computer Soc

Place of publication

10662 Los Vaqueros Circle, Po Box 3014, Los Alamitos, USA, Ca, 90720-1314

Rights statement

Copyright 2011 IEEE Computer Society

Repository Status

  • Restricted

Socio-economic Objectives

Information systems, technologies and services not elsewhere classified

Usage metrics

    University Of Tasmania

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC