University of Tasmania
Browse
JCSE12 - best papers of DASFAA12.pdf (4.89 MB)

Fast result enumeration for keyword queries on XML data

Download (4.89 MB)
journal contribution
posted on 2023-05-18, 01:21 authored by Zhou, J, Chen, Z, Tang, X, Bao, Z, Ling, TW
In this paper, we focus on efficient construction of tightest matched subtree (TMSubtree) results, for keyword queries on extensible markup language (XML) data, based on smallest lowest common ancestor (SLCA) semantics. Here, “matched” means that all nodes in a returned subtree satisfy the constraint that the set of distinct keywords of the subtree rooted at each node is not subsumed by that of any of its sibling nodes, while “tightest” means that no two subtrees rooted at two sibling nodes can contain the same set of keywords. Assume that d is the depth of a given TMSubtree, m is the number of keywords of a given query Q. We proved that if dm, a matched subtree result has at most 2m! nodes; otherwise, the size of a matched subtree result is bounded by (dm + 2)m!. Based on this theoretical result, we propose a pipelined algorithm to construct TMSubtree results without rescanning all node labels. Experiments verify the benefits of our algorithm in aiding keyword search over XML data.

History

Publication title

Journal of Computing Science and Engineering

Volume

6

Pagination

127-140

ISSN

1976-4677

Department/School

School of Information and Communication Technology

Publisher

Korean Institute of Information Scientists and Engineers

Place of publication

Republic of Korea

Rights statement

Licensed under Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0) http://creativecommons.org/licenses/by-nc/3.0/

Repository Status

  • Open

Socio-economic Objectives

Electronic information storage and retrieval services

Usage metrics

    University Of Tasmania

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC