eCite Digital Repository
Towards an effective XML keyword search
Citation
Bao, Z and Lu, J and Ling, TK and Chen, B, Towards an effective XML keyword search, IEEE Transactions on Knowledge and Data Engineering, 22, (8) pp. 1077-1092. ISSN 1041-4347 (2010) [Refereed Article]
Copyright Statement
Copyright 2010 IEEE Computer Society
Official URL: http://dx.doi.org/10.1109/TKDE.2010.63
DOI: doi:10.1109/TKDE.2010.63.
Abstract
Inspired by the great success of information retrieval (IR) style keyword search on the web, keyword search on XML has
emerged recently. The difference between text database and XML database results in three new challenges: 1) Identify the user
search intention, i.e., identify the XML node types that user wants to search for and search via. 2) Resolve keyword ambiguity
problems: a keyword can appear as both a tag name and a text value of some node; a keyword can appear as the text values of
different XML node types and carry different meanings; a keyword can appear as the tag name of different XML node types with
different meanings. 3) As the search results are subtrees of the XML document, new scoring function is needed to estimate its
relevance to a given query. However, existing methods cannot resolve these challenges, thus return low result quality in term of query
relevance. In this paper, we propose an IR-style approach which basically utilizes the statistics of underlying XML data to address
these challenges. We first propose specific guidelines that a search engine should meet in both search intention identification and
relevance oriented ranking for search results. Then, based on these guidelines, we design novel formulae to identify the search for
nodes and search via nodes of a query, and present a novel XML TF*IDF ranking strategy to rank the individual matches of all possible
search intentions. To complement our result ranking framework, we also take the popularity into consideration for the results that have
comparable relevance scores. Lastly, extensive experiments have been conducted to show the effectiveness of our approach.
Item Details
Item Type: | Refereed Article |
---|---|
Keywords: | XML, keyword search, ranking |
Research Division: | Information and Computing Sciences |
Research Group: | Data management and data science |
Research Field: | Data management and data science not elsewhere classified |
Objective Division: | Information and Communication Services |
Objective Group: | Information services |
Objective Field: | Electronic information storage and retrieval services |
UTAS Author: | Bao, Z (Dr Zhifeng Bao) |
ID Code: | 92171 |
Year Published: | 2010 |
Web of Science® Times Cited: | 24 |
Deposited By: | Information and Communication Technology |
Deposited On: | 2014-06-09 |
Last Modified: | 2014-12-08 |
Downloads: | 0 |
Repository Staff Only: item control page