Top-down keyword query processing on XML data

Junfeng Zhou, Xingmin Zhao, Wei Wang, Ziyang Chen, Jeffrey Xu Yu

Published: 2013, Last Modified: 27 Jul 2024CIKM 2013EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Efficiently answering XML keyword queries has attracted much research effort in the last decade. One key factors resulting in the inefficiency of existing methods are the common-ancestor-repetition (CAR) and visiting-useless-nodes (VUN) problems. In this paper, we propose a generic top-down processing strategy to answer a given keyword query w.r.t. LCA/SLCA/ELCA semantics. By top-down, we mean that we visit all common ancestor (CA) nodes in a depth-first, left-to-right order, thus avoid the CAR problem; by generic, we mean that our method is independent of the labeling schemes and query semantics. We show that the satisfiability of a node v w.r.t. the given semantics can be determined by v's child nodes, based on which our methods avoid the VUN problem. We propose two algorithms that are based on either traditional inverted lists or our newly proposed LLists to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.