Reachable subwebs for traversal-based query executionOpen Website

2014 (modified: 12 Nov 2022)WWW (Companion Volume) 2014Readers: Everyone
Abstract: Traversal-based approaches to execute queries over data on the Web have recently been studied. These approaches make use of up-to-date data from initially unknown data sources and, thus, enable applications to tap the full potential of the Web. While existing work focuses primarily on implementation techniques, a principled analysis of subwebs that are reachable by such approaches is missing. Such an analysis may help to gain new insight into the problem of optimizing the response time of traversal-based query engines. Furthermore, a better understanding of characteristics of such subwebs may also inform approaches to benchmark these engines. This paper provides such an analysis. In particular, we identify typical graph-based properties of query-specific reachable subwebs and quantify their diversity. Furthermore, we investigate whether vertex scoring methods (e.g., PageRank) are able to predict query-relevance of data sources when applied to such subwebs.
0 Replies

Loading