Subject-Oriented Classification Based on Scale Probing in the Deep Web

Published: 01 Jan 2008, Last Modified: 17 Apr 2025WAIM 2008EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To access the large-scale data sources efficiently and automatically, it is necessary to classify these data sources into different domains and categories. In this paper, we propose a novel classification approach to classify data sources into detail domain subjects by query probing. In our approach, we train sample instances for each subject category and use them to probe the data scale of each source and category. And then we build a matrix to classify a data source into one or more subject categories and develop a decision algorithm based on probing iteration to rectify the classification result. Our experiments over real deep web sources show that our approach can achieve higher accuracy across a variety of data sources.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview