Keyword Search Technology in Content Addressable Storage System

Published: 01 Jan 2020, Last Modified: 13 Nov 2024HPCC/DSS/SmartCity 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this study, we first analyze the characteristics of a content addressable storage (CAS) system, and discuss the key problems of keyword search in a CAS system, including data collection, index structure, index storage and result ranking. Then, we choose inter-planetary file system (IPFS), a CAS system, as the experimental platform. On IPFS, we propose a CASearch system, which is a keyword search engine for a CAS system. In the design, we try to ignore the characteristics of IPFS and focus on the common problems of a CAS system. The CASearch node can collect keywords proactively. CASearch binds the keywords by the variable index files, which is a feature of CAS. CASearch also ranks search results by the distance between nodes in the distributed hash table network. Using these solutions, we solve the four basic problems mentioned above. Finally, we evaluate CASearch in terms of data collection, time overhead, storage overhead and result ranking to prove its feasibility and advantages.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview