Abstract: In this study, we first analyze the characteristics of a content addressable storage (CAS) system, and discuss the key problems of keyword search in a CAS system, including data collection, index structure, index storage and result ranking. Then, we choose inter-planetary file system (IPFS), a CAS system, as the experimental platform. On IPFS, we propose a CASearch system, which is a keyword search engine for a CAS system. In the design, we try to ignore the characteristics of IPFS and focus on the common problems of a CAS system. The CASearch node can collect keywords proactively. CASearch binds the keywords by the variable index files, which is a feature of CAS. CASearch also ranks search results by the distance between nodes in the distributed hash table network. Using these solutions, we solve the four basic problems mentioned above. Finally, we evaluate CASearch in terms of data collection, time overhead, storage overhead and result ranking to prove its feasibility and advantages.
Loading