IDIOMS: Index-powered Distributed Object-centric Metadata Search for Scientific Data Management

Published: 01 Jan 2024, Last Modified: 14 May 2025CCGrid 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Affix-oriented metadata search is one of the essential fuzzy search capabilities that allow users to find data of interest in their voluminous data set with incomplete query conditions. With the recent transition towards object-centric data management systems in the science community, there is a paramount need for the support of such features in distributed settings. However, existing metadata search solutions either do not support efficient affix-oriented metadata search or do not suit well in a distributed setting of object-centric data management systems. To bridge this gap, we introduce IDIOMS, a metadata search solution underpinned by a distributed metadata index, meticulously designed to enable high-performance affix-oriented metadata search for parallel object-centric storage. One of the standout features of IDIOMS is its efficiency in supporting four distinct types of highly demanded metadata queries. Furthermore, IDIOMS is flexibly catering to both independent and collective metadata search operations. Our experimental comparisons with SoMeta, a state-of-the-art metadata query method, demonstrate more than 400× performance boost for independent queries and up to 300× performance improvements for collective queries, while keeping a small index management overhead.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview