Learned Index with Dynamic $\epsilon$

Daoyuan Chen; Wuchao Li; Yaliang Li; Bolin Ding; Kai Zeng; Defu Lian; Jingren Zhou

Learned Index with Dynamic $\epsilon$

Daoyuan Chen, Wuchao Li, Yaliang Li, Bolin Ding, Kai Zeng, Defu Lian, Jingren Zhou

Published: 01 Feb 2023, Last Modified: 23 Feb 2023ICLR 2023 posterReaders: Everyone

Keywords: Learned Index, Dynamic $\epsilon$

TL;DR: Based on the theoretically derived prediction error bounds, we propose a mathematically-grounded learned index framework with dynamic $\epsilon$, which is efficient and pluggable to several state-of-the-art learned index methods.

Abstract: Index structure is a fundamental component in database and facilitates broad data retrieval applications. Recent learned index methods show superior performance by learning hidden yet useful data distribution with the help of machine learning, and provide a guarantee that the prediction error is no more than a pre-defined $\epsilon$. However, existing learned index methods adopt a fixed $\epsilon$ for all the learned segments, neglecting the diverse characteristics of different data localities. In this paper, we propose a mathematically-grounded learned index framework with dynamic $\epsilon$, which is efficient and pluggable to existing learned index methods. We theoretically analyze prediction error bounds that link $\epsilon$ with data characteristics for an illustrative learned index method. Under the guidance of the derived bounds, we learn how to vary $\epsilon$ and improve the index performance with a better space-time trade-off. Experiments with real-world datasets and several state-of-the-art methods demonstrate the efficiency, effectiveness and usability of the proposed framework.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: General Machine Learning (ie none of the above)

9 Replies

Loading