B²-Tree: Cache-Friendly String Indexing within B-Trees

Published: 01 Jan 2021, Last Modified: 12 May 2025BTW 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently proposed index structures, that combine trie-based and comparison-based search mechanisms, considerably improve retrieval throughput for in-memory database systems. However, most of these index structures allocate small memory chunks when required. This stands in contrast to block-based index structures, that are necessary for disk-accesses of beyond main-memory database systems such as Umbra. We therefore present the B²-tree. The outer structure is identical to that of an ordinary B+-tree. It still stores elements in a dense array in sorted order, enabling efficient range scan operations. However, B²-tree is composed of multiple trees, each page integrates another trie-based search tree, which is used to determine a small memory region where a sought entry may be found. An embedded tree thereby consists of decision nodes, which operate on a single byte at a time, and span nodes, which are used to store common prefixes. This architecture usually accesses fewer cache lines than a vanilla B+-tree as our performance evaluation proved. As a result, the B²-tree is able to answer point queries considerably faster.
Loading