Abstract: The recently proposed learned bloom filter (LBF) opens a new perspective on how to reconstruct bloom filters with machine learning. However, the LBF has a massive time cost and does not apply to multidimensional spatial data. In this paper, we propose a prefix-based and adaptive learned bloom filter (PA-LBF) for spatial data, which efficiently supports the insertion and deletion. The proposed PA-LBF is divided into three parts: (1) the prefix-based classification. The <i>Z</i>-order space-filling curve is used to extract data, prefix it, and classify it. (2) The adaptive learning process. The multiple independent adaptive sub-LBFs are designed to train the suffixes of data, combined with part 1, to reduce the false positive rate (FPR), query, and learning process time consumption. (3) The backup filter uses CBF. Two kinds of backup CBF are constructed to meet the situation of different insertion and deletion frequencies. Experimental results prove the validity of the theory and show that the PA-LBF reduces the FPR by 84.87%, 79.53%, and 43.01% with the same memory usage compared with the LBF on three real-world spatial datasets. Moreover, the time consumption of PA-LBF can be reduced to <svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" style="vertical-align:-0.3499298pt" id="M1" height="8.46388pt" version="1.1" viewBox="-0.0498162 -8.11395 14.0226 8.46388" width="14.0226pt"><g transform="matrix(.013,0,0,-0.013,0,0)"><path id="g113-54" d="M153 550H386L412 615L406 623H120L82 318C104 327 142 338 184 338C294 338 347 275 347 187C347 112 305 39 221 39C160 39 119 71 97 89C88 97 80 96 71 90C59 80 50 67 49 57C48 45 52 36 66 23C80 9 123 -12 169 -12C221 -11 288 15 342 59C403 109 431 165 431 225C431 308 366 395 238 395C212 395 165 379 127 364L153 550Z"/></g><g transform="matrix(.013,0,0,-0.013,6.24,0)"><path id="g117-42" d="M528 54L331 254L528 455L492 493L294 291L96 493L60 455L257 254L60 54L96 16L294 217L492 16L528 54Z"/></g></svg> and <svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" style="vertical-align:-0.3499298pt" id="M2" height="8.69875pt" version="1.1" viewBox="-0.0498162 -8.34882 29.5245 8.69875" width="29.5245pt"><g transform="matrix(.013,0,0,-0.013,0,0)"><path id="g113-51" d="M412 140C382 77 369 73 315 73H129L270 222C362 320 402 379 402 466C402 571 322 635 234 635C177 635 130 609 99 576L42 495L64 475C90 514 133 568 201 568C274 568 318 519 318 435C318 349 255 267 193 193C144 135 87 78 32 23V0H405C417 45 427 89 440 131L412 140Z"/></g><g transform="matrix(.013,0,0,-0.013,6.24,0)"><path id="g113-47" d="M113 -12C146 -12 170 11 170 46C170 78 146 103 114 103S58 78 58 46C58 11 82 -12 113 -12Z"/></g><g transform="matrix(.013,0,0,-0.013,9.204,0)"><path id="g113-49" d="M241 635C89 635 35 457 35 312C35 153 89 -12 240 -12C390 -12 443 166 443 312C443 466 390 635 241 635ZM238 602C329 602 354 454 354 312C354 172 330 22 240 22C152 22 124 173 124 313S148 602 238 602Z"/></g><g transform="matrix(.013,0,0,-0.013,15.444,0)"><use xlink:href="#g113-54"/></g><g transform="matrix(.013,0,0,-0.013,21.684,0)"><use xlink:href="#g117-42"/></g></svg> that of the LBF on the query and learning process, respectively.
0 Replies
Loading