Abstract: Approximating a series of timestamped data points through a sequence of line segments with a maximum error guarantee is a fundamental data compression problem, termed as Piecewise Linear Approximation (PLA). As the demand for analyzing large volumes of time-series data across various domains continues to grow, the significance of this problem has recently received considerable attention. Recent PLA algorithms have emerged to help us handle the overwhelming amount of information, albeit at the expense of some precision loss. More precisely, these algorithms involve a delicate balance between the maximum acceptable precision loss and the space savings that can be achieved. In our recent work we proposed Sim-Piece, offering a fresh perspective on the long-standing challenge of PLO approximation. Sim-Piece identifies similarities among line segments in a PLA representation enabling their grouping and joint representation. This way, Sim-Piece delivers space-saving advantages that outperform even the optimal PLA approximation. In this work, we present Mix-Piece, an improved PLA compression algorithm that builds upon the core idea of Sim-Piece (i.e., exploiting similar PLA segments) but improves further its performance by (1) considering multiple candidate PLA segments when ingesting a time series, (2) enabling grouping of additional segments not utilized by Sim-Piece, and, (3) making use of a versatile output format that exploits all segment similarities. Our experimental evaluation demonstrates that Mix-Piece outperforms Sim-Piece and previous competing techniques, attaining compression ratios with more than twofold improvement on average over what PLA algorithms can offer. This allows for providing significantly higher accuracy with equivalent space requirements.
External IDs:dblp:journals/vldb/KitsiosLPK24
Loading