Codecs for DNA-based Data Storage Systems with Multiple Constraints for Internet of Things

Published: 01 Jan 2023, Last Modified: 28 Sept 2024GLOBECOM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Internet of Things (IoT) devices are severely constrained in computational capacity, battery life, and data storage, which fail to meet the requirement of mass data storage. With the explosive growth of data to be stored, Deoxyribonucleic acid (DNA)-based storage has become a promising direction for IoT data storage due to its various advantages, e.g. high capacity, long durability and scalability. However, DNA synthesis and sequencing are subject to errors due to certain biochemical properties of DNA. In this paper, an explicit encoding and decoding scheme for constrained systems satisfying both 3-RLL constraint and strong-( 4,1)-locally-GC-balanced constraint is designed. We propose the use of a state-splitting algorithm to encode binary strong-(4,1)-locally-balanced constrained systems with the rate 2: 3, and a state-dependent decoding algorithm to decode the encoded data. The calculation results show that the codebook of the encoding scheme in this paper is larger than that of the existing scheme, and the total number of codewords with a length of 24 is more than 6 times that of the existing scheme. The information rate is higher than that of existing coding schemes. The encoding table size required is two orders of magnitude smaller than the existing scheme.
Loading