Abstract: Synthetic polymer-based storage promises to accommodate the ever-increasing demand for archival storage. It involves designing molecules of distinct masses to represent the respective bits {0, 1}, followed by the synthesis of a polymer of molecular units that reflects the order of bits in the information string. The stored data can be read by means of a tandem mass spectrometer, that fragments the polymer into shorter substrings and provides their corresponding masses, from which the composition, i.e., the number of 1s and 0s in the concerned substring can be inferred. Prior works tackled the problem of unique string reconstruction from the set of all possible compositions, called the composition multiset. This was accomplished either by determining which string lengths always allow unique reconstruction, or by formulating coding constraints to facilitate the same for all string lengths. Additionally, error-correcting schemes to deal with substitution errors caused by imprecise fragmentation during the readout process, have also been suggested. This work extends previously considered error models that were mainly confined to substitutions of compositions. Our new error models consider insertions and deletions of compositions. The robustness of the reconstruction codebook proposed by Pattabiraman et al. to such errors is examined, and whenever necessary, new coding constraints are proposed to ensure unique reconstruction.
0 Replies
Loading