Abstract: We propose a new algorithm for computing the longest prefix of each suffix of a given string of length n over a constant-sized alphabet of size \(\sigma \) that occurs elsewhere in the string with Hamming distance at most k. Specifically, we show that the proposed algorithm requires time \(\mathcal {O}(n (\sigma R)^k \log \log n (\log k+ \log \log n))\) on average, where \(R=\lceil (k+2) (\log _{\sigma } n+1) \rceil \), and space \(\mathcal {O}(n)\). This improves upon the state-of-the-art average-case time complexity for the case when \(k=1\) [23] by a factor of \(\log n / \log ^3 \log n\). In addition, we show how the proposed technique can be adapted and applied in order to compute the longest previous factors under the Hamming distance model within the same complexities. In terms of real-world applications, we show that our technique can be directly applied to the problem of genome mappability.
Loading