Abstract: We study the average-case complexity of finding all occurrences of a given pattern $\alpha $ in an input text string. Over an alphabet of q symbols, let $c(\alpha ,n)$ be the minimum average number of characters that need to be examined in a random text string of length n. We prove that, for large m, almost all patterns $\alpha $ of length m satisfy \[ c(\alpha ,n) = \theta \left( \left\lceil \log _q \left(\frac{{n - m}}{{\ln m}} + 2\right) \right\rceil \right)\quad \text{if }m \leqq n \leqq 2m, \] and \[ c(\alpha ,n) = \theta \left( \frac{{\lceil {\log _q m} \rceil }}{m}n \right)\quad{\text{if }} n > 2m. \] This in particular confirms a conjecture raised in a recent paper by Knuth, Morris, and Pratt (Fast pattern matching in strings, SIAM J. Comput., 6 (1977), pp. 323–350.
0 Replies
Loading