Published online by Cambridge University Press: 27 July 2009
A practical source coding scheme based on approximate string matching is proposed. It is an approximate fixed-length string matching data compression combined with a block-coder based on the empirical distribution. A lemma on approximate string matching, which is an extension of the Kac Lemma, is proved. It is shown, based on the lemma, that the deterministic algorithm converts the stationary and ergodic source, u, into an output process v, and under the assumption that v is a stationary process, after the scheme has run for an infinite time, the optimal compression ratio R(D) is achieved. This reduces the problem of the universal lossy coder to the proof of stationarity of the output process ν in the proposed algorithm. The main advantages of the proposed method are the asymptotic sequential behavior of the encoder and the simplicity of the decoder.