Published online by Cambridge University Press: 12 September 2008
We consider a string editing problem in a probabilistic framework. This problem is of considerable interest to many facets of science, most notably molecular biology and computer science. A string editing transforms one string into another by performing a series of weighted edit operations of overall maximum (minimum) cost. The problem is equivalent to finding an optimal path in a weighted grid graph. In this paper we provide several results regarding a typical behaviour of such a path. In particular, we observe that the optimal path (i.e. edit distance) is almost surely (a.s.) equal to αn for large n where α is a constant and n is the sum of lengths of both strings. More importantly, we show that the edit distance is well concentrated around its average value. In the so called independent model in which all weights (in the associated grid graph) are statistically independent, we derive some bounds for the constant α. As a by-product of our results, we also present a precise estimate of the number of alignments between two strings. To prove these findings we use techniques of random walks, diffusion limiting processes, generating functions, and the method of bounded difference.