Searching

Sandeep Sen; Amit Kumar

doi:10.1017/9781108654937.007

The problem of searching is basic in the computer science field and vast amount of literature is devoted to many fascinating aspects of this problem. Starting with searching for a given key in a pre-processed set to the more recent techniques developed for searching documents, the modern civilization forges ahead using Google Search. Discussing the latter techniques is outside the scope of this chapter, so we focus on the more traditional framework. Knuth [83] is one of the most comprehensive sources of the earlier techniques; all textbooks on data structures address common techniques like binary search and balanced tree-based dictionaries like AVL (Adelson-Velsky and Landis) trees, red–black trees, B-trees, etc. We expect the reader to be familiar with such basic methods. Instead, we focus on some of the simpler and lesser known alternatives to the traditional data structures. Many of these rely on innovative use of randomized techniques, and are easier to generalize for a variety of applications. They are driven by a somewhat different perspective of the problem of searching that enables us to get a better understanding including practical scenarios where the universe is much smaller. The underlying assumption in comparison-based searching is that the universe may be infinite, that is, we can be searching real numbers. While this is a powerful framework, we miss out on many opportunities to develop faster alternatives based on hashing in a bounded universe. We will address both these frameworks so that the reader can make an informed choice for a specific application.

Skip-Lists – A Simple Dictionary

Skip-list is a data structure introduced by Pugh [119] as an alternative to balanced binary search trees for handling dictionary operations on ordered lists. The reader may recall that linked lists are very amenable to modifications in O(1) time although they do not support fast searches like binary search trees. We substitute complex book-keeping information used for maintaining balance conditions for binary trees by random sampling techniques. It has been shown that given access to random bits, the expected search time in a skip-list of n elements is O(logn), which compares very favorably with balanced binary trees.

Book contents

6 - Searching

Summary

Access options

Book contents

6 - Searching

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive