We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Even though the literature about Lisp is abundant and already accessible to the reading public, nevertheless, this book still fills a need. The logical substratum where Lisp and Scheme are founded demand that modern users must read programs that use (and even abuse) advanced technology, that is, higher-order functions, objects, continuations, and so forth. Tomorrow's concepts will be built on these bases, so not knowing them blocks your path to the future.
To explain these entities, their origin, their variations, this book will go into great detail. Folklore tells us that even if a Lisp user knows the value of every construction in use, he or she generally does not know its cost. This work also intends to fill that mythical hole with an in-depth study of the semantics and implementation of various features of Lisp, made more solid by more than thirty years of history.
Lisp is an enjoyable language in which numerous fundamental and non-trivial problems can be studied simply. Along with ML, which is strongly typed and suffers few side effects, Lisp is the most representative of the applicative languages. The concepts that illustrate this class of languages absolutely must be mastered by students and computer scientists of today and tomorrow. Based on the idea of “function,” an idea that has matured over several centuries of mathematical research, applicative languages are omnipresent in computing; they appear in various forms, such as the composition of UN⋆X byte streams, the extension language for the EMACS editor, as well as other scripting languages.
Concrete data consists of constructions that can be inspected, taken apart, or joined to form larger constructions. Lists are an example of concrete data. We can test whether or not a list is empty, and divide a non-empty list into its head and tail. New elements can be joined to a list. This chapter introduces several other forms of concrete data, including trees and logical propositions.
The ml datatype declaration defines a new type along with its constructors. In an expression, constructors create values of a datatype; in patterns, constructions describe how to take such values apart. A datatype can represent a class consisting of distinct subclasses — like Pascal's variant records, but without their complications and insecurities. A recursive datatype typically represents a tree. Functions on datatypes are declared by pattern-matching.
The special datatype exn is the type of exceptions, which stand for error conditions. Errors can be signalled and trapped. An exception handler tests for particular errors by pattern-matching.
Chapter outline
This chapter describes datatypes, pattern-matching, exception handling and trees. It contains the following sections:
The datatype declaration. Datatypes, constructors and pattern-matching are illustrated through examples. To represent the King and his subjects, a single type person comprises four classes of individual and associates appropriate information with each.
Exceptions. These represent a class of error values. Exceptions can be declared for each possible error.
With each reprinting of this book, a dozen minor errors have silently disappeared. But a reprinting is no occasion for making improvements, however valuable, that would affect the page numbering: we should then have several slightly different, incompatible editions. An accumulation of major changes (and the Editor's urgings) have prompted this second edition.
As luck would have it, changes to ml have come about at the same time. ml has a new standard library and the language itself has been revised. It is worth stressing that the changes do not compromise ml's essential stability. Some obscure technical points have been simplified. Anomalies in the original definition have been corrected. Existing programs will run with few or no changes. The most visible changes are the new character type and a new set of top level library functions.
The new edition brings the book up to date and greatly improves the presentation. Modules are now introduced early — in Chapter 2 instead of Chapter 7 — and used throughout. This effects a change of emphasis, from data structures (say, binary search trees) to abstract types (say, dictionaries). A typical section introduces an abstract type and presents its ml signature. Then it explains the ideas underlying the implementation, and finally presents the code as an ml structure. Though reviewers have been kind to the first edition, many readers have requested such a restructuring.
Inored, abused, unjustly criticized, insufficiently justified (theoretically), macros are no less than one of the fundamental bases of Lisp and have contributed significantly to the longevity of the language itself. While functions abstract computations and objects abstract data, macros abstract the structure of programs. This chapter presents macros and explores the problems they pose. By far one of the least studied topics in Lisp, there is enormous variation in macros in the implementation of Lisp or Scheme. Though this chapter contains few programs, it tries to sweep through the domain where these little known beings—macros—have evolved.
Invented by Timothy P. Hart [SG93] in 1963 shortly after the publication of the Lisp 1.5 reference manual, macros turned out to be one of the essential ingredients of Lisp. Macros authorize programmers to imagine and implement the language appropriate to their own problem. Like mathematics, where we continually invent new abbreviations appropriate for expressing new concepts, dialects of Lisp extend the language by means of new syntactic constructions. Don't get me wrong: I'm not talking about augmenting the language by means of a library of functions covering a particular domain. A Lisp with a library of graphic functions for drawing is still Lisp and no more than Lisp. The kind of extensions I'm talking about introduce new syntactic forms that actually increase the programmer's power.
Extending a language means introducing new notation that announces that we can write X when we want to signify Y.
Most programmers know how hard it is to make a program work. In the 1970s, it became apparent that programmers could no longer cope with software projects that were growing ever more complex. Systems were delayed and cancelled; costs escalated. In response to this software crisis, several new methodologies have arisen — each an attempt to master the complexity of large systems.
Structured programming seeks to organize programs into simple parts with simple interfaces. An abstract data type lets the programmer view a data structure, with its operations, as a mathematical object. The next chapter, on modules, will say more about these topics.
Functional programming and logic programming aim to express computations directly in mathematics. The complicated machine state is made invisible; the programmer has to understand only one expression at a time.
Program correctness proofs are introduced in this chapter. Like the other responses to the software crisis, formal methods aim to increase our understanding. The first lesson is that a program only ‘works’ if it is correct with respect to its specification. Our minds cannot cope with the billions of steps in an execution. If the program is expressed in a mathematical form, however, then each stage of the computation can be described by a formula. Programs can be verified — proved correct — or derived from a specification. Most of the early work on program verification focused on Pascal and similar languages; functional programs are easier to reason about because they involve no machine state.
In a public lecture, C. A. R. Hoare (1989a) described his algorithm for finding the ith smallest integer in a collection. This algorithm is subtle, but Hoare described it with admirable clarity as a game of solitaire. Each playing card carried an integer. Moving cards from pile to pile by simple rules, the required integer could quickly be found.
Then Hoare changed the rules of the game. Each card occupied a fixed position, and could only be moved if exchanged with another card. This described the algorithm in terms of arrays. Arrays have great efficiency, but they also have a cost. They probably defeated much of the audience, as they defeat experienced programmers. Mills and Linger (1986) claim that programmers become more productive when arrays are restricted to stacks, queues, etc., without subscripting.
Functional programmers often process collections of items using lists. Like Hoare's stacks of cards, lists allow items to be dealt with one at a time, with great clarity. Lists are easy to understand mathematically, and turn out to be more efficient than commonly thought.
Chapter outline
This chapter describes how to program with lists in Standard ml. It presents several examples that would normally involve arrays, such as matrix operations and sorting.
The chapter contains the following sections:
Introduction to lists. The notion of list is introduced. Standard ml operates on lists using pattern-matching.
Commit protocols are used for concurrency control in distributed data bases. Thus they belong to the application layer. For an introduction to this area we recommend the book by Bernstein et al. [BHG87, Chapter 7].
If a data base is distributed over several sites, it is very possible that a data base operation which is logically a single action in fact involves more than one site of the data base. For example, consider the transfer of a sum of money sm from one bank account to another. The balance of the first bank account has to be decreased by sm, while the balance of the second has to be increased by sm. These two subactions might have to take place at different sites. It is imperative that both subactions are executed, and not one. If it is not possible to execute one of them, e.g. because its site is temporarily down, they should both be not executed.
In data base management such a logically single action is called a transaction, and it should behave as if it is an atomic action. At some point in the execution of the transaction it has to be decided whether the transaction is (going to be) executed as a whole and will never be revoked (commit), or that the transaction cannot be completed, and parts already done will be undone (abort). In general, an algorithm to ensure that the transaction can be viewed as an atomic action is called an atomic commitment protocol. Thus all processes participating in an atomic commitment protocol have to reach agreement upon whether to commit or to abort the transaction under consideration.
A basic problem that must be addressed in any design of a distributed network is the routing of messages. That is, if a node in the network wants to send a message to some other node in the network or receives a message destined for some other node, a method is needed to enable the node to decide over which outgoing link it has to send this message. Algorithms for this problem are called routing algorithms. In the sequel we will only consider distributed routing algorithms which are determined by the cooperative behavior of the local routing protocols of the nodes in order to guarantee effective message handling and delivery.
Desirable properties of routing algorithms are for example correctness, optimality, and robustness. Correctness seems easy to achieve in a static network, but the problem is far less trivial in case links and nodes are allowed to go down and come up as they can do in practice. Optimality is concerned with finding the “quickest” routes. Ideally, a route should be chosen for a message on which it will encounter the least delay but, as this depends on the amount of traffic on the way, such routes are hard to predict and hence the goal is actually difficult to achieve. A frequent compromise is to minimize the number of hops, i.e., the number of links over which the message travels from origin to destination. We will restrict ourselves to minimum-hop routing. Robustness is concerned with the ease with which the routing scheme is adapted in case of topological changes.
Originally, the research reported in this book was motivated by the way the material used for an introductory course on Distributed Computing (taught in the spring of 1985) was presented in the literature. The teacher of the course, Jan van Leeuwen, and I felt that many results were presented in a way that needed clarifying, and that correctness proofs, if existent, were often far from convincing, if correct at all. Thus we started to develop correctness proofs for some distributed protocols. Gradually a methodology emerged for such proofs, based on the idea of “protocol skeletons” and “system-wide invariants”. Similar ideas were developed by others in the context of formal proof systems for parallel and distributed programs.
I thank the ESPRIT Basic Research Action No. 7141 (project ALCOM II: Algorithms and Complexity), the Netherlands Organization for Scienctific Research (NWO) under contract NF 62-376 (NFI project ALADDIN: Algorithmic Aspects of Parallel and Distributed Systems), and the Department of Computer Science at Utrecht University for giving me the opportunity to do research on this topic, and for providing such a stimulating environment. I thank my coauthors Jan van Leeuwen, Hans Bodlaender, and Gerard Tel, and also Petra van Haaften, Hans Zantema, and Netty van Gasteren for all the discussions we had.
Later the idea came up to write a thesis about this subject, and I especially thank my thesis advisor Jan van Leeuwen for his stimulating support. The first four chapters of the thesis served as preliminary versions for the first four chapters of this book, while chapter 5 on commit protocols was added later.
Consider a communication network in which processors want to transmit many short messages to each other. The processors are not necessarily connected by a communication channel. Usually this service is provided for by protocols in the transport layer. A protocol can incorporate such a message in a packet and send the packet to the destination processor. As discussed in chapter 1, in the transport layer it is again necessary that communication errors are considered, even though we can assume that the communication over channels is handled correctly by the lower layers. -Thus we have to assume that the communication network can lose packets, copy packets (due to necessary retransmissions), delay packets arbitrarily long, and deliver packets in a different order than the order in which they were sent.
We consider the design of some protocols that handle the communication of messages correctly, in the sense that there is no loss or duplication of messages (cf. Belsnes [Bel76]). To specify this more precisely, suppose processor i wants to transmit a message m to processor j. The message m is said to be lost if i thinks that j received m while this is not the case, and m is said to be duplicated if j receives two or more copies of m from i and thinks that they are different messages.
If a processor i has a message or a sequence of messages to send to j, it sets up a temporary connection with j, which is closed as soon as i knows that j received the message(s) (or that j is not in a position to receive them).
In this chapter we consider some link-level protocols and show their partial correctness by assertional verification. Link-level protocols, i.e., protocols residing in the data link layer, are designed to control the exchange of information between two computing stations, e.g. computers or processors over a full-duplex link. They should guard against the loss of information when the transmission medium is unreliable. We only discuss transmission errors that occur while the link is up, and thus use the model of a static network consisting of two nodes i and j, and a bidirectional link (i, j). We will not deal with the problems caused by links or nodes going down, nor with the termination of a protocol. In a different context, these issues will be dealt with in later chapters.
In section 2.1 we discuss a generalization of the sliding window protocol. This protocol is meant to control the exchange of messages in an asynchronous environment. Although sliding window protocols belong to the data link layer, we will see in chapter 4 that the generalization can also be used as a basis for connection management, which belongs to the transport layer. We show that the alternating bit protocol and the “balanced” two-way sliding window protocol are instances of this one general protocol skeleton, that contains several further parameters to tune the simultaneous transmission of data over a full-duplex link. After proving the partial correctness of the protocol skeleton, we discuss the dependence of the optimal choice of the parameters on the propagation delay of the link, the transmission speed of the senders, and the error rate of the link.
In the past two decades, distributed computing has evolved rapidly from a virtually non-existent to an important area in computer science research. As hardware costs declined, single mainframe computers with a few simple terminals were replaced by all kinds of general and special purpose computers and workstations, as the latter became more cost effective. At many sites it became necessary to interconnect all these computers to make communication and file exchanges possible, thus creating a computer network. Given a set of computers that can communicate, it is also desirable that they can cooperate in some sense, for example, to contribute to one and the same computation. Thus a network of computers is turned into a distributed system, capable of performing distributed computations. The field of distributed computing is concerned with the problems that arise in the cooperation and coordination between computers in performing distributed tasks.
Distributed algorithms (or: protocols) range from algorithms for communication to algorithms for distributed computations. These algorithms in a distributed system appear to be conceptually far more complex than in a single processing unit environment. With a single processing unit only one action can occur at a time, while in a distributed system the number of possibilities of what can happen when and where at a time tends to be enormous, and our human minds are just not able to keep track of all of them.
This leads to the problem of determining whether the executions of a distributed algorithm indeed have the desired effect in all possible circumstances and combinations of events. Testing algorithms has now become completely infeasible: some form of “verification” is the only way out.