We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This collection of papers from various areas of mathematical logic showcases the remarkable breadth and richness of the field. Leading authors reveal how contemporary technical results touch upon foundational questions about the nature of mathematics. Highlights of the volume include: a history of Tennenbaum's theorem in arithmetic; a number of papers on Tennenbaum phenomena in weak arithmetics as well as on other aspects of arithmetics, such as interpretability; the transcript of Gödel's previously unpublished 1972–1975 conversations with Sue Toledo, along with an appreciation of the same by Curtis Franks; Hugh Woodin's paper arguing against the generic multiverse view; Anne Troelstra's history of intuitionism through 1991; and Aki Kanamori's history of the Suslin problem in set theory. The book provides a historical and philosophical treatment of particular theorems in arithmetic and set theory, and is ideal for researchers and graduate students in mathematical logic and philosophy of mathematics.
We have reached the final destination of our journey into unification grammars, and can pause and look back at what we have done. Our main purpose has been the presentation of a powerful formalism for specifying grammars, for both formal and natural languages. In doing so, we intended to combine insights from both linguistics and computer science.
From linguistics, we have adopted various analyses of complex syntactic constructs, such as long-distance dependencies or subject/object control, that prevail in natural languages and are easily recognized as inadequately represented, say, by context-free grammars. Several linguistic theories deal with such constructs; a common denominator of many of them is the use of features (and their values) to capture the properties of strings, based on which a grammar can be specified. However, the use of features is mostly informal. The formalism we presented adopts (from linguistics) feature structures as its main data-structure, but with a rigorous formalization. As a result, claims about grammars can be made and proved. We believe that resorting to proofs (in the mathematical sense) should become a major endeavor in any study of theoretical linguistics, and the ability to prove claims should be a major ingredient in the education of theoretical linguists. By understanding its underlying mathematics, one can better understand the properties of the formalism, recognizing both its strengths and weaknesses as a tool for studying the syntax of natural languages.
From computer science, we took several insights and adapted them to also suit the analysis of natural language.
In previous chapters we presented a linguistically motivated grammatical formalism and focused on its mathematical formulation. We said very little about the computational properties and processing of grammars expressed in the formalism. This chapter is concerned with such aspects. We focus on the expresiveness of the formalism, and on computational procedures for processing it.
The expressiveness of a linguistic formalism F (specifying grammars) determines the class of (formal) languages that can be defined by the grammars in F. Typically, it also bears on the computational complexity that is needed to solve the universal recognition problem with respect to the formalism. This is the problem of determining, given an arbitrary grammar G in F and an arbitrary string w, whether w ∈ L(G). Thus, context-free grammars define (all and only) the context-free languages; the universal recognition problem for CFGs can be solved in cubic time in the length of the input string. Regular expressions define the regular languages; the universal recognition problem for regular expressions can be solved in linear time. Also, Turing machines define all the recursively enumerable languages, but the universal recognition problem is undecidable for this formalism.
We begin this chapter with a discussion of the expressive power of unification grammars in Section 6.1. In Section 6.2 we show that unification grammars are equivalent to Turing machines. This implies that the universal recognition problem for the formalism is undecidable.
Motivated by the violations of the context-free grammar G0, discussed in the previous chapter, we now extend the CFG formalism with additional mechanisms that will facilitate the expression of information that is missing in G0 in a uniform and compact way. The core idea is to incorporate into the grammar the properties of symbols in terms of which violations of G0 were stated. Properties are represented by means of feature structures. As we show in this chapter, feature structures provide a natural representation for the kind of linguistic information that grammars specify, a natural (partial) order on the amount of information stored in these representations and an efficient operation for combining the information stored in two representations.
We begin this chapter with an overview of feature structures, motivating their use as a representation of linguistic information (Section 2.1). We then present four different views of these entities. We begin with feature graphs (Section 2.2), which are just a special case of ordinary (labeled, directed) graphs. The well-studied mathematical and computational properties of graphs make this view easy to understand and very suitable for computational implementation. This view, however, introduces a level of arbitrariness, which is expressed in the identities of the nodes. We therefore introduce two additional views. One, simply called feature structures (Section 2.3), is defined as equivalence classes of isomorphic feature graphs, abstracting over node identities. The other, called abstract feature structures (Section 2.4), uses sets rather than graphs and is easy to work with mathematically.
This book grew out of lecture notes for a course we started teaching at the Department of Computer Science at the Technion, Haifa, in the spring of 1996, and later also taught at the University of Haifa. The students were advanced undergraduates and graduates who had good knowledge of formal languages but limited background in linguistics. We intended it to be an introductory course in computational linguistics, but we wanted to focus on contemporary linguistic theories and the necessary mechanisms for reasoning about and implementing them, rather than on traditional (statistical, corpus-based) natural language processing (NLP) techniques.
We realized that no good textbook existed that covered the material we wanted to teach. Although quite a few good introductions to NLP exist, including Pereira and Shieber (1987), Gazdar and Mellish (1989), Covington (1994), Allen (1995), and more recently, Manning and Schütze (1999), Jurafsky and Martin (2000), and Bird et al. (2009), none of them provides the mathematical and computational infrastructure needed for our purposes.
The focus of this book is two dimensional. On one hand, we focus on a certain formalism, unification grammars, for presenting, studying, and reasoning about grammars. Although it is not the sole formalism used in computational linguistics, it has gained much popularity, and it now underlies many ongoing projects. On the other hand, we also focus on fundamental natural language syntactic constructions, and the way they are specified in grammars expressed in this formalism.
Feature structures are the building blocks of unification grammars, as they serve as the counterpart of the terminal and nonterminal symbols in CFGs. However, in order to define grammars and derivations, one needs some extension of feature structures to sequences thereof. In this chapter we present multirooted feature structures that are aimed at capturing complex, ordered information and are used for representing rules and sentential forms of unification grammars; we motivate this extension in Section 4.1. In parallel to the exposition of feature structures in Chapter 2, we start by defining multirooted feature graphs (Section 4.2), a natural extension of feature graphs. We then abstract away from the identities of nodes in the graphs in two ways: by defining multirooted feature structures, which are equivalence classes of isomorphic multirooted feature graphs, and by defining abstract multirooted structures (Section 4.3). Finally, we define the concept of multi-AVMs (Section 4.4), which are an extension of AVMs, and show how they correspond to multirooted graphs. The crucial concept of unification in context is discussed in Section 4.5.
We then utilize this machinery for defining unification grammars. We begin by defining (sentential) forms and grammar rules (Section 4.6). Then, we define the concept of derivation for unification grammars, providing a means for defining the languages generated by such grammars (Section 4.7). We explore derivation trees in Section 4.8.
The move from context-free grammars to unification grammars is motivated by linguistic considerations (the need to provide better generalizations and more compact representations).
The previous chapter presented four different views of feature structures, with several correspondences among them. For each of the views, a subsumption relation was defined in a natural way. In this chapter we define the operation of unification for the different views. The subsumption relation compares the information content of feature structures. Unification combines the information that is contained in two (compatible) feature structures.We use the term “unification” to refer to both the operation and its result. In the sequel, whenever two feature structures are related, they are assumed to be over the same signature.
The mathematical interpretation of “combining” two members of a partially ordered set is to take the least upper bound of the two operands with respect to the partial order; in our case, subsumption. Indeed, feature structure unification is exactly that. However, since subsumption is antisymmetric for feature structures and AFSs but not for feature graphs and AVMs, a unique least upper bound cannot be guaranteed for all four views. We begin with feature graphs and define unification for this view first, extending it to feature structures in Section 3.3. We then (Section 3.2) provide a constructive definition of feature graph unification and prove that it corresponds to the least upper bound definition in a naturalway. We also provide in Section 3.4 an algorithm for computing the unification of two feature graphs. AVM unification can then be defined indirectly, using the correspondence between feature graphs and AVMs. We define unification directly for AFSs in Section 3.5. We conclude this chapter with a discussion of generalization, a dual operation to unification.
Natural languages are among Nature's most extraordinary phenomena. While humans acquire language naturally and use it with great ease, the formalization of language, which is the focus of research in linguistics, remains evasive. As in other sciences, attempts at formalization involve idealization: ignoring exceptions, defining fragments, and the like. In the second half of the twentieth century, the field of linguistics has undergone a revolution: The themes that are studied, the vocabulary with which they are expressed, and the methods and techniques for investigating them have changed dramatically. While the traditional aims of linguistic research have been the description of particular languages (both synchronically and diachronically), sometimes with respect to other, related languages, modern theoretical linguistics seeks the universal principles that underlie all natural languages; it is looking for structural generalizations that hold across languages, as well as across various phrase types in a single language, and it attempts to delimit the class of possible natural languages by formal means.
The revolution in linguistics, which is attributed mainly to Noam Chomsky, has influenced the young field of computer science. With the onset of programming languages, research in computer science began to explore different kinds of languages: formal languages that are constructed as a product of concise, rigorous rules. The pioneering work of Chomsky provided the means for applying the results obtained in the study of natural languages to the investigation of formal languages.
We developed an elaborate theory of unification grammars, motivated by the failure of context-free grammars to capture some of the linguistic generalizations one would like to express with respect to natural languages. In this chapter, we put the theory to use by accounting for several of the phenomena that motivated the construction. Specifically, we account for all the language fragments discussed in Section 1.3.
Much of the appeal of unification-based approaches to grammar stems from their ability to a account for linguistic phenomena in a concise way; in other words, unification grammars facilitate the expression of linguistic generalizations. This is mediated through two main mechanisms: First, the notion of grammatical category is expressed via feature structures, thereby allowing for complex categories as first-class citizens of the grammatical theory. Second, reentrancy provides a concise machinery for expressing “movement,” or more generally, relations that hold in a deeper level than a phrase-structure tree. Still, the formalism remains monostratal, without any transformations that yield a surface structure from some other structural representation.
Complex categories are used to express similarities between utterances that are not identical. With atomic categories of the type employed by context free grammars, two categories can be either identical or different. With feature structures as categories, two categories can be identical along one axis but different along another.
This chapter gives the definition of ‘category’ in Section 1.1, and follows that by four sections devoted entirely to examples of categories of various kinds. If you have never met the notion of a category before, you should quite quickly read through Definition 1.1.1 and then go to Section 1.2. There you will find some examples of categories that you are familiar with, although you may not have recognized the categorical structure before. In this way you will begin to see what Definition 1.1.1 is getting at. After that you can move around the chapter as you like.
Remember that it is probably better not to start at this page and read each word, sentence, paragraph, …, in turn. Move around a bit. If there is something you don't understand, or don't see the point of, then leave it for a while and come back to it later.
Life isn't linear, but written words are.
Categories defined
This section contains the definition of ‘category’, follows that with a few bits and pieces, and concludes with a discussion of some examples. No examples are looked at in detail, that is done in the remaining four sections. Section 1.2 contains a collection of simpler examples, some of which you will know already. You might want to dip into that section as you read this section.
In Chapter 2, Sections 2.3 to 2.7 we looked at some simple examples of limits and colimits. These are brought together in Table 2.1 which is repeated here as Table 4.1. In this chapter we generalize the idea.
Before we begin the details it is useful to outline the five steps we go through together with the associated notions for each step. After that we look at each step in more detail.
Template
This is the shape ∇ that a particular kind of diagram can have. It is a picture consisting of nodes (blobs) and edges (arrows). The central column of Table 4.1 lists a few of the simpler templates. Technically, a template is often a directed graph or more generally a category.
Diagram
This is an instantiation of a particular template ∇ in a category C. Each node of ∇ is instantiated with an object of C, and each edge is instantiated with an arrow of C. There are some obvious source and target restrictions that must be met, and the diagram may require that some cells commute. Thus we sometimes use a category as a template.
Posed problem
Each diagram in a category C poses two problems, the left (blunt end) problem and the right (sharp end) problem. We never actually say what the problem is (which is perhaps the reason why it is rarely mentioned) but we do say what a solution is.