We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This book addresses the broad community of researchers in various fields who use relations in their scientific work. Relations occur or are used in such diverse areas as psychology, pedagogy, social choice theory, linguistics, preference modelling, ranking, multicriteria decision studies, machine learning, voting theories, spatial reasoning, data base optimization, and many more. In all these fields, and of course in mathematics and computer science, relations are used to express, to model, to reason, and to compute with. Today, problems arising in applications are increasingly handled using relational means.
In some of the above mentioned areas it sometimes looks as if the wheel is being reinvented when standard results are rediscovered in a new specialized context. Some areas are highly mathematical, others require only a moderate use of mathematics. Not all researchers are aware of the developments that relational methods have enjoyed in recent years.
A coherent text on this topic has so far not been available, and it is intended to provide one with this book. Being an overview of the field it offers an easy start, that is nevertheless theoretically sound and up to date. It will be a help to scientists even if they are not overly versed in mathematics but have to apply it. The exposition does not stress the mathematical side too early. Instead, visualizations of ideas – mainly via matrices but also via graphs – are presented first, while proofs during the early introductory chapters are postponed to an appendix.
So far we have concentrated on the foundations of relational mathematics. Now we switch to applications. A first area of applications concerns all the different variants of orderings as they originated in operations research: weakorders, semiorders, intervalorders, and block-transitive orderings. With the Scott-Suppes Theorem in relational form as well as with the study of the consecutive 1s property, we here approach research level.
The second area of applications concerns modelling preferences with relations. The hierarchy of orderings is considered investigating indifference and incomparability, often starting from so-called preference structures, i.e., relational outcomes of assessment procedures. A bibliography on early preference considerations is contained in [2].
The area of aggregating preferences with relations, studied as a third field of applications, is relatively new. It presents relational measures and integration in order to treat the trust and belief of the Dempster–Shafer Theory in relational form. In contrast, the fuzzy approach is well known, with the coeficients of matrices stemming from the real interval [0, 1]; it comes closer and closer to relational algebra proper. In the present book, a direct attempt is made. Also t-norms and De Morgan triples can be generalized to a relational form.
Part I starts by recalling more or less trivial facts on sets, their elements or subsets, and relations between them. It is rather sketchy and will probably be uninteresting for literate scientists such as mathematicians and/or logicians.
Sets, elements, subsets, and relations can be represented in different ways. We will give hints to the programmer how to work with concrete relations and put particular emphasis on methods for switching from one form to another. Such transitions may be achieved on the representation level; they may, however, also touch a higher relation-algebraic level which we hide at this early stage.
We are going to recall how a partition is presented, or a permutation. Permutations may lead to a different presentation of a set or of a relation on or between sets. Functions between sets may be given in various forms, as a table, as a list, or in some other form. A partition may reduce the problem size when factorizing according to it. We show how relations emerge. This may be simply by writing down a matrix, or by abstracting with a cut from a real-valued matrix. For testing purposes, the relation may be generated randomly.
There is a clash in attitudes and expectations between mathematicians and information engineers. While the first group is interested in reasoning about properties, the second aims at computing and evaluating around these properties and thus tries to have the relations in question in their hands.
Typechecking is a static analysis on programs that enables the detection of potential dynamic type errors. Since it is already a standard technique in programming languages and since XML's schemas are analogous to data types in programming languages, we might expect existing typechecking techniques to be usable also for the verification of XML processing programs. However, a concrete analysis algorithm must be renewed entirely since schemas have a mathematical structure completely different from conventional data types. In this chapter, we first overview various approaches to XML typechecking and then consider, as a case study, a core of the XDuce type system in detail.
Compact taxonomy
Since the aim of typechecking is the rejection of possibly incorrect programs, it is a language feature clearly visible to the user. Therefore, in constructing a typechecker we must take into account not only the internal analysis algorithm but also the external specification, which further influences the design of the whole language.
The first design consideration is the kind of type error that should be detected. That is, what sort of type error should be raised from a program in the target language? The most typical kind of error occurs when the final result does not conform to the specified “output type.” However, some languages also check intermediate results against types. For example, the XDuce language has a pattern match facility, according to which a match error occurs when a form of conformance test on an intermediate result fails (Section 5.1).
Computer science, like other mathematical fields, cannot live without a tight relationship with reality. However, such a relationship is, frankly, not very common. This is probably why people so enthusiastically welcome a true meeting of theory and practice. In that sense, the coming together of XML and tree automata theory was a beautiful marriage. Thus I have written this book in the earnest hope that the news of this marriage will be spread and celebrated all over the world!
The book is a summary of my ten years' work. It could not have been realized without my collaborators, Peter Buneman, Giuseppe Castagna, Alain Frisch, Vladimir Gapeyev, Kazuhiro Inaba, Shinya Kawanaka, Hiromasa Kido, Michael Y. Levin, Sebastian Maneth, Makoto Murata, Benjamin C. Pierce, Tadahiro Suda, Jérôme Vouillon, Takeshi Yashiro, and Philip Wadler. In particular, I thank Kazuhiro Inaba and Sebastian Maneth, who made uncountable comments on the draft and thus contributed to a huge improvement of the book. Lastly, I thank my dear wife, Ayako, who gave me the warmest and unceasing encouragement to finish the book.
This chapter presents efficient algorithms for several important problems related to XML processing, namely: (1) membership for tree automata (to test whether a given tree is accepted by a given tree automaton); (2) evaluation of marking tree automata (to collect the set of bindings yielded by the matching of a given tree against a given marking tree automaton); and (3) containment for tree automata (to test whether the languages of given two tree automata are in subset relation). For these problems we describe several “on-the-fly” algorithms, in which only a part of the whole state space is explored to obtain the final result. In such algorithms there are two basic approaches, top-down and bottom-up. The top-down approach explores the state space from the initial states whereas the bottom-up approach does this from the final states. In general, the bottom-up approach tends to have a lower complexity in the worst case whereas the top-down often gives a higher efficiency in practical cases. We will also consider a further improvement from combining these ideas.
Membership algorithms
In this section, we will consider three algorithms for testing membership. The first is a top-down algorithm, which can be obtained rather simply from the semantics of tree automata but takes time that is exponential in the size of the input tree, in the worst case.
There exist many other areas of possibly fruitful application of relational mathematics from which we select three topics – and thus omit many others. In the first section, we will again study the lifting into the powerset. Credit is mainly due to theoretical computer scientists (e.g. [22, 39]) who have used and propagated such constructs as an existential image or a power transpose. When relations (and not just mappings as for homomorphisms) are employed to compare state transitions, we will correspondingly need relators as a substitute for functors. We will introduce the power relator. While rules concerning power operations have often been only postulated, we go further here and deduce such rules in our general axiomatic setting.
In Section 19.2, we treat questions of simulation using relational means. When trying to compare the actions of two black box state transition systems, one uses the concept of a bisimulation to model the idea of behavioral equivalence. In Section 19.3, a glimpse of state transition systems and system dynamics is offered.
The key concept is that of an orbit, the sequence of sets of states that result when continuously executing transitions of the system starting from some given set of states. There exist many orbits in a delicately related relative position. Here the application of relations seems particularly promising.
Tree transducers form a class of finite-state models that not only accept trees but also transform them. While in Chapter 7 we described a simple yet very powerful tree transformation language, μXDuce, tree transducers are more restricted and work with only a finite number of states. An important aspect of the restriction is that tree transducers can never use intermediate data structure. However, this restriction is not too unrealistic, as it is one of the design principles of the XSLT language – currently the most popular language for XML transformation. In addition, this restriction leads to several nice properties that otherwise would hardly hold; among the most important is the exact typechecking property to be detailed in Chapter 11. The present chapter introduces two standard tree transducer frameworks, namely, top-down tree transducers and macro tree transducers, and shows their basic properties.
Top-down tree transducers
Top-down tree transducers involve a form of transformation that traverses a given tree from the root to the leaves while at each node producing a fragment of the output tree determined by the label of the current node and the current state. Since a state of a tree transducer can then be seen as a rule from a label to a tree fragment, we call a state a procedure from now on.
When considering the tree transducer family, we often take a nondeterministic approach, as in the case of automata.
A comparison may help to describe the intention of this book: natural sciences and engineering sciences have their differential and integral calculi. Whenever practical work is to be done, one will easily find a numerical algebra package at the computing center which one will be able to use. This applies to solving linear equations or determining eigenvalues, for example, in connection with finite element methods.
The situation is different for various forms of information sciences as in the study of vagueness, fuzziness, spatial or temporal reasoning, handling of uncertain/rough/ qualitative knowledge in mathematical psychology, sociology, and computational linguistics, to mention a few areas. These also model theoretically with certain calculi, the calculi of logic, of sets, the calculus of relations, etc. However, for applications practitioners will usually apply Prolog-like calculi. Hardly anybody confronted with practical problems knows how to apply relational calculi; there is almost no broadly available computer support. There is usually no package able to handle problems beyond toy size. One will have to approach theoreticians since there are not many practitioners in such fields. So it might seem that George Boole in 1854 [26, 28] was right in saying:
It would, perhaps, be premature to speculate here upon the question whether the methods of abstract science are likely at any future day to render service in the investigation of social problems at all commensurate with those which they have rendered in various departments of physical inquiry.
The data format known as extensible mark-up language (XML) describes tree structures based on mark-up texts. The tree structures are formed by inserting, between text fragments, open and end tags that are balanced, like parentheses. A data set thus obtained is often called a document. On the surface, XML resembles hypertext mark-up language (HTML), the most popular display format for the Web. The essential difference, however, is that in XML the structure permitted to documents, including the set of tag names and their usage conventions, is not fixed a priori.
More precisely, XML allows users to define their own schemas; a schema determines the permitted structure of a document. In this sense, it is often said that a schema defines a “subset of XML” and thus XML is a “format for data formats.” With the support of schemas each individual application can define its own data format, while virtually all applications can share generic software tools for manipulating XML documents. This genericity is a prominent strength of XML in comparison with other existing formats. Indeed, XML has been adopted with unprecedented speed and range: an enormous number of XML schemas have been defined and used in practice. To raise a few examples, extensible HTML (XHTML) is the XML version of HTML, simple object access protocol (SOAP) is an XML message format for remote procedure calls, scalable vector graphics (SVG) is a vector graphics format in XML, and MathML is an XML format for mathematical formulas.
Many of the problems handled in applications are traditionally formulated in terms of graphs. This means that graphs will often be drawn and the reasoning will be pictorial. On the one hand, this is nice and intuitive when executed with chalk on a blackboard. On the other hand there is a considerable gap from this point to treating the problem on a computer. Often ad hoc programs are written in which more time is spent on I/O handling than on precision of the algorithm. Graphs are well suited to visualization of a result, even with the possibility of generating the graph via a graph drawing program. What is nearly impossible is the input of a problem given by means of a graph – when not using RELVIEW'S interactive graph input (see [18], for example.). In such cases some sort of relational interpretation of the respective graph is usually generated and input in some way.
We will treat reducibility and irreducibility first, mentioning also partial decomposability. Then difunctional relations are studied in the homogeneous context which provides additional results. The main aim of this chapter is to provide algorithms to determine relationally specified subsets of a graph or relation in a declarative way.
Reducibility and irreducibility
We are now going to study in more detail the reducibility and irreducibility introduced in a phenomenological form as Def. 6.12. Many of these results for relations stem from Georg Frobenius [54] and his study of eigenvalues of non-negative realvalued matrices; a comprehensive presentation is given in [92].
The work in this chapter – although stemming from various application fields – is characterized by two antitone mappings leading in opposite directions that cooperate in a certain way. In most cases they are related to one or more relations which are often heterogeneous. An iteration leads to a fixed point of a Galois correspondence. Important classes of applications lead to these investigations. Trying to find out where a program terminates, and thus correctness considerations, also invoke such iterations. Looking for the solution of games is accompanied by these iterations. Applying the Hungarian alternating chain method to find maximum matchings or to solve assignment problems subsumes to these iterations. All this is done in structurally the same way, and deserves to be studied separately.
Galois iteration
When Evariste Galois wrote down his last notes, in preparation for the duel in 1832, in which he expected to die, he probably could not have imagined to what extent these notes would later influence mathematics and applications. What he had observed may basically be presented with the correspondence of permutations of a set and their fixed points. Consider the 5-element sequence {1, 2, 3, 4, 5} for which there exist in total 5! = 120 permutations. The idea is now to observe which set of permutations leaves which set of elements fixed. Demanding more elements to be untouched by a permutation results, of course, in fewer permutations.