We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Separation logics have assertions—for example P * (x ↦ y) * Q—that describe objects in some underlying model—for example “heaplets”—that separate in some way—such as “the heaplet satisfying P can join with (is disjoint from) the heaplet satisfying x ↦ y.” In this chapter we investigate the objects in the underlying models: what kinds of objects will we have, and what does it mean for them to join?
This study of join relations is the study of separation algebras. Once we know how the underlying objects join, this will explain the meaning of the * operator (and other operators), and will justify the reasoning rules for these operators.
In a typical separation logic, the state has a stack ρ for local variables and a heap m for pointers and arrays. Typically, m is a partial function from addresses to values. The key idea in separation logic is that that each assertion characterizes the domain of this function as well as the value of the function. The separating conjunction P * Q requires that P and Q operate on subheaps with disjoint domains.
In contrast, for the stack we do not often worry about separation: we may assume that both P and Q operate on the entirety of the stack ρ.
For now, let us ignore stacks ρ, and let us assume that assertions P are just predicates on heaps, so m ⊨ P is simply P(m).
For convenient application of the VST program logic for C light, we have synthetic or derived rules: lemmas built from common combinations of the primitive inference rules for C light. We also have proof automation: programs that look at proof goals and choose which rules to apply.
For example, consider the C-language statements x:=e→f; and e1→f := e2; where x is a variable, f is the name of a structure field, and e, e1, e2 are expressions. The first command is a load field statement, and the second is a store field. Proofs about these statements could be done using the general semax-load and semax-store rules—along with the mapsto operator—but these require a lot of reasoning about field l-values. It's best to define a synthetic field_mapsto predicate that can be used as if it were a primitive:
We do not show the definition here (see floyd/field_mapsto.v) but basically field_mapsto π τ v1v2 is a predicate meaning: τ is a struct type whose field f of type τ2 has address-offset δ from the base address of the struct; the size/signedness of f is ch, v1 is a pointer to a struct of type τ, and the heaplet contains exactly v1 + δ v2, (value v2 at address v1 + δ with permission-share π), where v2: τ2.
An important application of separation algebras is to model Hoare logics of programming languages with mutable memory. We generate an appropriate separation logic by choosing the correct semantic model, that is, the correct separation algebra. A natural choice is to simply take the program heaps as the elements of the separation algebra together with some appropriate join relation.
In most of the early work in this direction, heaps were modeled as partial functions from addresses to values. In those models, two heaps join iff their domains are disjoint, the result being the union of the two heaps. However, this simple model is too restrictive, especially when one considers concurrency. It rules out useful and interesting protocols where two or more threads agree to share read permission to an area of memory.
There are a number of different ways to do the necessary permission accounting. Bornat et al. [27] present two different methods; one based on fractional permissions, and another based on token counting. Parkinson, in chapter 5 of his thesis [74], presents a more sophisticated system capable of handling both methods. However, this model has some drawbacks, which we shall address below.
Fractional permissions are used to handle the sorts of accounting situations that arise from concurrent divide-and-conquer algorithms. In such algorithms, a worker thread has read-only permission to the dataset and it needs to divide this permission among various child threads.
Mappings are logical specifications of the relationship between schemas. In data exchange, one typically restricts the kind of dependencies allowed in mappings, either to be able to find more efficient procedures for constructing solutions and answering target queries, or to make mappings have desirable properties, such as closure under composition. These two tasks could be contradictory. For instance, the mapping language of SO tgds ensures closure under composition, but such mappings include a form of second-order quantification that can be difficult to handle in practice. Thus, it is desirable to replace an SO tgd by an equivalent set of st-tgds whenever possible.
In this chapter, we consider the problem of simplifying schema mappings by providing characterizations of the most common classes of mappings in terms of the structural properties they satisfy. The main goal for studying these properties is to isolate the features that different classes of mappings satisfy, and to understand what one can lose or gain by switching from one class of mappings to another. We present basic structural properties and then we use them to characterize the class of mappings specified by st-tgds, both generally, and in LAV and GAV scenarios. We also show that the structural characterizations can be used to derive complexity-theoretical results for testing definability of a mapping into some class of mappings.
So far we have tacitly assumed that one uses a native XML DBMS for performing data exchange tasks. However, this is not the only (and perhaps not even the most common) route: XML documents are often stored in relational DBMSs. Thus, it is natural to ask whether relational data exchange techniques, developed in PART TWO, can be used to perform XML data exchange tasks.
In XML terminology, translations from XML to relations are referred to as shredding of documents, whereas translations going the other way, from relations to XML, are referred to as publishing. Thus, to use relational technology for XML data exchange tasks, we can employ a two-step approach:
shred XML data into relations;
then apply a relational data-exchange engine (and publish the result back as an XML document if necessary).
The seems very natural, but the key question is whether it will work correctly. That is, are we guaranteed to have the same result as we would have gotten had we implemented a native XML data-exchange system? This is what we investigate in this chapter. It turns out that we need to impose restrictions on XML schema mappings to enable this approach, and the restrictions are similar to those we needed to ensure tractability of data exchange tasks in the previous chapters.
Translations and correctness
We now describe what we mean by correctness of translations that enable a relational data exchange system to perform XML data exchange tasks.
In data exchange, we are interested in computing certain answers to a query. However, we do not yet know when such a computation is feasible. The goal of the chapter is to answer this question.
The bad news is that the problem of computing certain answers for relational calculus (equivalently, relational algebra or FO) queries is undecidable, even in the absence of target dependencies. But the good news is that the problem becomes decidable, and, indeed, tractable, for unions of conjunctive queries over mappings with a weakly acyclic set of tgds. Conjunctive queries, as was already mentioned several times, play a very important role and are very common, and mappings with weakly acyclic sets of tgds, as we have seen, are the ones behaving particularly well when it comes to materializing solutions.
The positive result, however, breaks down when we extend conjunctive queries with inequalities. But we can still find a meaningful class of queries capable of expressing interesting properties in the data exchange context that extends conjunctive queries with a limited amount of negation and shares most of their good properties for data exchange.
Finally, we study the notion of query rewriting, i.e., when certain answers to a query Q can be computed by posing a possibly different query Q′ over a materialized solution. Such rewritings are easy for unions of conjunctive queries; our study concentrates on rewritings of relational algebra queries.
So far we have concentrated on handling data in data exchange, i.e., transforming source databases into target ones, and answering queries over them. We now look at manipulating information about schemas and schema mappings, known as metadata, i.e., we deal with metadata management. In this short chapter we outline the key problems that need to be addressed in the context of metadata management. These are divided into two groups of problems. The first concerns reasoning about mappings, and the second group of problems is about manipulating mappings, i.e., building new mappings from existing ones.
Reasoning about schema mappings
As we have seen, mappings are logical specifications of the relationship between schemas, both in the relational and XML scenarios. In particular, we have seen many different logical languages that are used to specify mappings. Thus, a first natural problem that one would like to study in the context of metadata management is to characterize the properties that a mapping satisfies depending on the logical formulae that are used to define it. More precisely, one would like, in the first place, to understand whether the logical formulae used to specify a mapping are excessively restrictive in the sense that no source instance admits a solution, or at least restrictive in the sense that some source instances do not admit solutions. Note that this is different from the problem of checking for the existence of solutions, studied in Chapter 5.
Data exchange, as the name suggests, is the problem of exchanging data between different databases that have different schemas. One often needs to exchange data between existing legacy databases, whose schemas cannot be easily modified, and thus one needs to specify rules for translating data from one database to the other. These rules are known as schema mappings. Once a source database and a schema mapping are given, one needs to transfer data to the target, i.e., construct a target database. And once the target database is constructed, one needs to answer queries against it.
This problem is quite old; it has been studied, and systems have been built, but it was done in a rather ad hoc way. A systematic study of the problem of data exchange commenced with the 2003 paper “Data exchange: semantics and query answering” by Fagin, Kolaitis, Miller, and Popa, published in the proceedings of the International Conference on Database Theory. A large number of followup papers appeared, and for a while data exchange was one of the most active research topics in databases. Foundational questions related to data exchange largely revolved around three key problems:
how to build a target solution;
how to answer queries over target solutions; and
how to manipulate schema mappings themselves.
The last question is also known under the name of metadata management, since mappings represent metadata, rather than data in the database.