1. Introduction
To handle notions of differentiation that have become more prominent in computer science, two categorical structures have been useful: monoidal differential categories (Blute et al., Reference Blute, Cockett and Seely2006) and Cartesian differential category (CDCs) (Blute et al., Reference Blute, Cockett and Seely2009). Each axiomatizes a different aspect of differentiation: monoidal differential categories axiomatize the linear maps and then derive the smooth maps from them; conversely, CDCs axiomatize the smooth maps and derive the linear maps from them. While these structures have been very useful, they both only represent the “forward” aspect of differentiation. For uses of the derivative in supervised learning, the “reverse” derivative is more relevant.
To understand the difference between forward and reverse differentiation, let us provide a simple example. Consider the smooth map $f: \mathbb{R}^2 \to \mathbb{R}$ defined by $f(x_1,x_2) = x_1^2x_2 + \sin(x_2)$ . The Jacobian matrix of f, at $(x_1, x_2)$ , is the $1 \times 2$ matrix whose components are the partial derivatives of f:
The directional (forward) derivative of f is the map $\mathsf{D}[f]: \mathbb{R}^2 \times \mathbb{R}^2 \to \mathbb{R}$ given by multiplying the Jacobian matrix of f at the first input vector by the second input vector (seen as a $2 \times 1$ matrix):
Note that this “pushes vectors forwards”: at a point of $\mathbb{R}^2$ , the directional derivative $\mathsf{D}[f]$ takes a vector in $\mathbb{R}^2$ to a vector in $\mathbb{R}$ ; that is, vectors are moved in the same direction as the map f itself.
Conversely, the reverse derivative moves vectors in the opposite direction. The reverse derivative uses the transpose of the Jacobian of f at $(x_1, x_2)$ , which is the $2 \times 1$ matrix:
Then the reverse derivative of f is $\mathsf{R}[f]: \mathbb{R}^2 \times \mathbb{R} \to \mathbb{R}^2$ defined by multiplying the transpose of the Jacobian f at the first input vector by the second input vector (this time seen as a $1 \times 1$ matrix):
Thus, this operation indeed moves vectors in the opposite direction as f; that is, it takes vectors from the codomain of f and returns vectors in the domain. The reverse derivative is better suited for supervised learning situations, in which one knows a change in the codomain (e.g., the error of some function) and wants to know how much adjustment to make in the domain.
Thus, a natural question is how to modify monoidal and CDCs to handle reverse differentiation. For the Cartesian side of the picture, this was already accomplished in Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020). While a CDC involves a category which comes equipped with an operator $\mathsf{D}$ which for any map $f: A \to B$ outputs a map $\mathsf{D}[f]: A \times A \to B$ , a Cartesian reverse differential category (CRDC) comes equipped with an operator $\mathsf{R}$ which for any map $f: A \to B$ outputs a map $\mathsf{R}[f]: A \times B \to A$ . It was shown in Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020) that a CRDC can be seen as a CDC with additional structure. Specifically, a CRDC is equivalent to giving a CDC in which the subcategory of linear maps in each simple slice has a transpose operator, which categorically speaking is a special type of of dagger structure. The explicit connection with supervised learning was then made in Cruttwell et al. (Reference Cruttwell, Gavranović, Ghani, Wilson and Zanasi2022), which showed how to describe several supervised-learning techniques in the abstract setting of a CRDC.
However, the first CRDC paper Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020) left open the question of what an MRDC should be. The goal of this paper is to fill in this gap by defining monoidal reverse differential categories and establishing their fundamental relationships to the existing categorical differential structures described above.
What should this structure look like? As mentioned above, CDCs axiomatize smooth maps, while MDCs axiomatize linear maps. However, as noted above, for a CRDC, its subcategory of linear maps has dagger structure. So at a minimum, an MRDC should have dagger structure. However, we argue in this paper that an MRDC should be an even stronger: it should be self-dual compact closed.
Why do we ask for this additional structure? There are two important requirements we ask of an MRDC.
(1) Just as every CRDC gives a CDC so should an MRDC give a monoidal differential category (MDC); moreover, we should be able to characterize precisely what structure is required of an MDC to make it an MRDC (as we can in the Cartesian case (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Theorem 41).
(2) Just as the coKleisli category of an MDC is a CDC (Blute et al., Reference Blute, Cockett and Seely2009, Proposition 3.2.1), so should the coKleisli category of an MRDC be a CRDC.
We shall see in Section 4.1 that these requirements force an MRDC to be self-dual compact closed.
To prove these results, it will be helpful to investigate the basic structure of an MDC more closely. In Section 2, we review monoidal differential categories and add a new aspect to their story: a “context fibration” which helps to relate the structure of MDCs to CDCs (and then similarly between MRDCs and CRDCs).
Thus, the main contributions of this paper are as follows:
• Give the basic definition of an MRDC, along with examples, including some unexpected ones in quantum computation.
• Prove theorems that describe the relationships of MRDCs to CDCs, CRDCs, and MDCs.
• Provide additional material about the relationship of MDCs to CDCs via a “context fibration”.
This work leaves open many future avenues for exploration; we describe some of these in Section 5.
Related Work: In this paper, we study the linear logic categorical semantics for reverse differentiation. This was also studied by Smeding and Vákár when they provide the categorical semantics for CHAD, their programming language for automatic differentiation (which includes both forward and reverse differentiation) (Vákár and Smeding, Reference Vákár and Smeding2022). The work in this paper is also related to work done in categorical quantum mechanics. Indeed, the categorical semantics of (differential) linear logic that we consider in this paper also comes with the added assumption of dagger-compact-closed structure. Compact-closed categories have long been considered as models for linear logic (Hyland and Schalk, Reference Hyland and Schalk2003), and they form a setting that is often studied by those in the categorical quantum community, sometimes called multiplicative categorical quantum logic (Duncan, Reference Duncan2006, Chapter 4). We are specifically interested in compact-closed models of linear logic with exponentials. This is a setting that was studied by Selinger and Valiron when they developed a programming language for quantum computation with classical control (Selinger and Valiron, Reference Selinger and Valiron2008), as well as by Vicary who studied categorical quantum harmonic oscillators (Vicary, Reference Vicary2008). Cockett, Comfort, and Srinivasan also provide a generalization of linear logic with exponentials for categorical quantum mechanics, by generalizing compact-closed categories to linear distributive categories with daggers (Srinivasan et al., Reference Srinivasan, Comfort and Cockett2021).
Outline: A reader interested in just the definition of MRDC can skip ahead to Section 4. However, an important part of the paper is the justification of why we define MRDCs the way we do: in particular, why the self-dual compact-closed requirement is important. For this, it was helpful for us to expand on a number of aspects of MDCs and CRDCs. In particular, in Section 2.2, we define a canonical “context” fibration associated to any MDC. Then, in Section 2.7, we show that when we can build an associated CDC from an MDC, the canonical fibration associated to the MDC is equivalent to the canonical fibration of linear maps associated to a CDC. This result is key in seeing why MRDCs must be self-dual compact closed. Thus, prior to defining an MRDC, we review the background of MDCs, CDCs, and CRDCs, but also add some important new theory of these structures, which will in turn be helpful in understanding how our definition of an MRDC comes about. Section 4 contains the main definition of the paper, that of an MRDC. We also describe examples and prove the required properties. We conclude the section by describing additional ways to build CRDCs. Finally, in Section 5, we describe some ways this work can be expanded on in the future.
Conventions: In this paper, we will use the same terminology, notation, graphical calculus, and conventions as found in Blute et al. (Reference Blute, Cockett, Lemay and Seely2020). In particular, our string diagrams are to be read from top to bottom, and we write composition diagrammatically, that is, the composition of maps $f: A \to B$ and $g: B \to C$ is denoted as $f;\;g: A \to C$ .
2. Forward Differential Categories
In this section, we review the theory of monoidal and CDCs and add an important new element to the story: a canonical fibration associated to any coalgebra modality (in particular, to any differential category); see Section 2.2. When the differential category has products, so that we can build its associated CDC, we show that this canonical fibration is isomorphic to the canonical linear fibration of the CDC; see Theorem 24. This isomorphism will be very useful when we go from MRDCs to CRDCs.
2.1 Coalgebra modalities
The central structure on which a monoidal differential category rests is a coalgebra modality.
Definition 1. A coalgebra modality (Blute et al., Reference Blute, Cockett and Seely2006, Definition 2.1) on a symmetric monoidal category $\mathbb{X}$ is a quintuple $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ consisting of an endofunctor ${\rm{!}}: \mathbb{X} \to \mathbb{X}$ and four natural transformations:
such that $({\rm{!}}, \delta, \varepsilon)$ is a comonad and for each object A, $({\rm{!}} A, \Delta, e)$ is a cocommutative comonoid and $\delta_A$ is a comonoid morphism, that is, the diagrams found in (Reference Blute, Cockett, Lemay and SeelyBlute et al., 2020, Definition 1) commute.
Note that requiring that $\Delta$ and e be natural transformations is equivalent to asking that for each map $f: A \to B$ , ${\rm{!}}(f): {\rm{!}} A \to {\rm{!}} B$ is also a comonoid morphism. In the graphical calculus, we will use functor boxes when dealing with string diagrams involving the endofunctor, that is, a mere map $f: A \to B$ will be encased in a circle while ${\rm{!}}(f): {\rm{!}} A \to {\rm{!}} B$ will be encased in a box:
The remaining coalgebra modality structure maps are drawn as follows:
We will occasionally make use of the following canonical natural transformation associated to any coalgebra modality.
Definition 2. For a coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ on a symmetric monoidal category $\mathbb{X}$ , its coderiving transformation (Cockett and Lemay, Reference Cockett and Lemay2018, Definition 2.2) is the natural transformation $\mathsf{d}^\circ_A: {\rm{!}} A \to {\rm{!}} A \otimes A$ defined as follows:
See Cockett and Lemay (Reference Cockett and Lemay2018, Proposition 2.1) for a list of identities the coderiving transformation satisfies.
We now turn our attention to when our symmetric monoidal category also has finite products.
Definition 3. For a coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ on a symmetric monoidal category $\mathbb{X}$ with finite products, the Seely maps consist of the natural transformations:
defined, respectively, as follows:
A coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ on a symmetric monoidal category $\mathbb{X}$ with finite products has Seely isomorphisms (Reference Blute, Cockett, Lemay and SeelyBlute et al., 2020, Definition 10) if the Seely maps are isomorphisms, so that ${\rm{!}}(A \times B) \cong {\rm{!}} A \otimes {\rm{!}} B$ and ${\rm{!}} \top \cong k$ .
Coalgebra modalities with Seely isomorphisms can equivalently be described as monoidal coalgebra modalities (Blute et al., Reference Blute, Cockett, Lemay and Seely2020, Definition 2), which are coalgebra modalities equipped with extra structure: a natural transformation ${\mathsf{m}_{A,B}: {\rm{!}} A \otimes {\rm{!}} B \to {\rm{!}}(A \otimes B)}$ and a map $\mathsf{m}_k: k \to {\rm{!}} k$ such that the underlying comonad ${\rm{!}}$ is a symmetric monoidal comonad, and that $\Delta$ and $\mathsf{e}$ are both monoidal transformations and ${\rm{!}}$ -coalgebra morphisms (which imply that $\mathsf{m}_{A,B}$ and $\mathsf{m}_k$ are comonoid morphisms). See Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Section 7) for how to build $\mathsf{m}$ and $\mathsf{m}_k$ from the Seely isomorphisms, and vice versa. Note however that monoidal coalgebra modalities can be defined without the need of finite products; however, as they do not play a central role in this paper, we have elected to only briefly mention them. Many examples of (monoidal) coalgebra modalities can be found throughout the literature, see for example Hyland and Schalk (Reference Hyland and Schalk2003, Section 2.4) for a very nice list of various kinds of examples of (monoidal) coalgebra modalities.
We conclude this section by briefly discussing coalgebra modalities in the presence of additive structure. Indeed, the underlying categorical structure of a differential category is not only a symmetric monoidal category but also an additive symmetric monoidal category.
Definition 4. An additive symmetric monoidal category (Blute et al., Reference Blute, Cockett, Lemay and Seely2020, Definition 3) is a symmetric monoidal category $\mathbb{X}$ such that each hom-set $\mathbb{X}(A,B)$ is a commutative monoid with zero map $0 \in \mathbb{X}(A,B)$ and addition ${+:\; \mathbb{X}(A,B) \times \mathbb{X}(A,B) \to \mathbb{X}(A,B)}$ , $(f,g) \mapsto f +g$ , and, such that composition and the tensor product preserves the additive structure, that is, the following equalities hold:
By Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Theorem 1), for additive symmetric monoidal categories, monoidal coalgebra modalities can equivalently be described as additive bialgebra modalities (Blute et al., Reference Blute, Cockett, Lemay and Seely2020, Definition 3). This implies that, in the additive case, we also have two extra natural transformations $\nabla_A: {\rm{!}} A \otimes {\rm{!}} A \to {\rm{!}} A$ and $u_A: k \to {\rm{!}} A$ such that ${\rm{!}} A$ is a bimonoid. In particular, this implies that ${\rm{!}} A$ is a commutative monoid. In the graphical calculus:
If an additive symmetric monoidal category has finite products, then the product $\times$ is in fact a biproduct and the terminal object $\top$ is a zero object. Thus, for an additive symmetric monoidal category with finite (bi)products, a coalgebra modality with Seely isomorphisms is an additive bialgebra modality and vice versa (Blute et al., Reference Blute, Cockett, Lemay and Seely2020, Theorem 6). In particular, the inverse maps $\chi^{-1}_{A,B}: {\rm{!}} A \otimes {\rm{!}} B \to {\rm{!}} (A \times B)$ and $\chi^{-1}_\top: k \to {\rm{!}} \top$ are constructed as follows using the monoid structure of ${\rm{!}} A$ :
where $\iota_0: A \to A \times B$ and $\iota_1: B \to A \times B$ are the injection maps of the biproduct.
2.2 The context fibration associated to a coalgebra modality
In this section, we describe a canonical fibration associated to any coalgebra modality, whose individual fibres were studied in Ehrhard and Jafarrahmani (Reference Ehrhard and Jafarrahmani2021), Hyland and Schalk (Reference Hyland and Schalk1999). We assume the reader is familiar with the theory of fibrations (as, for example, presented in Jacobs, Reference Jacobs1999, Section 2.1). The fibration in question will be over the coKleisli category of the comonad ${\rm{!}}$ . As we will be working with coKleisli categories, we will use the notation in Blute et al. (Reference Blute, Cockett and Seely2015), where interpretation brackets $\left[\kern-0.15em\left[ - \right]\kern-0.15em\right]$ are used to translate between maps in the coKleisli category and maps in the base category. That is, for a comonad $({\rm{!}}, \delta, \varepsilon)$ on a category $\mathbb{X}$ if $\mathbb{X}_{\rm{!}}$ is its coKleisli category, then a map $f: A \to B$ in $\mathbb{X}_{{\rm{!}}}$ corresponds to a map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ in $\mathbb{X}$ . Using this notation, recall that composition and identities in $\mathbb{X}_{\rm{!}}$ are defined as
There are canonical adjoint functors $\mathsf{U}_{\rm{!}}: \mathbb{X}_{\rm{!}} \to \mathbb{X}$ and $\mathsf{F}_{\rm{!}}: \mathbb{X} \to \mathbb{X}_{\rm{!}}$ defined as follows:
We now describe the canonical fibration over the coKleisli category of a coalgebra modality.
Definition 5. Let $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ be a coalgebra modality on a symmetric monoidal category $\mathbb{X}$ . Define the category $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ as follows:
(i) The objects of $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ are pairs of objects (X,A) of $\mathbb{X}$ ; that is:
(ii) The maps of $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ are pairs $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right],g): (X,A) \to (Y,B)$ consisting of a coKleisli map ${\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} X \to Y}$ and a map $g: {\rm{!}} X \otimes A \to B$ , that is,
(iii) The identity map of (X,A) is defined as $(\left[\kern-0.15em\left[ 1_A \right]\kern-0.15em\right], e_X \otimes 1_A) = (\varepsilon_A, e_X \otimes 1_A): (X,A) \to (X,A)$ ;
(iv) The composition of maps $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right],g): (X,A) \to (Y,B)$ and $(\left[\kern-0.15em\left[ h \right]\kern-0.15em\right], k): (Y,B) \to (Z,C)$ is defined as follows:
Let $\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}$ be the forgetful functor, which is defined on objects as $\mathsf{p}_{\rm{!}}(X,A) = X$ and on maps as ${\mathsf{p}_{\rm{!}}(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right],g) = \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]}$ .
The following is then straightforward:
Proposition 6. Let $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ be a coalgebra modality on a symmetric monoidal category $\mathbb{X}$ . Then $\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}$ is a fibration where the Cartesian maps are those of the form:
We now describe the fibres of this fibration. The fibres are examples of Hyland and Schalk’s comonoid indexing (Hyland and Schalk, Reference Hyland and Schalk1999, Section 4) over the cofree ${\rm{!}}$ -coalgebras, which are also used by Ehrhard and Jafarrahmani for studying fixed point formulas (Ehrhard and Jafarrahmani, Reference Ehrhard and Jafarrahmani2021). In particular, since ${\rm{!}} X$ is a comonoid, ${\rm{!}} X \otimes -$ is a comonad and, furthermore, its coKleisli category is precisely the fibre over X.
Lemma 7. Let $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ be a coalgebra modality on a symmetric monoidal category $\mathbb{X}$ . For any object $X \in \mathbb{X}$ , the fibre over X of the fibration $\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}$ is written as $\mathcal{L}_{\rm{!}}[X]$ and given by
(i) The objects of $\mathcal{L}_{\rm{!}}[X]$ are the same as the objects of $\mathbb{X}$ , that is, $Ob\left( \mathcal{L}_{\rm{!}}[X] \right) = Ob\left( \mathbb{X} \right)$ ;
(ii) The maps of $\mathcal{L}_{\rm{!}}[X]$ are maps $f: {\rm{!}} X \otimes A \to B$ , that is, $\mathcal{L}_{\rm{!}}[X]\left( A,B \right) = \mathbb{X}({\rm{!}} X \otimes A, B)$ ;
(iii) The identity map of A is defined as $e_X \otimes 1_A: {\rm{!}} X \otimes A \to A$ ;
(iv) The composition of maps $f: {\rm{!}} X \otimes A \to B$ and $g: {\rm{!}} X \otimes B \to C$ is defined as follows:
For every coKleisli map $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]: {\rm{!}} X \to Y$ , define the substitution functor $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]^\ast_{\rm{!}}: \mathcal{L}_{\rm{!}}[Y] \to \mathcal{L}_{\rm{!}}[X]$ on objects as $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]^\ast_{\rm{!}}(A) = A$ and on maps $f: {\rm{!}} Y \otimes A \to B$ as follows:
Every fibre is also a symmetric monoidal category.
Lemma 8. (Hyland and Schalk, Reference Hyland and Schalk1999, Proposition 4.1) Let $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ be a coalgebra modality on a symmetric monoidal category $\mathbb{X}$ . For every object $X \in \mathbb{X}$ , $\mathcal{L}_{\rm{!}}[X]$ is a symmetric monoidal category where the tensor product $\otimes$ is defined on objects as the tensor product in $\mathbb{X}$ , and on maps $f: {\rm{!}} X \otimes A \to B$ and $g: {\rm{!}} X \otimes C \to D$ as follows:
and where the monoidal unit is the same as $\mathbb{X}$ . Furthermore, for every coKleisli map $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]: {\rm{!}} X \to Y$ , the substitution functor $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]^\ast_{\rm{!}}: \mathcal{L}_{\rm{!}}[Y] \to \mathcal{L}_{\rm{!}}[X]$ is a strict symmetric monoidal functor.
That is, as described in Moeller and Vasilakopoulou (Reference Moeller and Vasilakopoulou2020, Remark 3.5), this fibration is a pseudomonoid in the 2-category of fibrations over $\mathbb{X}_{\rm{!}}$ . This is different from a monoidal fibration, which is defined to be a pseudomonoid in the 2-category of fibrations with non-fixed base (Moeller and Vasilakopoulou, Reference Moeller and Vasilakopoulou2020, Definition 3.1).
However, if the base category has finite products $\mathbb{X}$ , then so does $\mathbb{X}_{\rm{!}}$ : on objects the product is defined as in $\mathbb{X}$ , and the remaining data is defined as follows:
Moreover, such a fibration in which the base category is Cartesian is a monoidal fibration: see Moeller and Vasilakopoulou (Reference Moeller and Vasilakopoulou2020, Theorem 4.1) and Shulman (Reference Shulman2008, Theorem 12.8). In particular, this means that the total category of the fibration is monoidal, and the following corollary describes its structure.
Corollary 9. Let $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ be a coalgebra modality on a symmetric monoidal category $\mathbb{X}$ with finite products. Then $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ is a symmetric monoidal category where the tensor product $\otimes_{\rm{!}}$ is defined on objects as $(X,A) \otimes (Y,B) = (X \times Y, A \otimes B)$ , and on maps $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], g): (X,A) \to (Y,B)$ and $(\left[\kern-0.15em\left[ h \right]\kern-0.15em\right], k): (Z,C) \to (W,D)$ , $ (\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], g) \otimes (\left[\kern-0.15em\left[ h \right]\kern-0.15em\right], k)$ is defined as follows:
and where the monoidal unit is $(\top, k)$ . Furthermore, $\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}$ is a monoidal fibration in the sense of Reference Moeller and VasilakopoulouMoeller and Vasilakopoulou (2020, Definition 3.1).
It also interesting to note that for monoidal coalgebra modalities, each fibre also comes equipped with monoidal coalgebra modality structure (Ehrhard and Jafarrahmani, Reference Ehrhard and Jafarrahmani2021; Hyland and Schalk, Reference Hyland and Schalk1999). Furthermore, if one also assumes finite products, we can extend this to a monoidal coalgebra modality on the whole fibration. However, these results are not necessary for the rest of the story of this paper.
2.3 Monoidal differential categories
We now recall one of the central structures of this paper. Monoidal differential categories (these were originally simply called differential categories, but here we add “monoidal” to help differentiate the various structures we are considering). For a more detailed introduction to monoidal differential categories, we refer the reader to Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Reference Blute, Cockett and Seely2006).
Definition 10. A monoidal differential category (Blute et al., Reference Blute, Cockett and Seely2006, Definition 2.4) is an additive symmetric monoidal category $\mathbb{X}$ with a coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ which comes equipped with a deriving transformation (Blute et al., Reference Blute, Cockett, Lemay and Seely2020, Definition 7); that is, a natural transformation $\mathsf{d}_A: {\rm{!}} A \otimes A \to {\rm{!}} A$ , which is drawn in the graphical calculus as
and such that the following axioms hold:
[d.1] Constant rule:
[d.2] Leibniz rule (or product rule):
[d.3] Linear rule:
[d.4] Chain rule:
[d.5] Interchange rule:
For lists of many examples of differential categories, we invite the reader to see Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Reference Blute, Cockett and Seely2006).
Definition 11. A monoidal differential storage category (Blute et al., Reference Blute, Cockett and Seely2006, Section 4) is a monoidal differential category with finite products whose coalgebra modality has Seely isomorphisms.
For a monoidal differential storage category, the differential structure can also equivalently be described in terms of a codereliction (Blute et al., Reference Blute, Cockett and Seely2006, Definition 4.11), which is a natural transformation $\eta_A: A \to {\rm{!}} A$ satisfying the axioms found in Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Definition 9). By Blute et al. (Reference Blute, Cockett, Lemay and Seely2020, Theorem 4), for coalgebra modalities with Seely isomorphisms (or more generally monoidal coalgebra modalities), there is a bijective correspondence between coderelictions and deriving transformations. Starting with a deriving transformation $\mathsf{d}$ , we construct a codereliction as follows:
Conversely, starting with a codereliction $\eta$ , we construct a deriving transformation as follows:
These constructions are inverses of each other.
2.4 Cartesian differential categories
Another central structure of this paper is a CDC. For a full detailed introduction to CDCs, see Blute et al. (Reference Blute, Cockett and Seely2009), Garner and Lemay (Reference Garner and Lemay2021).
The underlying structure of a CDC is that of a Cartesian left additive category. A category is said to be left additive if it is skew-enriched over the category of commutative monoids (Garner and Lemay, Reference Garner and Lemay2021, Section 2.1), or in other words, if each hom-set is a commutative monoid such that precomposition preserves the additive structure. This allows one to have zero maps and sums of maps while allowing for maps which do not preserve the additive structure. Maps which do preserve the additive structure are called additive maps.
Definition 12. A Cartesian left additive category (Blute et al., Reference Blute, Cockett and Seely2009, Definition 1.2.1) is a category $\mathbb{X}$ with finite products such that each hom-set $\mathbb{X}(A,B)$ is a commutative monoid with zero map $0 \in \mathbb{X}(A,B)$ and addition ${+: \mathbb{X}(A,B) \times \mathbb{X}(A,B) \to \mathbb{X}(A,B)}$ , $(f,g) \mapsto f +g$ , and, such that:
(i) Precomposition preserves the additive structure; that is, the following equalities hold: $f;\; 0 = 0$ and $f;\;(g+h) = f;\;g + f;\;h$
(ii) Postcomposition by the projection maps preserve the additive structure; that is, the following equalities hold: $0;\;\pi_i= 0$ and $(f+g);\;\pi_i = f;\;\pi_i + g;\;\pi_i$ .
In a Cartesian left additive category, a map $f: A \to B$ is additive (Reference Blute, Cockett and SeelyBlute et al., 2009, Definition 1.1.1) if postcomposition by f preserves the additive structure; that is, the following equalities hold: $(g+h);\;f = g;\;f + h;\;f$ and $0;\;f = 0$ (note that the projection maps are additive).
Here are now some important maps for CDCs that can be defined in any Cartesian left additive category. In a Cartesian left additive category $\mathbb{X}$ :
(i) For each pair of objects A and B, define the injection maps $\iota_0: A \to A \times B$ and $\iota_1: B \to A \times B$ respectively as $\iota_0 := \langle 1_A, 0 \rangle$ and $\iota_1 := \langle 0, 1_B \rangle$
(ii) For each object A, define the sum map $+_A: A \times A \to A$ as $+_A := \pi_0 + \pi_1$ .
(iii) For each object A, define the lifting map $\ell_A: A \times A \to (A \times A) \times (A \times A)$ as follows $\ell_A := \iota_0 \times \iota_1$ .
(iv) For each object A, define the interchange map $c_A: (A \times A) \times (A \times A) \to (A \times A) \times (A \times A)$ as follows $c_A : = \left \langle \pi_0 \times \pi_0, \pi_1 \times \pi_1 \right \rangle$ .
Observe that while c is natural in the obvious sense, the same cannot be said for the rest. Indeed, the injection maps $\iota_j$ , the sum map $\nabla$ , and the lifting map $\ell$ are not natural transformations. In particular, since the injection maps are not natural, it follows that these injection maps do not make the product a coproduct, and therefore not a biproduct. However, the well-known biproduct identities still hold in a Cartesian left additive category.
Definition 13. A CDC (Blute et al., Reference Blute, Cockett and Seely2009, Definition 2.1.1) is a Cartesian left additive category $\mathbb{X}$ equipped with a differential combinator $\mathsf{D}$ which is a family of operators:
where $\mathsf{D}[f]$ is called the derivative of f, such that the following seven axioms hold:
[CD.1] Additivity of the differentiation:
[CD.2] Additivity of the derivative in its second variable:
[CD.3] Coherence with identities and projections:
[CD.4] Coherence with pairings:
[CD.5] Chain rule:
[CD.6] Linearity of the derivative in its second variable:
[CD.7] Symmetry of mixed partial derivatives:
More discussions on the intuition for the differential combinator axioms can be found in Blute et al. (Reference Blute, Cockett and Seely2009, Remark 2.1.3). There are many interesting (and sometimes very exotic) examples of CDCs in the literature: see Cockett and Lemay (Reference Cockett and Lemay2022), Garner and Lemay (Reference Garner and Lemay2021).
2.5 Linear fibration of a CDC
Just as any monoidal differential category has a canonical fibration associated to it, so too does a CDC. To understand this fibration, we begin by describing what it means for a map in a CDC to be linear.
Definition 14. In a CDC $\mathbb{X}$ with differential combinator $\mathsf{D}$ , a map $f: A \to B$ is linear (Blute et al., Reference Blute, Cockett and Seely2009, Definition 2.2.1) if the following diagram commutes:
or equivalently ( Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk Cockett et al., 2020 , Lemma 12), if the following diagram commutes:
Define the subcategory of linear maps $\mathsf{LIN}[\mathbb{X}]$ to be the category whose objects are the same as $\mathbb{X}$ and whose maps are linear in $\mathbb{X}$ .
A modification of this notion allows one to describe maps which are only “linear in one variable”.
Definition 15. In a CDC $\mathbb{X}$ with differential combinator $\mathsf{D}$ , a map ${f\!: X \!\times\! A \to B}$ is linear in its second argument A (or linear in context X) (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Definition 9) if the following diagram commutes:
or equivalently ( Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk Cockett et al., 2020 , Lemma 12), if the following diagram commutes:
With such maps, we can define the canonical fibration associated to a CDC.
Definition 16. For a CDC $\mathbb{X}$ with differential combinator $\mathsf{D}$ , define the category $\mathcal{L}[\mathbb{X}]$ as follows:
(i) The objects of $\mathcal{L}[\mathbb{X}]$ are pairs of elements (X,A) of $\mathbb{X}$ , that is, $Ob\left( \mathcal{L}[\mathbb{X}] \right) = Ob\left( \mathbb{X} \right) \times Ob\left( \mathbb{X} \right)$ ;
(ii) The maps of $\mathcal{L}[\mathbb{X}]$ are pairs of maps $(f,g): (X,A) \to (Y,B)$ consisting of an arbitrary map ${f: X \to Y}$ and a map $g: X \times A \to B$ which is linear in context X;
(iii) The identity map of (X,A) is the pair $(1_X, \pi_1): (X,A) \to (X,A)$ ;
(iv) The composition of maps $(f,g): (X,A) \to (Y,B)$ and $(h, k): (Y,B) \to (Z,C)$ is defined as follows:
Let $\mathsf{p}: \mathcal{L}[\mathbb{X}] \to \mathbb{X}$ be the forgetful functor defined on objects as $\mathsf{p}(X,A) = X$ and on maps as $\mathsf{p}(f,g) = f$ .
Note that this is a subcategory of the simple fibration over $\mathbb{X}$ Jacobs (Reference Jacobs1999, Definition 1.3.1). It is then straightforward to show that:
Proposition 17. Let $\mathbb{X}$ be a CDC with differential combinator $\mathsf{D}$ . Then the forgetful functor ${\mathsf{p}: \mathcal{L}[\mathbb{X}] \to \mathbb{X}}$ is a fibration where the Cartesian maps are those of the form $(f, \pi_1): (X,A) \to (Y,A)$ .
It will be useful to have an explicit description of the fibres of this fibration:
Lemma 18. Let $\mathbb{X}$ be a CDC with differential combinator $\mathsf{D}$ . For any object $X \in \mathbb{X}$ , the fibre over X of the fibration $\mathsf{p}: \mathcal{L}[\mathbb{X}] \to \mathbb{X}$ is written as $\mathcal{L}[X]$ and given by
(i) The objects of $\mathcal{L}[X]$ are the same as the objects of $\mathbb{X}$ , that is, $Ob\left( \mathcal{L}[X] \right) = Ob\left( \mathbb{X} \right)$ ;
(ii) The maps of $\mathcal{L}[X]$ are maps $f: X \times A \to B$ which are linear in context X;
(iii) The identity map of A is defined as $\pi_1: X \times A \to A$ ;
(iv) The composition of maps $f: X \times A \to B$ and $g: X \times B \to C$ is defined as follows:
For every map $h: X \to Y$ , define the substitution functor $h^\ast: \mathcal{L}[Y] \to \mathcal{L}[X]$ on objects as $h^\ast(A) = A$ and on maps $f: Y \times A \to B$ as follows:
Furthermore, note that for the terminal object $\top$ , there is an isomorphism $\mathcal{L}[\top] \cong \mathsf{LIN}[\mathbb{X}]$ .
2.6 The coKleisli construction
In this section, we review a very important source of CDCs: the coKleisli categories of monoidal differential categories.
Before constructing the differential combinator, we must first describe the additive structure of the coKleisli category. So let $({\rm{!}}, \delta, \varepsilon)$ be a comonad on a category $\mathbb{X}$ with finite biproducts. Then $\mathbb{X}_{\rm{!}}$ is a Cartesian left additive category (Blute et al., Reference Blute, Cockett and Seely2009, Proposition 1.3.3) where the additive structure is defined as follows:
Furthermore, the injection maps, sum maps, lifting maps, and interchange maps in the coKleisli category are easily computed out to be:
If one starts with a differential category, then using the deriving transformation, we are able to construct a differential combinator for the coKleisli category.
Proposition 19. (Blute et al., Reference Blute, Cockett and Seely2009, Proposition 3.2.1) Let $\mathbb{X}$ be a monoidal differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ , and finite (bi)products (which we denote here using the product notation). Then the coKleisli category $\mathbb{X}_{\rm{!}}$ is a CDC with Cartesian left additive structure defined above and differential combinator $\mathsf{D}$ defined as follows on a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ :
where $\chi_{A \times A}: {\rm{!}} (A \times A) \to {\rm{!}} A \otimes {\rm{!}} A$ is defined as in Definition 3.
It is important to note that the above proposition does not require the coalgebra modality to be monoidal or, equivalently, to have Seely isomorphisms. In addition, we could have also expressed the differential combinator of the coKleisli category in terms of the coderiving transformation $\mathsf{d}^\circ$ (Definition 2) as follows on a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ :
We now turn our attention to giving an explicit description of the linear maps in the coKleisli category.
Lemma 20. Let $\mathbb{X}$ be a monoidal differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ , and finite (bi)products. Then:
(i) A coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ is linear in $\mathbb{X}_{\rm{!}}$ if and only if the following diagram commutes:
(ii) For every map $g: A \to B$ in $\mathbb{X}$ , $\left[\kern-0.15em\left[ \mathsf{F}_{\rm{!}}(g) \right]\kern-0.15em\right] = \varepsilon_A ; g: {\rm{!}} A \to B$ is linear in $\mathbb{X}_{\rm{!}}$ . Therefore, there is a functor $\mathsf{F}_{\mathsf{L}}: \mathbb{X} \to \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ defined on objects as $\mathsf{F}_{\mathsf{L}}(A) = A$ and on maps $g: A \to B$ as $\left[\kern-0.15em\left[ \mathsf{F}_{\mathsf{L}}(g) \right]\kern-0.15em\right] = \left[\kern-0.15em\left[ \mathsf{F}_{\rm{!}}(g) \right]\kern-0.15em\right] = \varepsilon_A ;\; g$ .
Proof. For (i), first observe that for any coKleisli map $\left[\kern-0.15em\left[ k \right]\kern-0.15em\right]: {\rm{!}} (A \times A) \to B$ , precomposing by $\left[\kern-0.15em\left[ \iota_1 \right]\kern-0.15em\right] = \varepsilon_A;\; \iota_1$ is equal to $\left[\kern-0.15em\left[ \iota_1;\; k \right]\kern-0.15em\right] = {\rm{!}}(\iota_1);\; \left[\kern-0.15em\left[ k \right]\kern-0.15em\right]$ . Therefore, for any coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ , we compute the following:
So $\left[\kern-0.15em\left[ \iota_1;\; \mathsf{D}[f] \right]\kern-0.15em\right] = \mathsf{d}^\circ_A;\; \left( {\rm{!}}(0) \otimes 1_A \right); \; \mathsf{d}_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]$ . Therefore, by definition, $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ is linear if and only if $ \mathsf{d}^\circ_A;\; \left( {\rm{!}}(0) \otimes 1_A \right); \; \mathsf{d}_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \left[\kern-0.15em\left[ \iota_1;\; \mathsf{D}[f] \right]\kern-0.15em\right] = \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]$ . For (ii), we use the linear rule [d.3] to compute:
Therefore, $\left[\kern-0.15em\left[ \mathsf{F}_{\rm{!}}(g) \right]\kern-0.15em\right] = \varepsilon_A ;\; g$ is linear. As a consequence, $\mathsf{F}_{\mathsf{L}}: \mathbb{X} \to \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ is well-defined and is functor since $\mathsf{F}_{\rm{!}}: \mathbb{X} \to \mathbb{X}_{\rm{!}}$ is a functor.
It is important to note that for an arbitrary differential category $\mathbb{X}$ with finite products, not every linear map in the coKleisli category is of the form $\varepsilon_B;\; g$ . Therefore, $\mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ is not necessarily isomorphic to the base category $\mathbb{X}$ . However, for differential storage categories, the desired isomorphism holds, which is a fundamental concept in differential linear logic.
Corollary 21. Let $\mathbb{X}$ be a monoidal differential storage category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ with Seely isomorphisms, deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ (or equivalently codereliction $\eta_A: A \to {\rm{!}} A$ ), and finite (bi)products. Then:
(i) A coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ is linear in $\mathbb{X}_{\rm{!}}$ if and only if the following diagram commutes:
(ii) $\mathsf{F}_{\mathsf{L}}: \mathbb{X} \to \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ is an isomorphism with inverse $\mathsf{F}^{-1}_{\mathsf{L}}: \mathsf{LIN}[\mathbb{X}_{\rm{!}}] \to \mathbb{X}$ defined on objects as $\mathsf{F}^{-1}_{\mathsf{L}}(A) = A$ and on maps $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ as $\mathsf{F}^{-1}_{L}\left( \left[\kern-0.15em\left[ f \right]\kern-0.15em\right] \right) = \eta_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]$ .
In other words, a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ is linear in $\mathbb{X}_{\rm{!}}$ if and only if $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \varepsilon_A;\; g$ for some (necessarily unique) map $g: A \to B$ in $\mathbb{X}$ .
Proof. For (i), recall that in an additive bialgebra modality, ${\rm{!}}(0) = e_A;\; u_A$ . Therefore, for any coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ we have that:
So $\mathsf{d}^\circ_A;\; \left( {\rm{!}}(0) \otimes 1_A \right);\; \mathsf{d}_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \eta_A;\; \varepsilon_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]$ . Then by Lemma 20(i), $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ is linear if and only if $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]= \mathsf{d}^\circ_A;\; \left( {\rm{!}}(0) \otimes 1_A \right);\; \mathsf{d}_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \eta_A;\; \varepsilon_A;\; \left[\kern-0.15em\left[ f \right]\kern-0.15em\right]$ .
For (ii), usually we would first have to check that $\mathsf{F}^{-1}_{\mathsf{L}}$ is a functor; that is, $\mathsf{F}^{-1}_{\mathsf{L}}$ preserves composition and identities. However, it turns out that there is a way around this by applying Mac Lane (2013, Chapter IV, Theorem 2) to isomorphisms. Briefly, if $\mathsf{F}: \mathbb{X} \to \mathbb{Y}$ is a functor and $\mathsf{G}: \mathbb{Y} \to \mathbb{X}$ is a well-defined mapping on objects and maps such that $\mathsf{F} \circ \mathsf{G} = 1_\mathbb{Y}$ and $\mathsf{G} \circ \mathsf{F} = 1_\mathbb{X}$ , then $\mathsf{G}$ is a functor and so $\mathsf{F}$ is an isomorphism with inverse $\mathsf{G}$ . By Lemma 20(ii), $\mathsf{F}_{\mathsf{L}}: \mathbb{X} \to \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ is a functor, so it remains to show that $\mathsf{F}^{-1}_{\mathsf{L}} \circ \mathsf{F}_\mathsf{L} = 1_\mathbb{X}$ and $\mathsf{F}_{\mathsf{L}} \circ \mathsf{F}^{-1}_\mathsf{L} = 1_{\mathsf{LIN}[\mathbb{X}_{\rm{!}}]}$ . Starting with the former, on objects this is immediate, $\mathsf{F}^{-1}_{\mathsf{L}}\mathsf{F}_\mathsf{L} (A) = A$ . While on maps, recall that the linear rule for the codereliction says that $\eta_A;\; \varepsilon_A = 1_A$ , therefore:
So $\mathsf{F}^{-1}_{\mathsf{L}} \circ \mathsf{F}_\mathsf{L} = 1_\mathbb{X}$ . For the other direction, on objects this is again immediate $\mathsf{F}_{\mathsf{L}}\mathsf{F}^{-1}_\mathsf{L} (A) = A$ . For a linear coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ , by Lemma 20(i), we have that:
So $\mathsf{F}_{\mathsf{L}} \circ \mathsf{F}^{-1}_\mathsf{L} = 1_{\mathsf{LIN}[\mathbb{X}_{\rm{!}}]}$ . Therefore, $\mathsf{F}^{-1}_{\mathsf{L}}: \mathsf{LIN}[\mathbb{X}_{\rm{!}}] \to \mathbb{X}$ is a functor and is an inverse of $\mathsf{F}_{\mathsf{L}}: \mathbb{X} \to \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ . So we conclude that $\mathbb{X} \cong \mathsf{LIN}[\mathbb{X}_{\rm{!}}]$ .
2.7 Equivalence of linear fibrations
Consider a monoidal differential category with finite products. On the one hand, we have the fibration ${\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}}$ of Proposition 6 associated to any coalgebra modality. On the other hand, by Proposition 19, $\mathbb{X}_{{\rm{!}}}$ is a CDC, and so we also have its associated linear fibration ${\mathsf{p}: \mathcal{L}[\mathbb{X}_{\rm{!}}] \to \mathbb{X}_{\rm{!}}}$ of Proposition 17. The objective of this section is to show that they are in fact isomorphic (as fibrations over $\mathbb{X}_{{\rm{!}}}$ ). We begin by providing an explicit description of maps which are linear in context in the coKleisli category.
Lemma 22. Let $\mathbb{X}$ be a differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ , and finite (bi)products. Then:
(i) For every object X and A, the following diagram commutes:
(ii) A coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ is linear in context X if and only if the following diagram commutes:
(iii) For every map $g: {\rm{!}} X \otimes A \to B$ in $\mathbb{X}$ , the composite:
is linear in context X in $\mathbb{X}_{\rm{!}}$ .
Proof. For (i), recall the following useful compatibility relation between the deriving transformation and coderiving transformation (Cockett and Lemay, Reference Cockett and Lemay2018, Proposition 4.1):
Then by using the above identity and the biproduct coherences, we compute
For (ii), first note that for ${\left[\kern-0.15em\left[ k \right]\kern-0.15em\right]: {\rm{!}}\left( (X \times A) \times (X \times A) \right) \to B}$ , precomposing by ${\left[\kern-0.15em\left[ \iota_0 \times \iota_1 \right]\kern-0.15em\right] = \varepsilon_{X \times A};\; (\iota_0 \times \iota_1)}$ is equal to $\left[\kern-0.15em\left[ (\iota_0 \times \iota_1);\; k \right]\kern-0.15em\right] = {\rm{!}} (\iota_0 \times \iota_1);\; \left[\kern-0.15em\left[ k \right]\kern-0.15em\right]$ . Therefore, for any coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ , we compute
So
Therefore, by definition, $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ is linear in context X if and only if
For (iii), it is automatic by (i) that we have that:
Then by (ii), it follows that $\mathsf{d}^\circ_{X \times A};\; ({\rm{!}}(\pi_0) \otimes \pi_1);\; g$ is linear in context X.
Below we will show that, in fact, a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ is linear in context X if and only if it is of the form $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \mathsf{d}^\circ_{X \times A};\; ({\rm{!}}(\pi_0) \otimes \pi_1);\; g$ for some (necessarily unique) map $g: {\rm{!}} X \otimes A \to B$ . If we have the Seely isomorphisms, we may also re-express linearity in context using the codereliction.
Corollary 23. Let $\mathbb{X}$ be a differential storage category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ with Seely isomorphisms, deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ (or equivalently codereliction $\eta_A: A \to {\rm{!}} A$ ), and finite (bi)products. Then a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ is linear in context X if and only if the following diagram commutes:
Proof. By expressing the deriving transformation in terms of the multiplication and codereliction, for any coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ , we compute:
So we have that:
Then by Lemma 22(ii), $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} (X \times A) \to B$ is linear if and only if the following equality holds:
So the desired equivalence holds.
We now prove the main result of this section, that we have an isomorphism of fibrations. It is important to note that the following result does not require the Seely isomorphisms.
Theorem 24. Let $\mathbb{X}$ be a monoidal differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ , and finite (bi)products. Then the fibrations ${\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}}$ and ${\mathsf{p}: \mathcal{L}[\mathbb{X}_{\rm{!}}] \to \mathbb{X}_{\rm{!}}}$ are isomorphic via the functors $\mathsf{E}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathcal{L}[\mathbb{X}_{\rm{!}}]$ and $\mathsf{E}^{-1}: \mathcal{L}[\mathbb{X}_{\rm{!}}] \to \mathcal{L}_{\rm{!}}[\mathbb{X}]$ where:
(i) $\mathsf{E}$ is defined on objects as $\mathsf{E}(X,A) = (X,A)$ , and on maps $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], g): (X,A) \to (Y,B)$ as follows:
(ii) $\mathsf{E}^{-1}$ is defined on objects as $\mathsf{E}^{-1}(X,A) = (X,A)$ , and on maps $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], \left[\kern-0.15em\left[ g \right]\kern-0.15em\right]): (X,A) \to (Y,B)$ as follows:
Proof. For this proof, we will follow the same strategy as in the proof of Corollary 21. We will first prove that $\mathsf{E}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathcal{L}[\mathbb{X}_{\rm{!}}]$ is a well-defined morphism of fibrations (that is, a functor that preserves Cartesian maps). Then we will prove that $\mathsf{E} \circ \mathsf{E}^{-1} = 1_{ \mathcal{L}[\mathbb{X}_{\rm{!}}]}$ and $\mathsf{E}^{-1} \circ \mathsf{E} = 1_{ \mathcal{L}_{\rm{!}}[\mathbb{X}]}$ . Therefore, it follows that ${\mathsf{E}^{-1}: \mathcal{L}[\mathbb{X}_{\rm{!}}] \to \mathcal{L}_{\rm{!}}[\mathbb{X}]}$ is also a morphism of fibrations, and that $\mathsf{E}$ and $\mathsf{E}^{-1}$ are isomorphisms and inverses of each other.
By Lemma 22(iii), $\mathsf{E}$ is well-defined. To show that $\mathsf{E}$ preserve composition, first observe that by using Lemma 22(i) we see that composition in $\mathcal{L}[\mathbb{X}_{\rm{!}}]$ can be expressed as follows (which we leave as an exercise for the reader to work out for themselves):
Then by Lemma 22(ii), we compute that:
Next we show that $\mathsf{E}$ preserves identities. First note that the identity in $\mathcal{L}[\mathbb{X}_{\rm{!}}]$ is easily computed out to be $(\left[\kern-0.15em\left[ 1_X \right]\kern-0.15em\right], \left[\kern-0.15em\left[ \pi_1 \right]\kern-0.15em\right]) = (\varepsilon_X, \varepsilon_{X \times A};\; \pi_1)$ . Then we compute:
So $\mathsf{E}$ is a functor. To show that $\mathsf{E}$ is a fibration morphism, we must show that $\mathsf{p} \circ \mathsf{E} = \mathsf{p}_{\rm{!}}$ and that $\mathsf{E}$ preserves Cartesian maps. Starting with the former, this is straightforward on both maps and objects since $\mathsf{p}\mathsf{E}(X,A) = X = \mathsf{p}_{\rm{!}}(X,A)$ and $\mathsf{p}\mathsf{E}(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], g) = \left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \mathsf{p}_{\rm{!}}(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right],g)$ . Next we must show that $\mathsf{E}$ also preserves Cartesian maps. By Proposition 17, note that in $\mathcal{L}[\mathbb{X}_{\rm{!}}]$ these are easily computed out to be the maps of the form $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], \left[\kern-0.15em\left[ \pi_1 \right]\kern-0.15em\right]) = (\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], \varepsilon_{X \times A};\; \pi_1)$ , while we recall that by Proposition 6, Cartesian maps in $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ are of the form $(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], e_X \otimes 1_A)$ . By a similar calculation as the one above, we easily compute that $\mathsf{E} (\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], e_X \otimes 1_A) = (\left[\kern-0.15em\left[ f \right]\kern-0.15em\right], \varepsilon_{X \times A};\; \pi_1)$ . So $\mathsf{E}$ preserves Cartesian maps, and we conclude that $\mathsf{E}$ is a fibration morphism.
Next we show that $\mathsf{E}$ and $\mathsf{E}^{-1}$ are inverses of each other. Starting with $\mathsf{E} \circ \mathsf{E}^{-1}$ , clearly on objects $\mathsf{E}\mathsf{E}^{-1}(X,A) = (X,A)$ , while on maps we use Lemma 22(ii):
So $\mathsf{E} \circ \mathsf{E}^{-1} = 1_{ \mathcal{L}[\mathbb{X}_{\rm{!}}]}$ . Next for $\mathsf{E}^{-1} \circ \mathsf{E}$ , again this is clear on objects since $\mathsf{E}^{-1}\mathsf{E}(X,A) = (X,A)$ , while on maps we use Lemma 22(i):
So $\mathsf{E}^{-1} \circ \mathsf{E} = 1_{ \mathcal{L}_{\rm{!}}[\mathbb{X}]}$ . As a consequence, it follows that $\mathsf{E}^{-1}$ is also a fibration morphism. Therefore, $\mathsf{E}$ and $\mathsf{E}^{-1}$ are fibration isomorphisms and inverses of each other, and so we conclude that the fibrations ${\mathsf{p}_{\rm{!}}: \mathcal{L}_{\rm{!}}[\mathbb{X}] \to \mathbb{X}_{\rm{!}}}$ and ${\mathsf{p}: \mathcal{L}[\mathbb{X}_{\rm{!}}] \to \mathbb{X}_{\rm{!}}}$ are isomorphic.
As an immediate consequence, we have that fibres over the same object are isomorphic.
Corollary 25. Let $\mathbb{X}$ be a monoidal differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ , and finite (bi)products. Then for each object X, $\mathcal{L}_{\rm{!}}[X]$ is isomorphic to $\mathcal{L}[\mathbb{X}]$ via the functors $\mathsf{E}_X: \mathcal{L}_{\rm{!}}[X] \to \mathcal{L}[X]$ and $\mathsf{E}^{-1}_X: \mathcal{L}[X] \to \mathcal{L}_{\rm{!}}[X]$ where:
(i) $\mathsf{E}_X$ is defined on objects as $\mathsf{E}_X(A) = A$ , and on maps $g: {\rm{!}} X \otimes A \to B$ as follows:
(ii) $\mathsf{E}_X^{-1}$ is defined on objects as $\mathsf{E}^{-1}_X(A) = A$ , and on maps $\left[\kern-0.15em\left[ g \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ as follows:
As such, a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ is linear in context X if and only if there exists a (necessarily unique) map $g: {\rm{!}} X \otimes A \to B$ such that $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right] = \mathsf{d}^\circ_{X \times A};\; ({\rm{!}}(\pi_0) \otimes \pi_1);\; g$ .
3. Cartesian Reverse Differential Categories
In this section, we recall the key definitions and results on CRDCs from Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020).
3.1 Definition
Definition 26. (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Definition 13) A CRDC is a Cartesian left additive category $\mathbb{X}$ equipped with a reverse differential combinator $\mathsf{R}$ , which is a family of operators ${\mathsf{R}: \mathbb{X}(A,B) \to \mathbb{X}(A \times B,A)}$ , $f \mapsto \mathsf{R}[f]$ , where $\mathsf{R}[f]$ is called the reverse derivative of f, such that the following seven axioms hold:
[RD.1] Additivity of reverse differentiation:
[RD.2] Additivity of the reverse derivative in its second variable:
[RD.3] Coherence with identities and projections:
[RD.4] Coherence with pairings:
[RD.5] Reverse chain rule:
[RD.6] Linearity of the reverse derivative in its second variable:
[RD.7] Symmetry of mixed partial derivatives:
For more discussion on the definition and examples, see Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020). One of the central results of that paper is that any CRDC also has the structure of a CDC:
Theorem 27. (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Theorem 16) If $\mathbb{X}$ is a CRDC with reverse differential combinator $\mathsf{R}$ , then $\mathbb{X}$ is also a CDC, where for a map $f: A \to B$ , its derivative $D[f]: A \times A \to B$ is defined as follows:
In general, however, there is no reason why a CDC should have the structure of a CRDC. In the next two sections, we look at what additional structure is needed on a CDC to get a CRDC.
3.2 Dagger fibrations
In this section, we review the structure necessary to go from a CDC to a reverse CDC: a dagger fibration structure on its fibration of linear maps.
The idea of a dagger fibration is to capture (from the fibrational point of view) the idea of each fibre being a dagger category. Recalling that a dagger category involves a functor from a category to its opposite, from the fibrational point of view, this must then involve a map from a fibration to its dual fibration, a fibration which takes the original fibration and takes the opposite category in each fibre. This fibration can be defined directly as follows:
Definition 28. (Jacobs, Reference Jacobs1999, Defn. 1.10.11) If $p: \mathbb{E} \to \mathbb{B}$ is a fibration, its dual fibration is the fibration $p^*: \mathbb{E}^* \to \mathbb{B}$ given by:
(i) The objects of $\mathbb{E}^\ast$ are the same as the objects of $\mathbb{E}$ ; that is, $Ob(\mathbb{E}^\ast) = Ob(\mathbb{E})$ ;
(ii) A map from E to E’ in $\mathbb{E}^*$ consists of an equivalence class of spans
where v is vertical and c Cartesian. Such a span is equivalent to (S’,v’,c’) if there is a vertical isomorphism $\alpha: S \to S'$ which makes the relevant triangles commute.
The following are our two primary examples of the dual fibration construction.
Example 29. The dual fibration of the fibration of Proposition 6 has objects pairs (X,A), with a map from (X,A) to (Y,B) consisting of a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} X \to Y$ and a map
Example 30. The dual fibration of the fibration of Proposition 17 has objects pairs (X,A), with a map from (X,A) to (Y,B) consisting of a map $f: X \to Y$ and a map
Lemma 31. (Jacobs, Reference Jacobs1999, Lemma 1.10.12) If $p: \mathbb{E} \to \mathbb{B}$ is a fibration, then:
(i). For each object B of $\mathbb{B}$ , there is an isomorphism of categories
which is natural in B;
(ii) There is an isomorphism of fibrations $(\mathbb{E}^*)^* \cong \mathbb{E}$ over $\mathbb{B}$ .
The following does not appear in any published accounts on the dual fibration, but is straightforward:
Lemma 32. If $(p: \mathbb{E} \to \mathbb{B})$ and $(p': \mathbb{E}' \to \mathbb{B}')$ are fibrations and
is a morphism of fibrations, then there is a morphism of fibrations
which sends a span $(S, v, C): X \to X'$ to $(FS,F(v),F(c)): FX \to FX'$ .
We can now succinctly define what it means for a fibration to have dagger category structure in each fibre.
Definition 33. (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Definition 33) A dagger fibration consists of a fibration $p: \mathbb{E} \to \mathbb{B}$ together with a morphism of fibrations
which is stationary on objects and “is its own inverse”; that is, the composite
is the identity functor.
Example 34. Let us consider what it would mean to have dagger structure on the linear fibration $\mathcal{L}[X]$ (Definition 15) of a CDC $\mathbb{X}$ . In particular, this would mean that for each map $f: X \times A \to B$ which is linear in context X, we would need to give a map $f^{\dagger[X]}: X \times B \to A$ which is also linear in context X. The axioms for a dagger fibration are then equivalent to asking that each fibre $\mathcal{L}[X]$ (Lemma 18) be a dagger category with dagger $\dagger[X]: \mathcal{L}[X]^{\mathsf{op}} \to \mathcal{L}[X]$ such that each substitution functor preserves the dagger. Explicitly, the operation $\dagger[\!-\!]$ satisfies the following:
(i) Contravariant functoriality: $(\langle \pi_0, f \rangle;\; g)^{\dagger[X]} = \langle \pi_0, g^{\dagger[X]} \rangle;\; f^{\dagger[X]}$ and $\pi_1^{\dagger[X]} = \pi_1$
(ii) Involutive: ${f^{\dagger[X]}}^{\dagger[X]} =f$
(iii) Change of Base: for every map $h: Y \to X$ in $\mathbb{X}$ , its associated substitution functor $h^\ast: \mathcal{L}[X] \to \mathcal{L}[Y]$ preserves the dagger, that is, $(h^\ast(f))^{\dagger[Y]} = h^\ast\left( f^{\dagger[X]} \right)$ .
If $\mathbb{X}$ is a CRDC, then by Theorem 27, it is also a CDC, and in this case its associated linear fibration has dagger structure:
Theorem 35. (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Theorem 37) If $\mathbb{X}$ is a CRDC, then its associated fibration of linear maps $\mathcal{L}[\mathbb{X}]$ is a dagger fibration, where for a map $f: X \times A \to B$ which is linear in context X,
It will also be useful (see Lemma 53) to characterize when the fibration associated to any coalgebra modality has dagger fibration structure.
Example 36. To give a dagger fibration structure on the fibration of Proposition 6 corresponds to associating every map $f: {\rm{!}} X \otimes A \to B$ to a map $f^{\dagger[X]}: {\rm{!}} X \otimes B \to A$ , which we draw in the graphical calculus simply as
Once again, the axioms for a dagger fibration in this case are equivalent to asking that each of the fibres $\mathcal{L}_{\rm{!}}[X]$ be a dagger category with dagger $\dagger[X]: \mathcal{L}_{\rm{!}}[X]^{\mathsf{op}} \to \mathcal{L}_{\rm{!}}[X]$ such that each substitution functor preserves the dagger. Explicitly, the operation $\dagger[\!-\!]$ satisfies the following:
(i) Contravariant functorality:
-
(ii) Involution: ${f^{\dagger[X]}}^{\dagger[X]} =f$
-
(iii) Change of Base: For every coKleisli map $\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]: {\rm{!}} Y \to X$ , the substitution functor ${\left[\kern-0.15em\left[ h \right]\kern-0.15em\right]^\ast: \mathcal{L}_{\rm{!}}[Y] \to \mathcal{L}_{\rm{!}}[X]}$ preserves the dagger, that is, $ \left( (\delta \otimes 1) ({\rm{!}}(h) \otimes 1) f \right)^{\dagger[Y]} = (\delta \otimes 1) ({\rm{!}}(h) \otimes 1)f^{\dagger[X]}$
3.3 Characterization theorem for CRDCs
With one extra ingredient, the structure described in Theorem 35 is enough to characterize CRDCs.
Definition 37. A CDC $\mathbb{X}$ has a contextual linear dagger if $\mathcal{L}[\mathbb{X}]$ has a dagger fibration structure for which each fibre $\mathcal{L}[\mathbb{X}]$ has dagger biproducts.
Theorem 38. (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Theorem 42) A CRDC is precisely the same as a Cartesian differential $\mathbb{X}$ with a contextual linear dagger.
In particular, given such a CDC, its reverse combinator is given by taking the dagger of its (forward) derivative $\mathsf{D}$ . So for a map $f: A \to B$ , its derivative is $\mathsf{D}[f]: A \times A \to B$ (which is linear in its second variable by [CD.6]), so we define the reverse derivative as follows:
4. Monoidal Reverse Differential Categories
This section introduces the main subject of the article: MRDCs. As noted in the introduction, we would like these structures to satisfy several requirements:
(1) Just as every CRDC is a CDC, so should every MRDC be a monoidal differential category.
(2) Just as every monoidal differential storage category has CDC structure on its coKleisli category, so should every monoidal reverse differential storage category have Cartesian reverse differential structure on its coKleisli category.
(3) Examples of this structure should be interesting and varied.
In the next section, we will see that requirements 1 and 2 will force MRDCs to be self-dual compact closed.
4.1 MRDCs should be self-dual compact closed
First, let us recall the relevant definitions.
Definition 39. In a symmetric monoidal category, a self-dual object (Heunen and Vicary, Reference Heunen and Vicary2019, Definition 3.1) is a triple $(A, \cup_A, \cap_A)$ consisting of an object A and two maps $\cup_A: A \otimes A \to k$ and $\cap_A: k \to A \otimes A$ , drawn in the graphical calculus as follows:
such that the following diagram commutes (often called the snake equations):
A self-dual object $(A, \cup_A, \cap_A)$ is said to satisfy the twist equations if the cup and cap are symmetry invariant, that is, the following diagram commutes:
A self-dual compact-closed category (Reference SelingerSelinger, 2010, Section 5) is a symmetric monoidal category $\mathbb{X}$ equipped with a family of maps $\cup_A: A \otimes A \to k$ and $\cap_A: k \to A \otimes A$ such that for each object A, $(A, \cup_A, \cap_A)$ is a self-dual object which satisfies the twist equations.
Without loss of generality, in a self-dual compact-closed category, we use the convention that for each pair of objects A and B:
and that for the unit k, $\cup_k = \cap_k = 1_k$ (which are just empty drawings in the graphical calculus). We also point out that the twist equations are not strictly necessary for the story of this paper, or for defining a compact-closed category where $A = A^\ast$ as in Selinger (Reference Selinger2010). However, in this paper, we have elected to include them in the definition as it is more practical, greatly simplifying our string diagram computations, and all of the examples of MRDCs that we have discovered so far satisfy this twist equation. Furthermore, in most of the literature when considering self-dual compact-closed categories, the twist equations are often taken as axioms such as for the ZX calculus (Coecke and Duncan, Reference Coecke and Duncan2008), quantum computing (which considers the free self-dual compact-closed PROP) (Hadzihasanovic, Reference Hadzihasanovic2015), and hypergraph categories (Fong and Spivak, Reference Fong and Spivak2019).
We now discuss the induced dagger functor of a self-dual compact-closed category:
Lemma 40. (Selinger, Reference Selinger2010, Remark 4.5) Let $\mathbb{X}$ be a self-dual compact-closed category with cups $\cup$ and caps $\cap$ . Then $\mathbb{X}$ is a dagger category whose dagger functor $(\_)^\ast: \mathbb{X}^{\mathsf{op}} \to \mathbb{X}$ is defined on objects as $A^\ast = A$ and for a map $f: A \to B$ , $f^\ast: B \to A$ (Heunen and Vicary, Reference Heunen and Vicary2019, Definition 3.9) is defined as follows:
Explicitly, $1_A^\ast = 1_A$ , $(f;\;g)^\ast = g^\ast;\;f^\ast$ and $f^{\ast\ast} = f$ . Furthermore, for any map $f: A \to B$ , the following diagrams commute (Reference Heunen and VicaryHeunen and Vicary, 2019, Lemma 3.12 & 3.26):
The above identities are sometimes referred to as the sliding equations.
We will now justify that MRDCs should be self-dual compact closed. We begin in the Seely case. By requirement 1. at the beginning of this section, if we start with an MRDC which satisfies the Seely requirements, then it should be a differential storage category $\mathbb{X}$ , with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ (with the Seely isomorphisms), and deriving transformation $\mathsf{d}: {\rm{!}} A \otimes A \to {\rm{!}} A$ (or equivalently codereliction $\eta_A: A \to {\rm{!}} A$ ). Of course this means that the coKleisli category $\mathbb{X}_{\rm{!}}$ is a CDC. But then by requirement (2), $\mathbb{X}_{\rm{!}}$ should be a CRDC, which means its fibration of linear maps $\mathcal{L}[\mathbb{X}_{\rm{!}}]$ is a dagger fibration. As explained in Example 34, each of the fibres $\mathcal{L}[X]$ is then a $\dagger$ -category. By Corollary 21, we also have the isomorphisms $\mathcal{L}[0] \cong \mathsf{LIN}[\mathbb{X}_{\rm{!}}] \cong \mathbb{X}$ , which implies that the base category $\mathbb{X}$ is itself a $\dagger$ -category. To distinguish between the two dagger structures, we will use $\dagger$ for the daggers in $\mathbb{X}_{\rm{!}}$ , and $\ast$ for the daggers in $\mathbb{X}$ . If we assume that $\dagger$ is monoidal, then using the comonad counit $\varepsilon$ and the codereliction $\eta$ , we can now build cups and caps to make every object of $\mathbb{X}$ a self-dual object. Unfortunately, since the twist equation is non-canonical, it does not appear to come for free from this approach, and so $\mathbb{X}$ is not a self-dual compact-closed category as defined in this paper. Of course this should somewhat be expected since we asked for the twist equation for practical reasons. Nevertheless, this still justifies the link between reverse differentiation and self-duality.
Using Theorem 24, pushing the dagger through the fibration equivalence $\mathcal{L}[\mathbb{X}_{\rm{!}}] \cong \mathcal{L}_{\rm{!}}[\mathbb{X}]$ , we also have that $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ is a dagger fibration as in Example 36. Consider then the map $\varepsilon_A: {\rm{!}} A \to A$ interpreted as a map in the fibre $\mathcal{L}_{\rm{!}}[\mathbb{X}](k,A)$ . Taking its dagger we obtain a map $\varepsilon_A^{\dagger[A]}: {\rm{!}} A \otimes A \to k$ . Precomposing this map with the codereliction we obtain our cup $\cup_A: A \otimes A \to k$ :
To build the cap $\cap_A: k \to A \otimes A$ , we use the dagger $\ast$ on the base category $\mathbb{X}$ :
Lemma 41. Let $\mathbb{X}$ be a differential storage category. Suppose $\mathbb{X}_{\rm{!}}$ is a CRDC, and therefore has a contextual linear dagger $\dagger$ . If $\dagger$ is (strict) monoidal then every object of $\mathbb{X}$ is a self-dual object where the cups and caps are defined as above.
Proof. We must show that the cups and caps satisfy the snake equations. To do so we will need some simple identities. First observe that for a map $f: {\rm{!}} X \otimes A \to B$ and a map $g: B \to D$ , since the dagger is contravariant, it straightforward to show that:
Next, using the assumption that $\dagger$ is monoidal, we have that for any map $f: {\rm{!}} X \otimes A \to B$ and any object C:
The last required identity comes from the fact that in any CDC with a contextual linear dagger, the dagger preserves linearity in context. Translating this in terms of the fibration $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ , by Corollary 21 we have that for any map $f: {\rm{!}} X \otimes A \to B$ :
Now we compute one of the snake equations:
The proof for the other snake equation is similar. So we conclude that $(A, \cup_A, \cap_A)$ is a self-dual object, and that $\mathbb{X}$ is a self-dual compact-closed category.
Even if the coalgebra modality does not have the Seely isomorphisms, it is still possible to show that in each of the fibres $\mathcal{L}_{\rm{!}}[X]$ every object is self-dual. However, the computations in the proof are more complicated and not necessarily more enlightening. So we will omit the proof and simply provide the construction. In the fibre $\mathcal{L}_{\rm{!}}[X]$ , the cup is the map $\cup^X_A: {\rm{!}} X \otimes A \otimes A \to k$ defined as follows:
while the cap $\cap^X_A: {\rm{!}} X \to A \otimes A$ is defined as the dagger in the fibre of the cap: $\cap^X_A := {\cup^X_A}^{\dagger[X]}$ .
4.2 Definition and examples
Given the discussion of the previous section, MRDCs should at least be self-dual compact closed. We now give the full definition.
Definition 42. A MRDC is an additive symmetric monoidal category $\mathbb{X}$ , such that $\mathbb{X}$ is a self-dual compact-closed category, equipped with a coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and a reverse deriving transformation which is a family of maps $\mathsf{r}_A: {\rm{!}} A \otimes {\rm{!}} A \to A$ drawn in the graphical calculus as:
such that the following axioms hold:
[r.N] Reverse naturality rule:
[r.1] Reverse constant rule:
[r.2] Reverse Leibniz rule (or Reverse product rule):
[r.3] Reverse linear rule:
[r.4] Reverse chain rule:
[r.5] Reverse interchange rule:
Before we get to examples, it will be useful to describe the relationship between monoidal differential categories and MRDCs. In particular, it shows that our definition satisfies requirement 1 described in the introduction to this section.
Theorem 43. A reverse differential category is precisely a differential category which is also self-dual compact closed. Explicitly:
(i) If $\mathbb{X}$ is a reverse differential category, then $\mathbb{X}$ is a differential category where the deriving transformation $\mathsf{d}_A: {\rm{!}} A \otimes A \to {\rm{!}} A$ is defined as
(ii) If $\mathbb{X}$ is a differential category which is also a self-dual compact-closed category, then $\mathbb{X}$ is a reverse differential category where the reverse deriving transformation is defined as
Furthermore, these constructions are inverses of each other.
Remark 44. As noted above, the story of these MRDCs could, in theory, be told without assuming the twist equation. If we drop that axiom, then the above constructions include an extra twist (depending on the convention of if $A^\ast$ if on the left/right for the cup/cap):
However, as explained above, we have elected to assume the twist equation to simplify our string diagrams.
Proof. The axioms of a reverse deriving transformation $\mathsf{r}$ correspond precisely to the axioms of a deriving transformation $\mathsf{d}$ . Naturality of $\mathsf{d}$ corresponds to the reverse naturality rule [r.N], while [d.n] corresponds to [r.n] for $n = 1, 2, ..., 5$ . The correspondence follows from using the snake equations (Definition 39) and the sliding equations (Lemma 40). Since the computations are all similar, we will not work out the full proof in detail and will instead provide two examples, prove that the constructions are inverses of each other, and then leave the rest as an exercise for the reader.
Starting with a reverse deriving transformation $\mathsf{r}$ , we will show that the constructed $\mathsf{d}$ satisfies the Leibniz rule [d.2] by using the snake equations and sliding equations on the reverse Leibniz rule [r.2]:
To show that $\mathsf{d}$ is natural and satisfies the rest of the deriving transformation axioms is similar. Conversely, starting with a deriving transformation $\mathsf{d}$ , we will show that the constructed $\mathsf{r}$ satisfies the reverse chain rule [r.4] by using the snake equations and sliding equations on the chain rule [d.4]:
To show that $\mathsf{r}$ satisfies the rest of the reverse deriving transformations axioms is similar. Lastly, we use the snake equations to prove that these constructions are inverses of each other:
So we conclude that a reverse differential category is precisely the same thing as differential category which is also self-dual compact closed.
We note that a self-dual compact-closed differential category is also a codifferential category (the dual of a differential category). Indeed, observe that if $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ is a coalgebra modality on a self-dual compact-closed category $\mathbb{X}$ , then we can define an algebra modality (the dual of a coalgebra modality). The monad endofunctor ${\rm{?}}: \mathbb{X} \to \mathbb{X}$ is defined on objects as ${\rm{?}} A = {\rm{!}} A$ and on maps ${\rm{?}} f = \left( {\rm{!}}(f^\ast) \right)^\ast$ , while the remaining natural transformations are the duals of the coalgebra modality natural transformations. Explicitly, $({\rm{?}}, \delta^\ast, \varepsilon^\ast, \Delta^\ast, e^\ast)$ is an algebra modality on $\mathbb{X}$ . It is crucial to observe that ${\rm{!}}$ and ${\rm{?}}$ are equal on objects but not maps, and that $\delta^\ast$ , $\varepsilon^\ast$ , $\Delta^\ast$ and $e^\ast$ are not necessarily natural with respect to ${\rm{!}}$ . For example, ${\rm{?}}(f);\; \Delta^\ast = \Delta^\ast;\; ({\rm{?}}(f) \otimes {\rm{?}}(f))$ but $\Delta^\ast;\; {\rm{!}}(f) $ may not be equal to $({\rm{!}}(f) \otimes {\rm{!}}(f));\; \Delta^\ast$ . Furthermore, if $\mathbb{X}$ is also a differential category, where $\mathsf{d}$ is a deriving transformation for $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ , then $\mathbb{X}$ is also a codifferential category where $\mathsf{d}^\ast$ is a deriving transformation for $({\rm{?}}, \delta^\ast, \varepsilon^\ast, \Delta^\ast, e^\ast)$ .
Before providing examples of reverse differential categories, let us first discuss the relation between the reverse deriving transformation and the coderiving transformation. Indeed, observe that if $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ is a coalgebra modality on a self-dual compact-closed category $\mathbb{X}$ , then there is a canonical map of the desired type ${\rm{!}} A \otimes {\rm{!}} A \to A$ defined as the composite:
While this is a map of the right type, this is not automatically a reverse deriving transformation. This map is a reverse deriving transformation if and only if the coderiving transformation is a deriving transformation for the induced algebra modality.
Lemma 45. If $\mathbb{X}$ is a self-dual compact-closed category, which is additive symmetric monoidal and equipped with a coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ , the following are equivalent:
(1) $\mathsf{r} := (1 \otimes \mathsf{d}^\circ);\;(\cup \otimes 1)$ is a reverse deriving transformation;
(2) $\mathsf{d}^\circ$ is a deriving transformation for the algebra modality $({\rm{?}}, \delta^\ast, \varepsilon^\ast, \Delta^\ast, e^\ast)$ ; that is, the dual diagrams of Definition 10 commute.
Furthermore, in this case, $\mathsf{d}^\ast = \mathsf{d}^\circ$ .
Proof. For $(i) \Rightarrow (ii)$ : by Theorem 43, we obtain a deriving transformation $\mathsf{d}_A: {\rm{!}} A \otimes A \to {\rm{!}} A$ . We will now show that $\mathsf{d}^\ast = \mathsf{d}^\circ$ . So using the snake equations and Theorem 43, we compute
So $\mathsf{d}^\ast = \mathsf{d}^\circ$ . Therefore, by the above discussion, $\mathsf{d}^\ast = \mathsf{d}^\circ$ is a deriving transformation for the algebra modality $({\rm{?}}, \delta^\ast, \varepsilon^\ast, \Delta^\ast, e^\ast)$ . Conversely, for $(ii) \Rightarrow (i)$ : by the dual of the above discussion, we have that ${\mathsf{d}^\circ}^\ast$ is a deriving transformation for the coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ . Then by Theorem 43, we obtain a reverse deriving transformation $\mathsf{r}_A: {\rm{!}} A \otimes {\rm{!}} A \to A$ . Expanding out the construction, we compute
So we conclude that $\mathsf{r} := (1 \otimes \mathsf{d}^\circ);\;(\cup \otimes 1)$ .
In the presence of Seely isomorphisms, if all the important structure maps are duals of one another, then the reverse deriving transformation is of the above form.
Definition 46. A reverse differential storage category is a reverse differential category with finite products whose coalgebra modality has Seely isomorphisms and such that $\eta = \varepsilon^\ast$ , $\nabla= \Delta^\ast$ , and $u = e^\ast$ (where $\eta$ is the induced codereliction, and $\nabla$ and $\mathsf{u}$ are the induced natural monoid structure).
Corollary 47. A reverse differential storage category is precisely a differential storage category which is also self-dual compact closed and such that $\eta = \varepsilon^\ast$ , $\nabla= \Delta^\ast$ , and $u = e^\ast$ . Furthermore, in a reverse differential storage category, the reverse deriving transformation is of the form $\mathsf{r} = (1 \otimes \mathsf{d}^\circ);\;(\cup \otimes 1)$ .
Proof. The first part of the statement is simply an extension of Theorem 43. For the second part, recall that in a differential storage category the deriving transformation is of the form $\mathsf{d} = (1 \otimes \eta);\; \nabla$ . By assumption, the dual of the deriving transformation is computed out to be:
So $\mathsf{d}^\ast = \mathsf{d}^\circ$ . However, recall that since we are in the self-dual case, $\mathsf{d}^\ast = \mathsf{d}^\circ$ is a deriving transformation for the algebra modality $({\rm{?}}, \delta^\ast, \varepsilon^\ast, \Delta^\ast, e^\ast)$ . Then by Lemma 45, it follows that $\mathsf{r} = (1 \otimes \mathsf{d}^\circ);\;(\cup \otimes 1)$ .
We conclude this section with examples of reverse differential categories.
Example 48. Let $\mathsf{REL}$ be the category of sets and relations, where we recall that the objects are sets and the maps are relations between them; that is, a relation from a set X to a set Y, denoted $R: X \to Y$ , is a subset $R \subseteq X \times Y$ . $\mathsf{REL}$ is a symmetric monoidal category where the monoidal product is given by the Cartesian product of sets, $X \otimes Y = X \times Y$ , and where the monoidal unit is a chosen singleton $\lbrace \ast \rbrace$ . With this monoidal structure, $\mathsf{REL}$ is also a self-dual compact-closed category where for a set X, its cup $\cup_X: X \times X \to \lbrace \ast \rbrace$ and cap $\cap_X: \lbrace \ast \rbrace \to X \times X$ are the dual relations which relate the single element to all pairs of copies of elements of X:
$\mathsf{REL}$ is also a (monoidal) differential category. The additive symmetric monoidal structure is induced by the biproduct, which is given by the disjoint union of sets, $X \sqcup Y$ , and where the terminal object is the empty set $\emptyset$ . As such, the sum of parallel relations ${R: X \to Y}$ and $S: X \to Y$ is defined as their union $R + S := R \cup S$ , while the zero map $0: X \to Y$ is the empty relation $0 := \emptyset$ . The coalgebra modality on $\mathsf{REL}$ is given by finite bags (also called finite multisets). So for a set X, let ${\rm{!}} X$ be the set of all finite bags of X. This coalgebra modality has the Seely isomorphisms so ${\rm{!}}(X \sqcup Y) \cong {\rm{!}} X \times {\rm{!}} Y$ and $ {\rm{!}} \emptyset \cong \lbrace \ast \rbrace$ . The deriving transformation ${\mathsf{d}_X: {\rm{!}} X \times X \to {\rm{!}} X}$ is defined as the relation which adds an element into the bag:
where $\left[\kern-0.15em\left[ x \right]\kern-0.15em\right]$ is the one element bag and $\sqcup$ is the (necessarily disjoint) union of finite bags. For more details on this differential category, see Blute et al. (Reference Blute, Cockett and Seely2006, Section 2.5.1). By applying Theorem 43, $\mathsf{REL}$ is also a reverse differential category where the reverse deriving transformation $\mathsf{r}_X: {\rm{!}} X \times {\rm{!}} X \to X$ is the relation that relates two bags that differ by one element to that said element:
Example 49. The above example generalizes to the weighted relational model (Laird et al., Reference Laird, Manzonetto, McCusker and Pagani2013; Ong, Reference Ong2017). The underlying category is the biproduct completion of a complete commutative semiring. Briefly recall that a complete commutative semiring is a commutative semiring where one can have sums indexed by arbitrary sets I, which we denote by $\sum \limits_{i \in I}$ , such that these summation operations satisfy certain distributivity and partitions axioms, see (Ong, Reference Ong2017, Section III.B). For a complete commutative semiring R, define the category $R^\Pi$ whose objects are sets X and where a map from X to Y is a set function $f: X \times Y \to R$ , and where composition and identities are defined as in (Ong, Reference Ong2017, Section III.B). Note that when we take the two-element Boolean algebra $B = \lbrace 0, 1 \rbrace$ , then $B^\Pi$ is isomorphic to $\mathsf{REL}$ . For any complete commutative semiring R, $R^\Pi$ is a symmetric monoidal category where the monoidal product is given by the Cartesian product of sets, $X \otimes Y = X \times Y$ , and where the monoidal unit is a chosen singleton $\lbrace \ast \rbrace$ . $R^\Pi$ is also a self-dual compact-closed category where for a set X, its cup $\cup_X: X \times X \to \lbrace \ast \rbrace$ and cap $\cap_X: \lbrace \ast \rbrace \to X \times X$ are the functions defined as follows:
$R^\Pi$ is also a differential category. The additive symmetric monoidal structure is induced by the biproduct, which is given by the disjoint union of sets, $X \sqcup Y$ , and where the terminal object is the empty set $\emptyset$ . The sum of ${f: X \to Y}$ and $g: X \to Y$ (which recall are functions $X \times Y \to R$ ) is defined pointwise, $(f+g)(x,y) = f(x,y) + g(x,y)$ , while the zero map $0: X \to Y$ is the function which maps everything to zero, $0(x,y) = 0$ . The coalgebra modality on $R^\Pi$ is again given by finite bags, that is, for a set X, let ${\rm{!}} X$ be the set of all finite bags of X, and this coalgebra modality has the Seely isomorphisms as in the previous example. The deriving transformation ${\mathsf{d}_X: {\rm{!}} X \times X \to {\rm{!}} X}$ is defined as follows:
where $\vert B \vert$ is the cardinality of the finite bag. The image by $\vert B^\prime \vert = \vert B \vert + 1$ takes into account that if we were in the unordered case, there would be $n+1$ possible ways of putting an element into a bag of size n. Of course the factor disappears in the case that the semiring is additively idempotent (i.e. $1+1 = 1$ ), such as the two-element Boolean algebra B. Which is why the factor does not appear in the differential structure of $\mathsf{REL}$ as described in the previous example. For more details on this differential category, see Lemay (Reference Lemay2020, Section 6). By applying Theorem 43, $R^\Pi$ is also a reverse differential category where the reverse deriving transformation $\mathsf{r}_X: {\rm{!}} X \times {\rm{!}} X \to X$ is defined as follows:
Example 50. Let k be a field and $\mathsf{FVEC}_k$ the category of finite dimensional k-vector spaces and k-linear maps between them. While $\mathsf{FVEC}_k$ is a compact-closed category, it is not canonically self-dual compact closed, since to give a self-dual structure corresponds to providing a basis. So let $\mathsf{FVEC}^{\mathcal{B}}_k$ be the category whose objects are pairs $(V, B_V)$ consisting of a finite dimensional k-vector space and a basis $B_V$ of V, and whose maps are arbitrary k-linear maps between the underlying vector spaces. $\mathsf{FVEC}^{\mathcal{B}}_k$ is a self-dual compact-closed category, where the tensor product is defined as
the monoidal unit is $(k, \lbrace 1 \rbrace)$ , and where the cup $\cup_{(V,B_V)}: (V, B_V) \otimes (V, B_V) \to (k, \lbrace 1 \rbrace)$ is defined as on basis elements $v, w \in B_V$ as follows:
and the cap $\cap_{(V,B_V)}: (k, \lbrace 1 \rbrace) \to (V, B_V) \otimes (V, B_V)$ is the k-linear map defined as
which is well-defined since $B_V$ is a finite set. $\mathsf{FVEC}^{\mathcal{B}}_k$ is also an additive symmetric monoidal category where the additive structure is induced by the direct sum of vector spaces (which is the categorical biproduct):
and where $(0, \emptyset)$ is the zero object. Unfortunately, however, as explained in Lemay (Reference Lemay2019), $\mathsf{FVEC}^{\mathcal{B}}_k$ does not usually have a (non-trivial) differential category structure. This problem is solved when we consider $k = \mathbb{Z}_2$ , as was done in Hyland and Schalk (Reference Hyland and Schalk2003), Lemay (Reference Lemay2019). $\mathsf{FVEC}^{\mathcal{B}}_{\mathbb{Z}_2}$ is then a differential category where the coalgebra modality is induced by the exterior algebra, which is defined as follows:
Recall that the wedge product satisfies that $v \wedge v = 0$ . Usually, one also has that the wedge product is anticommutative, that is, $v \wedge w = -w \wedge v$ . But in the case of $\mathbb{Z}_2$ , $1=-1$ and therefore $v \wedge w = w \wedge v$ , which is key to obtaining a coalgebra modality. The deriving transformation $\mathsf{d}_{(V,B_V)}: {\rm{!}}(V, B_V) \otimes (V, B_V) \to {\rm{!}}(V, B_V)$ is defined on basis elements as follows:
See Lemay (Reference Lemay2019, Example 2.6.(iii)) for more details on this differential category. By applying Theorem 43, $\mathsf{FVEC}^{\mathcal{B}}_{\mathbb{Z}_2}$ is also a reverse differential category where the reverse deriving transformation $\mathsf{r}_{(V,B_V)}: {\rm{!}}(V, B_V) \otimes {\rm{!}}(V, B_V) \to (V, B_V)$ is defined on basis elements as follows:
Example 51. Pagani, Selinger, and Valiron’s categorical model of a quantum lambda calculus, $\overline{\textbf{CPMs}}^\oplus$ (Pagani et al., Reference Pagani, Selinger and Valiron2014, Section 4.2), is a reverse differential storage category. The objects of $\overline{\textbf{CPMs}}^\oplus$ are families of pairs of natural numbers and subgroups of permutations, while the maps of $\overline{\textbf{CPMs}}^\oplus$ can be interpreted as completely positive (continuous) module homomorphisms (Pagani et al., Reference Pagani, Selinger and Valiron2014, Propostion 16). $\overline{\textbf{CPMs}}^\oplus$ is self-dual compact closed (Pagani et al., Reference Pagani, Selinger and Valiron2014, Section 4.3.3) and has (in)finite biproducts (Pagani et al., Reference Pagani, Selinger and Valiron2014, Section 4.3.1), and so is an additive symmetric monoidal category as well. Furthermore, $\overline{\textbf{CPMs}}^\oplus$ is a Lafont category, that is, $\overline{\textbf{CPMs}}^\oplus$ has a coalgebra modality ${\rm{!}}$ which is given by cofree cocommutative comonoids, and these coalgebra modalities are called free exponential modalities (Pagani et al., Reference Pagani, Selinger and Valiron2014, Section 4.3.4). By (Lemay, Reference Lemay, Gadducci and Silva2021, Theorem 21), in the presence of biproducts, any free exponential modality has a (canonical) deriving transformation, and so any Lafont category with biproducts is a differential (storage) category. Therefore, $\overline{\textbf{CPMs}}^\oplus$ is a differential (storage) category. Since $\overline{\textbf{CPMs}}^\oplus$ is also self-dual compact closed, by applying Theorem 43, we conclude that $\overline{\textbf{CPMs}}^\oplus$ is also a reverse differential category. In future work, it would be interesting to study in more detail the consequence of reverse differential structure in this model of quantum lambda calculus.
Example 52. There is another interesting relationship between reverse differential categories and categorical quantum mechanics. Every reverse differential storage category whose coalgebra modality is a free exponential modality is a model of Vicary’s categorical quantum harmonic oscillator (Vicary, Reference Vicary2008, Defintion 3.1). We note, however, that the converse is not necessarily true since the required base category for a categorical quantum harmonic oscillator need only be a $\dagger$ -symmetric monoidal category instead of a ( $\dagger$ -)compact-closed category. That said, as discussed in Vicary (Reference Vicary2008, Section 6), in future work it would be interesting to revisit Vicary’s categorical quantum harmonic oscillators from the point of view of (reverse) differential categories.
4.3 From MRDCs to CRDCs
In this section, we prove that our definition satisfies requirement 2 of an MRDC, that is, that the coKleisli category of an MRDC is a CRDC.
First, note that we already have part of what we need. By Theorem 38, to give a CRDC is equivalent to giving a CDC and a contextual linear dagger. Moreover, by Theorem 43, any MRDC is a differential category, and by Theorem 19, if $\mathbb{X}$ is a differential category then its coKleisli category has the structure of a CDC. Thus, putting this together, for any MRDC, its coKleisli category is a CDC.
Thus, all that remains to show is that the coKleisli category has a contextual linear dagger, and for this, we need its linear fibration, $\mathcal{L}[\mathbb{X}_{{\rm{!}}}]$ to be a dagger fibration. However, Theorem 24 showed that there is an isomorphism of fibrations $\mathcal{L}[\mathbb{X}_{{\rm{!}}}] \cong \mathcal{L}_{{\rm{!}}}[\mathbb{X}]$ . Thus, it will suffice to give a dagger fibration structure on $\mathcal{L}_{{\rm{!}}}[\mathbb{X}]$ :
Lemma 53 If $\mathbb{X}$ is a self-dual compact-closed category, then the fibration $\mathcal{L}_{{\rm{!}}}[\mathbb{X}]$ has dagger fibration structure, where for a map $f: {\rm{!}} X \otimes A \to B$ , its dagger $f^{\dagger[X]}: {\rm{!}} X \otimes B \to A$ is defined as the following composite:
Furthermore, for any map $f: A \to B$ in $\mathbb{X}$ , the dagger of $e_X \otimes f: {\rm{!}} X \otimes A \to B$ is $(e_X \otimes f)^{\dagger[X]} = e_X \otimes f^\ast: {\rm{!}} X \otimes B \to A$ , where $f^\ast: B \to A$ is defined as in Lemma 40.
Proof. Per Example 36, it suffices to prove that the dagger operation satisfies contravariant functoriality, involution, and change of base. We begin by showing that the dagger is contravariant on composition:
Next we show that the dagger preserves identities:
Next we show that the dagger is involutive using the snake equations:
Lastly we show that the substitution functors preserves the dagger, which is automatic by definition:
So we conclude that $\mathcal{L}_{{\rm{!}}}[\mathbb{X}]$ has dagger fibration. Next, for any map $f: A \to B$ , using the snake equations and the sliding equations, we compute
So the desired equality holds.
The following then follows from the remarks above:
Corollary 54. Let $\mathbb{X}$ be a differential category which is self-dual compact closed. Then the coKleisli category $\mathbb{X}_{\rm{!}}$ is a CDC with a contextual linear dagger, where for a map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}}(X \times A) \to B$ which is linear in context X, its dagger $\left[\kern-0.15em\left[ f^{\dagger[X]} \right]\kern-0.15em\right]: {\rm{!}} (X \times B) \to A$ is defined as follows:
where $\mathsf{E}_X$ and $\mathsf{E}^{-1}_X$ are defined on in Corollary 25, and the $\dagger[X]$ on the right-hand side is defined as in Lemma 53.
Proof. It follows from the equivalence of Theorem 24, that by giving a dagger fibration on $\mathcal{L}_{\rm{!}}[\mathbb{X}]$ (Lemma 53), we obtain a dagger fibration on $\mathcal{L}[\mathbb{X}_{\rm{!}}]$ defined as follows:
Zooming in on the fibres, we have that the dagger on maps which are linear in context is defined as $\left[\kern-0.15em\left[ f^{\dagger[X]} \right]\kern-0.15em\right] = \mathsf{E}_X\left( \mathsf{E}^{-1}_X(\left[\kern-0.15em\left[ f \right]\kern-0.15em\right])^{\dagger[X]} \right)$ . It remains to show that each fibre also has dagger-biproducts. First note that in the fibres, the projections maps and injections maps are, respectively:
By applying $\mathsf{E}^{-1}_X$ to the projection, we obtain the following:
By Lemma 53, their dagger is given the dual:
However by Houston (Reference Houston2008), the dual of the projections are the injections (and vice versa). So:
Lastly, applying $\mathsf{E}_X$ we finally obtain that:
So we conclude that $\left[\kern-0.15em\left[ 1_X \times \pi_i \right]\kern-0.15em\right]^{\dagger[X]} = \left[\kern-0.15em\left[ 1_X \times \iota_i \right]\kern-0.15em\right]$ , and that therefore each fibre $\mathcal{L}[X]$ has dagger biproducts. Thus, $\mathbb{X}_{\rm{!}}$ is a CDC with a contextual linear dagger.
We then obtain one of the main results of this paper:
Theorem 55. Let $\mathbb{X}$ be a reverse differential category with coalgebra modality $({\rm{!}}, \delta, \varepsilon, \Delta, e)$ and reverse deriving transformation $\mathsf{r}: {\rm{!}} A \otimes {\rm{!}} A \to A$ , and finite (bi)products (which we denote here using the product notation). Then the coKleisli category $\mathbb{X}_{\rm{!}}$ is a CRDC with Cartesian left additive structure defined in Section 2.6 and reverse differential combinator $\mathsf{R}$ defined as follows on a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ :
where $\chi_{A \times B}: {\rm{!}} (A \times B) \to {\rm{!}} A \otimes {\rm{!}} B$ is defined as in Definition 3 and $(\_)^\ast$ is defined as in Lemma 40. Furthermore, the induced differential combinator is precisely that of Proposition 19, and the induced contextual linear dagger is precisely that of Corollary 54.
Proof. By Proposition 19, $\mathbb{X}_{\rm{!}}$ is a CDC, and since $\mathbb{X}$ is compact closed, by Corollary 54, $\mathbb{X}_{\rm{!}}$ also has a contextual linear dagger. Therefore by Theorem 38, $\mathbb{X}_{\rm{!}}$ is a Cartesian reverse differential category where for a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} X \to B$ , its reverse derivative $\left[\kern-0.15em\left[ \mathsf{R}[f] \right]\kern-0.15em\right]: {\rm{!}} (A \times B) \to A$ is defined as $\left[\kern-0.15em\left[ \mathsf{R}[f] \right]\kern-0.15em\right] = \left[\kern-0.15em\left[ \mathsf{D}[f]^{\dagger[A]} \right]\kern-0.15em\right]$ . Expanding this out, we compute
So we conclude that the reverse differential combinator of $\mathbb{X}_{\rm{!}}$ is induced by the reverse deriving transformation of $\mathbb{X}$ .
Similarly to the differential combinator, the reverse differential combinator can also be expressed in terms of the coderiving transformation as follows on a coKleisli map $\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B$ :
4.4 Other constructions of CRDCs
The inconvenience of MRDCs is that the self-dual compact-closed requirement is quite strong. Indeed, there are not many interesting or well-studied models of differential linear logic in the literature that are self-dual compact closed. In fact, from a linear logic perspective, such models are often considered somewhat “degenerate” (Hyland and Schalk, Reference Hyland and Schalk2003, Definition 3). Therefore, examples of CRDCs arising from MRDCs will often not appear naturally. There is, however, another way of construction CRDCs from coKleisli categories. In particular, this slightly altered construction can be done with any monoidal differential category. Instead of requiring all objects in the base category to be self-dual, we can instead take the full subcategory of the coKleisli category of self-dual objects in the base category. The proof that this subcategory is a CRDC is essentially the same as Theorem 55. Being a full subcategory of a CDC that is closed under finite products implies that said subcategory is also a CDC. Then using the self-duality, we can build a contextual linear dagger and we conclude that we have a CRDC.
Definition 56. Let $({\rm{!}}, \delta, \varepsilon)$ be a comonad on a symmetric monoidal category $\mathbb{X}$ . Define $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ as the full subcategory of the coKleisli category $\mathbb{X}_{\rm{!}}$ whose objects are self-dual objects (Definition 39) of $\mathbb{X}$ , so triples $(A, \cup_A, \cap_A)$ . Recall that by a full subcategory, the maps of $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ are all those of $\mathbb{X}_{\rm{!}}$ between the underlying objects, that is, $\mathsf{R}[\mathbb{X}_{\rm{!}}]\left( (A, \cup, \cap), (B, \cup, \cap) \right) = \mathbb{X}_{\rm{!}}(A,B) = \mathbb{X}({\rm{!}} A, B)$ , and both composition and identities are the same as in $\mathbb{X}_{\rm{!}}$ .
Suppose that the base symmetric monoidal category $\mathbb{X}$ has finite biproducts. The zero object is self-dual, where the cups and caps are simply the zero morphisms (Heunen and Vicary, Reference Heunen and Vicary2019, Lemma 3.19), and the biproduct of self-dual objects is again self-dual (Heunen and Vicary, Reference Heunen and Vicary2019, Lemma 3.23). Explicitly, if $(A, \cup_A, \cap_A)$ and $(B, \cup_B, \cap_B)$ are self-dual objects, then $(A \times B, \cup_{A \times B}, \cap_{A \times B})$ is also a self-dual object where the cup and cap are defined, respectively, as follows:
Therefore, it follows that $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ has finite products. Note that any full subcategory of a CDC whose objects are closed under finite products is again a CDC. Therefore, if the starting base category $\mathbb{X}$ is a differential category, $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ will be a CDC.
Lemma 57. Let $\mathbb{X}$ be a differential category with finite biproducts. Then $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CDC where the differential combinator is defined as in Proposition 19.
Now that we have established that $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CDC, to show that it is also a CRDC, it remains only to show that $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ has a contextual linear dagger. However, we may define the dagger in the same way that it was done in Corollary 54.
Lemma 58. Let $\mathbb{X}$ be a differential category with finite biproducts. Then $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CDC with a contextual linear dagger, where the dagger is defined in the same way as in Corollary 54.
Proof. Using essentially the same proof as throughout Section 4.3, it follows from self-duality that we obtain a contextual linear dagger.
As a result, it follows that $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CRDC. Since the base category does not necessarily have a reverse deriving transformation, we will explicitly write the reverse differential combinator of $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ in terms of the deriving transformation and the cups and caps.
Proposition 59. Let $\mathbb{X}$ be a differential category with finite products. Then $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CRDC, where the reverse differential combinator is defined as follows for a coKleisli map ${\left[\kern-0.15em\left[ f \right]\kern-0.15em\right]: {\rm{!}} A \to B}$ :
Proof. Since $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CDC with a contextual linear dagger, then by Theorem 38, $\mathsf{R}[\mathbb{X}_{\rm{!}}]$ is a CRDC. By essentially the same calculations as in the proof of Theorem 55, we can show that the resulting reverse differential combinator is precisely the desired one.
We could have also expressed the reverse differential combinator in terms of the coderiving transformation as follows:
As mentioned above, the advantage of this construction is that we construct a CRDC from any differential category, with or without the Seely isomorphisms. In most cases, the self-dual objects of a differential category are of a “finite-dimensional” flavour. We conclude this section by applying this construction to well-known examples of differential categories to reconstruct some of the main examples of CRDCs.
Example 60. This example recaptures the reverse differentiation of polynomials from (Cockett et al., Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Example 14.1). For simplicity, we will work with vector fields over a field but we note that this example can be generalized to the category of modules over any commutative semiring. Let k be a field and $\mathsf{VEC}_k$ be the category of k-vector spaces and k-linear maps between them. Then $\mathsf{VEC}^{\mathsf{op}}_k$ is a differential category where ${\rm{!}} V$ is the free symmetric algebra over V:
where $\otimes_s^n$ is the n-fold symmetrized tensor product of V. If X is a basis of V, then ${\rm{!}} V \cong k[X]$ , where k[X] is the polynomial ring over X. From this point of view, the deriving transformation can be described as a map $\mathsf{d}_V: k[X] \to k[X] \otimes V$ which maps a polynomial to the sum of its partial derivatives:
Thus $\mathsf{VEC}^{\mathsf{op}}_{k}$ is a differential category, whose differential structure captures polynomial differentiation. For more details on this (co)differential category, see Blute et al. (Reference Blute, Cockett and Seely2006, Section 2.5.3). The self-dual objects in $\mathsf{VEC}^{\mathsf{op}}_k$ are precisely the finite-dimensional vector spaces, and since self-dual objects are self-dual, the same is true in $\mathsf{VEC}^{\mathsf{op}}_k$ . Therefore, $\mathsf{R}[{\mathsf{VEC}^{\mathsf{op}}_k}_{\rm{!}}]$ is equivalent, as CRDCs, to $\mathsf{POLY}_k$ from Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Example 14.1).
Example 61. This example recaptures the reverse differentiation of smooth functions from Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Example 14.2). Let $\mathbb{R}$ be the field of real numbers. While the differential structure on $\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}$ from the above example captures polynomial differentiation, $\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}$ has another differential structure where this time the deriving transformation corresponds to differentiating (real) smooth functions. The key to this example is the notion of $C^\infty$ -rings, which recall are defined as the algebras of the Lawvere theory whose morphisms are smooth maps between the Euclidean spaces $\mathbb{R}^n$ . Equivalently, a $C^\infty$ -ring is a set A equipped with a family of functions ${\Phi_f: A^n \to A}$ indexed by the smooth functions $f: \mathbb{R}^n \to \mathbb{R}$ and which satisfies certain coherence equations. For example, $C^\infty(\mathbb{R}^n) = {\lbrace f: \mathbb{R}^n \to \mathbb{R} \vert~ f \textrm{ is smooth} \rbrace}$ is a $C^\infty$ -ring. For every $\mathbb{R}$ -vector space V, there exists a free $C^\infty$ -ring over V (Cruttwell et al., Reference Cruttwell, Lemay and Lucyshyn-Wright2021, Section 4), which we denote as $\mathsf{S}^\infty(V)$ . If V is finite dimensional of dimension n, then $\mathsf{S}^\infty(V) \cong C^\infty(\mathbb{R}^n)$ as $C^\infty$ -rings, and in particular, $\mathsf{S}^\infty(\mathbb{R}^n) = C^\infty(\mathbb{R}^n)$ . Then $\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}$ is a differential category with respect to the coalgebra modality $\mathsf{S}^\infty$ and whose deriving transformation is induced by differentiating smooth functions. In particular for $\mathbb{R}^n$ , the deriving transformation $\mathsf{d}: C^\infty(\mathbb{R}^n) \to C^\infty(\mathbb{R}^n) \otimes \mathbb{R}^n$ maps a smooth function $f: \mathbb{R}^n \to \mathbb{R}$ to the sum of its partial derivatives:
Hence $\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}$ is a monoidal differential category, whose differential structure captures smooth function differentiation. For more details on this differential category, see Cruttwell et al. (Reference Cruttwell, Lemay and Lucyshyn-Wright2021). As explained in the above example, the self-dual objects of $\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}$ are the finite-dimensional vector spaces. Therefore, $\mathsf{R}[{\mathsf{VEC}^{\mathsf{op}}_\mathbb{R}}_{\mathsf{S}^\infty}]$ is equivalent as a CRDC to the example $\mathsf{SMOOTH}$ from Cockett et al. (Reference Cockett, Cruttwell, Gallagher, Lemay, MacAdam, Plotkin and Pronk2020, Example 14.2).
5. Conclusions and Future Work
In this paper, we have filled in a gap in the literature on categorical differential structures by providing a definition of a MRDC. We have also provided key results to relate this structure to others, showing how MRDCs relate to monoidal differential categories, CDCs, and CRDCs. This work provides many additional avenues for exploration; we briefly discuss some of them here.
• To understand what the structure of MRDCs should be, this paper started from an MDC and looked at what would happen if it’s associated CDC was a CRDC. However, there is another approach one could take. In Blute et al. (Reference Blute, Cockett and Seely2015), the authors look at what additional structure on a CDC would be necessary to form an MDC. Thus, alternatively, one could start with a CRDC with such structure, and show that one gets an MRDC. We leave this for future work.
• In Garner and Lemay (Reference Garner and Lemay2021), the authors describe how CDCs can be seen as a type of skew-enriched category, and use this result to demonstrate how every CDC embeds into a CDC associated to an MDC. Similar results for CRDCS and MRDCs would be very useful.
• In the world of “reverse” differential structures, the analog of tangent structures (Cockett and Cruttwell, Reference Cockett and Cruttwell2013) has yet to be described. Such a structure would axiomatize the cotangent bundle in differential geometry. Understanding such a structure’s relationship to MRDC and CRDCs will then help further bridge the gap between differential geometry and differentiation in computer science.
• All of the above items are theoretical; however, there is an important applied avenue which this work allows one to pursue. As Examples 51 and 52 demonstrated, several abstract models of quantum computation are MRDCs. Then in particular, by Theorem 55, the coKleisli category associated to these models is a CRDC. By the results of Cruttwell et al. (Reference Cruttwell, Gavranović, Ghani, Wilson and Zanasi2022), this means that one could apply supervised learning techniques to these examples. This possibility of combining quantum computation with supervised learning is an exciting direction we hope will be pursued in the future.
Acknowledgements
For the work on this research project: Geoff Cruttwell was supported by an NSERC Discovery grant; Jonathan Gallagher was financially supported by an AARMS postdoctoral fellowship; Jean-Simon Pacaud Lemay was financially supported by an NSERC Postdoctoral Fellowship (PDF) – Award #: 456414649; and Dorette Pronk was supported by an NSERC Discovery grant.