What have I been thinking about?

So Beeminder is about to go off on me if I do not post something soon.  Due to me graduating this semester, I have been very busy writing my thesis.  So blog posts have been put on the back burner for a bit, but I want to keep being active anyway.  I decided to post a snapshot of what I have been thinking about research wise.

The snapshot is in the form of a diagram.  This diagram illustrates all of the topics and how they fit together of my ongoing research program on developing a core logic where induction and coinduction can live harmoniously. A major property of this core logic is that it must have non-trivial categorical models.  This cannot be said about bi-intuitionistic logic currently.

Finally, here is what I am thinking about in a single diagram:

diagCorrect

 

I am currently drafting a post explaining — in most likely too much detail — LNL-models. So stayed tuned for that.

Posted in category theory, computer science, Logic, metatheory, Semantics, Type Theory | Leave a comment

The Category of Deterministic Automata

One thing I have recently decided is that category theory provides a unique perspective of computation. Thus, it can be used throughout computer science. Many people in my field — theory of programming languages — already know this, or at least know that category theory can be used in the study and implementation of programming languages, but I claim more! I also claim that not every computer scientist knows that category theory can provide this alternate perspective. In a number of posts I am going to share some discoveries I have found in the literature that supports this claim. I am not sure on the amount of posts, but I plan to write a bunch. I really want to share some little nuggets of fun that I have been finding. In this post I want to define the category of deterministic automata. I learned of this from the wonderful reports titled “A Junction between Computer Science and Category Theory” by Goguen, Thatcher, Wagner, and Wright.

Automaton

Throughout this post I often use the word “automaton” to mean “deterministic automaton.” Given a set of inputs called \(X\), a set of states \(S\), and a function \( \delta : X \times S \to S\), called the transition function, we can define the deterministic automata \(\mathcal{A} = \langle X,S,\delta \rangle\). The transition function \(\delta\) tells us how to move from one state to the next. We call \(\mathcal{A}\) a finite automata if and only if \(X\) and \(S\) are finite sets.

Lets consider a quick example. Suppose we have the following:

  • \(X = \{ A,B\} \)
  • \(S = \{ S_0,S_1,S_2\} \)
  • \(\delta(A, S_0) = S_1 \)
  • \(\delta(A, S_1) = S_1 \)
  • \(\delta(B, S_1) = S_2 \)
  • \(\delta(A, S_2) = S_1 \)
  • \(\delta(B, S_2) = S_2 \)

There is an equivalent graphical formulation of automaton. The graphical formulation of the previous definition is as follows:


We can see that the graphical formulation is a graph where the nodes are the states of the automaton, and the edges make up the definition of the transition function, and are labeled by the input symbols. The graphical formulation makes it really easy to determine reachability between states. For example, the state \(S_2\) is reachable from the state \(S_0\) using the input word \(AB\) by the following reasoning:

  • \(\delta(B, \delta(A, S_0)) = \delta(B, S_1) = S_2 \)

We can confirm this reasoning very easy using the graphical formulation, because we can simply follow the edge labeled \(A\) to the state \(S_1\), and then follow the edge labeled \(B\) to the state \(S_2\). Instead of transitions between edges one symbol at a time is it possible to extend \( \delta \) so that we can transition directly using groupings of symbols — these are usually called words?

It turns out that we can, and I know of two different ways this can be done. Perhaps there are more, but I want to give an extension of \( \delta \) that is somewhat different. First we must define the monoid of all functions between states, that is functions from \(S\) to \(S\). Suppose that we have two functions \(f : S \to S\) and \(g : S \to S\), then we can define a new function called \(f ; g : S \to S\). So suppose \(s \in S\), then \((f ; g)(s) = g(f(s))\). We can define a second function called the identity function \(id : S \to S\) by \(id(s) = s\), and we can see that \(f ; id = id ; f = f\) for any function \(f : S \to S\). This data defines the monoid \([S,S]\) of all functions between states, and I leave checking the axioms to the reader.

Suppose \(A \in X\), then notice that we can define a function \(\delta_A : S \to S\) as \(\delta_A(s) = \delta(A,s)\) for any \(s \in S\). Thus, \(\delta_A \in [S,S]\). Taking this one step further we can define the function \(\delta_X : X \to [S,S]\) as \(\delta_X(A) = \delta_A\). Based on the definition of an automaton we know that \(X\) is a set of symbols, but we can think of this as an alphabet, then we can use this alphabet as the carrier set of the monoid \(X^*\) which is the free monoid of all strings generated by juxtaposition of the elements of \(X\) where the empty string is the identity of the monoid. Since we can take a set of symbols and generate a free monoid we should be able to take a function from the set of symbols to a monoid and extend it into a monoid homomorphism from the free monoid generated from the domain, to the range monoid. This means we can take the function \(\delta_X : X \to [S,S] \) and extend it into a function \(\delta^*_X : X^* \to [S,S]\), where

  • \(\delta^*_X(A_1 \cdots A_n) = \delta^*_X(A_1);\cdots ;\delta^*_X(A_n) = \delta_X(A_1);\cdots ;\delta_X(A_n) = \delta_{A_1};\cdots ;\delta_{A_n}\)

The previous equation is the defining characteristic of a monoid homomorphism. This extension is called the universal property of \(X^*\). We leave the details of the proof that the extension is indeed unique to the reader. See Proposition 1.6 of the paper I cited above. Finally, we can define the extension of \(\delta\) called \(\delta^+\) by \(\delta^+(w,s) = (\delta^*_X(w))(s)\).
Recall our example from above we know

  • \(\delta(B, \delta(A, S_0)) = \delta(B, S_1) = S_2 \)

Now using the extension we can now see that

  • \(\begin{array}{llll}
    \delta^+(AB,S_0) & = (\delta^*_X(AB))(S_0) \\
    & = (\delta^*_X(A);\delta^*_X(B))(S_0) \\
    & = (\delta_A;\delta_B)(S_0)\\
    & = \delta_B(\delta_A(S_0))\\
    & = \delta(B, \delta(A, S_0) \\
    & = S_2
    \end{array}\)

Automaton Homomorphisms

In this section I will define what I call automaton homomorphisms, but this is nonstandard terminology, but I think fits best. These are simple little mappings from one automaton to another with an additional invariant.

Suppose we have automatons \(\mathcal{A} = \langle X_{\mathcal{A}},S_{\mathcal{A}}, \delta_{\mathcal{A}}\rangle \) and \(\mathcal{B} = \langle X_{\mathcal{B}},S_{\mathcal{B}}, \delta_{\mathcal{B}} \rangle \). Then how would we map \(\mathcal{A}\) into \(\mathcal{B}\)? First, we would have to map \(X_{\mathcal{A}}\) into \(X_{\mathcal{B}}\) and map \(S_{\mathcal{A}}\) into \(S_{\mathcal{B}}\). Then using these mappings we would have to map \(\delta_{\mathcal{A}}\) into \(\delta_{\mathcal{B}}\), but we have to be careful, because this final mapping should not break transitions when moving from \(\mathcal{A}\) to \(\mathcal{B}\), that is if we can transition from one state to another in \(\mathcal{A}\), then the states these two map to in \(\mathcal{B}\) should also have a transition between them.

This intuition guides us to the definition of an automaton homomorphism between \(\mathcal{A}\) and \(\mathcal{B}\). These are pairs \((\phi, \psi)\) where \(\phi : X_{\mathcal{A}} \to X_{\mathcal{B}}\) and \(\psi : S_{\mathcal{A}} \to S_{\mathcal{B}}\). Furthermore, the following diagram must commute:


The definition of \(\phi \times \psi\) is rather simple, \((\phi \times \psi)(A,s) = (\phi(A),\psi(s))\). The previous diagram simply guarantees our invariant holds.

Automaton are Categories

Now I would like to show that automaton are in fact categories. This is similar to how we can think of sets as categories. Suppose we have an automaton \(\mathcal{A} = \langle X,S,\delta\rangle \). Then we define the category \(\mathsf{Tr}_{\mathcal{A}}\) as follows:

  • Objects: \(|\mathsf{Tr}_{\mathcal{A}}|_0 = S\)
  • Morphisms: \(|\mathsf{Tr}_{\mathcal{A}}|_1 = \{w : S_1 \to S_2 \mid w \in X^* \text{ and } \delta^+(w,S_1) = S_2 \} \)

Now we have to check the axioms of a category. Clearly, the identity is \(\lambda : S_0 \to S_0\) for any state \(S_0 \in S \) where \(\lambda \in X^*\) is the empty string. Suppose \(w_1 : S_1 \to S_2 \in |\mathsf{Tr}_{\mathcal{A}}|_1\) and \(w_2 : S_2 \to S_3 \in |\mathsf{Tr}_{\mathcal{A}}|_1\). Then we know \(w_{1}w_{2} \in X^*\) and \(\delta^+(w_{1}w_{2},S_1) = S_3\). Therefore, \(w_{1}w_{2} : S_1 \to S_3 \in |\mathsf{Tr}_{\mathcal{A}}|_1\). So we can define composition as \(w_1 ; w_2 = w_{1}w_{2} : S_1 \to S_3\). The remainder of the axioms are easily verified.

The Category of Deterministic Automaton

Finally, we can now take everything we have defined so far, and put it all together to form the category of deterministic automaton \(\mathsf{DAut}\). The objects are all deterministic automaton. The morphisms are all automaton homomorphisms between deterministic automaton. Certainly there is an identity, for any automaton \(\mathcal{A}\) there is an identity homomorphism from \(\mathcal{A}\) to \(\mathcal{A}\) defined as follows \((\mathsf{id}_X,\mathsf{id}_S)(B,S_0) = (B,S_0)\), and the invariant holds:


Now we must define composition. Suppose \((\phi_1,\psi_1) : \mathcal{A} \to \mathcal{B}\) and \((\phi_2,\psi_2) : \mathcal{B} \to \mathcal{C}\) are automaton homomorphisms. Then their composition \((\phi_1,\psi_1) ; (\phi_2,\psi_2)\) is defined as \((\phi_1,\psi_1) ; (\phi_2,\psi_2) = (\phi_1;\phi_2,\psi_1;\psi_2)\). We must check that this definition of composition is an automaton homomorphisms, hence we must verify the invariant. We know by assumption that the invariant holds for \((\phi_1,\psi_1)\) and
\((\phi_2,\psi_2)\):

Using these diagrams we can show that the following diagram commutes:


The following equational reasoning shows the previous diagram commutes:
\(
\begin{array}{lllll}
\delta_{\mathcal{A}};\psi_1;\psi_2 & = (\delta_{\mathcal{A}};\psi_1);\psi_2 & \text{(associativity)}\\
& = ((\phi_1 \times \psi_1);\delta_{\mathcal{B}});\psi_2 & \text{(top diagram)}\\
& = (\phi_1 \times \psi_1);(\delta_{\mathcal{B}};\psi_2) & \text{(associativity)}\\
& = (\phi_1 \times \psi_1);((\phi_2 \times \psi_2);\delta_{\mathcal{C}}) & \text{(bottom diagram)}\\
& = (\phi_1 \times \psi_1);(\phi_2 \times \psi_2);\delta_{\mathcal{C}} & \text{(associativity)}\\
\end{array}
\)
Thus, composition as we have defined is indeed an automaton homomorphism. I leave checking the remainder of the axioms of a category as an exercise.

This post talks specifically about deterministic automaton, but we can form the category of nondeterministic automaton as well. In fact, the definition is exactly the same. Then using these we can form the category \(\mathsf{Aut}\) of all automaton.

Fun Facts

Here I list some fun facts that can be deduced from the categories we have looked at in this post.

  • To check that a word \(w\) is accepted by an automaton \(\mathcal{A}\), one only needs to check that there exists two states \(S_0\) and \(S_1\) such that \(w \in \mathsf{TR}_{\mathcal{A}}(S_0,S_1)\).
  • An automaton \(\mathcal{A}\) is strongly connected if and only if \(\mathsf{TR}_{\mathcal{A}}(S_0,S_1) \not= \emptyset\) for all states \(S_0,S_1 \in S\).
  • An automaton \(\mathcal{A}\) is reachable from the state \(S_0\) if and only if \(\mathsf{TR}_{\mathcal{A}}(S_0,S_1) \not= \emptyset\) for all states \(S_1 \in S\).
  • All of the sets \(\mathsf{TR}_{\mathcal{A}}(S_0,S_1)\) are the sets definable by the automaton \(\mathcal{A} \). Furthermore, these correspond to subsets — including their finite unions — of \(X^*\).
Posted in category theory, computer science | Tagged , , | Leave a comment

Codebreaker Screening at UIowa!

The year 2012 was the year of Alan Turing. It took place mainly in Europe, but there were a lot of great events all over the world. Checkout:

http://www.mathcomp.leeds.ac.uk/turing2012/

 

for a ton of cool stuff that went on all over the world. One project that I was really excited about is the production of a new film about Alan Turing’s life called Codebreaker. The website for the film is here http://www.turingfilm.com/.

Once the film was produced the executive producer Patrick Sammon started doing special screenings around the world at various universities. Recently, the film has finally been released on DVD. I have always planned on buying a copy of the film. So I decided to pitch the idea for me to show the film as a movie night in our department (Computer Science at the University of Iowa). I pitched the idea to Sheryl Semler who is the academic services coordinator for our department. She thought it was a great idea, but I casually mentioned the fact that Patrik does screening at universities, and then she thought that we should bring up the idea of having an actual screening to our department chair Alberto Segre.

Alberto thought this was a great idea! Every year we have a technology and computer science student ran ACM conference. He pitched the idea to make the screening the keynote for the conference. So he then asked the ACM chapter if they liked the idea, and they did. The only problem was getting the money together. We were able get several sponsors for the film. These include the Department of Cinema and Comparative Studies, as well as the Department of Gender, Women, and Sexuality Studies, and then as you might have guessed the CS department.

I am very excited to announce that the screening is happening, and is taking place next week. The screening will take place on Friday, February 28th 2014 at 6:30pm. It is open to the public, and their will be snacks! The film will be introduced by the executive producer, and then after the film he will take questions. I really encourage all that can come to please do! It is going to be a lot of fun. More on the screening can be found here:
https://www.cs.uiowa.edu/resources/features/2014-ui-computing-conference-228-31.

Posted in Uncategorized | Leave a comment

Dualized Intuitionistic Logic: Introduction

This is the first of a few posts introducing a new line of research I am working on currently. In this post I only introduce the ideas we are considering. Then each post after will fill in the details. So if something seems interesting or does not seem to make sense just wait it will. I have tried to write this post about three time now. We kept having to change our results so much that what I had written become obsolete and incorrect. This is one of the reasons I have decided to break up the post into multiple posts. We are finally getting into a position in which I can reveal where the research is taking us.

Over the course of a little over a year I have been becoming increasingly interested in classical type theories. A classical type theory is a type theory which when moving from type theory to logic corresponds to classical logic as stated by the three perspectives of computation (also known as the Curry-Howard correspondence or the proofs-as-programs propositions-as-types correspondence). I managed to get my Advisor, Aaron Stump, interested. Getting Aaron interested is quite a joy. He, like many of us, is very passionate, and has great ideas. In addition we have scooped up a new graduate student, Ryan McCleeary, to work on these ideas with us. One idea I am profoundly excited about is our idea to study duality in constructive logic. To understand what I mean by this lets consider what duality we have in classical logic.

Logic is a beautiful universe. We can use it to express truth (w.r.t a certain semantics), and we can use it to express computation. Like I said, logic is awesome! In classical logic there is a beautiful notion of duality. For example, we can show that conjunction is dual to disjunction. That is we can prove \( A \land B \iff \lnot ( \lnot A \lor \lnot B ) \). Here we can see that conjunction is the same as a negative disjunction whose disjuncts are both negative. We can see this duality also in the inference rules for the classical sequent calculus:

Here the left rules are the negative rules. So conjunction on the left is actually disjunction, likewise for disjunction on the left, which is negative conjunction. So far we have only seen dualities between conjunction and disjunction, but what about implication? There is a left and right rule for implication:

Again the left rule is negative implication, but is there an operator which is dual to implication? The answer is yes. We know in classical logic \(A \to B\) is logically equivalent to \(\lnot A \lor B\). Then we also know that \(A \to B\) is logically equivalent to \(\lnot (A \land \lnot B) \). This tells us that the dual to implication must be logically equivalent to \(A \land \lnot B\). This looks very much like set difference if we interpret conjunction as intersection and negation as complement. This is the reason we call the dual to implication subtraction, written \(A – B\), and it is indeed logically equivalent to \(A \land \lnot B\). This operator was first studied by Cecylia Rauzer. It has since been studied by a number of people like Pierre-Louis Curien and Hugo Herbelin, Gianluigi Bellin, Tristan Crolard, Phillip Wadler, Peter Selinger and many more. We will delay the definition of its inference rules for now. We will introduce these in the next post.

So far we have seen some examples of duality in classical logic, but what are some applications? One application has been to study the long standing conjecture that the call-by-value reduction strategy is dual to the call-by-name reduction strategy. This has been answered in the positive by exploiting the very dualities we have been discussing. The first to investigate this line of work was Peter Selinger. He proves the conjecture using a categorical semantics of the \(\lambda\mu\)-Calculus. However, the equivalences he uses are up to isomorphism. Pierre-Louis Curien and Hugo Herbelin followed suit, but designed a new calculus called the \(\bar\lambda\tilde\mu\)-calculus. This calculus adopted subtraction as the dual to implication. Phillip Wadler designed yet another calculus to solve this problem, but instead of taking implication and subtraction as primitive he took conjunction, disjunction, and negation as primitive and defined implication and subtraction.
The problem of proving CBV dual to CBN uses classical logic, and indeed so far we have only consider classical logic. The applications of duality we are considering are in the area of constructive logic. Tristan Crolard has studied this as well, but with an emphasis on using subtraction to model constructive co-routines. We are interested in so much more.

What types of applications do we have in mind? Inhabitants of negative types (throughout the sequel we will call types like \(\lnot A\) a negative type) give rise to codata. We are finding programming with codata to be very interesting. They allow for the definition of infinite data type using coinduction in a language without lazy evaluation. In addition we are finding that codata gives rise to first class patterns! No need to for a primitive notion of pattern matching. We get it for free! Our main application is to show that induction and coinduction can live harmoniously within the same type theory without any funny restrictions (Agda), or sacrificing type safety (Coq). In addition we think (keep in mind this is all on going research) that we might get some very interesting control operators by exploiting the fact that we have a primitive notion of duality.

So this all sounds great. This is where I will leave this post. In the next post I will give several perspectives of subtraction. I will define them in Heyting Algebras, give a categorical account of them, and then define them logically by defining Dualized Intuitionistic Logic (DIL). Following the definition of DIL I will give a few examples. To know what the third post is going to be about read the second. Stay tuned!

Posted in Logic, metatheory | Tagged , , , , | 3 Comments

Hanging with Kripke

Lately, I have been thinking a lot about Kripke models of intuitionistic propositional logic for my research. During this time I also played around a bit with the models. Here I would like to write down some of the stuff I played with.

Def. 1. A Kripke model is a tuple \(M = \langle W , \preceq , \mathcal{V} \rangle \), where \(W\) is a non-empty set of objects called worlds, \(\preceq \) is a binary relation on \(W\), and \( \mathcal{V} \) is a unary relation between worlds and sets of atomic propositions. This model has the following conditions:

  • \( \preceq \) is a preorder, and
  • (Hereditary Condition) For every atomic formula \(p\) and worlds \(w\) and \(w’\), if \(w \leq w’\) and \(p \in \mathcal{V}(w)\), then \(p \in \mathcal{V}(w’)\).

We call the tuple \( F = \langle W, \preceq \rangle\) a Kripke frame.

Categorizing Kripke

Kripke frames are simply preordered sets. Thus, they are categories.

Lemma. 1. Suppose \(F = \langle W, \preceq \rangle\) is a Kripke frame. Then \(F\) is a category.
Proof. Simply take \(W\) as the set of objects. An arrow exists between two worlds \(w\) and \(w’\) if and only if \(w \preceq w’\). Then identity arrows exist by reflexivity of \(\preceq\) and composition is defined using transitivity. It is easy to see that left and right identities behave correctly and composition is associative. \( \Box\)

We denote the category of a Kripke frame \(F\) as \(Cat(F)\). A Kripke model is then a Kripke frame together with an interpretation function \( \mathcal{V} \) from the set of worlds to a set of atomic propositions. We can interpret a model as \(Cat(F)\) for some Kripke frame \(F\) together with the following functor (\(\mathcal{\subseteq}: Cat(F) \to Set\)):

  • \( \mathcal{\subseteq}(w) = A\), for some set \(A\) of atomic propositions, and
  • \(\mathcal{\subseteq}(f : w \to w’) = g : \mathcal{\subseteq}(w) \to \mathcal{\subseteq}(w’)\), where \(p = g(p)\) (if \(p \in \mathcal{\subseteq}(w)\) then \(p \in \mathcal{\subseteq}(w’)\)).

Therefore, a Kripke model is then the tuple \(\langle Cat(F), \mathcal{\subseteq}\rangle\).

What about the category of Kripke models?

Def. 3. The category of Kripke models, \( Kat \), is defined as follows:

  • Objects: The Set of all Kripke models.
  • Arrows: The set of homomorphisms from the Kripke frame of one model to the Kripke frame of another. That is for models \(M_1 = \langle W_1,\preceq_1,\mathcal{V}_1\rangle\) and \(M_2 = \langle W_2,\preceq_2,\mathcal{V}_2\rangle\) the arrow from \(M_1\) to \(M_2\) is a homomorphism from \(W_1\) to \(W_2\) such that the preorder structure is preserved.
  • Identities: Clearly, identities exists, just take the trivial homomorphism.
  • Composition: Homomorphisms are composable, and they respect identities and associativity.

So that is the definition using Def. 1, lets try and give one interms of pure categories. Since a functor is a generalization of a homomorphism between categories we simply move from the arrows being homomorphisms in \(Set\) to functors.

Def. 3. The category of Kripke models, \( Kat_{Cat} \), is defined as follows:

  • Objects: The set of all Kripke models \(M = \langle Cat(\langle W, \preceq \rangle),\subseteq\rangle \).
  • Arrows: The set of functors between frames.
  • Identities: Take the identity functors \( \mathcal{I}_{M} \) for any model \(M\).
  • Composition: Composition is well defined for functors. Identities play nice and composition is associative.

Frame it!

We can eliminate the pesky functor \(\subseteq\) and the interpretation mapping \(\mathcal{V}\) all together. Instead of leaving the worlds unspecified we can adopt sets of atomic propositions as worlds. Doing this results in Kripke frames becoming Kripke models. Intuitionistic logic can be interpreted in this configuration in the usual way except instead of interrupting atomic propositions as members of the interpretation of a world they are actual members of the world. Some may have philosophical objections to this style, because the semantics is not completely distinct from the object language, but I do not think this is that far off base from the semantics of type theories. There we interpret types as sets of normal forms which are members of the object language. This is also exactly how worlds are defined in the work on actualism; where this is called possible world semantics. I am a fan of both styles.

Posted in category theory, Logic, metatheory, philosophy, Semantics, Type Theory | 4 Comments

Completeness of a Multi-Succedent Intuitionistic Propositional Logic

Over the course of about four informal lectures I presented how to show completeness for multi-succedent intuitionistic propositional logic.  I used Mints’ book “A short introduction to intuitionistic logic” as my reference.  There is a beautiful proof of completeness done in section 8.2. However, there are many problems with Mints’ presentation and proofs.  Thus, I decided to write my own lecture notes correcting these problems.  I wanted to post them here in case anyone else might find them helpful.

The lecture notes can be found here.

Posted in Uncategorized | Leave a comment

You and Your Research

Phil Wadler wrote a nice post on writing, communication, and being successful in research.  You can find it here: You and Your Research and the Elements of Style.

Wadler points to an inspiring video of a lecture by Richard Hamming:

I highly recommend watching the above video.

Posted in Graduate School Advice | Leave a comment

Church-Encoded Pairs are not Categorical Products

Recently, John Altidor a PhD student from the University of Massachusetts has been visting UIowa. His research as of right now is concerned with subtyping. While his work can be applied to a wide range of programming languages he and his advisor have been applying their work to Java. Lately, he and I have been talking a lot about his work on subtyping. As I said, he has been applying his work to Java, and one thing I find interesting is thinking about how much of OO can be done in a dependent type theory. His work sparked me to read a few papers on subtyping. One such paper is “An extension of system F with subtyping” by L. Cardelli et al. In this paper the authors claim something I was not aware of. They state that church-encoded pairs in system F do not correspond to actual categorical constructs. Now this is not saying that they are unsound, but it is saying that they are not products. It is important to note that this claim is with respect to \(\beta\eta\)-equality. I am not sure, but it may be possible to show that church-encoded pairs can be identified with products using some other notation of equality. The authors claim this is well-known and do not give the details. Curiosity has lead me to filling in the details myself.

I plan to present the details of this claim in a three part post. First, I will set the stage by defining system F with \(\beta\eta\)-equality and church-encoded pairs. In the second post I will present all the categorical notions we will need on the journey. Finally, in the third post I will present the main results. The entirety of this series will be presented and formalized in Coq. I assume the reader is familier with Coq and the locally nameless binding representation. However, for the uninitiated I give plenty of references.

System F was invented independently by both J. Girard and J. Reynolds. It is the extension of the simply typed \(\lambda\)-calculus with second order universal quantification. For a more in depth look at system F see the amazing book “Proofs and Types.” I start by defining the syntax. The syntax for types and terms are defined by the following inductive definitions. Note that I will not be able to give every detail of the formalization, however in the final post I will provide a link to the entire formalization.

Inductive typ : Set := 
  | typ_base 
  | typ_fvar   (n:nat)
  | typ_bvar   (n:nat)
  | typ_forall (A:typ)
  | typ_arrow  (A1:typ) (A2:typ).

Inductive trm : Set :=
  | trm_bvar  (n:nat)
  | trm_fvar  (n:nat)
  | trm_lam   (A:typ) (b:trm)
  | trm_Lam   (b:trm)
  | trm_app   (a:trm) (b:trm)
  | trm_tapp  (a:trm) (T:typ)
  | trm_c.

There are a few things that must be said before moving on. I am using the locally nameless representation for binders. See this for more information on that. System F has the types, typ_bvar _ bound variables, typ_fvar _ free variables, typ_base a base type, typ_forall _ the universal type, and typ_arrow _ _ implication (or the function type). Now terms are made up of bound variables trm_bvar _, free variables trm_fvar _, \(\lambda\)-abstractions trm_lam _ _, type abstractions trm_Lam _, applications trm_app _ _, type applications trm_tapp _ _, and a constant trm_c.

The syntax for terms and types allow for the definition of non-well-formed types and terms. An example of a non-well-formed type is typ_bvar 0. This type contains an unbound bound variable. Another example is typ_arrow (typ_bvar 0) (typ_fvar 0) which also contains an unbound bound variable. A similar game can be played using terms. A type or a term is well-formed iff all bound variables are bound. Well-formed types and terms are characterized by the following two predicates.

Inductive lc_T : typ -> Prop :=
  | lcT_base : lc_T typ_base

  | lcT_fvar : forall n, lc_T (typ_fvar n)

  | lcT_arrow : forall a b,
    lc_T a ->
    lc_T b ->
    lc_T (typ_arrow a b)

  | lcT_forall : forall t L,
    (forall x, ~(In x L) -> lc_T (open_T 0 (typ_fvar x) t)) ->
    lc_T (typ_forall t).

Inductive lc : trm -> Prop :=
  | lc_fvar : forall n, lc (trm_fvar n)

  | lc_lam : forall L t A, 
    lc_T A ->
    (forall x, ~(In x L) -> lc (open_t 0 (trm_fvar x) t)) ->
    lc (trm_lam A t)

  | lc_Lam : forall t,
    lc t ->
    lc (trm_Lam t)

  | lc_app : forall t1 t2,
    lc t1 ->
    lc t2 ->
    lc (trm_app t1 t2)

  | lc_tapp : forall T t,
    lc_T T ->
    lc t ->
    lc (trm_tapp t T)

  | lc_c : lc trm_c.

A type T or a term t is called locally closed iff lc_T T and lc_t t are true. Both of these definitions depend on two functions called open_T (n:nat) (t:typ) (t':typ) : typ and open_t (n:nat) (t:trm) (t':trm) : trm. The former replaces all bound variables typ_bvar n with the type t in t'. The latter does the same with respect to terms. There is also a third function we will see open_tT (n:nat) (t:typ) (t':trm) : trm which replaces all bound type variables typ_bvar n with the type t in the term t'.

The definition of full \(\beta\eta\)-reduction is next. This is the standard definition so I do not comment on it. The reflexive and transitive closure of this relation is denoted by t1 ~*> t2.

Inductive step_t : trm -> trm -> Prop :=
  | ST_lam : forall A q q' L,
    lc_T A ->
    (forall y, 
      ~ In y L -> 
      exists q'', open_t 0 (trm_fvar y) q ~> q'' /\ q' = close_t 0 y q'') -> 
    (trm_lam A q) ~> (trm_lam A q')

  | ST_Lam : forall t t',
    t ~> t' ->
    (trm_Lam t) ~> (trm_Lam t')

  | ST_beta : forall T t1 t2,
    lc (trm_lam T t1) -> 
    lc t2 ->
    (trm_app (trm_lam T t1) t2) ~> (open_t 0 t2 t1)

  | ST_eta : forall t T,
    lc t ->
    (trm_lam T (trm_app t (trm_bvar 0))) ~> t

  | ST_type : forall t T,
    lc t ->
    lc_T T ->
    (trm_tapp (trm_Lam t) T) ~> (open_tT 0 T t)

  | ST_app1 : forall t1 t1' t2,
    lc t2 ->
    t1 ~> t1' ->
    trm_app t1 t2 ~> trm_app t1' t2

  | ST_app2 : forall t1 t2 t2',
    lc t1 ->
    t2 ~> t2' ->
    trm_app t1 t2 ~> trm_app t1 t2'

  | ST_tapp : forall t t' T, 
    lc_T T ->
    t ~> t' ->
    trm_tapp t T ~> trm_tapp t' T
where "t1 '~>' t2" := (step_t t1 t2).

The standard definition of typing for church-style system F goes something like the following; where context is defined to be List (nat*typ). In fact all names are nats. The proposition Ok G where G:context states that contexts are well-formed, which means that all variables in the domain of G are distinct.

Inductive type_check (G:context) : trm -> typ -> Prop :=
  | T_Var : forall T x,
    Ok G -> In (x,T) G ->
    type_check G (trm_fvar x) T

  | T_Lam : forall L t T1 T2,
    (forall (x:nat),
      ~ (In x L) ->
      type_check ((x,T1) :: G) (open_t 0 (trm_fvar x) t) T2) ->
    type_check G (trm_lam T1 t) (typ_arrow T1 T2)

  | T_TLam : forall L t T,
    (forall (x:nat),
      ~ (In x L) ->
      type_check G (open_tT 0 (typ_fvar x) t) (open_T 0 (typ_fvar x) T)) ->
    type_check G (trm_Lam t) (typ_forall T)

  | T_App : forall T1 T2 t1 t2,
    type_check G t1 (typ_arrow T1 T2) ->
    type_check G t2 T1 ->
    type_check G (trm_app t1 t2) T2

  | T_AppT : forall t T T' A,
    lc_T T' ->
    type_check G t (typ_forall T) ->
    (open_T 0 T' T) = A ->
    type_check G (trm_tapp t T') A

  | T_C :
    Ok G ->
    type_check G trm_c typ_base
where "G :> t ; T" := (type_check G t T).


Finally, I define \(\beta\eta\)-equality. This definition is rather long and tedious. If the reader would like a clearer picture see page 8 of Cardelli's paper cited above. There Cardelli gives the definition for \(\beta\eta\)-equality for system F with subtyping which the following definition is based.

Inductive trm_eq (G:context) : trm -> trm -> typ -> Prop :=
  | Eq_var :  forall n T, 
    G :> (trm_fvar n) ; T -> 
    G :> (trm_fvar n) <~> (trm_fvar n) ; T

  | Eq_lam : forall L t1 t2 T1 T2,
    (forall (x:nat),
     ~ (In x L) ->
     ((x,T1) :: G) :> (open_t 0 (trm_fvar x) t1) <~> (open_t 0 (trm_fvar x) t2) ; T2) ->
    G :> (trm_lam T1 t1) <~> (trm_lam T1 t2) ; (typ_arrow T1 T2)

  | Eq_tlam : forall L t1 t2 T,
    (forall (n:nat),
      ~ In n L ->
      G :> (open_tT 0 (typ_fvar n) t1) <~> (open_tT 0 (typ_fvar n) t2) ; (open_T 0 (typ_fvar n) T)) ->
    G :> (trm_Lam t1) <~> (trm_Lam t2) ; (typ_forall T)

  | Eq_app : forall a a' b b' A B,
    G :> a <~> a' ; typ_arrow A B ->
    G :> b <~> b' ; A ->
    G :> (trm_app a b) <~> (trm_app a' b') ; B
  
  | Eq_tapp : forall a a' A B C,
    lc_T B ->
    G :> a <~> a' ; typ_forall A ->
    C = (open_T 0 B A) ->
    G :> (trm_tapp a B) <~> (trm_tapp a' B) ; C

  | Eq_beta : forall L a a' b b' c A B,
    (forall (x:nat),
      ~ (In x L) ->
      ((x,A) :: G) :> (open_t 0 (trm_fvar x) b) <~> (open_t 0 (trm_fvar x) b') ; B) ->
    G :> a <~> a' ; A ->
    c = (open_t 0 a' b') -> 
    G :> (trm_app (trm_lam A b) a) <~>  c ; B 

  | Eq_tbeta : forall L a b c A B C,
    lc_T B ->
    (forall (n:nat),
      ~ In n L ->
      G :> (open_tT 0 (typ_fvar n) a) <~> (open_tT 0 (typ_fvar n) b) ; (open_T 0 (typ_fvar n) A)) ->
    c = open_tT 0 B b -> 
    C = open_T 0 B A ->
    G :> (trm_tapp (trm_Lam a) B) <~>  c ; C

  | Eq_eta : forall b b' A B,
    G :> b <~> b' ; typ_arrow A B ->
    G :> (trm_lam A (trm_app b (trm_bvar 0))) <~> b' ; typ_arrow A B
 
  | Eq_teta : forall b b' A,
    G :> b <~> b' ; typ_forall A ->
    G :> (trm_Lam (trm_tapp b (typ_fvar 0))) <~> b' ; typ_forall A

  | Eq_sym : forall t1 t2 T, 
    G :> t2 <~> t1 ; T -> 
    G :> t1 <~> t2 ; T 

  | Eq_trans : forall t1 t2 t3 T,  
    G :> t1 <~> t3 ; T ->  
    G :> t3 <~> t2 ; T -> G :> t1 <~> t2 ; T 
where "G :> t1 <~> t2 ; T" := (trm_eq G t1 t2 T).

This concludes the definition of system F with \(\beta\eta\)-equality. The next step is to define the church encoding of pairs. Following the definition I state several correctness properties certifying that we have the correct definitions.

At this point I define the church-encoding of pairs. The following defines the constructor for pairs.

Definition pair_cons := 
  trm_Lam (trm_Lam (trm_lam (typ_bvar 1) 
  (trm_lam (typ_bvar 0) 
    (trm_Lam (trm_lam (typ_arrow (typ_bvar 2) (typ_arrow (typ_bvar 1) (typ_bvar 0))) 
      ((trm_app (trm_app (trm_bvar 0) (trm_bvar 2)) (trm_bvar 1)))))))).

This is a little hard to read so lets take a look at this in mathematical notation.
\[
\Lambda X.\Lambda Y.\lambda x:X.\lambda y:Y.\Lambda Z.\lambda z:X \rightarrow (Y \rightarrow Z).(z\,x)\,y.
\]
Here the variables \(x\) and \(y\) are the data we are packaging up in the pair. The variable \(z\) is used to project out the data of the pair. Next we define the type of pair_cons.

Definition pair_cons_type := 
 typ_forall (typ_forall (typ_arrow (typ_bvar 1) (typ_arrow (typ_bvar 0) 
   (typ_forall (typ_arrow 
     (typ_arrow (typ_bvar 2) (typ_arrow (typ_bvar 1) (typ_bvar 0))) (typ_bvar 0)))))).

In mathematical notation this looks like the following:

\[
\forall X.\forall Y.X \rightarrow (Y \rightarrow \forall Z.((X \rightarrow (Y \rightarrow Z)) \rightarrow Z)).
\]

The previous definitions are how define church-encoded pairs in system F. Next are the definitions of two convince functions which will allow us to easily construct pairs. The following is the definition of a pair given two terms a:A and b:B.

Definition pair (A B:typ) (a b:trm) := trm_app (trm_app (trm_tapp (trm_tapp pair_cons A) B) a) b.

The following is the type of pair.

Definition pair_type (A B:typ) := 
  typ_forall (typ_arrow (typ_arrow A (typ_arrow B (typ_bvar 0))) (typ_bvar 0)).

We need to be sure these are the correct definitions so first I show that pair_cons has the type pair_cons_type and that pair has the type pair_type. I omit proofs for brevity.

Lemma pair_cons_typing : nil :> pair_cons ; pair_cons_type.

Lemma pair_typing : forall A B a b, 
  lc_T A ->
  lc_T B ->
  nil :> a ; A ->
  nil :> b ; B ->
  nil :> pair A B a b ; pair_type A B.

Using our pair definition we can construct pairs, but we still need ways of getting the data back out of the pairs. Next I define the first and second projections of our pairs. First, O need two very simple functions. They are used for deciding which projection we want. The first function, arg_one, is used to get the first projection, while the second, arg_two, is used to get the second.

Definition arg_one A B := trm_lam A (trm_lam B (trm_bvar 1)).
Definition arg_two A B := trm_lam A (trm_lam B (trm_bvar 0)).

Now using arg_one and arg_two I can define the first and second projections as follows:

Definition fst (A:typ) (B:typ) (p:trm) := trm_app (trm_tapp p A) (arg_one A B).
Definition snd (A:typ) (B:typ) (p:trm) := trm_app (trm_tapp p B) (arg_two A B).

Again I must be sure these are the correct definitions. I first show that fst and snd have the expected type.

Lemma fst_typing : forall A B p,
  lc_T A ->
  lc_T B ->
  nil :> p ; pair_type A B ->
  nil :> fst A B p ; A.

Lemma snd_typing : forall A B p,
  lc_T A ->
  lc_T B ->
  nil :> p ; pair_type A B ->
  nil :> snd A B p ; B.

Recall that I defined \(\beta\eta\)-equality. All equational reasoning I will do will be using \(\beta\eta\)-equality. Using this I show that applying fst to a pair is equivalent to the first project of the pair.

Lemma fst_eq : forall A B a b,
  nil :> a ; A ->
  nil :> b ; B ->
  nil :> (pair A B a b) ; pair_type A B ->
  nil :> fst A B (pair A B a b) <~> a ; A.

Finally, I do the same for snd.

Lemma snd_eq : forall A B a b,
  nil :> a ; A ->
  nil :> b ; B ->
  nil :> (pair A B a b) ; pair_type A B ->
  nil :> snd A B (pair A B a b) <~> b ; B.

We have set out on a journey to better understand the relationship between church-encoded pairs in system F and categorical products. So far we have defined system F with \(\beta\eta\)-equality and defined church-encoded pairs in system F. In the next post we will layout all the necessary category theory we will need for the main result. Stay tuned.

Posted in category theory, Coq, Logic, metatheory, Semantics, Type Theory | Tagged , , , , , | Leave a comment

Coq, Category Theory, and Steve Awodey’s Book.

There is a great book on category theory called, you guessed it, “Category Theory” by Steve Awodey. It is a great book! I really enjoy it. I am interested in category theory, because it provides a possible interpretation of type theory. Actually, I have been thinking of the relationship between category theory and type theory as being different perspectives of the same thing. That thing is computation. As I learn more category theory I thought it would be useful to start thinking about how to formalize results involving category theory in Coq.

To my surprise I found that Adam Megacz has started a nice library called coq-categories. His formalization follows Awodey’s book exactly. In fact in order to find a concept in the formalization one has to know which chapter it is in Awodey’s book. This is a great formalization. It uses type classes to define categories and functors. Type classes are such a nice fit for categories.

There are two issues with this formalization. First, it does not formalize anything beyond chapter 7 of the book. The second is that it has not be worked on in over a year. The last commit was 2011-04-24. The version of Coq it was developed in was 8.3pl2.

I decided to clone Adam’s git repo. and get it working with Coq 8.4. It took only about fifteen minutes to get working. The only problems that I ran into were related to unification of implicit arguments.

This post just serves as a place to put up a patch in case anyone is looking for one. I plan to develop this a bit further. I would like to formalize control categories and play around with the semantics of a few different type theories. Here is the patch.

I plan to post more about my extensions in the near future.

Posted in category theory, Coq, Fun, Logic, metatheory, Semantics, Type Theory | Tagged , , , , | 2 Comments

Cartesian Closed Comics

There is a new comic called Cartesian Closed Comics. They are pretty good.

Below is my favorite one:

Find more by clicking the above link.

Posted in Fun | Tagged , , | Leave a comment