Setoids and Cats

This paper argues that a pragmatic approach is needed for integrating design and formalisation of complex systems. We report on our approach to designing the seL4 operating system microkernel API and its formalisation in Isabelle/HOL. The formalisation consists of the systematic translation of significant parts of the functional programming language Haskell into Isabelle/HOL, including monad-based code. We give an account of the experience, decisions and outcomes in this translation as well as the technical problems we encountered together with our solutions. The longer-term goal is to demonstrate that formalisation and verification of a large, complex, OS-level code base is feasible with current tools and methods and is in the order of magnitude of traditional development cost.

In this blog, I will be (to start with, at least) mainly talking about my work on simulating dependent types in Haskell, and using these simulated dependent types to prove theorems and to prove properties of code.

Genuine dependent types (as in, e.g., PVS) refer to one or more values. Since Haskell does not allow mixing of types and values, it is not possible to have genuine dependent types in Haskell; however, we can simulate them. The basic idea is to "lift" individual values to the type level - something like this, for example (this code snippet uses a Glasgow extension to Haskell - hence the compiler options pragma):

{-# OPTIONS_GHC -fglasgow-exts #-}

-- Zero exists
data Zero = Zero
-- The successor of n exists iff n is a natural number
data Succ n = Succ (Nat n)

data Nat n where
-- Zero is a natural number
NatZero :: Nat Zero
-- If m is a natural number, then so is its successor
NatSucc :: Nat m -> Nat (Succ m)

Because Zero and Succ Zero and so on are type-level representations of values (in particular, the values 0 and 1) I call them metaterms. Because Nat is a type constructor that classifies metaterms, I call it a metatype. So terms are "lifted" to metaterms, and types are "lifted" to metatypes, in my approach.

However, the approach I'm following is a little more complicated than the code sample shown above. The reason why has to do with equality (a thorny topic that I'll be elaborating on in my next post).

I initially thought that it would be sufficient to treat metaterms of most metatypes as equal iff their representations as Haskell types were treated as identical by the compiler. Clearly, a metaterm such as Succ (Succ Zero) should be treated as equal only to itself, I thought.

However, there is one clear exception which doesn't fit this pattern, which I encountered early on: functions. As a formal methods person, I would like to be able to specify arbitrary postconditions of functions. However, a typical postcondition - because it is just a relation between the inputs and outputs of a function - can be constructed in many different, equivalent ways. We would like to be able to abstract away from the particular representation of postconditions, and consider two postconditions to be equal iff they represent "the same" relation between inputs and outputs.

When are two relations the same? If we model a relation as a set of tuples, we can consider two relations to be the same if they are modelled by the same set. However, we have to be careful about the definition of equality used for the elements of the tuples. The relation might, for example, be between two functions - in which case we would want to consider a function with a postcondition of "f(x) == 2x" as equal to a function with a postcondition of "f(x) == x+x". In other words, when considering equality of higher-order functions, we need to apply our notion of equality recursively to postconditions.

So my next idea was to create a generalised abstract data type for equality, with two branches: one for ordinary, "identical type-level representation" equality, and one for equality of functions. You would use this GADT, which I called Equiv, as a universal equality relation.

Of course, this hack wasn't good enough. It quickly became evident that the problem I had with functions carried over to many other data types - functors, categories, etc.

So what to do? It wouldn't be modular to require developers to just add a new branch to Equiv whenever they wanted a custom equality relation defined over a new metatype. How ugly, and how unworkable - existing code would have to be updated every time a new branch was added, and it would expose irrelevant details to the whole system.

My solution was to mandate that every metatype in my system was to represent a setoid of values. A setoid is simply a set together with an equivalence relation defined over that set. This way, every metatype carries around with it a definition of the precise conditions that need to be satisfied in order for two metaterms of that metatype to be considered equivalent.

And in what I think is a nice piece of syntactic hackery (made possible by Haskell's ability to define not just operators on values, but also type operators), the notion of reflexivity:

a a1 a1 -- "a1 is equal to a1 under the equivalence relation given by a"

is quite literally equated with the meta-typing relation I have created ::*

a1 ::* a -- "a1 is of (meta)type a"

This is because the definition of ::* is a type synonym that looks like this (after expanding macros):

type (::*) (a1 :: *) (r :: * -> * -> *) = r a1 a1

This does not cause any problems, because by definition, a value is equal to itself under the equivalence relation of the setoid iff the value is a member of the setoid. As a happy consequence, we don't need to state, and nor do we need to prove, the reflexivity law for equality relations - because it is true for all setoids by definition!

However, my problems with equality weren't over - of which more next time.

Setoids and Cats

22 August 2006

Formalising a High-Performance Microkernel

Formalising a High-Performance Microkernel

Abstract

Online Copy

18 August 2006

Types-as-values suggest setoids-as-metatypes

About Me

Links

Previous Posts

Archives