A short exploration of GHC’s instance resolution hiding mistakes from the type checker.

I was recently working on some Haskell code (for research, with Jack Hughes) and happened to be using a monoid (via the Monoid type class) and I was rushing. I accidentally wrote x `mempty` y instead of x `mappend` y. The code with mempty type checked and compiled, but I quickly noticed some tests giving unexpected results. After a pause, I checked the recent diff and noticed this mistake, but I had to think for a moment about why this mistake was not leading to a type error. I thought this was an interesting little example of how type class instance resolution can sometimes trip you up in Haskell, and how to uncover what is going on. This also points to a need for GHC to explain its instance resolution, something others have thought about; I will briefly mention some links at the end.

The nub of the problem

In isolation, my mistake was essentially this:

whoops :: Monoid d => d -> d -> d
whoops x y = mempty x y  -- Bug here. Should be "mappend x y"

So, given two parameters of type d for which there is a monoid structure on d, we use both parameters as arguments to mempty.

Recall the Monoid type class is essentially:

class Monoid d where
  mempty :: d
  mappend :: d -> d -> d

(note that ​​Monoid also has a derived operation mconcat and is now decomposed into Semigroup and Monoid but I elide that detail here, see https://hackage.haskell.org/package/base-4.14.0.0/docs/Prelude.html#t:Monoid).

We might naively think that whoops would therefore not type check since we do not know that d is a function type. However, whoops is well-typed and evaluating
whoops [1,2,3] [4,5,6] returns []. If the code had been as I intended, (using mappend here instead of mempty) then we would expect [1,2,3,4,5,6] according to the usual monoid on lists.

The reason this is not a type error is because of GHC’s instance resolution and the following provided instance of `Monoid`:

instance Monoid b => Monoid (a -> b) where
  mempty = \_ -> mempty
  mappend f g = \x -> f x `mappend` g x

That is, functions are monoids if their domain is a monoid with mempty as the constant function returning the mempty element of b and mappend as the pointwise lifting of a monoid to a function space.

In this case, we can dig into what is happening by compiling1 with --ddump-ds-preopt and looking at GHC’s desugared output before optimisation, where all the type class instances have been resolved. I’ve cleaned up the output a little (mostly renaming):

whoops :: forall d. Monoid d => d -> d -> d
whoops = \ (@ d) ($dMonoid :: Monoid d) ->
  let
    $dMonoid_f :: Monoid (d -> d)
    $dMonoid_f = GHC.Base.$fMonoid-> @ d @ d $dMonoid }

    $dMonoid_ff :: Monoid (d -> d -> d)
    $dMonoid_ff = GHC.Base.$fMonoid-> @ (d -> d) @ d $dMonoid_f }
  in
    \ (x :: d) (y :: d) -> mempty @ (d -> d -> d) $dMonoid_ff x y

The second line shows whoops has a type parameter (written @ d) and the incoming dictionary $dMonoid representing the Monoid d type class instance (type classes are implemented as data types called dictionaries, and I use the names $dMonoidX for these here).

Via the explicit type application (of the form @ t for a type term t) we can see in the last line that mempty is being resolved at the type d -> d -> d with the monoid instance Monoid (d -> d -> d) given here by the $dMonoid_ff construction just above. This is in turn derived from the Monoid (d -> d) given by the dictionary construction $dMonoid_f just above that. Thus we have gone twice through the lifting of a monoid to a function space, and so our use of mempty here is:

mempty @ (d -> d -> d) $dMonoid_ff = \_ -> (\_ -> mempty @ $dMonoid)

thus

mempty @ (d -> d -> d) $dMonoid_ff x y = mempty @ d $dMonoid

That’s why the program type checks and we see the mempty element of the original intended monoid on d when applying whoops to some arguments and evaluating.

Epilogue

I luckily spotted my mistake quite quickly, but this kind of bug can be a confounding experience for beginners. There has been some discussion about extending GHCi with a feature allowing users to ask GHC to explain its instance resolution. Michael Sloan has a nice write up discussing the idea and there is a GHC ticket proposing by Icelandjack something similar which seems like it would work well in this context where you want to ask what the instance resolution was for a particular expression. There are many much more confusing situations possible that get hidden by the implicit nature of instance resolution, so I think this would be a very useful feature, for beginner and expert Haskell programmers alike. And it certainly would have explained this particular error quickly to me without me having to scribble on paper and then check --ddump-ds-preopt to confirm my suspicion.

Additional: I should also point out that this kind of situation could be avoided if there were ways to scope, import, and even name type class instances. The monoid instance Monoid b => Monoid (a -> b) is very useful, but having it in scope by default as part of base was really the main issue here.


1 To compile, put this in a file with a main stub, e.g.

main :: IO ()
main = return ()
Advertisement

Considering the order of results when computing Cartesian product [short]

Sometimes in programming you need to do a pairwise comparison of some elements coming from two collections, for example, checking possible collisions between particles (which may be embedded inside a quadtree representation for efficiency). A handy operation is then the Cartesian product of the two sets of elements, to get the set of all pairs, which can then be traversed.

Whenever I need a Cartesian product of two lists in Haskell, I whip out the list monad to generate the Cartesian product:

cartesianProduct :: [a] -> [b] -> [(a, b)]
cartesianProduct as bs = as >>= (\a -> bs >>= (\b -> [(a, b)]))

Thus for every element a in as we get every element b in bs and form the singleton list of the pair (a, b), all of which get concatenated to produce the result. For example:

*Main> cartesianProduct [1..4] [1..4]
[(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(2,4),(3,1),(3,2),(3,3),(3,4),(4,1),(4,2),(4,3),(4,4)]

This traverses the Cartesian space in row order. That is, if we imagine the square grid of possibilities here (where the element at the ith row and jth column corresponds to element (i, j)), then cartesianProduct generates the pairs in the following order:

cp-rowOrder

In mathematics, the Cartesian product is defined on sets, so the order of these pairs is irrelevant. Indeed, if we want to examine all pairs, then it may not matter in what order. However, if we learn something as we look at the pairs (i.e., accumulating some state), then the order can be important.

In some recent research, I was building an automated algebra tool to find axioms for some algebraic structure, based on an interpretation. This involved generating all pairs of syntax trees t and t' of terms over the algebraic structure, up to a particular depth, and evaluating whether t = t'. I also had some automated proof search machinery to make sure that this equality wasn’t already derivable from the previously generated axioms. I could have done this derivability check as a filtering afterwards, but I was exploring a very large space, and did not expect to even be able to generate all the possibilities. I just wanted to let the thing run for a few hours and see how far it got. Therefore, I needed Cartesian product to get all pairs, but the order in which I generated the pairs became important for the effectiveness of my approach.  The above ordering (row major order) was not so useful as I was unlikely to find interesting axioms quickly by traversing the span of a (very long) row; I needed to unpack the space in a more balanced way.

My first attempt at a more balanced Cartesian product was the following:

cartesianProductBalanced :: [a] -> [b] -> ([(a, b)])
cartesianProductBalanced as bs =
     concatMap (zip as) (tails bs)
  ++ concatMap (flip zip bs) (tail (tails as))

tails gives the list of successively applying tail, e.g., tails [1..4] = [[1,2,3,4],[2,3,4],[3,4],[4],[]]

This definition for cartesianProductBalanced essentially traverses the diagonal of the space and then the lower triangle of the matrix, progressing away from the diagonal to the bottom-left, then traversing the upper triangle of the matrix (the last line of code), progressing away from the diagonal to the top-right corner. The ordering for a 4×4 space is then:

*Main> cartesianProductBalanced [1..4] [1..4]
[(1,1),(2,2),(3,3),(4,4),(1,2),(2,3),(3,4),(1,3),(2,4),(1,4),(2,1),(3,2),(4,3),(3,1),(4,2),(4,1)]

cp-low-then-upper
This was more balanced, giving me a better view of the space, but does not scale well: traversing the elements of this Cartesian product linearly means first traversing all the way down the diagonal, which could be very long! So, we’ve ended up with a similar problem to the row-major traversal.

Instead, I finally settled on a “tiling” Cartesian product:

cartesianProductTiling :: [a] -> [b] -> [(a, b)]
cartesianProductTiling [] _ = []
cartesianProductTiling _ [] = []
cartesianProductTiling [a] [b] = [(a, b)]
cartesianProductTiling as bs =
       cartesianProductTiling as1 bs1
    ++ cartesianProductTiling as2 bs1
    ++ cartesianProductTiling as1 bs2
    ++ cartesianProductTiling as2 bs2
  where
    (as1, as2) = splitAt ((length as) `div` 2) as
    (bs1, bs2) = splitAt ((length bs) `div` 2) bs

This splits the space into four quadrants and recursively applies itself to the upper-left, then lower-left, then upper-right, then lower-right, e.g.,

*Main> cartesianProductTiling [1..4] [1..4]
[(1,1),(2,1),(1,2),(2,2),(3,1),(4,1),(3,2),(4,2),(1,3),(2,3),(1,4),(2,4),(3,3),(4,3),(3,4),(4,4)]

cp-tiling

Thus, we explore the upper-left portion first, without having to traverse down a full row, column, or diagonal. This turned out to be much better for my purposes of searching the space of possible axioms for an algebraic structure.

Note that this visits the elements in Z-order (also known as the Lebesgue space-filling curve) [Actually, it is a reflected Z-order curve, but that doesn’t matter here].

The only downside to this tiling approach is that it does not work for infinite spaces (as it calculates the length of as and bs), so we cannot exploit Haskell’s laziness here to help us in the infinite case. I’ll leave it as an exercise for the reader to define a tiling version that works for infinite lists.

Of course, there are many other possibilities depending on your application domain!


Later edit:  In the case of the example I mentioned, I actually do not need the full Cartesian product since the order of pairs was not relevant to me (equality is symmetric) and neither do I need the pairs of identical elements (equality is reflexive). So for my program, computing just the lower triangle of Cartesian product square is much more efficient, i.e,:

cartesianProductLowerTriangle :: [a] -> [b] -> [(a, b)]
cartesianProductLowerTriangle as bs = concatMap (zip as) (tail (tails bs))

However, this does not have the nice property of tiling product where we visit the elements in a more balanced fashion. I’ll leave that one for another day (I got what I needed from my program in the end anyway).

Rearranging equations using a zipper

Whilst experimenting with some ideas for a project, I realised I needed a quick piece of code to rearrange equations (defined in terms of +, *, -, and /) in AST form, e.g., given an AST for the equation x = y + 3, rearrange to get y = x - 3.

I realised that equations can be formulated as zippers over an AST, where operations for navigating the zipper essentially rearrange the equation. I thought this was quite neat, so I thought I would show the technique here. The code is in simple Haskell.

I’ll show the construction for a simple arithmetic calculus with the following AST data type of terms:

data Term = Add Term Term 
          | Mul Term Term 
          | Div Term Term
          | Sub Term Term  
          | Neg Term
          | Var String
          | Const Integer

with some standard pretty printing code:

instance Show Term where 
    show (Add t1 t2) = (show' t1) ++ " + " ++ (show' t2)
    show (Mul t1 t2) = (show' t1) ++ " * " ++ (show' t2)
    show (Sub t1 t2) = (show' t1) ++ " - " ++ (show' t2)
    show (Div t1 t2) = (show' t1) ++ " / " ++ (show' t2)
    show (Neg t) = "-" ++ (show' t) 
    show (Var v) = v
    show (Const n) = show n

where show' is a helper to minimise brackets e.g. pretty printing “-(v)” as “-v”.

show' :: Term -> String
show' (Var v) = v
show' (Const n) = show n
show' t@(Neg (Var v)) = show t
show' t@(Neg (Const n)) = show t
show' t = "(" ++ show t ++ ")"

Equations can be defined as pairs of terms, i.e., ‘T1 = T2’ where T1 and T2 are both represented by values of Term. However, instead, I’m going to represent equations via a zipper.

Zippers (described beautifully in the paper by Huet) represent values that have some subvalue “in focus”. The position of the focus can then be shifted through the value, refocussing on different parts. This is encoded by pairing a focal subvalue with a path to this focus, which records the rest of the value that is not in focus. For equations, the zipper type pairs a focus Term (which we’ll think of as the left-hand side of the equation) with a path (which we’ll think of as the right-hand side of the equation).

data Equation = Eq Term Path

Paths give a sequence of direction markers, essentially providing an address to the term in focus, starting from the root, where each marker is accompanied with the label of the parent node and the subtree of the branch not taken, i.e., a path going left is paired with the right subtree (which is not on the path to the focus).

data Path = Top (Either Integer String)  -- At top: constant or variable
           | Bin Op                -- OR in a binary operation Op,
                 Dir               --    in either left (L) or right (R) branch
                 Term              --    with the untaken branch 
                 Path              --    and the rest of the equation
           | N Path                -- OR in the unary negation operation

data Dir = L | R 
data Op = A | M | D | S | So | Do

The Op type gives tags for every operation, as well as additional tags So and Do which represent sub and divide but with arguments flipped. This is used to get an isomorphism between the operations that zip “up” and “down” the equation zipper, refocussing on subterms.

A useful helper maps tags to their operations:

opToTerm :: Op -> (Term -> Term -> Term)
opToTerm A = Add
opToTerm M = Mul
opToTerm D = Div
opToTerm S = Sub
opToTerm So = (\x -> \y -> Sub y x)
opToTerm Do = (\x -> \y -> Div y x)

Equations are pretty printed as follows:

instance Show Path where
    show p = show . pathToTerm $ p
    
instance Show Equation where
    show (Eq t p) = (show t) ++ " = " ++ (show p)

where pathToTerm converts paths to terms:

pathToTerm :: Path -> Term
pathToTerm (Top (Left c)) = Const c
pathToTerm (Top (Right v))= Var v
pathToTerm (Bin op L t p) = (opToTerm op) (pathToTerm p) t
pathToTerm (Bin op R t p) = (opToTerm op) t (pathToTerm p)
pathToTerm (N p)          = Neg (pathToTerm p)

Now onto the zipper operations which providing rebalancing of the equation. Equations are zipped-down by left and right, which for a binary operation focus on either the left or right argument respectively, for unary negation focus on the single argument, and for constants or variables does nothing. When going left or right, the equations are rebalanced with their inverse arithmetic operations (show in the comments here):

left (Eq (Var s) p)     = Eq (Var s) p
left (Eq (Const n) p)   = Eq (Const n) p
left (Eq (Add t1 t2) p) = Eq t1 (Bin S L t2 p)   -- t1 + t2 = p  -> t1 = p - t2
left (Eq (Mul t1 t2) p) = Eq t1 (Bin D L t2 p)   -- t1 * t2 = p  -> t1 = p / t2
left (Eq (Div t1 t2) p) = Eq t1 (Bin M L t2 p)   -- t1 / t2 = p  -> t1 = p * t2
left (Eq (Sub t1 t2) p) = Eq t1 (Bin A L t2 p)   -- t1 - t2 = p  -> t1 = p + t2
left (Eq (Neg t) p)     = Eq t (N p)             -- -t = p       -> t = -p

right (Eq (Var s) p)     = Eq (Var s) p          
right (Eq (Const n) p)   = Eq (Const n) p
right (Eq (Add t1 t2) p) = Eq t2 (Bin So R t1 p)  -- t1 + t2 = p -> t2 = p - t1
right (Eq (Mul t1 t2) p) = Eq t2 (Bin Do R t1 p)  -- t1 * t2 = p -> t2 = p / t1
right (Eq (Div t1 t2) p) = Eq t2 (Bin D R t1 p)   -- t1 / t2 = p -> t2 = t1 / p
right (Eq (Sub t1 t2) p) = Eq t2 (Bin S R t1 p)   -- t1 - t2 = p -> t2 = t1 - p
right (Eq (Neg t) p)     = Eq t (N p)

In both left and right, Add and Mul become subtraction and dividing, but in right in order for the the zipping-up operation to be the inverse, subtraction and division are represented using the flipped So and Do markers.

Equations are zipped-up by up, which unrolls one step of the path and reforms the term on the left-hand side from that on the right. This is the inverse of left and right:

up (Eq t1 (Top a))        = Eq t1 (Top a)
up (Eq t1 (Bin A L t2 p)) = Eq (Sub t1 t2) p -- t1 = t2 + p -> t1 - t2 = p
up (Eq t1 (Bin M L t2 p)) = Eq (Div t1 t2) p -- t1 = t2 * p -> t1 / t2 = p
up (Eq t1 (Bin D L t2 p)) = Eq (Mul t1 t2) p -- t1 = p / t2 -> t1 * t2 = p
up (Eq t1 (Bin S L t2 p)) = Eq (Add t1 t2) p -- t1 = p - t2 -> t1 + t2 = p

up (Eq t1 (Bin So R t2 p)) = Eq (Add t2 t1) p -- t1 = p - t2 -> t2 + t1 = p
up (Eq t1 (Bin Do R t2 p)) = Eq (Mul t2 t1) p -- t1 = p / t2 -> t2 * t1 = p
up (Eq t1 (Bin D R t2 p))  = Eq (Div t2 t1) p -- t1 = t2 / p -> t2 / t1 = p
up (Eq t1 (Bin S R t2 p))  = Eq (Sub t2 t1) p -- t1 = t2 - p -> t2 - t1 = p

up (Eq t1 (N p))           = Eq (Neg t1) p    -- t1 = -p     -> -t1 = p

And that’s it! Here is an example of its use from GHCi.

foo = Eq (Sub (Mul (Add (Var "x") (Var "y")) (Add (Var "x") 
           (Const 1))) (Const 1)) (Top (Left 0))

*Main> foo
((x + y) * (x + 1)) - 1 = 0

*Main> left $ foo
(x + y) * (x + 1) = 0 + 1

*Main> right . left $ foo
x + 1 = (0 + 1) / (x + y)

*Main> left . right . left $ foo
x = ((0 + 1) / (x + y)) - 1

*Main> up . left . right . left $ foo
x + 1 = (0 + 1) / (x + y)

*Main> up . up . left . right . left $ foo
(x + y) * (x + 1) = 0 + 1

*Main> up . up . up . left . right . left $ foo
((x + y) * (x + 1)) - 1 = 0

It is straightforward to prove that: up . left $ x = x (when left x is not equal to x) and up . right $ x = x(when right x is not equal to x).

Note, I am simply rebalancing the syntax of equations: this technique does not help if you have multiple uses of a variable and you want to solve the question for a particular variable, e.g. y = x + 1/(3x), or quadratics.

Here’s a concluding thought. The navigation operations left, right, and up essentially apply the inverse of the operation in focus to each side of the equation. We could therefore reformulate the navigation operations in terms of any group: given a term L ⊕ R under focus where is the binary operation of a group with inverse operation -1, then navigating left applies ⊕ R-1 to both sides and navigating right applies ⊕ L-1. However, in this blog post there is a slight difference: navigating applies the inverse to both sides and then reduces the term of the left-hand side using the group axioms X ⊕ X-1 = I (where I is the identity element of the group) and X ⊕ I = X such that the term does not grow larger and larger with inverses.

I wonder if there are other applications, which have a group structure (or number of interacting groups), for which the above zipper approach would be useful?

Subcategories & “Exofunctors” in Haskell

In my previous post I discussed the new constraint kinds extension to GHC, which provides a way to get type-indexed constraint families in GHC/Haskell. The extension provides some very useful expressivity. In this post I’m going to explain a possible use of the extension.

In Haskell the Functor class is misleading named as it actually captures the notion of an endofunctor, not functors in general. This post shows a use of constraint kinds to define a type class of exofunctors; that is, functors that are not necessarily endofunctors. I will explain what all of this means.

This example is just one from a draft note (edit July 2012: draft note subsumed by my TFP 2012 submission) explaining the use of constraint families, via the constraint kinds extension, for describing abstract structures from category theory that are parameterised by subcategories, including non-endofunctors, relative monads, and relative comonads.

I will try to concisely describe any relevant concepts from category theory, through the lens of functional programming, although I’ll elide some details.

The Hask category

The starting point of the idea is that programs in Haskell can be understood as providing definitions within some category, which we will call Hask. Categories comprise a collection of objects and a collection of morphisms which are mappings between objects. Categories come equipped with identity morphisms for every object and an associative composition operation for morphisms (see Wikipedia for a more complete, formal definition). For Hask, the objects are Haskell types, morphisms are functions in Haskell, identity morphisms are provided by the identity function, and composition is the usual function composition operation. For the purpose of this discussion we are not really concerned about the exact properties of Hask, just that Haskell acts as a kind of internal language for category theory, within some arbitrary category Hask (Dan Piponi provides some discussion on this topic).

Subcategories

Given some category C, a subcategory of C comprises a subcollection of the objects of C and a subcollection of the morphisms of C which map only between objects in the subcollection of this subcategory.

We can define for Hask a singleton subcategory for each type, which has just that one type as an object and functions from that type to itself as morphisms e.g. the Int-subcategory of Hask has one object, the Int type, and has functions of type Int → Int as morphisms. If this subcategory has all the morphisms Int → Int it is called a full subcategory. Is there a way to describe “larger” subcategories with more than just one object?

Via universal quantification we could define the trivial (“non-proper”) subcategory of Hask with objects of type a (implicitly universally quantified) and morphisms a -> b, which is just Hask again. Is there a way to describe “smaller” subcategories with fewer than all the objects, but more than one object? Yes. For this we use type classes.

Subcategories as type classes

The instances of a single parameter type class can be interpreted as describing the members of a set of types (or a relation on types for multi-parameter type classes). In a type signature, a universally quantified type variable constrained by a type class constraint represents a collection of types that are members of the class. E.g. for the Eq class, the following type signature describes a collection of types for which there are instances of Eq:

      Eq a => a

The members of Eq are a subcollection of the objects of Hask. Similarly, the type:

      (Eq a, Eq b) => (a -> b)

represents a subcollection of the morphisms of Hask mapping between objects in the subcollection of objects which are members of Eq. Thus, the Eq class defines an Eq-subcategory of Hask with the above subcollections of objects and morphisms.

Type classes can thus be interpreted as describing subcategories in Haskell. In a type signature, a type class constraint on a type variable thus specifies the subcategory which the type variable ranges over the objects of. We will go on to use the constraint kinds extension to define constraint-kinded type families, allowing structures from category theory to be parameterised by subcategories, encoded as type class constraints. We will use functors as the example in this post (more examples here).

Functors in Haskell

In category theory, a functor provides a mapping between categories e.g. F : C → D, mapping the objects and morphisms of C to objects and morphisms of D. Functors preserves identities and composition between the source and target category (see Wikipedia for more). An endofunctor is a functor where C and D are the same category.

The type constructor of a parametric data type in Haskell provides an object mapping from Hask to Hask e.g. given a data type data F a = ... the type constructor F maps objects (types) of Hask to other objects in Hask. A functor in Haskell is defined by a parametric data type, providing an object mapping, and an instance of the well-known Functor class for that data type:

class Functor f where
   fmap :: (a -> b) -> f a -> f b

which provides a mapping on morphisms, called fmap. There are many examples of functors in Haskell, for examples lists, where the fmap operation is the usual map operation, or the Maybe type. However, not all parametric data types are functors.

It is well-known that the Set data type in Haskell cannot be made an instance of the Functor class. The Data.Set library provides a map operation of type:

 
Set.map :: (Ord a, Ord b) => (a -> b) -> Set a -> Set b

The Ord constraint on the element types is due to the implementation of Set using balanced binary trees, thus elements must be comparable. Whilst the data type is declared polymorphic, the constructors and transformers of Set allow only elements of a type that is an instance of Ord.

Using Set.map to define an instance of the Functor class for Set causes a type error:

instance Functor Set where
   fmap = Data.Set.map

...

foo.lhs:4:14:
    No instances for (Ord b, Ord a)
      arising from a use of `Data.Set.map'
    In the expression: Data.Set.map
    In an equation for `fmap': fmap = Data.Set.map
    In the instance declaration for `Functor Set'

The type error occurs as the signature for fmap has no constraints, or the empty (always true) constraint, whereas Set.map has Ord constraints. A mismatch occurs and a type error is produced.

The type error is however well justified from a mathematical perspective.

Haskell functors are not functors, but endofunctors

First of all, the name Functor is a misnomer; the class actually describes endofunctors, that is functors which have the same category for their source and target. If we understand type class constraints as specifying a subcategory, then the lack of constraints on fmap means that Functor describes endofunctors HaskHask.

The Set data type is not an endofunctor; it is a functor which maps from the Ord-subcategory of Hask to Hask. Thus Set :: OrdHask. The class constraints on the element types in Set.map declare the subcategory of Set functor to which the morphisms belong.

Type class of exofunctors

Can we define a type class which captures functors that are not necessarily endofunctors, but may have distinct source and target categories? Yes, using an associated type family of kind Constraint.

The following ExoFunctor type class describes a functor from a subcategory of Hask to Hask:

{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE TypeFamilies #-}

class ExoFunctor f where
   type SubCat f x :: Constraint
   fmap :: (SubCat f a, SubCat f b) => (a -> b) -> f a -> f b

The SubCat family defines the source subcategory for the functor, which depends on f. The target subcategory is just Hask, since f a and f b do not have any constraints.

We can now define the following instance for Set:

instance ExoFunctor Set where
   type SubCat Set x = Ord x
   fmap = Set.map

Endofunctors can also be made an instance of ExoFunctor using the empty constraint e.g.:

instance ExoFunctor [] where
    type SubCat [] a = ()
    fmap = map

(Aside: one might be wondering whether we should also have a way to restrict the target subcategory to something other than Hask here. By covariance we can always “cast” a functor C → D, where D is a subcategory of some other category E, to C → E without any problems. Thus, there is nothing to be gained from restricting the target to a subcategory, as it can always be reinterpreted as Hask.)

Conclusion (implementational restrictions = subcategories)

Subcategory constraints are needed when a data type is restricted in its polymorphism by its operations, usually because of some hidden implementational details that have permeated to the surface. These implementational details have until now been painful for Haskell programmers, and have threatened abstractions such as functors, monads, and comonads. Categorically, these implementational restrictions can be formulated succinctly with subcategories, for which there are corresponding structures of non-endofunctors, relative monads, and relative comonads. Until now there has been no succinct way to describe such structures in Haskell.

Using constraint kinds we can define associated type families, of kind Constraint, which allow abstract categorical structures, described via their operations as a type class, to be parameterised by subcategories on a per-instance basis. We can thus define a class of exofunctors, i.e. functors that are not necessarily endofunctors, which we showed here. The other related structures which are difficult to describe in Haskell without constraint kinds: relative monads and relative comonads, are discussed further in a draft note (edit July 2012: draft note subsumed by my TFP 2012 submission). The note includes examples of a Set monad and an unboxed array comonad, both of which expose their implementational restrictions as type class constraints which can be described as subcategory constraints.
Any feedback on this post or the draft note is greatly appreciated. Thanks.

Constraint kinds in Haskell, finally bringing us constraint families

Back in 2009 Tom Schrijvers and I wrote a paper entitled Haskell Type Constraints Unleashed [1] which appeared at FLOPS 2010 in April. In the paper we fleshed out the idea of adding constraint synyonyms and constraint families to GHC/Haskell, building upon various existing proposals for class families/indexed constraints. The general idea in our paper, and in the earlier proposals, is to add a mechanism to GHC/Haskell to allow constraints, such as type class or equality constraints, to be indexed by a type in the same way that type families and data families allow types to be indexed by types.

As an example of why constraint families are useful, consider the following type class which describes a simple, polymorphic, embedded language in Haskell (in the “finally tagless“-style [2]) (this example appears in [1]):

class Expr sem where
    constant :: a -> sem a
    add :: sem a -> sem a -> sem a

Instances of Expr provide different evaluation semantics for the language. For example, we might like to evaluate the language for numerical values, so we might try and write the following instance:

data E a = E {eval :: a}

instance Expr E where
     constant c = E c
     add e1 e2 = E $ (eval e1) + (eval e2)

However, this instance does not type check. GHC returns the type error:

    No instance for (Num a)
      arising from a use of `+'
    In the second argument of `($)', namely `(eval e1) + (eval e2)'
    ...

The + operation requires the Num a constraint, yet the signature for add states no constraints on type variable a, thus the constraint is never satisfied in this instance. We could add the Num a constraint to the signature of add, but this would restrict the polymorphism of the language: further instances would have this constraint forced upon them. Other useful semantics for the language may require other constraints e.g. Show a for pretty printing. Should we just add more and more constraints to the class? By no means!

Constraint families, as we describe in [1], provide a solution: by associating a constraint family to the class we can vary, or index, the constraints in the types of add and constant by the type of an instance of Expr. The solution we suggest looks something likes:

class Expr sem where
    constraint Pre sem a
    constant :: Pre sem a => a -> sem a
    add :: Pre sem a => sem a -> sem a -> sem a

instance Expr E where
    constraint Pre E a = Num a
    ... -- methods as before

Pre is the name of a constraint family that takes two type parameters and returns a constraint, where the first type parameter is the type parameter of the Expr class.

We could add some further instances:

data P a = P {prettyP :: String}

instance Expr P where
    constraint Pre P a = Show a
    constant c = P (show c)
    add e1 e2 = P $ prettyP e1 ++ " + " ++ prettyP e2

e.g.

*Main> prettyP (add (constant 1) (constant 2))
"1 + 2"

At the time of writing the paper I had only a prototype implementation in the form of a preprocessor that desugared constraint families into some equivalent constructions (which were significantly more awkward and ugly of course). There has not been a proper implementation in GHC, or of anything similar. Until now.

At CamHac, the Cambridge Haskell Hackathon, last month, Max Bolingbroke started work on an extension for GHC called “constraint kinds”. The new extensions unifies types and constraints such that the only distinction is that constraints have a special kind, denoted Constraint. Thus, for example, the |Show| class constructor is actually a type constructor, of kind:

Show :: * -> Constraint

For type signatures of the form C => t, the left-hand side is now a type term of kind Constraint. As another example, the equality constraints constructor ~ now has kind:

~ :: * -> * -> Constraint

i.e. it takes two types and returns a constraint.

Since constraints are now just types, existing type system features on type terms, such as synonyms and families, can be reused on constraints. Thus we can now define constraint synonyms via standard type synonms e.g.

type ShowNum a = (Num a, Show a)

And most excitingly, constraint families can be defined via type families of return kind Constraint. Our previous example can be written:

class Expr sem where
    type Pre sem a :: Constraint
    constant :: Pre sem a => a -> sem a
    add :: Pre sem a => sem a -> sem a -> sem a

instance Expr E where
    type Pre E a = Num a
    ...

Thus, Pre is a type family of kind * -> * -> Constraint. And it all works!

The constraint kinds extension can be turned on via the pragma:

{-# LANGUAGE ConstraintKinds #-}

Max has written about the extension over at his blog, which has some more examples, so do go check that out. As far as I am aware the extension should be hitting the streets with version 7.4 of GHC. But if you can’t wait it is already in the GHC HEAD revision so you can checkout a development snapshot and give it a whirl.

In my next post I will be showing how we can use the constraint kinds extension to describe abstract structures from category theory in Haskell that are defined upon subcategories. I already have a draft note about this (edit July 2012: draft note subsumed by my TFP 2012 submission)
submission if you can’t wait!

 

References


[1] Orchard, D. Schrijvers, T.: Haskell Type Constraints Unleashed, FLOPS 2010
, [author’s copy with corrections] [On SpringerLink]

[2] Carrete, J., Kiselyov, O., Shan, C. C.: Finally Tagless, Partially Evaluated, APLAS 2007

Paper: Ypnos, Declarative Parallel Structured Grid Programming

Max Bolingbroke, Alan Mycroft, and I have written a paper on a new DSL for programming structured grid computations with the view to parallelisation, called Ypnos, submitted to DAMP ’10]

Abstract:

A fully automatic, compiler-driven approach to parallelisation can result in unpredictable time and space costs for compiled code. On the other hand, a fully manual, human-driven approach to parallelisation can be long, tedious, prone to errors, hard to debug, and often architecture-specific. We present a declarative domain-specific language, Ypnos, for expressing structured grid computations which encourages manual specification of causally sequential operations but then allows a simple, predictable, static analysis to generate optimised, parallel implementations. We introduce the language and provide some discussion on the theoretical aspects of the language semantics, particularly the structuring of computations around the category theoretic notion of a comonad.

Any feedback is welcome.