# 1 Functorial Structure

In category theory, a functor \(F: C \to D\) between categories \(C\) and \(D\) is

- an assignment
- to objects \(a\) in \(C\) of objects \(F a\) in \(D\) and
- to arrows \(f : a \to b\) of arrows \(Ff : Fa \to Fb\) in \(D\),

- such that the laws of
- identity: \(F : \operatorname{id}_a = \operatorname{id}_{Fa}\), for all objects \(a\) in \(C\), and
- composition: \(F(g \circ f) = Fg \circ Ff\), for all (composable) morphisms \(f : a \to b\) and \(g : b \to c\) in \(C\),

Functors can also be considered as objects or as morphisms between objects. Indeed, in the category of categories, functors are morphisms, while in the category of functors, functors are objects, with natural transformations between them as the morphisms.

This categorical notion carries over to Haskell and justifies the choice of the name of the corresponding typeclass. Indeed,

which means that `f`

is a type constructor of kind `* -> *`

or `Type -> Type`

the parentheses in `(f a -> f b)`

are syntactically superfluous due to right-associativity of the arrow `->`

(it’s `infixr 0 ->`

), and which is such that the identity law

and the composition law

are satisfied. This must be verified manually but can be automated by black-box (structural or property) testing with, e.g., QuickCheck.

The essence of the `Functor`

typeclass is that its `fmap`

functor preserves the structure of the functorial value `x :: f a`

it acts on. From the type signature of `fmap`

it is obvious that it does not know anything about `f`

as such. All it gets as input is an arbitrary functorial value `x`

of type `f a`

. To be more precise, with the GHC extension `ExplicitForAll`

, we can correctly specify

The identity law gives us the consistent assignment of objects to objects.

In other words, `fmap`

can be considered as structure-preserving lifting. It even must preserve the length of a list, `[a]`

, which also is a `Functor`

. Just note that those types `f a`

and `f b`

can be functorial, too. In other words, `fmap`

would take a function (morphism in one category) `f a -> f b`

and map it to a function (morphism in another category) `g (f a) -> g (f b)`

. But the types represented by the type variables `a`

and `b`

have nothing to do with the functorial structure of the type constructor `f :: * -> *`

, as any type is good for it. Let’s get a bit deeper into this nesting of functorial structures that we’ve just discovered.

As a remark, `(<$>) = fmap`

with `infixl 4 <$>`

, i.e., `<$>`

is the infix synonym for `fmap`

, associative to the left of precedence order `4`

. Another side note is the regular confusion that arises due to the use of the letter `f`

for the functorial type constructor and for the function-type argument to `fmap`

. They live on different levels of the syntax, the former lives on the type level (types have kinds, akin “types of types”), whereas the latter on the term level (terms inhabit types).

# 2 Composition or Superposition of Functorial Structures

As usual, given an abstract notion, different perspectives let us discover new applications. Nesting of functorial structure is just one application of the categorical notion carried over to Haskell. Now, how do we express that nested `fmap`

above? By just giving `a ~ f c`

and `b ~ f d`

,^{1} where `a`

, `b`

, `c`

, `d`

are arbitrary types and `f`

is a functorial type constructor^{2}.

But how can we go a step further and lift a function `j :: a -> b`

to a function of type `g (f a) -> g (f b)`

or even `k (h (g (f a))) -> k (h (g (f b)))`

? It turns out, just by composition of `fmap`

s. Here’s how we do this.

```
(.) :: (b -> c) -> (a -> b) -> (a -> c)
fmap . fmap :: (Functor f, Functor g)
=> (a -> b) -> (g (f a) -> g (f b))
fmap . fmap . fmap :: (Functor f, Functor g, Functor h)
=> (a -> b) -> (h (g (f a)) -> h( g (f b)))
fmap . fmap . fmap . fmap :: (Functor f, Functor g, Functor h, Functor k)
=> (a -> b) -> (k (h (g (f a))) -> k (h( g (f b))))
```

Another question: when would we need this? Let’s take two of the simplest `Functor`

instances, `Maybe`

(free structure^{3} on computations that can fail) and `[]`

(non-deterministic computations^{4}). Imagine we have a pure function `j :: a -> b`

that performs a trivial transformation, say `j = const 'j'`

, where `const :: a -> b -> a`

can do only one thing, namely take a value of type `a`

, ignore the second argument of type `b`

, and result in that same first argument; indeed this is completely encoded in the type signature: `const`

expects a value of an arbitrary type `a`

, it doesn’t know which concrete type the type variable `a`

is until it’s bound, so it can’t use any class methods absent any type constraints, and it must result in a value of that same type `a`

, hence it can only act as `id :: a -> a`

on the first argument and cannot process the second one of a potentially different type `b`

.

Now imagine we have a value `x :: [Maybe Char]`

such as `[Just 'c', Nothing]`

. First note that both `[] :: * -> *`

and `Maybe :: * -> *`

are `Functor`

instances. We have therefore a nested functorial structure here, which we can write as `g (f a)`

where `g ~ []`

and `f ~ Maybe`

. So there are two levels of nesting. How about a value `y :: [Maybe String]`

such as `[Just "cd", Nothing]`

? It’s different now, since `type String = [Char]`

by definition, which gives one deeper level of functorial nesting and is a structure `g (f (h a))`

where `h ~ []`

. And if we add a `Maybe`

encapsulation, say, to obtain a value `z = Just y`

, we’ll end up with `k (g (f (h a)))`

where `k ~ Maybe`

. Let’s see how this plays out.

So now we have

```
x = [Just 'c', Nothing] :: [Maybe Char]
y = [Just "cd", Nothing] :: [Maybe String] ~ [Maybe [Char]]
z = Just y :: Maybe [Maybe String]
```

We first apply our constant `j :: a -> Char`

directly to each of the three values. In the comments I specify the concrete type of `j`

and what this `a`

is, for each of those three values, `x`

, `y`

and `z`

.

```
j x = 'j' -- j :: a ~ [Maybe Char] -> Char
j y = 'j' -- j :: a ~ [Maybe String] -> Char
j z = 'j' -- j :: a ~ Maybe [Maybe String] -> Char
```

Now, let’s lift `j`

to the context of the first level of the nesting.

```
fmap j x = ['j', 'j'] -- j :: a ~ Maybe Char -> Char
fmap j y = ['j', 'j'] -- j :: a ~ Maybe String -> Char
fmap j z = Just 'j' -- j :: a ~ [Maybe String] -> Char
```

Let’s go one level deeper,

```
ff = fmap . fmap
ff j x = [Just 'j', Nothing] -- j :: a ~ Char -> Char
ff j y = [Just 'j', Nothing] -- j :: a ~ String -> Char
ff j z = Just "jj" -- j :: a ~ Maybe String -> Char, so ff j :: (a -> Maybe String) -> f a -> f Char
```

The pattern is apparent: for `j`

, we peel off one outer layer after another. This is because

whereby if `j :: a -> b`

, where `b ~ Char`

, and

`x :: g (f a) ~ [Maybe Char]`

, then necessarily`g ~ []`

and`f ~ Maybe`

, and hence`a ~ Char`

, and therefore`j :: Char -> Char`

, and the curried application of`fmap . fmap`

to`j`

results in a function of type`Char -> g (f b) ~ [Maybe Char]`

;`y :: g (f a) ~ [Maybe String]`

, then necessarily`g ~ []`

and`f Maybe`

, and hence`a ~ String`

, and therefore we must have`j :: String -> Char`

and`fmap . fmap $ j :: String -> g (f b) ~ [Maybe Char]`

;`z :: g (f a) ~ Maybe [Maybe String]`

, then necessarily`g ~ Maybe`

,`f ~ []`

and`a ~ Maybe String`

, whereby`j :: Maybe String -> Char`

and`fmap .fmap $ j :: Maybe [Maybe String] -> g (f b) ~ Maybe [Char] = Maybe String`

.

Such a proof can be referred to as manual or explicit **type inference**. This is one of Haskell and its compilers’ strengths. We never need to do this, that’s the compiler’s job. But we may want to do this to justify a refactorization, the introduction of a new abstraction to our code, or the correctness of an instance definition for a type constructor (including nullary type constructors which are equivalent to concrete or proper types). The more experience we gain with explicit type inference, the more easily can we reason about our code. So this is an excellent exercise. As a suggestion, try using pen and paper, the academic way, it’s an illuminating experience if you’re not yet used to it.

To understand better why this works out this way, consider the result of lifting `j`

, multiple times in a row. I’ll explicitly include some syntactic peculiarities that functional programmers keep in their mind. So our basic setup is delineated in the following code block, where we go one level deep.

```
j :: a -> Char -- b ~ Char
fmap :: Functor f => (a -> b) -> (f a -> f b)
fmap j :: Functor f => f a -> f Char
```

Let’s descend another level deeper or lift another level higher. Note the remark on the syntactic precedence of function application, also denoted by the infix operator `($) :: (a -> b) -> a -> b`

with `infixr 0 $`

, over the composition operator `(.) :: (b -> c) -> (a -> b) -> (a -> c)`

with `infixl 9 .`

which means that function application as in `fmap j`

takes precedence over composition as in `fmap . fmap j`

, which is in contrast to the desired composition `(fmap . fmap)`

applied to `j`

. I’ll specify the corresponding types explicitly, so as to immediate recognize that it’s a different object.

```
fmap . fmap :: (Functor f, Functor g)
=> (a -> b) -> (g (f a) -> g (f b))
(fmap . fmap) j :: (Functor f, Functor g)
=> g (f a) -> g (f Char)
fmap . fmap j -- Precedence of application: 'infixr 9 .' vs 'infixr 0 $'.
== fmap . (fmap j) :: (Functor f, Functor g)
=> (a -> b) -> (f a -> f Char)
```

Good, next level.

```
fmap . fmap . fmap :: (Functor f, Functor g, Functor h)
=> (a -> b) -> (h (g (f a)) -> h (g (f b)))
(fmap . fmap . fmap) j :: (Functor f, Functor g, Functor h)
=> (h (g (f a)) -> h (g (f b)))
fmap . fmap . fmap j
== fmap . fmap . (fmap j) :: (Functor f, Functor g)
=> (a -> b) -> (g (f a) -> g (f Char))
```

And finally the fourth.

```
fmap . fmap . fmap .fmap :: (Functor f, Functor g, Functor h, Functor k)
=> (a -> b) -> (k (h (g (f a))) -> k (h (g (f b))))
(fmap . fmap . fmap . fmap ) j :: (Functor f, Functor g, Functor h, Functor k)
=> (k (h (g (f a))) -> k (h (g (f b))))
fmap . fmap . fmap . fmap j
== fmap . fmap . fmap . (fmap j) :: (Functor f, Functor g, Functor h)
=> (a -> b) -> (h (g (f a)) -> h (g (f Char)))
```

This way we now can access nested levels of functorial superstructures such as `k (h (g (f a)))`

, which could be more complex, such as `k (h (g a) (f b))`

, where `h`

is a `Bifunctor`

instance^{5}, any tree nesting basically, and with recursive, linear, or dependent types it goes much farther than this. We will leave these abstractions for later. This idea will come in handy when we deal with lifting monadic values in the context of monad transformers.

# 3 Implementations for `Maybe`

and `[]`

Let’s implement the `Functor`

type class for the `Maybe`

and the `[]`

type constructors. For correctness, enable the `InstanceSigs`

language extension in GHC, otherwise comment out the type signature inside.

## 3.1 A `Functor`

instance for `Maybe`

```
instance Functor Maybe where -- Recall: f ~ Maybe :: * -> *
fmap :: (a -> b) -> f a -> f b
fmap _ Nothing = Nothing
fmap f (Just x) = Just (f x)
```

- identity law:

- composition law:

```
fmap g . fmap f $ Nothing = fmap g Nothing = Nothing
== fmap (g . f) Nothing
fmap g . fmap f $ Just x = fmap g (Just f x) = Just ((g . f) x)
== fmap (g . f) (Just x)
```

This instance makes `Maybe`

a functor: a `Maybe`

value can now be mapped over with `fmap`

, it now conforms to the essence of the functor. And here you can see how our instance preserves the `Maybe`

structure:

- a
`Nothing`

becomes a`Nothing`

, and - a
`Just x`

value, where`x :: a`

and`a`

is an arbitrary type, becomes a`Just (f x)`

value.

This consistency is crucial. Our mapped function needs not and cannot know anything about the particular `Functor`

instance `f`

. If this feels tedious, feel free to skip such proofs on first reading. But they are essential. Here’s why. Let’s write an incorrect instance which the compiler won’t raise a red flag on.

which is equivalent to simply

This instance already fails even the identity law. Indeed, in this case, `fmap id (Just 'x') = Nothing /= Just 'x' = id (Just 'x')`

.

Alternatively, we can fix some arbitrary value.

This instance also violates the identity law. Indeed, in this case, `fmap id Nothing = Just 'x' /= Nothing = id Nothing`

.

But we could also define `Maybe`

to be the **identity functor**, which regardless of the mapped function always gives back the input values.

This instance now seems to preserve the structure. Indeed, it seems to satisfy both

- the identity law, for
`fmap id Nothing = Nothing == id Nothing`

and`fmap id (Just x) = Just x == id (Just x)`

, for any`x`

,

- and the composition law, for
`fmap g . fmap f $ Nothing = fmap g Nothing = Nothing = fmap (g . f) Nothing`

, and`fmap g . fmap f $ Just x = fmap g (Just x) = Just x = fmap (g . f) (Just x)`

, for any composable`g`

and`f`

and every`x`

.

But it doesn’t in fact type-check! Recall that `fmap :: (a -> b) -> f a -> f b`

. So what if `a ~ Int`

and `b ~ Char`

, so that the argument function `f :: Int -> Char`

? Then the first pattern on `Nothing`

would type-check flawlessly, but the second on `Just x`

would not if `x`

is not both `Int`

and `Char`

at the same time. But `Int`

and `Char`

are distinct! This is a contradiction. All would be well if only we could assume that `b ~ a`

so we could have taken `f = id`

, but we cannot. In other words, we *must* incorporate the transformation `a -> b`

somehow in the instance definition. I can’t express in words how much I appreciate that GHC is so incredibly smart! ♥ It turns out, this was actually not the identity functor in the first place!

And now an example where everything type-checks and the identity law is satisfied but the law of composition is violated. Is it even possible? We need to try to fail the fourth identity above. Can we?

```
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap _ (Just _) = undefined -- The bottom value, which inhabits every type.
```

This type-checks flawlessly. But, is `undefined == undefined`

? Let’s see what we need to assert.

`fmap id Nothing = Nothing == id Nothing`

, this is correct.`fmap id (Just x) = undefined == id undefined`

, this is questionable.`fmap g . fmap f $ Nothing = fmap g Nothing = Nothing == fmap (g . f) Nothing`

, which is correct.`fmap g . fmap f $ Just x = fmap g undefined = undefined == fmap (g . f) (Just x)`

, which also is questionable.

Anyway, this is a pathological corner case, one which we must be aware of, nonetheless. As noted, `undefined`

is the bottom value, which inhabits every type. In other words, this is the only value that can take on any type, exactly as we need here. Due to Haskell’s lazy evaluation strategy, it’s harmless as long it is not evaluated and sits as a placeholder essentially, that makes our programs type-check and compile, as long as we have not yet filled the space it occupies with good code. But *if it is evaluated*, the program exits with a runtime error. In fact, `undefined = error "Prelude.undefined"`

. And `length [undefined, undefined] == 2`

since no evaluation of the values inside the list takes place. When we type-check we force an evaluation up to the weak-head normal form (WHNF), which is the evaluation up to the outermost constructor.

As a side note, we cannot instantiate `Maybe`

more than once, for a given type class. To overcome this limitation, `newtype`

s are introduced that are transparent wrappers at compile time so as to facilitate type checking, and are discarded in the binary code. For each new type, such as `newtype Maybe' a = Maybe' {runMaybe' :: Maybe a}`

, a new instance of the same class can be defined, e.g., `instance Functor Maybe' where`

, which describes a different behavior — that nonetheless must be law-abiding!

## 3.2 A `Functor`

instance for lists, `[]`

Our second concrete type of (linked) lists, `[]`

, is a functor too (and also so much more!) and its instance is also relatively straightforward to specify. Recall I said earlier that the length of a list is its inherent property and that functorial mapping `fmap`

is guaranteed to preserve the structure of a functorial value. Granted, lift extension or shrinkage is trivial, but it is not a functorial action! So our basic tenet for the instantiation of `[]`

as a `Functor`

will be maintenance of the structure. For comparison, first our `Maybe`

instance.

```
instance Functor [] where -- Recall: f ~ Maybe :: * -> *
fmap :: (a -> b) -> f a ~ Maybe a -> f b ~ Maybe b
fmap f Nothing = Nothing
fmap f (Just x) = Just (f x)
```

We ensured its correctness above by performing type-inference for the identity and the composition laws that are part of the definition of a functor, both in category theory and in Haskell. So if you are reading this and want to implement a functor in your language of choice, then make sure your instance also satisfies the laws, it is an overarching, language-independent inherent part of the notion.

Now, let’s look at our list instance.

```
instance Functor [] where
fmap :: (a -> b) -> f a ~ [a] -> f b ~ [b]
fmap f [] = []
fmap f (x:xs) = f x : fmap f xs -- Recursive!
```

We now need to verify the functorial laws.

identity law:

`fmap id [] = [] = id []`

and`fmap id xs'@(x:xs) = x : fmap id xs = ... = xs = id xs'`

, for any list`xs' :: [a]`

.composition law: for this we will invoke the method of proof by structural induction

- induction basis:
`fmap (g . f) [] = [] = fmap g [] = fmap g (fmap f []) = (fmap g . fmap f) []`

- induction hypothesis: suppose
`xs'@(x:xs)`

are such that`fmap (g . f) xs = (fmap g . fmap f) xs`

, and we need to show that, given this assumption,`fmap (g . f) xs' = (fmap g . fmap f) xs`

:

where we used the hypothesis in the last equality. And, conversely,

- induction basis:

That’s it. One important nit-pick: ellipses `...`

in proofs are, well, kinda acceptable on this informal level of discourse as we are maintaining here, but if you wanted to do it formally correct, you would have to express them as a complete induction with induction basis and induction hypothesis. In future, this will be an exercise for you. So if you feel you didn’t get it from this particular example, just read up on it on Wikipedia, it’s wonderful.^{6} ♥

# 4 Other Standard Data Types

## 4.1 Sum and Product Types

Let’s look at the type constructor `Either`

that generalizes `Maybe`

and the type constructor tuple `(,)`

. They are defined as

The most fundamental notion that they represent is that of sum types and product types, respectively. I mean, we can write equivalently

A sum type represents alternatives and has multiple constructors, here `Left`

and `Right`

or `One`

and `Two`

; logically this corresponds to `OR`

. A product type represents conjunction and has but one constructor, here `(,)`

or `Product'`

.^{7}

A propositional logical system can be well-defined with just negation and either of conjunction or disjunction. Indeed, the implication \[p \to q \equiv \neg (p \wedge \neg q) \equiv \neg p \vee q.\]

For predicate logics Haskell offers explicit and implicit `forall`

s. Existential quantification can be given in terms of the universal quantification by means of negation. We will not go as far yet. But note that GHC offers rank-\(n\) types and existential types.

*Exercise* Do you see in how what sense `Either`

generalizes `Maybe`

and `Maybe`

specifies `Either`

? Work this out.

By induction, sum and product types are well-defined for any finite natural number \(n\). This basic idea is crucial for dependent types, too.

*Exercise* If \(\sigma \in \Sigma\) is a sum type, and \(\pi \in \Pi\) a product type, where the capital letters represent sets of sum and product types, respectively, then what types belong to the sum set \[\Sigma + \Pi := \left\{ \sigma + \pi \;\mid\; \sigma \in \Sigma,\; \pi \in \Pi \right\}\] and the product sets \(\Pi\Sigma\) and \[\Sigma \Pi := \left\{ \sigma \pi \;\mid\; \sigma \in \Sigma,\; \pi \in \Pi \right\}\] respectively, where we define \(\sigma + \pi = \sigma | \pi\) and \(\sigma\pi = \sigma\, \pi\) as above in terms of Haskell?

### 4.1.1 Functor Instance for `Either`

Let’s define our instance. There is only one difference from `Maybe`

, and this is the “arity” of `Either`

, i.e., its kind is `* -> * -> *`

whereas `Maybe :: * -> *`

, and a `Functor`

can be any type constructor `f :: * -> *`

. So we need to make `Either`

a binary type constructor. In Haskell, kinds are also curried, so we can simply bind the first type variable. In short: `Maybe`

is a unary wheres `Either`

is a binary type constructor.^{8}

```
instance Functor (Either a) where -- f :: * -> * whereas Either :: * -> * -> *
fmap _ (Left x) = Left x
fmap f (Right x) = Right (f x)
```

- identity law:
`fmap id (Left x) = Left x = id (Left x)`

and`fmap if (Right x) = Right x = id (Right x)`

. - composition law:
`fmap (g . f) (Left x) = Left x = fmap g (fmap f (Left x)) = fmap g . fmap f $ Left x`

.

*Exercise* Make sure you can follow these lines. Write out on paper if you don’t understand every single step here.

Why did we choose to bind the first subtype and not the second? Well, in Haskell we don’t have kind-level \(\lambda\)-expressions. Other than this I’m not aware of any technical or formal reason, so it seems to be a bit arbitrary. We can, if necessary, define another sum type like `Either' a b = Left' b | Right' b`

in which we flip the order manually.

*Remark*. Recall, by induction, this is valid for any finite natural number of subtypes in a sum type.

### 4.1.2 Functor Instance for `(,)`

```
instance Functor ((,) a) where -- (,) :: * -> * -> *, and (,) a = (a,) :: * -> *
fmap g (x,y) = (x, g y)
```

- identity law:
`fmap id (x,y) = (x, y) = id (x,y)`

- composition law:
`fmap (g . f) (x,y) = (x, (g.f) y)`

and`fmap g (fmap f (x,y)) = fmap g (x, f y) = fmap g (x, g (f y))`

.

*Exercise* I’ve shortened the notation a bit here. Check whether this was indeed admissible.

Again, why did we choose to bind the first subtype and not the second? Same reason: no kind-level \(\lambda\)-expressions.

*Exercise* If you’re wondering why we didn’t take `(g x, g y)`

, then look at the types.

(*Hint*: the functorial type constructor here `f ~ (t,)`

, for a fixed arbitrary type `t`

, and `fmap :: (a -> b) -> f a ~ (t,a) -> f b ~ (t,b)`

. So `(g x, g y) :: (b, b)`

instead of `f b ~ (a,b)`

!)

*Remark*. Recall, by induction, this is valid for any finite natural number of subtypes in a product type.

## 4.2 Function Type `(->)`

Now this is where it gets mind boggling. Spoiler: it turns out `fmap`

ping over a function is just function composition! And it *has to be*! Wonder why? Let’s take a look! First, as always, our instance definition. Again, since the function type constructor is `(->) :: * -> * -> *`

, we need to bind its first type argument.

… erm, now we’re stuck… How do we even refer to a function here? Let’s go backwards. Let’s first analyze the types and see how we can meet our requirements. Okay, so `fmap :: (a -> b) -> f a -> f b`

and `f ~ (t ->)`

here, for a fixed arbitrary type `t`

. So `fmap :: (a -> b) -> (t -> a) -> (t -> b)`

. But that’s exactly the signature of function composition `(.) :: (b -> c) -> (a -> b) -> (a -> c)`

. So in order to simply type-check, we can try using function composition. This is by no means an exhaustive, at most a heuristic, approach to finding the definition. But it works here just perfectly. When we deal below with the two functorial laws, we’ll see why this definition is sufficient and necessary! Let’s now finish our definition.

```
instance Functor ((->) a) where -- f ~ (->) a = (a ->) :: * -> *
--fmap = (.) -- For brevity.
fmap g f = g . f -- For clarity.
```

Now, the functorial laws.

Identity law:

`fmap id f = id . f = f`

— now that was easy.Composition law:

^{9}for this we will need to assert**associativity of composition**^{10},

*Exercise* Make sure you understand this. Write this out otherwise.

Now, why is `(.)`

the only right way here? What if we had other functions with the same signature? Now do we actually? Let’s see… a function `x :: (b -> c) -> (a -> b) -> (a -> c)`

, for arbitrary types `a`

, `b`

, and `c`

… Let’s just use Hoogle! Well, there are a few, but they all are just the same composition! What if this is theoretically justified? Note that there are no constraints on the types `a`

, `b`

or `c`

, which means that `x`

must be able to deal with any type! What was special about `id :: a -> a`

? It knows nothing about `a`

, so it can’t process `a`

in any way, as there’s no way to know *just how* it *could* process `a`

in the first place. This situation is in fact not unique. We will encounter it later in the context of **existential types**, which formalize separation of concerns by transferring specification of argument types from the callee to the caller.

And here, `x`

must take two functions of arbitrary arguments, of which it only knows that they are coherent by ordering and repetition — the only actual bit of knowledge about them is that `b`

is input to the first and output to the second argument. Now given this information alone (number of arguments, their types (function types), and the type `b`

), we have to somehow fill in the gap here. On the other hand, we have the two functorial laws, but as you see they are trivial for function composition. So it’s really that consistency about `b`

.

*Exercise*: Work this out.

Just kidding! Look, we want to understand why only `(.)`

fits that type signature. We have to make this type-check. Let’s just spell it out according to the type signature alone.

Woah! But that’s just `g (f a)`

, i.e., the function composition `(g . f) a`

. So indeed `x = (.)`

. This was a bit more tedious than it should be, but I hope it’s clear enough to be able to follow the lines of the argument, even if you’re not yet comfortable enough with this form of reasoning. In mathematics, the higher the degree of formal maturity, the shorter simple arguments become, but the more complex are the constructions. The complexity of expression shifts somewhat.

*Exercise* This is still a bit hand-wavy. Figure out how to fill the blanks. Not kidding this time. It isn’t hard. Just repeat the argument and see what may be missing.

### 4.2.1 Functor Instances for `Sum'`

and `Product'`

There is no difference beyond notation between `Sum'`

and `Either`

on the one hand and `Product'`

and `(,)`

on the other. Suffices to write it out explicitly.

```
data Sum' a b = One a | Two b
data Product' a b = Product a b
instance Functor (Sum' a) where
fmap f (One a) = One a
fmap f (Two b) = Two (f b)
instance Functor (Product' a) where
fmap f (Product' a b) = Product' a (f b)
```

It is a bit different for phantom types

or in other words *essentially* `fmap _ = id`

.

But not for existential types:

```
data Existential a = forall e. Existential a e
instance Functor Existential where
fmap f (Existential a e) = Existential (f a) e
```

*Exercise* Verify the `Functor`

laws for each of these four instances.

## 4.3 Reader, Writer, and State Monads

You might be wondering what the point of the above academic “nonsense” was. Good question! Look, in Haskell and functional programming in general, we make use of such academic abstractions to ensure correctness of our code. This is not much different from the object-oriented design patterns most developers are used to. Here we have a strong theoretic background to ensure correctness. And we will use it where it makes sense and is reasonable.

The abstract take above paved the road, for example, for the Reader, Writer, and State monads. They all express how an effectful computation interacts with its environment. Computations in the context of Reader can only access read-only values stored in the environment, in the Writer monad, they can only write to the environment, and in the State context, a computation can both read from and write to its particular environment. This is separation of concerns at its foundations. From these building blocks we will go further and rise higher.

But first, let’s implement their `Functor`

instances. For this we first need to know how Reader, Writer, and State are defined. As I said earlier, they are nothing but what we already know. So we can implement them just in terms of:

```
newtype Reader r a = Reader { runReader :: r -> a }
newtype Writer w a = Writer { runWriter :: (w, a) }
newtype State r w a = State { runState :: r -> (w, a) }
```

*Remark* (Transformers). In Haskell they are implemented in the Transformers library (transformers) and in the Monad Transformers library (mtl), where the latter builds atop the former.

*Remark* (Records notation). Let’s dissect the definition of `Reader`

here.

`newtype Reader r a`

:`newtype`

: this is akin to`data`

, albeit admits only a single value constructor, a transparent type wrapper, defines a syntactically (but not semantically) distinct type (in contrast to`type`

synonyms, which are syntactically the same types).`Reader r a`

: type constructor`Reader`

takes two arbitrary types`r`

and`a`

, its kind is `* -> * -> *`.

`=`

reads like “*… is defined as …*” or “*… can be constructed as …*”.`Reader { runReader :: r -> a }`

: this describes how a value of type`Reader r a`

can be constructed: just provide a function`r -> a`

, for any type`r`

and any type`a`

.`Reader`

: this is the name of the value constructor which we can use for pattern matching, where the action of pattern matching would have the type signature`Reader r a -> (r -> a)`

, whereas the type signature of the value constructor is`Reader :: (r -> a) -> Reader r a`

. Don’t get confused by the same name; this is commonplace in Haskell; the type constructor`Reader`

and the value constructor`Reader`

live on different levels of syntax.`{ runReader :: r -> a }`

: this is a “record”, which essentially is just the same as`Reader (r -> a)`

with an implicit accessor`runReader :: Reader r a -> (r -> a)`

.

So we have three actions here:

`Reader :: (r -> a) -> Reader r a -- Value constructor. runReader :: Reader r a -> (r -> a) -- Accessor, or deconstructor. patternMatch :: Reader r a -> (r -> a) -- Pattern matching action (destructuring).`

The “destructuring” is not like in C++ or Java, though. Alright. But since we defined

`Reader`

,`Writer`

, and`State`

in terms of what we already have`Functor`

instances for, do we need to define (and verify) the respective instances for these three type constructors? Well, we do. But only because we used`newtype`

expressions. Had we used`type`

expressions, we wouldn’t need to, albeit we wouldn’t be able to use the record syntax, not that we need it:`type Reader' r a = r -> a`

and then`runReader' :: Reader' r a -> (r -> a); runReader' = id`

, but this would be pretty much useless. As I said earlier,`newtype`

s are there so we can implement different (but correct) instances for the same type; they are nothing but transparent wrappers about a type (sum, product, or any other type that is admissible); syntactically they appear as distinct types, even though under the hood they are the same, and once the program is compiled, there is no distinction anymore, at runtime they are the same. So let’s implement those instances. But we don’t have to reinvent the wheel! We can simply refer to the`run...`

accessors now! Look:

```
instance Functor (Reader r) where -- f ~ Reader r ~ (r ->) :: * -> *
fmap :: (a -> b) -> f a -> f b
--fmap :: (a -> b) -> Reader r a -> Reader r b
--fmap :: (a -> b) -> (r -> a) -> (r -> b)
fmap g x = Reader -- ^ Reader :: (r -> b)-> Reader r b -- Repack.
$ fmap g -- ^ fmap g = (g.) :: f a ~ (r -> a) -> f b ~ (r -> b) -- Delegate.
$ runReader x -- ^ runReader :: Reader r a -> (r -> a) -- Unwrap.
```

The laws are certainly satisfied.

*Exercise* Check that blatant assertion.

As you see, we don’t even need to think much about how to connect the dots here because we simply can delegate this task to `fmap`

we have for the function embedded in `Reader`

, simply because it already is functorial. This is true of all `newtype`

declarations, whenever we just want a fancy synonym. But as I said, this is not the actual purpose of `newtype`

s, just a convenient side-effect.

On we go to Writer.

```
instance Functor (Writer w) where
fmap g x = Writer -- Writer :: (w, b) -> Writer w b -- Repack.
$ fmap g -- fmap g a = (w, g a) :: (w, a) -> (w, b) -- Delegate.
$ runReader x -- runReadr :: Writer w a -> (w, a) -- Unwrap.
```

*Exercise* Verify the `Functor`

laws.

And finally the State.

```
instance Functor (State r w) where -- State r w :: * -> *
fmap g x = State -- State :: (r -> (w, b)) -> State r w b
$ (fmap . fmap) g -- fmap . fmap g :: g (f a) -> g (f b)
$ runState x -- :: State r w a -> (r -> (w, a)) ~ Reader r (Writer w a)
```

*Exercise* Verify the `Functor`

laws.

Note how we used `fmap . fmap`

to delegate twice, level by level, where `g ~ (r->) ~ Reader r`

and `f ~ (w,) ~ Writer w`

. Can you recognize how our earlier tedious abstraction `fmap . fmap`

takes the burden and saves our precious time here? We had to do that only once, and can reuse this abstraction everywhere, as we are doing here! This is due to the remark concerning type equality in the fourth line above.

*Exercise* Work this out in more detail. Specifically, write out what `(fmap . fmap) g`

does here (*Hint*: simply write out the definitions explicitly) and see how everything type-checks.

*Exercise* Express `State`

in terms of `Reader`

and `Writer`

as above. What are the semantics of `Writer w (Reader r a)`

? Let `State' r w a = State' { runState' :: Reader r (Writer w a) }`

, and then write and verify `Functor`

instances for both.

If you’re wondering why the header calls these derivative types “monads,” this is just how they are most commonly referred to, if you want to look them up elsewhere, because they are instances of the `Monad`

type class. I’m going to explain this in more detail a next article. We will encounter these standard type constructors often enough. Come back to this post anytime you feel you forgot how we defined them.

And that’s it. Let’s wrap up. With all the wrapping and unwrapping we’ve done above, it sounds almost like a pun.

# 5 Key Take-Aways

Haskell the language as such is essentially about management of guarantees and advanced quality assurance, ensuring consistency and correctness of our code. It doesn’t come free: the initial mental burden is arguably significant. But once you pass a certain threshold, get used to thinking just enough formally to make sense of it, you’re good to go, and GHC will become a great virtual partner in this sort of pair programming.

Let’s conclude this post with a quick bullet-point summary.

- A
`Functor`

is a type class for type constructors`f :: * -> *`

that exhibit the behavior of preservation of structure of the value on every`fmap`

application. - We can compose functors with functors, i.e., nest them inside like a ladder. This becomes handy when we have embedded structures and want to
`fmap`

across all of them to reach the leaf that stores the value we actually want to transform — all while being guaranteed to preserve the functorial structure! - Important standard functors are the canonical sum type constructors
`Maybe`

and`Either`

, the list type constructor`[]`

, the function type constructor`(r->)`

for any`r`

, the canonical product type constructor`(w,)`

for any`w`

, and their derivatives`Reader r`

,`Writer w`

, and`State r w`

. - It is important to not only define an instance of a type class but also verify that the particular instance satisfies its laws. When we later deal with black-box (property-based) testing in Haskell, we’ll see how to automate this a bit, which is especially convenient for hard cases. This way you can untangle your CI/CD pipeline and simplify the entire lifecycle, including refactoring. One of the major factors is that type signatures provide an excellent expression of intent.

# 6 Bonus Material

The following two sections are a bit advanced and serve only as appetizers for related concepts to discover and improve your Haskell and functional skills.

## 6.1 Somewhat Esoteric Functor Instances

### 6.1.1 `Identity`

*Exercise* Work out the `Functor`

laws.

The point of `Identity`

is to fill the gaps and to validate other pieces of code. It is an important cornercase on the type level akin to the function `id`

on the term level.

### 6.1.2 `Const`

Another peculiar parametric type is `Const`

, which similar to the function `const :: a -> b -> a`

that ignores its second argument and is commonly used in the form `y = const x :: b -> a`

where `x :: a`

, to specify a **constant** (morphism), which for any argument gives the same output `x`

. It’s dual to `id`

in a way.

*Exercise* Work out the `Functor`

laws.

The essential peculiarity from the point of view of type theory is the opaque or **phantom type** variable `b`

present in the type constructor `Const`

on the left-hand side but absent from the value constructor `Const`

on the right-hand side. For each distinct type `b`

, the resulting type `Const a b`

is distinct, for every type `a`

. In other words, we can encode additional information in the particular type `Cosnt a T`

, where `T`

is a concrete type that the type variable `b`

assumes, without making it accessible to the construction–desctructuring by pattern matching.

The dual notion is that of **existential types**, which are given as

and where, dually, the type constructor `Existential`

doesn’t know anything about that type `b`

its witnesses (values) are constructed with by using the value (or witness) constructor `Existential`

, which by the way must be oblivious to that `b`

, as by this explicit `forall`

declaration we transfer the choice of the particular type `b`

to the caller; in other words value construction must work independently of the concrete type `b`

, i.e., for all types `b`

.

### 6.1.3 `IO`

Among Haskell’s types, the `IO`

type constructor is the most peculiar, especially in its implementation via the `RealWorld`

construct. More on this in another post. The most important thing to take away here is that `IO`

is a `Functor`

, which enable us to use `fmap`

on its values. Such an `IO`

value `x :: IO a`

, for a given type `a`

, describes a computation of type `a`

that involves input or output, albeit need not be executed. It is a formulation of the computation, not the computation as such. When an `IO a`

value is evaluated, its execution takes place.

For instance, `putStrLn :: String -> IO ()`

and `getLine :: IO String`

. So let `x = getLine`

, so we can apply `fmap ("Hi, " ++) :: Functor f => f String -> f String`

to `x`

and get another `IO String`

value. We will dive deeper into `IO`

when we deal with monads, as it’s the natural playground for `IO`

in Haskell. Stay tuned!

## 6.2 Natural Transformations

I said in the introduction above that functors can be considered as

- morphisms between categories with categories as objects, in the category \(\mathbf{Cat}\) of cateogories, or
- objects with natural transformations between them as morphisms, in the category of functors.

So far we’ve only covered the first perspective. Now it’s time for the second one.

Let \(F : C \to D\) and \(G : C \to D\) denote a functors between a pair of categories \(C\) and \(D\). A **natural transformation** \(\eta : F \to G\) between these functors is a family of morphisms, indexed by the objects of \(C\), such that

- \(\eta\) (consistently) associates the morphism \(\eta_a : Fa \to Ga\) in \(D\), for every object \(a\) in \(C\),
- coherent commutativity: \(\eta_b \circ Ff = Gf \circ \eta_a\), for every morphism \(f : a \to b\) in \(C\).

It captures the essence of mutating the structure `f -> g`

while preserving the content of the type `a`

.

So while we have the morphism `fmap :: (a -> b) -> (f a -> f b)`

in the category of categories, we want a morphism `nat :: (f -> g) -> (f a -> g a)`

in the category of functors, where `f`

and `g`

are both `Functor`

instances. Apparently, we have to level up. In GHC this model is implemented as the (now obsolete) `Rank2Types`

language extension, a particular case (and now just a synonym) for higher-kinded types handled by the `RankNTypes`

language extension. Both imply `ExplicitForall`

.

### 6.2.1 `RankNTypes`

Let’s specify two concrete types `S`

and `T`

for brevity. You can replace them with `forall a. a`

everywhere. Let’s then check what the rank of the function types is.

```
data S = S; data T = T
r1 :: forall b c. b -> c -> T
r2 :: (forall d. d -> d) -> forall b c. b -> c -> T
r3 :: ((forall d. d -> d) -> S) -> forall b c. b -> c -> T
u :: S -> forall d. d -> T
v :: S -> forall d. Num d => d -> T
```

By convention, the scope of binding of the universal quantifier \(\forall\) extends to the right until the end of the subexpression. This is the case with `forall`

too. Variables not bound are called **free**. So in

`forall a b. a -> b`

,`a`

and`b`

are bound by that`forall`

;`forall a. (forall b. b -> a) -> b`

, the latter`a`

was bound on the outside of the parentheses, whereas the latter`b`

is free because the scope opf of the innermost`forall`

expression is limited by the subexpression in the parentheses, and thereby invalidates the entire expression;`forall a (forall b. b -> c) -> a`

, the inner`c`

is free but the latter`a`

is bound.

By Skolemization, we can normalize the expression in line 4 to the equivalent expression in line 3, in other words, we can prefix all `forall a b c`

in fron of all the arrows instead of quantifying each type variable individually in a chain of arrows; in other words we can move `forall b`

to the left of `a -> b`

but we cannot move `forall a`

to the right of `a -> b`

since `forall`

binds to the right only.

The **rank** of a type is the level of nesting of such universally quantified subexpressions, plus one for the outermost level. Therefore:

- The function
`r1`

is of a rank-one type; in its type signature only the implicit universal quantification of`b`

and`c`

was made explicit, nothing fancy yet. - It is different with
`r2`

where we prefix a function argument in front of it, which can only be`id`

, as you may recall, which isn’t essential here, as the argument can also be of type`forall d. d -> S`

, say. This is indeed a rank-two type because a rank one argument to the right, which is precisely`r1`

’s type, follows another rank-one type argument. - With
`r3`

it is similar: it is a rank-three (function) type, because it has a rank-two type preceding the outermost rank-one type function arrow`forall b c. b -> c -> T`

. - The functions
`u`

and`v`

have a rank-one type but are not valid Haskell98, and are handled by the`RankNTypes`

extension, and their type signatures are reduced to the equivalent`u :: S ->d -> T`

and`v :: Num d => S -> d -> T`

.

Now let’s replace `T`

with something more complicated, e.g., the types below.

```
newtype Rank1Type a b = Rank1Type (a -> b) -- Our ordinary rank-one types.
newtype ExistType a = forall b. ExistType (a -> b) -- Rank-one existential type.
newtype Rank2Type a = Rank2Type (forall b. a -> b) -- Rank-two type.
newtype Rank2Type' a = Rank2Type' (a -> forall b. b) -- Syntactically equivalent.
newtype Rank2TypeEx a = forall c. Rank2TypeEx (forall b. a -> b -> c) -- Existential rank-2 type.
newtype Rank2TypeExP a p = forall c. Rank2TypeExP (forall b. a -> b -> c) -- Phantom existential rank-2 type.
newtype Rank3Type a = Rank3Type (forall b c. a -> b -> c) -- Rank-three type.
newtype Rank4Type a = Rank4Type (forall b c d. a -> b -> c -> d) -- Rank-four type.
```

Here, the value constructors are higher-rank functions, as they map `Rank2Type :: forall a. (forall b. a -> b) -> Rank2Type a`

. But they construct a `Rank2Type`

value from a rank-one function!

*Exercise* Count the arrows to make sure the names correctly reflect the semantics. What exactly is here rank-two? The value constructor or the resulting type?

For ordinary types, such as `Rank1Type`

in the first line, the universal quantification of `a`

and `b`

is implicit on the left-hand side in the type constructor. The step to existential quantification is completed when we universally quantify a type in the constructor that does not belong to the type.

And a change to a phantomization in line 6, where type `p`

is the corresponding phantom type, is completed by adding an extra type to the type constructor that does not belong to the respective value constructors.

### 6.2.2 Natural Transformation as an Existential Type?

So we said that a natural transformation is a family of morhisms. Look, if we write

we’ll get a family of value constructors `Eta`

, indexed by objects, which are the types `a`

here. So, for every type `a`

, `Eta a :: Eta`

. What is it? We’re obviously missing the functors. So let

whose value constructor has the type `forall f g a. (f a -> f a) -> Eta f g`

which, however, is *not equivalent* to `forall f g. (forall a. f a -> g a) -> Eta f g`

, which is a rank-two type.

We wanted `Eta`

to be oblivious to `a`

so it cannot touch it in the first place.

### 6.2.3 Natural Transformation as a Higher-Rank Function

Here, `f`

and `g`

both must have the kind `* -> *`

as follows from their use in the constructor, where `a`

is a rigid type variable, of kind `*`

. Here, the type of the value constructor `Eta :: (forall a. f a -> g a) -> Eta f g`

is of rank two. The accessor (or deconstructor) `runEta :: forall f g. Eta f g -> forall a. f a -> g a`

has rank one because by Skolemization we arrive at the equivalent expression `runEta :: forall f g a. Eta f g -> f a -> g a`

, without any nested `forall`

subexpressions, as also prior to Skolemization.

Now we can map any type constructor `f`

to any to constructor `g`

, expressed by `Nat f g`

. But that’s not what we originally wanted. The image \(\eta F\) of a functor \(F\) under a natural transformation \(\eta\) must be the functor \(G\), as otherwise the desired mapping would not be well-defined, and also satisfy both the defining properties of a naturl transformation.

So let’s first impose the `Functor`

constraint on `f`

since we know that the argument to our natural transformation must be a functor, and then implement `Eta f`

as a `Functor`

. But note first the `Eta :: (* -> *) -> (* -> *) -> *`

, but `Functor :: (* -> *) -> Constraint`

, i.e., expects a type constructor `* -> *`

but not `(* -> *) -> *`

. So simple type binding, i.e., `Eta f :: (* -> *) -> *`

, won’t work here! How can we tackle this?

```
instance Functor f => Functor (Eta f) where
fmap :: (g -> h) -> (Eta f g) -> (Eta f h)
fmap k x = ...
```

I hope this gives at least a rough picture of what I said initially and what is possible to model and verify with Haskell. There is so much more to it! But first of all we must ensure we have a working understanding of the fundamental notions of functional software architecture and are capable to recognize its benefits and deficiencies. Haskell the platform is not perfect but it is excellent, with a wonderful patient and selfless community!♥

The tilde character

`~`

denotes “type equality.”↩︎There are many ways to express verbally what

`f`

is. We use “**functorial type**” for “*constructor***instance of the type class**” (short: “`Functor`

**Functor instance**”), which it is since`f :: * -> *`

; one can also say that`f`

is a`Functor`

or a “functor,” which would arguably be a bit less precise. This disambiguation is not really important if your audience understands what exactly you are referring to. You can also refer to it as a**parametrically polymorphic**type`f`

. A**functorial value**is any value`x`

of type`f a`

, where`f a`

is a functorial type.↩︎In a category \(C\), a

**free object**is the image of a**free functor**\(F : \mathbf{Set} \to C\), which assigns any set \(a\) the object \(Fa\) in \(C\) that represents the minimal structure on the set that it must carry to be qualified to be an object in \(C\). For example, given a set \(a\), a free semigroup \(Fa\) of \(a\) is a semigroup structure superimposed on \(a\), which may require that some elements have to be discarded in order to satisfy the associativity of the semigroup operation as given by the functor \(F\), where also a choice between elements may have to be made. So this is not as trivial as it might seem on the surface.

The*dual notion*is that of a**forgetful functor**\(U : C \to \mathbf{Set}\), which strips all structure of \(C\) from an object \(a\) to produce a set \(Ua\), e.g., if \(a\) is a semigroup, then \(Ua\) is its carrier with exactly the same elements but without the semigroup operation.↩︎A computation is said to be

**nondeterministic**if given the same input, in produces different output, which doesn’t necessarily mean that it is a random variable, a deterministic list of values is enough because it leaves the choice of a single element open and so provides a branching decision tree. Admittedly, it’s a bit contrived. Moreover, the notion of nondeterminism in science is quite broad. In dynamical systems, nondeterminism can be modeled stochastically. Essentially nondeterminism means that multiple outcomes are possible.↩︎A

**bifunctor**is essentially a bivariate functor, functorial in each argument. There are also contravariant functors and combinations with contravariance, which we need not be concerned with now.↩︎This will become crucial for recursive data types, where the induction takes the form of

**structural induction**, which we actually also are doing here. Moreover, in Haskell, lists are “infinite” types. The complete induction is only valid for any finite natural number, e.g., when you want to prove some assertion like “*for all \(n \in \mathbb{N}\), …*” but not directly for an assertion like “*the cardinality of this set is an infinite cardinal number…*”. For \(\mathbb{N}\), depending on the axioms you choose, say Peano’s, you can invoke*tertium non datur*and derive the result from a contradiction, whereas in intuitionistic formulations it’s a bit harder. However, this is not my field of expertise. So for infinities, you’ll need the**transfinite induction**over the ordinals (the complete induction goes only over the naturals, and structural induction goes by the construction).↩︎I’m using apostrophe suffixes here to distinguish them from the corresponding

`newtype`

s in Prelude for`Monoid`

instances.↩︎There is an important difference between a

**type constructor**and a**data (or value) constructor**. In an expression like`data T a = C a`

,`T`

is a type constructor,`T a`

is a type, whereas`C`

is a data (or value) constructor, and`x :: T a`

is a value of type`T a`

that was constructed by the constructor`C`

, which we can use in pattern matching when we write`x@(C y)`

to extract the value`y :: a`

from the`C`

context. Recall that in Haskell, type variables are written lowercase, whereas concrete types are capital.↩︎Wait…

**composition of composition**? Yes! This works too! Look (no obscenities intended) the*reflexivity*is a bit hard to grasp, just follow the types:

`(.) :: (s -> t) -> ((a -> s) -> (a -> t)) (.) :: s ~ (b -> c) -> t ~ ((d -> b) -> (d -> c)) ((.) .) :: (a -> s) ~ (a -> b -> c) -> (a -> t) ~ a -> (d -> b) -> (d -> c) (.) :: a' ~ (b -> c) -> b' ~ (a -> b) -> c' ~ (a -> c) (.) . (.) :: a' ~ (b -> c) -> (d -> b') ~ (d -> a -> b) -> (d -> c') ~ (d -> a -> c)`

*Exercise*Retrace each step. Soon we will see how such seemingly complex expressions turn out to be very useful in certain circumstances, especially library code can benefit from it. You certaintly don’t need to use them always. Most people don’t understand them. To write excellent Haskell, you don’t need this! ↩︎But associativity of function composition is

*the*basic tenet of almost everything we do. Semigroups need this, and by extension categories.↩︎