# List Homomorphisms and Parallelism

A *list homomorphism* is a function `h`

on lists for which there
exists an
associative
binary operator `⊙`

such that

`h (x ++ y) = h x ⊙ h y`

for any `x`

and `y`

, where `++`

denotes list concatenation. Or to put
it another way, a function `h`

is a list homomorphism if we can split
the input list any way we wish, apply `h`

to the parts independently,
and combine the results using some operator `⊙`

.

Simple examples of list homomorphisms:

- The identity function, with
`⊙`

being`++`

. - Summation, with
`⊙`

being`+`

.

Operationally, when computing a list homomorphism we can split the
input into any number of *chunks*, compute a result per chunk, and
then combine the results into a final result for the whole list. Each
chunk can be processed independently of the others, in parallel. For
the purpose of this text, we can consider “list” to mean “array”,
which is perhaps more practical. We will not depend on our “lists”
having the behaviour of linked lists, and using linked lists would
actually inhibit parallelisation.

In principle `h`

need not be defined for empty inputs, but we’ll
assume that it is, such that

`h [] = e`

where `e`

is necessarily an identity
element for `⊙`

.
This is strictly not required for a list homomorphism, but it makes
the following discussion simpler.

## Example of a nontrivial homomorphism

The maximum sum subarray
problem (also
known as *maximum segment sum*) is about finding the largest sum of a
subarray `A[i:j]`

of some array `A`

. This is not a list
homomorphism - knowing `mssp x`

and `mssp y`

is not enough to compute
`mssp (x++y)`

. Example:

```
mssp [0, 3,-2] = 3
mssp [4,-1, 0] = 4
mssp ([0,3,-2] ++ [4,-1,0]) = 5
```

But if we extend the domain a bit, we can indeed obtain a
homomorphism. This is called a *near homomorphism*: it computes the
result we care about, plus some ancillary information used to combine
partial results. In this case, we will compute a tuple with four
integer elements:

The maximum subarray sum (i.e., the final result we are actually interested in).

The maximum subarray sum starting from the

*first*element.The maximum subarray sum ending at the

*last*element.The sum of the entire array.

Note that the first three must be non-negative, as a subarray can always be empty.

Now define a function `f`

that morally computes such a tuple for
single element subarrays:

`f x = (max x 0, max x 0, max x 0, x)`

Then we define an associative operator for combining our tuples:

```
(mssx, misx, mcsx, tsx) ⊙ (mssy, misy, mcsy, tsy) =
(max mssx (max mssy (mcsx + misy)),
max misx (tsx+misy),
max mcsy (mcsx+tsy),
tsx + tsy)
```

(Proof of associativity left for the reader.) This operator has an identity element:

`e = (0, 0, 0, 0)`

Now we can define a list homomorphism for solving the MSSP:

```
h [] = e
h [x] = f x
h (x ++ y) = h x ⊙ h y
```

In a real parallel language, we would probably write this as

`reduce ⊙ e (map f A)`

Why is this the same? Keep reading!

## The list homomorphism theorems

The first two *list homomorphism theorems* were published by Richard
S. Bird in 1987, and the third by Gibbons in 1995 (although he notes
it had appeared as a “folk theorem” before then). Especially the
Gibbons paper (link below) is recommended reading for a precise
exposition that clarifies some things I’m leaving fuzzy here.

### The first homomorphism theorem

If `h`

is a list homomorphism, then there is an operator `⊙`

and
function `f`

such that

`h xs = reduce ⊙ e (map f xs)`

This theorem means that we can represent a list homomorphism as a
function `f : a -> b`

, an associative binary operator `⊙ : b -> b -> b`

, and its identity element `e`

. In many cases `f`

is merely the
identity function, which gives us the `reduce`

commonly found in
parallel programming systems.

### The second homomorphism theorem

If `(f,⊙,e)`

represents a list homormorphism, then

`reduce ⊙ e (map f xs) = foldl ⊕ e xs = foldr ⊗ e xs`

where

`a ⊕ b = f a ⊙ b`

and

`a ⊗ b = a ⊙ f b`

This means that any list homomorphism can be computed with either a
left or a right
fold
using a *specialised function* derived from `⊙`

and `f`

. This is
essentially a form of loop
fusion, as it
allows us to avoid manifesting the result of the `map`

. In a parallel
implementation of reduction, we might break the input into a chunk per
processor, then use the second list homomorphism theorem to compute an
optimised sequential fold for each chunk.

### The third homomorphism theorem

If `h`

can be expressed with both a leftwards and rightwards fold,
then `h`

is also a list homomorphism. This implies that if we can
write a function as both a leftwards and a rightwards fold, *then we
can write that function as a parallel reduction*. This is possible
whenever we can find a function `g`

such that

`h ∘ g ∘ h = h`

That is, `g`

is similar to (but not exactly) an inverse of the
homomorphism `h`

. Gibbons’ proof shows that such a `g`

always exists.

Unfortunately, Gibbons’ proof of the theorem does not tell us exactly
how to construct the (`f`

, `⊙`

, `e`

), or `g`

for that matter. We know
it must exist, but not what it looks like. It also does not promise
that the homomorphism is going to be as asymptotically efficient as
any of the original folds. In particular, it would be nice if we
could take a fold implementing Kadane’s
algorithm
and mechanically derive the solution to MSSP shown above. Still, this
theorem can inspire us to look for a parallel implementation.

Indeed, the paper *Automatic Inversion Generates Divide-and-Conquer
Parallel Programs* attacks the problem of obtaining `g`

through a
program synthesis technique based on the user providing the leftwards
and rightwards definitions of `h`

. The technique is at least effective
enough to handle `mssp`

. but there is still no guarantee that a
correct `g`

can be found. It is however guaranteed that *if* it is
found, it is going to be efficient. Another synthesis-based approach
is discussed in *Decomposition-based Synthesis for Applying
Divide-and-Conquer-like Algorithmic Paradigms*, which is demonstrated
to work well for many (small) problems.

## References

*An Introduction to the Theory of Lists*(pdf), by Richard S. Bird (1987)*The Third Homomorphism Theorem*(pdf), by Jeremy Gibbons (1995)*Parallel Programming, List Homomorphisms and the Maximum Segment Sum Problem*(pdf), by Murray Cole (1993)*Automatic Inversion Generates Divide-and-Conquer Parallel Programs*(pdf), by Kazutaka Morita, Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi (2007)*Decomposition-based Synthesis for Applying Divide-and-Conquer-like Algorithmic Paradigms*(pdf), by Ruyi Ji, Yuwei Zhao, Yingfei Xiong, Di Wang, Lu Zhang, and Zhenjiang Hu.