List Homomorphisms and Parallelism
A list homomorphism is a function h
on lists for which there
exists an
associative
binary operator ⊙
such that
h (x ++ y) = h x ⊙ h y
for any x
and y
, where ++
denotes list concatenation. Or to put
it another way, a function h
is a list homomorphism if we can split
the input list any way we wish, apply h
to the parts independently,
and combine the results using some operator ⊙
.
Simple examples of list homomorphisms:
- The identity function, with
⊙
being++
. - Summation, with
⊙
being+
.
Operationally, when computing a list homomorphism we can split the input into any number of chunks, compute a result per chunk, and then combine the results into a final result for the whole list. Each chunk can be processed independently of the others, in parallel. For the purpose of this text, we can consider “list” to mean “array”, which is perhaps more practical. We will not depend on our “lists” having the behaviour of linked lists, and using linked lists would actually inhibit parallelisation.
In principle h
need not be defined for empty inputs, but we’ll
assume that it is, such that
h [] = e
where e
is necessarily an identity
element for ⊙
.
This is strictly not required for a list homomorphism, but it makes
the following discussion simpler.
Example of a nontrivial homomorphism
The maximum sum subarray
problem (also
known as maximum segment sum) is about finding the largest sum of a
subarray A[i:j]
of some array A
. This is not a list
homomorphism - knowing mssp x
and mssp y
is not enough to compute
mssp (x++y)
. Example:
mssp [0, 3,-2] = 3
mssp [4,-1, 0] = 4
mssp ([0,3,-2] ++ [4,-1,0]) = 5
But if we extend the domain a bit, we can indeed obtain a homomorphism. This is called a near homomorphism: it computes the result we care about, plus some ancillary information used to combine partial results. In this case, we will compute a tuple with four integer elements:
The maximum subarray sum (i.e., the final result we are actually interested in).
The maximum subarray sum starting from the first element.
The maximum subarray sum ending at the last element.
The sum of the entire array.
Note that the first three must be non-negative, as a subarray can always be empty.
Now define a function f
that morally computes such a tuple for
single element subarrays:
f x = (max x 0, max x 0, max x 0, x)
Then we define an associative operator for combining our tuples:
(mssx, misx, mcsx, tsx) ⊙ (mssy, misy, mcsy, tsy) =
(max mssx (max mssy (mcsx + misy)),
max misx (tsx+misy),
max mcsy (mcsx+tsy),
tsx + tsy)
(Proof of associativity left for the reader.) This operator has an identity element:
e = (0, 0, 0, 0)
Now we can define a list homomorphism for solving the MSSP:
h [] = e
h [x] = f x
h (x ++ y) = h x ⊙ h y
In a real parallel language, we would probably write this as
reduce ⊙ e (map f A)
Why is this the same? Keep reading!
The list homomorphism theorems
The first two list homomorphism theorems were published by Richard S. Bird in 1987, and the third by Gibbons in 1995 (although he notes it had appeared as a “folk theorem” before then). Especially the Gibbons paper (link below) is recommended reading for a precise exposition that clarifies some things I’m leaving fuzzy here.
The first homomorphism theorem
If h
is a list homomorphism, then there is an operator ⊙
and
function f
such that
h xs = reduce ⊙ e (map f xs)
This theorem means that we can represent a list homomorphism as a
function f : a -> b
, an associative binary operator ⊙ : b -> b -> b
, and its identity element e
. In many cases f
is merely the
identity function, which gives us the reduce
commonly found in
parallel programming systems.
The second homomorphism theorem
If (f,⊙,e)
represents a list homormorphism, then
reduce ⊙ e (map f xs) = foldl ⊕ e xs = foldr ⊗ e xs
where
a ⊕ b = f a ⊙ b
and
a ⊗ b = a ⊙ f b
This means that any list homomorphism can be computed with either a
left or a right
fold
using a specialised function derived from ⊙
and f
. This is
essentially a form of loop
fusion, as it
allows us to avoid manifesting the result of the map
. In a parallel
implementation of reduction, we might break the input into a chunk per
processor, then use the second list homomorphism theorem to compute an
optimised sequential fold for each chunk.
The third homomorphism theorem
If h
can be expressed with both a leftwards and rightwards fold,
then h
is also a list homomorphism. This implies that if we can
write a function as both a leftwards and a rightwards fold, then we
can write that function as a parallel reduction. This is possible
whenever we can find a function g
such that
h ∘ g ∘ h = h
That is, g
is similar to (but not exactly) an inverse of the
homomorphism h
. Gibbons’ proof shows that such a g
always exists.
Unfortunately, Gibbons’ proof of the theorem does not tell us exactly
how to construct the (f
, ⊙
, e
), or g
for that matter. We know
it must exist, but not what it looks like. It also does not promise
that the homomorphism is going to be as asymptotically efficient as
any of the original folds. In particular, it would be nice if we
could take a fold implementing Kadane’s
algorithm
and mechanically derive the solution to MSSP shown above. Still, this
theorem can inspire us to look for a parallel implementation.
Indeed, the paper Automatic Inversion Generates Divide-and-Conquer
Parallel Programs attacks the problem of obtaining g
through a
program synthesis technique based on the user providing the leftwards
and rightwards definitions of h
. The technique is at least effective
enough to handle mssp
. but there is still no guarantee that a
correct g
can be found. It is however guaranteed that if it is
found, it is going to be efficient. Another synthesis-based approach
is discussed in Decomposition-based Synthesis for Applying
Divide-and-Conquer-like Algorithmic Paradigms, which is demonstrated
to work well for many (small) problems.
References
An Introduction to the Theory of Lists (pdf), by Richard S. Bird (1987)
The Third Homomorphism Theorem (pdf), by Jeremy Gibbons (1995)
Parallel Programming, List Homomorphisms and the Maximum Segment Sum Problem (pdf), by Murray Cole (1993)
Automatic Inversion Generates Divide-and-Conquer Parallel Programs (pdf), by Kazutaka Morita, Akimasa Morihata, Kiminori Matsuzaki, Zhenjiang Hu, and Masato Takeichi (2007)
Decomposition-based Synthesis for Applying Divide-and-Conquer-like Algorithmic Paradigms (pdf), by Ruyi Ji, Yuwei Zhao, Yingfei Xiong, Di Wang, Lu Zhang, and Zhenjiang Hu.