This guide (which is also a Literate Haskell program) describes basic Hakyll metaphors in a way that I would have found useful when first learning Hakyll, using a working example site to illustrate the concepts. I was using Hakyll 188.8.131.52 and GHC 7.8.1 when writing this article.
Hakyll is the static web site generator used to create this site. You use it by defining the behaviour and structure of your site as a Haskell program that uses various facilities exposed by the Hakyll modules, a style familiar to users of Xmonad. A number of tutorials are available on the Hakyll website, but when I was trying to learn Hakyll (some years ago), I was sorely missing a guide aimed at experienced Haskell programmers, one that defined the basic abstractions and metaphors in terms of the data types actually exposed by Hakyll. After much trial and error, I eventually came to understand what was going on, and decided to document it as the kind of tutorial that I would have found useful. I still recommend skimming the other tutorials, as I will probably skip things that I don’t find very interesting. In fact, this should probably be considered an “advanced tutorial” (that certainly sounds much better than “stream-of-consciousness snapshot”).
Hakyll is used by compiling and running a program in the directory containing the input files, which then generates the site as a set of output files. Hence, we define a
Main module that exports a
main function and imports all the modules we’ll need. The
OverloadedStrings language extension, as a small convenience, lets us write
"*.md" instead of
Fundamentally, Hakyll is extremely simple: it is an association between Identifiers and Compilers, and everything else is just scaffolding around that. An identifier is just that: a name consisting of an optional “group” and a mandatory “path”, which need not be a file path, but often is. A compiler is an action in the monad
Compiler. When the site is generated, the compiler for each identifier is executed. A compiler can in principle do anything to generate its output, but one interesting thing it can do is ask for the compilation result some other (known) identifier. Hakyll will automatically run the compilers in the proper order such that these requirements are satisfied, although cycles are forbidden. Since you cannot in general know the type of values generated by the compiler of some arbitrary identifier, you can get a type error at runtime if you ask for the compilation result as a different type than the compiler produces (there are some typeclass constraints that I’ll get into later).
The output from a compiler is not necessarily put anywhere unless another compiler asks for it, but it is possible to define a “route” for a compiler, which causes Hakyll to write the result to a file in the site directory. The route is not part of the compiler itself, but defined as an association of the identifier. Routes are the way we can map, for example, a Markdown file to the HTML file it should become in the generated site.
When working with Hakyll in practise, you will define compilers though a monadic DSL incarnated in the Rules monad. This monad makes it easy to apply similar compilers to all files matching some given pattern. For example, let us define a rule that copies the
style.css file verbatim into our site.
return () is to make the type
Rules, rather than
RulesM (Pattern CopyFile) –
Rules is just
RulesM (). The
match function runs the given rules for each file in its match, although the above will only match exactly one file, namely
style.css in the same directory as our Hakyll source file. In general terms, the
compile action associates the given compiler with each of the identifiers matching the current pattern. The set of possible identifiers (which are then filtered by the pattern) is taken from the files (not directories) in the directory from which Hakyll is run. The compiler given to
compile must take a
Resource as its input, which is really just a wrapper around an identifier guaranteed to refer to a file.
Let’s do something more complicated: we want our site to contain a bunch of pages, but all of them should contain a complete list (a menu) of all pages on the site. This sounds like a problem: in order to generate one page, we must already have seen every other page, which violates the requirement that compiler dependencies must be acyclic. The solution is to use two logical passes: first run a set of compilers that somehow generate a list of all pages, then actually generate the content for each page while including that information. This is possible since the menu does not need to know the content of each page, but only its eventual location on the site, which in Hakyll terms is the route for each identifier corresponding to an input file. So far, so good, but how do we easily generate such lists? To begin with, we define a pattern that matches all the content pages on our site.
This pattern matches all
.md files, including those in subdirectories to any depth. Our site may have other files - images and other resources - but we do not want these to show up in the menu anyway.
When we try to define rules for storing menu information, we may run into the problem that the compiler we pass to
compile is really quite restricted in its output type: it has to implement various type classes permitting serialisation, as it could in principle be asked to write its result to a file. In this case we’re in luck, as plain
Strings implement the required instances. A second problem is that we conceptually wish to associate two compilers with each input file - one that generates its data for the menu, and one that actually renders the page to HTML. The solution is to tag the identifiers related to the menu compilers with a version (here,
"menu"), which makes them different from the identifiers used for the actual content, which have no version.
For every file that matches the
content pattern, we create an identifier corresponding of the file path and the version
"menu". This identifier we associate with the compiler
destination. It is important to note that
match filters the input, whereas
version modifies the identifiers of the output.
destination compiler does not use its underlying file for anything, but instead obtains the identifier being compiled, setting the version of that identifier to
Nothing, getting the route for the resulting identifier, and if that identifier doesn’t exist (the route is
Nothing), returning the empty string. This is a a bit of a hack, but we don’t really expect getRouteFor to ever return
Nothing, as that would mean we have been asked to add a menu entry for a file that will not exist on the site. Since the only difference between the identifiers used for generating the menu and the content is that the latter have version
Nothing, this will compute the output path of the compiler responsible for generating the content for the respective file. The reason this works is because you do not need to run the compiler in order to determine where it will put its output - that is defined in the
Rules DSL, and hence available simply by querying the identifier.
The rule for defining our content pages is quite simple. We replace the existing extension (
md according to the
content pattern) with
html, then pass the page through a three-step compiler that first converts the page from Markdown to HTML, then applies an HTML template, then finally converts absolute URLs into relative URLs so the resulting files can be put anywhere (don’t worry about this last stage, it’s not important). To understand how
loadAndApplyTemplate works, we first have to understand Hakyll templates and contexts. Templates are simply files in which variables can be written as
$var$. When applying the template, each such instance is replaced with the value of the corresponding variable in the passed context. Our
template.html file will contain the text
$menu$ where we intend our menu to show up, and hence we need to create a context that contains a
menu field containing an HTML-rendering of the menu for that page.
Apart from the
menu field, the template also needs some standard fields like
"title". These are provided by
getMenu compiler is the one that actually produces the menu. The real trick here is the use of
loadAll, which lets us obtain a list of all compiler outputs for identifiers in the
"menu" group. That means a list of all routes for our content pages!
showMenu itself is just a plain Haskell function that produces an HTML list with the current page highlighted. In practice, we’d use a proper HTML combinator library, but let’s stick with strings for simplicity.
In order to use a template, we have to tell Hakyll it exists. That’s what the
templateCompiler is about. This is not a very interesting definition.
Finally, the Hakyll main function is written in a rather stylised manner, with
hakyll being given a
Rules () monadic action. Note that although I execute
compileContent for conceptual reasons, I could swap them around and it would still work.
That’s all there is to it. To try it out, download some input files, put this hakyll.lhs into the directory, then run
$ ghc --make hakyll.lhs && ./hakyll build && ./hakyll preview 8080
and point your web browser at
localhost:8080. The result should be something similar to this. If you change the code, remember to run
./hakyll clean as well, as Hakyll’s cache system might otherwise not realise that something is different.
You may also want to look at sigkill.lhs, the program generating my own website.