Opinionated Hakyll Tutorial

This guide (which is also a Literate Haskell program) describes basic Hakyll metaphors in a way that I would have found useful when first learning Hakyll, using a working example site to illustrate the concepts. The mechanism described in this article are not necessarily the literal implementation Hakyll, but are rather a glimpse of the metaphors that I have found useful in practise when working with it. Also, it should probably be noted that I was using Hakyll 4.6.8.1 and GHC 7.8.1 when writing this article.

Introduction

Hakyll is the static web site generator used to create this site. You use it by defining the behaviour and structure of your site as a Haskell program that uses various facilities exposed by the Hakyll modules, a style familiar to users of Xmonad. A number of tutorials are available on the Hakyll website, but when I was trying to learn Hakyll (some years ago), I was sorely missing a guide aimed at experienced Haskell programmers, one that defined the basic abstractions and metaphors in terms of the data types actually exposed by Hakyll. After much trial and error, I eventually came to understand what was going on, and decided to document it as the kind of tutorial that I would have found useful. I still recommend skimming the other tutorials, as I will probably skip things that I don’t find very interesting. In fact, this should probably be considered an “advanced tutorial” (that certainly sounds much better than “stream-of-consciousness snapshot”).

Modules

Hakyll is used by compiling and running a program in the directory containing the input files, which then generates the site as a set of output files. Hence, we define a Main module that exports a main function and imports all the modules we’ll need. The OverloadedStrings language extension is a small convenience lets us write "*.md" instead of parseGlob "*.md.

> {-# LANGUAGE OverloadedStrings #-}
> module Main(main) where
> import Hakyll
> import Data.Maybe

Basic Operation

Fundamentally, Hakyll is extremely simple: it is an association between Identifiers and Compilers, and everything else is just scaffolding around that. An identifier is just that: a name consisting of an optional “group” and a mandatory “path”, which need not be a file path, but often is. A compiler is an action in the monad Compiler. When the site is generated, the compiler for each identifier is executed. A compiler can in principle do anything to generate its output, but one interesting thing it can do is ask for the compilation result some other (known) identifier. Hakyll will automatically run the compilers in the proper order such that these requirements are satisfied, although cycles are forbidden. Since you cannot in general know the type of values generated by the compiler of some arbitrary identifier, you can get type error at runtime, if you ask for the compilation result as a different type than the compiler produces (there are some typeclass constraints that I’ll get into later).

The output from a compiler is not necessarily put anywhere unless another compiler asks for it, but it is possible to define a “route” for a compiler, which causes Hakyll to write the result to a file in the site directory. The route is not part of the compiler itself, but defined as an association of the identifier. Routes are the way we can map, for example, a Markdown file to the HTML file it should become in the generated site.

Rules

When working with Hakyll in practise, you will define compilers though a monadic DSL incarnated in the [Rules][4] monad. This monad makes it easy to apply similar compilers to all files matching some given pattern. For example, let us define a rule that copies the style.css file verbatim into our site.

> compileCss :: Rules ()
> compileCss = match "style.css" $ do
>   route   idRoute
>   _ <- compile copyFileCompiler
>   return ()

The return () is to make the type Rules, rather than RulesM (Pattern CopyFile)Rules is just RulesM (). The match function runs the given rules for each file in its match, although the above will only match exactly one file, namely style.css in the same directory as our Hakyll source file. In general terms, the compile action associates the given compiler with each of the identifiers matching the current pattern. The set of possible identifiers (which are then filtered by the pattern) is taken from the files (not directories) in the directory from which Hakyll is run. The compiler given to compile must take a Resource as its input, which is really just a wrapper around an identifier guaranteed to refer to a file.

[4]: 3: https://hackage.haskell.org/package/hakyll-4.6.9.0/docs/Hakyll-Core-Rules.html

A Menu

Let’s do something more complicated: we want our site to contain a bunch of pages, but all of them should contain a complete list (a menu) of all pages on the site. This sounds like a problem: in order to generate one page, we must already have seen every other page, which violates the requirement that compiler dependencies must be acyclic. The solution is to use two logical passes: first run a set of compilers that somehow generate a list of all pages, then actually generate the content for each page while including that information. This is possible since the menu does not need to know the content of each page, but only its eventual location on the site, which in Hakyll terms is the route for each identifier corresponding to an input file. So far, so good, but how do we easily generate such lists? To begin with, we define a pattern that matches all the content pages on our site.

> content :: Pattern
> content = "**.md"

This pattern matches all .md files, including those in subdirectories to any depth. Our site may have other files - images and other resources - but we do not want these to show up in the menu anyway.

When we try to define rules for storing menu information, we may run into the problem that the compiler we pass to compile is really quite restricted in its output type: it has to implement various type classes permitting serialisation, as it could in principle be asked to write its result to a file. In this case we’re in luck, as plain Strings implement the required instances. A second problem is that we conceptually wish to associate two compilers with each input file - one that generates its data for the menu, and one that actually renders the page to HTML. The solution is to tag the identifiers related to the menu compilers with a version (here, "menu"), which makes them different from the identifiers used for the actual content, which have no version.

> compileMenu :: Rules ()
> compileMenu = match content $ version "menu" $ compile destination

For every file that matches the content pattern, we create an identifier corresponding of the file path and the version "menu". This identifier we associate with the compiler destination. It is important to note that match filters the input, whereas version modifies the identifiers of the output.

> destination :: Compiler (Item String)
> destination = setVersion Nothing <$> getUnderlying
>               >>= getRoute
>               >>= makeItem . fromMaybe ""

The destination compiler does not use its underlying file for anything, but instead obtains the identifier being compiled, setting the version of that identifier to Nothing, getting the route for the resulting identifier, and if that identifier doesn’t exist (the route is Nothing), returning the empty string. This is a a bit of a hack, but we don’t really expect getRouteFor to ever return Nothing, as that would mean we have been asked to add a menu entry for a file that will not exist on the site. Since the only difference between the identifiers used for generating the menu and the content is that the latter have version Nothing, this will compute the output path of the compiler responsible for generating the content for the respective file. The reason this works is because you do not need to run the compiler in order to determine where it will put its output - that is defined in the Rules DSL, and hence available simply by querying the identifier.

Using the Menu

The rule for defining our content pages is quite simple. We replace the existing extension (md according to the content pattern) with html, then pass the page through a three-step compiler that first converts the page from Markdown to HTML, then applies an HTML template, then finally converts absolute URLs into relative URLs so the resulting files can be put anywhere (don’t worry about this last stage, it’s not important). To understand how loadAndApplyTEmpalte works, we first have to understand Hakyll templates and contexts. Templates are simply files in which variables can be written as $var$. When applying the template, each such instance is replaced with the value of the corresponding variable in the passed context. Our template.html file will contain the text $menu$ where we intend our menu to show up, and hence we need to create a context that contains a menu field containing an HTML-rendering of the menu for that page.

> compileContent :: Rules ()
> compileContent = match content $ do
>   route $ setExtension "html"
>   compile $ do
>     menu <- contentContext
>     pandocCompiler
>       >>= loadAndApplyTemplate "template.html" menu
>       >>= relativizeUrls

Apart from the menu field, the template also needs some standard fields like "content" and "title". These are provided by defaultContext.

> contentContext :: Compiler (Context String)
> contentContext = do
>   menu <- getMenu
>   return $
>     defaultContext `mappend`
>     constField "menu" menu

The getMenu compiler is the one that actually produces the menu. The real trick here is the use of loadAll, which lets us obtain a list of all compiler outputs for identifiers in the "menu" group. That means a list of all routes for our content pages!

> getMenu :: Compiler String
> getMenu = do
>   menu <- map itemBody <$> loadAll (fromVersion $ Just "menu")
>   myRoute <- getRoute =<< getUnderlying
>   return $ case myRoute of
>              Nothing -> showMenu "" menu
>              Just me -> showMenu me menu

showMenu itself is just a plain Haskell function that produces an HTML list with the current page highlighted. In practice, we’d use a proper HTML combinator library, but let’s stick with strings for simplicity.

> showMenu :: FilePath -> [FilePath] -> String
> showMenu this items = "<ul>"++concatMap li items++"</ul>"
>   where li item = "<li><a href=\"/"++item++"\">"++name item++"</a></li>"
>         name item | item == this = "<strong>"++item++"</strong>"
>                   | otherwise    = item

In order to use a template, we have to tell Hakyll it exists. That’s what the templateCompiler is about. This is not a very interesting definition.

> compileTemplates :: Rules ()
> compileTemplates = match "template.html" $ compile templateCompiler

Finally, the Hakyll main function is written in a rather stylised manner, with hakyll being given a Rules () monadic action. Note that although I execute compileMenu before compileContent for conceptual reasons, I could swap them around and it would still work.

> main :: IO ()
> main = hakyll $ do
>          compileCss
>          compileMenu
>          compileContent
>          compileTemplates

That’s all there is to it. To try it out, download some input files, put this hakyll.lhs into the directory, then run

$ ghc --make hakyll.lhs && ./hakyll build && ./hakyll preview 8080

and point your web browser at localhost:8080. The result should be something similar to this. If you change the code, remember to run ./hakyll clean as well, as Hakyll’s cache system might otherwise not realise that something is different.

You may also want to look at sigkill.lhs, the program generating my own website.