The Sigkill.dk generator
This is the Hakyll program for generating sigkill.dk (see my Hakyll tutorial). Look at this Git repository for the data files as well. The most defining trait of the site is the tree menu at the top, which contains every content page on the site. Apart from that, I also do a lot of small hacks to generate various bits of the site. There is also a simple blog system, with one file per post.
{-# LANGUAGE OverloadedStrings #-}
module Main(main) where
import Control.Arrow (first, (&&&))
import Control.Monad
import Data.Char
import Data.List hiding (group)
import Data.Ord
import Data.Maybe
import Data.Monoid
import System.FilePath
import System.Directory (removeFile)
import Text.Blaze.Internal (preEscapedString)
import Text.Blaze.Html5 ((!))
import qualified Text.Blaze.Html5 as H
import qualified Text.Blaze.Html5.Attributes as A
import Text.Blaze.Html.Renderer.String (renderHtml)
import Text.Pandoc
import Text.Pandoc.Walk
import Hakyll.Web.Pandoc (renderPandocWith, renderPandocWithTransform)
import Hakyll
Hierarchical menu
We are going to define a data type and associated helper functions for generating a menu. Conceptually, the site is a directory tree, with a page being a leaf of the tree. The menu for a given page will illustrate the path taken from the root to the page, namely which intermediary directories were entered.
A level (or “line”, if you look at its actual visual appearance) of
the menu consists of two lists: the elements preceding and succeeding
the focused element. The focused element itself is the first
element of the aftItems
list. This definition ensures that we have
at most a single focused element per menu level. Each element is a
pair consisting of an URL and a name.
data MenuLevel = MenuLevel { prevItems :: [(FilePath,String)]
aftItems :: [(FilePath,String)]
,
}
allItems :: MenuLevel -> [(FilePath, String)]
= prevItems l ++ aftItems l allItems l
emptyMenuLevel :: MenuLevel
= MenuLevel [] [] emptyMenuLevel
First, let us define a function for inserting an element into a sorted list, returning the original list if the element is already there.
insertUniq :: Ord a => a -> [a] -> [a]
| x `elem` xs = xs
insertUniq x xs | otherwise = insert x xs
We can use this function to insert a non-focused element into a
MenuLevel
. We take care to put the new element in its proper sorted
position relative to the focused element, if any.
insertItem :: MenuLevel -> (FilePath, String) -> MenuLevel
= case aftItems l of
insertItem l v -> atPrev
[] :xs) | v < x -> atPrev
(x| otherwise -> l { aftItems = x:insertUniq v xs }
where atPrev = l { prevItems = insertUniq v (prevItems l) }
When inserting a focused element, we have to split the elements into those that go before and those that come after the focused element.
insertFocused :: MenuLevel -> (FilePath, String) -> MenuLevel
= MenuLevel bef (v:aft)
insertFocused l v where (bef, aft) = partition (<v) (delete v $ allItems l)
Finally, a menu is just a list of menu levels.
newtype Menu = Menu { menuLevels :: [MenuLevel] }
emptyMenu :: Menu
= Menu [] emptyMenu
I am using the BlazeHTML library for
HTML generation, so the result of rendering a menu is an H.Html
value. The rendering will consist of one HTML <ul>
block per menu
level, each with the CSS class menuN
, where N
is the number of the
level.
showMenu :: Menu -> H.Html
= zipWithM_ showMenuLevel [0..] . menuLevels showMenu
The focus element is tagged with the CSS class thisPage
.
showMenuLevel :: Int -> MenuLevel -> H.Html
=
showMenuLevel d m mapM_ H.li elems) ! A.class_ (H.toValue $ "menu" ++ show d)
H.ul (where showElem (p,k) = H.a (H.toHtml k) ! A.href (H.toValue p)
= showElem (p,k) ! A.class_ "thisPage"
showFocusElem (p,k) = map showElem (prevItems m) ++
elems case aftItems m of [] -> []
:ls) -> showFocusElem l :
(lmap showElem ls
Building the menu
Recall that the directory structure of the site is a tree. To construct a menu, we are given the current node (page) and a list of all possible nodes of the tree (all pages on the site), and we then construct the minimum tree that contains all nodes on the path from the root to the current node, as well as all siblings of those nodes. In file system terms, we show the files contained in each directory traversed from the root to the current page (as well as any children of the current page, if it is a directory).
To begin, we define a function that given the current path, decomposes some other path into the part that should be visible. For example:
relevant "foo/bar/baz" "foo/bar/quux" = ["foo/","bar/","quux"]
relevant "foo/bar/baz" "foo/bar/quux/" = ["foo/","bar/","quux/"]
relevant "foo/bar/baz" "foo/bar/quux/zog" = ["foo/","bar/","quux/"]
relevant "foo/bar/baz" "quux/zog" = ["quux/"]
relevant :: FilePath -> FilePath -> [FilePath]
= relevant' (splitPath this) (splitPath other)
relevant this other where relevant' (x:xs) (y:ys) = y : if x == y then relevant' xs ys else []
:_) = [y]
relevant' [] (y= [] relevant' _ _
To construct a full menu given the current path and a list of all paths, we repeatedly extend it by a single path. Recall that menu elements are pairs of names and paths - we generate those names by taking the file name and dropping the extension of the path, also dropping any trailing “index.html” from paths.
buildMenu :: FilePath -> [FilePath] -> Menu
= foldl (extendMenu this) emptyMenu
buildMenu this . map (first dropIndex . (id &&& dropExtension . takeFileName))
dropIndex :: FilePath -> FilePath
| takeBaseName p == "index" = dropFileName p
dropIndex p | otherwise = p
extendMenu :: FilePath -> Menu -> (FilePath, String) -> Menu
=
extendMenu this m (path, name) if path' `elem` ["./", "/", ""] then m else
Menu $ add (menuLevels m) (relevant this' path') "/"
where add ls [] _ = ls
:xs) p
add ls (x| x `elem` focused = insertFocused l (p++x,name') : add ls' xs (p++x)
| otherwise = insertItem l (p++x,name') : add ls' xs (p++x)
where (l,ls') = case ls of [] -> (emptyMenuLevel, [])
:ks -> (k,ks)
k= if hasTrailingPathSeparator x then x else name
name' = splitPath this'
focused = normalise path
path' = normalise this this'
For convenience, we define a Hakyll rule that adds anything currently
matched to the menu. To do this, we first need two convenience
functions. The first checks whether the identifier of the current
compilation is defined with some other version (recall that a version
is identified by a Maybe String
, not just a String
), and if so,
returns the route of that identifier.
routeWithVersion :: Maybe String -> Compiler (Maybe FilePath)
= getRoute =<< setVersion v <$> getUnderlying routeWithVersion v
The second extracts the route for the current identifier with no version. As a matter of convenience, we return an empty path if the identifier has no associated route. This should never occur in practice.
normalRoute :: Compiler FilePath
= fromMaybe "" <$> routeWithVersion Nothing normalRoute
The "menu"
version will contain an identifier for every page that
should show up in the site menu, with the compiler for each identifier
generating a pathname.
addToMenu :: Rules ()
= version "menu" $ compile $ makeItem =<< normalRoute addToMenu
To generate the menu for a given page, we use loadAll
to obtain a
list of everything with the version “menu” (the pathnames) and use it
to build the menu, which is immediately rendered to HTML. If a
compiler has been defined for these identifiers that creates anything
but a FilePath
, Hakyll will signal a run-time type error.
getMenu :: Compiler String
= do
getMenu <- map itemBody <$> loadAll (fromVersion $ Just "menu")
menu <- getRoute =<< getUnderlying
myRoute return $ renderHtml $ showMenu $ case myRoute of
Nothing -> buildMenu "" menu
Just me -> buildMenu me menu
Extracting descriptive texts from small programs.
I have a number of small programs and scripts of my site, and I want to automatically generate a list and description for each of them. Each program starts with a descriptive comment containing Markdown markup, so the challenge becomes extracting that comment. I define functions for extracting the leading comment from shell, C and Haskell, respectively.
For shell scripts, we take all leading lines that have a comment
character in the first column, excepting the hashbang (#!
). This
also works for many other languages.
shDocstring :: String -> String
= unlines
shDocstring . map (drop 2)
. takeWhile ("#" `isPrefixOf`)
. dropWhile (all (`elem` ['#', ' ']))
. dropWhile ("#!" `isPrefixOf`)
. lines
For C, we extract the first multi-line comment. At this point we should probably have used a regular expression library.
cDocstring :: String -> String
= unlines
cDocstring . map (dropWhile (==' ')
. dropWhile (=='*')
. dropWhile (==' '))
. maybe [] lines
. (return . reverse . cut . reverse
<=< find ("*/" `isSuffixOf`) . inits
<=< return . cut
<=< find ("/*" `isPrefixOf`) . tails)
where cut s | "/*" `isPrefixOf` s = cut $ drop 2 s
| otherwise = dropWhile isSpace s
Haskell is processed much like shell script: We extract the leading line comments.
hsDocstring :: String -> String
= unlines
hsDocstring . map (drop 3)
. takeWhile ("--" `isPrefixOf`)
. dropWhile ("#!" `isPrefixOf`)
. lines
A hack compiler produces, from a script file, a String
containing
its name and docstring in HTML format, based on an HTML template. The
docstring is assumed to be in Markdown format, so we pass the entire
thing through a Markdown-to-HTML compiler. Unfortunately, it seems
that renderPandoc
inspects the pathname given Item
to figure out
the input format, so we force Markdown interpretation by pretending
the docstring is in a file of name ".md"
.
hackCompiler :: Compiler (Item String)
= do
hackCompiler <- getUnderlyingExtension
ext <- itemBody <$> getResourceString
src <- return $ case ext of
desc ".c" -> cDocstring src
".hs" -> hsDocstring src
-> shDocstring src
_ <- normalRoute
dest <- renderPandoc $ Item ".md" desc
desc' let name = takeFileName dest
= constField "name" name <>
ctx "script" dest <>
constField "description" (itemBody desc')
constField "templates/hack.html" ctx =<< getResourceString loadAndApplyTemplate
Adding the hacks to a page is now just loading everything with version
"hacks"
, and wrapping it in an HTML list.
addHacks :: Item String -> Compiler (Item String)
= do
addHacks item <- loadAll (fromVersion $ Just "hacks")
hacks return item { itemBody = itemBody item <> renderHtml (mapM_ hackHtml hacks) }
where hackHtml = preEscapedString . itemBody
Extracting update times
We extract a timestamp by invoking git log
on the given file,
passing a bunch of options that ultimately results in the ISO 8601
date (YY-MM-DD) for the most recent commit for the file.
getUpdateTime :: FilePath -> Compiler String
=
getUpdateTime f "git" ["log", "-1", "--format=%ad", "--date=format:%F", "--", f] "" unixFilter
Putting it all together
First, we define three convenience compilers. The first selects between different options based on the underlying identifier.
byPattern :: a -> [(Pattern, a)] -> Compiler a
= do
byPattern def options <- getUnderlying
ident return $ fromMaybe def (snd <$> find ((`matches` ident) . fst) options)
This can be used to create a compiler that performs initial creation.
createByPattern :: Compiler a -> [(Pattern, Compiler a)] -> Compiler a
= join $ byPattern def options createByPattern def options
It can also be used to create a compiler that modifies the result of another compiler.
modifyByPattern :: (b -> Compiler a) -> [(Pattern, b -> Compiler a)] -> b -> Compiler a
= join $ byPattern def options <*> pure x modifyByPattern def options x
When we instantiate the final page template, we will need to provide a
suitable context. Apart from the default fields, my website makes use
of two others: a "menu"
field containing the menu, and a "source"
field that contains a link to the raw (“source”) file for the current
page.
contentContext :: Compiler (Context String)
= do
contentContext <- getMenu
menu <- getResourceFilePath
source <- getUpdateTime source
updated return $
<>
defaultContext "menu" menu <>
constField "source" source <>
constField "updated" updated constField
Furthermore, blog entries need a "date"
field, which is extracted
from the file name of the post.
postContext :: Compiler (Context String)
= do
postContext <- contentContext
ctx return $ dateField "date" "%B %e, %Y" `mappend` ctx
postCtx :: Context String
= mconcat
postCtx "mtime" "%U"
[ modificationTimeField "date" "%B %e, %Y"
, dateField
, defaultContext ]
I extend the default Hakyll configuration with information on how to
use rsync
to copy the site contents to the server.
config :: Configuration
= defaultConfiguration
config = "rsync --chmod=Do+rx,Fo+r --checksum -ave 'ssh -p 22' \
{ deployCommand \_site/* --exclude pub athas@sigkill.dk:/var/www/htdocs/sigkill.dk"
}
Now we’re ready to describe the entire site.
main :: IO ()
= hakyllWith config $ do main
CSS files are compressed, data files, my public key, and the
robots.txt
are copied verbatim.
"css/*" $ do
match
route idRoute
compile compressCssCompiler"files/**" static
match "pubkey.asc" static
match "id_rsa.pub" static
match "robots.txt" static match
One of our primary objectives is the ability to write content for the
site without having to modify this generator program. Therefore, we
define content as a non-hidden file contained in any of the directories
me
, writings
, hacks
, programs
or projects
, as well as
any .md
file in the root directory. This property is checked by the
content
pattern.
We divide the content into two sets: content pages, which is all
content of types .md
, .lhs
, .fut
, and .man
, and content data, which
is the rest.
let inContentDir = "me/**" .||. "writings/**" .||. "hacks/**" .||.
"programs/**" .||. "projects/**" .||. "*.md"
= complement "**/.**" .&&. complement ".*/**" .&&. complement "**flycheck**"
nothidden = inContentDir .&&. nothidden
content = content .&&. fromRegex "\\.(md|lhs|fut|man)$"
contentPages = content .&&. complement contentPages contentData
Content data is copied verbatim, as it is expected to be images and similar non-processable data.
match contentData static
Content pages will end up as HTML files. This is conceptually a
simple process: they are added to list of pages contained in the menu,
processed by a content compiler, which is manCompiler
for
manpages, futCompiler
for Futhark programs, and contentCompiler
(which we will see later) for all other files, although everything
gets instantiated with the same template in the end. Some pages also
need special generated content: the hacks-page needs a list of hacks.
These are special-cased in an intermediate step.
$ do
match contentPages
addToMenu$ setExtension "html"
route $ do
compile <- contentContext
context "**.fut", futCompiler),
createByPattern contentCompiler [("**.man", manCompiler)]
(>>= modifyByPattern return [("hacks/index.md", addHacks)]
>>= loadAndApplyTemplate "templates/default.html" context
>>= relativizeUrls
The group "source"
contains the source files of all literate
programs. This is practical, as the processed literate program will
be an HTML file, and thus probably no longer compilable by the
literate system. While the user could in some cases copy the text
from the browser into a source file (this is possible for literate
Haskell), it is more convenient to have the original source file
available. Source files are, of course, copied verbatim.
$ version "source" static match contentPages
The group "hacks"
contains small program descriptions generated by
hackCompiler
. It is included by addHacks
.
"hacks/scripts/*" $ version "hacks" $ compile hackCompiler match
For the blog, there are two main tasks: first, we must create a page
for every post; second, we must create an overview page. A blog post
is simply a Markdown file in the "blog/"
subdirectory. Note that
this is not matched by contentPages
.
let blogArticles = "blog/*-*-*-*.md" .&&. hasNoVersion
Each post post gives rise to a corresponding HTML file, which uses the post template. The template expects a date field, whose value we extract from the file name. Importantly, individual blog articles are not added to the menu - there would quickly be far too many.
$ version "source" static
match blogArticles
$ do
match blogArticles $ setExtension "html"
route $ do
compile <- postContext
postCtx
contentCompiler>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
The posts list is created by a template. The blog posts are sorted by
date, with the date encoded into the filename of the blog entry. The
created page is "blog/index.html"
rather than "blog.html"
in order
for blog entries to be considered children of the blog entry in the
menu.
"blog/index.html"] $ do
create [
addToMenu
route idRoute$ do
compile <- contentContext
ctx <- recentFirst =<< loadAll blogArticles
posts let ctx' = constField "title" "Blog" <>
"posts" postCtx (return posts) <>
listField
ctx""
makeItem >>= loadAndApplyTemplate "templates/posts.html" ctx'
>>= loadAndApplyTemplate "templates/default.html" ctx'
>>= relativizeUrls
As the final touch on the blog, we produce an Atom feed. I have no particular reason for choosing Atom over the alternatives, except that it seems slightly more modern. Hakyll seems to support most formats, so I can add more if I feel like it.
"atom.xml"] $ do
create [
route idRoute$ do
compile let feedCtx = postCtx `mappend` bodyField "description"
<- fmap (take 10) . recentFirst =<<
posts "content"
loadAllSnapshots blogArticles let feedConfiguration = FeedConfiguration
= "Troels Henriksen's blog"
{ feedTitle = "What some hacker has written"
, feedDescription = "Troels Henriksen"
, feedAuthorName = "athas@sigkill.dk"
, feedAuthorEmail = "http://sigkill.dk"
, feedRoot
} renderAtom feedConfiguration feedCtx posts
Finally, HTML templates are, of course, handled by the default template compiler.
"templates/*" $ compile templateCompiler match
We’re done with the main function. All we need to do now is some fleshing out. To start with, static files are merely copied into position.
static :: Rules ()
= route idRoute >> compile copyFileCompiler >> return () static
Compiling content pages is done with the default
pandocCompiler
, but we extend it slightly with a transformation
that makes every headline link to itself. First, we define the
function that transforms a Header
block into a Header
block with
a self-link.
selfLinkHeader :: Block -> Block
Header n (ident, classes, kvs) b) =
selfLinkHeader (Header n (ident, classes, kvs) [b']
where b' = Link (ident <> "-link", ["titlelink"], []) b ("#" <> ident, ident)
= x selfLinkHeader x
Then we can define our contentCompiler
as a pandocCompiler
with an
additional transformation step.
contentCompiler :: Compiler (Item String)
= pandocCompilerWithTransform
contentCompiler
defaultHakyllReaderOptions$
defaultHakyllWriterOptions walk selfLinkHeader
Compiling man pages is done using the system groff
program. The
output from groff
will contain control characters (notably
backspaces), which we process with col -b
to generate plain text.
Finally, we insert the text into an HTML pre
element to preserve the
whitespace formatting.
manCompiler :: Compiler (Item String)
= getResourceString
manCompiler >>= withItemBody (unixFilter "groff" (words "-m mandoc -T utf8")
>=> unixFilter "col" ["-b"]
>=> return . renderHtml . H.pre . H.toHtml)
Futhark programs (with the .fut
extension) are taken to be literate
Futhark programs, and processed with futhark literate
. We need to play tricks with the itemIdentifier
because renderPandocWith
expects an .md
extension.
futCompiler :: Compiler (Item String)
= do
futCompiler <- getResourceFilePath
source $ unixFilter "futhark" ["literate", source] mempty
void let mdfile = source `replaceExtension` "md"
<- makeItem =<< unsafeCompiler (readFile mdfile)
item let oldident = itemIdentifier item
$ removeFile mdfile
unsafeCompiler <- renderPandocWithTransform defaultHakyllReaderOptions defaultHakyllWriterOptions
item'
(walk selfLinkHeader)= fromFilePath mdfile }
item { itemIdentifier pure item' { itemIdentifier = oldident }
And that’s it.