Makefile-Based Blogging

Yet another static site generator using pandoc(1) and make(1).

December 12, 2022 AD

A few days ago, I got the gumption to start blogging again. The last time I wrote with any frequency, I lovingly hand-crafted each HTML file before rsyncing it to my web server. This time, I wanted a more efficient workflow.

I surveyed the vast number of static site generators available on GitHub, but most of them seemed like overkill for my humble website. I figured that by the time I wrapped by head around one of them, I could have just written a Makefile.

Finally, I came across pandoc-blog, which gave me inspiration and showed me the ideal pandoc incantations for generating HTML from markdown files. And thus, my Makefile-based static site generator was born. You’re reading the inaugural post!

Generating the HTML

The workhorse of this thing is pandoc, which is a ubiquitous open-source document converter. Transforming markdown into HTML is as simple as:

pandoc document.md -o document.html

Simple! But to generate an entire website, we’ll need some of pandoc’s additional features: custom templates and document metadata.

Custom Templates

The layout of pandoc’s output document is determined by the template in use. Pandoc includes default templates for a variety of document formats, but you can also specify your own.

A very simple HTML template might look something like this:

<html lang="en">
  <head>
    <meta name="author" content="$author-meta$">
    <meta name="description" content="$description$">
  </head>
  <body>
    <h1 class="title">$title$</h1>
$body$
  </body>
</html>

My pandoc template is what generates the navigation bar at the top of this page.

The variable $body$ is replaced by the content of your markdown document when pandoc renders the template. The other variables are replaced by their corresponding values from the document’s metadata.

Document Metadata

Each pandoc source document can have associated metadata values. There are three ways of specifying metadata: the --medatata flag, a dedicated metadata file, or a YAML metadata block embedded within the document itself. We’ll be using the embedded metadata blocks.

Each markdown document for my website starts with a YAML metadata block. The metadata for the post you’re currently reading looks like this:

---
title: Makefile-Based Blogging
date: December 12, 2022
subtitle: Yet another static site generator using `pandoc(1)` and `make(1)`.
description: Building a markdown-based static site generator using pandoc and make.
---

You can put whatever YAML you like in your markdown files, as long as the metadata starts and ends with three hyphens.

Automating pandoc with make

Using a Makefile, we can automatically invoke pandoc to convert each markdown file in our blog to HTML. In addition, make will keep track of which source files have changed since the last run and rebuild them accordingly.

First, lets describe the project layout:

The Makefile used to build this website is located here. I’ve reproduced a simplified version below, to make it easier to step through.

######################
# Variable definitions
######################

# These variables are used to generate the RSS feed
URL              = https://www.sacredheartsc.com
FEED_TITLE       = sacredheartsc blog
FEED_DESCRIPTION = Carolina-grown articles about self-hosting, privacy, unix, and more.

# The number of blog posts to show on the homepage
BLOG_LIST_LIMIT = 5

# File extensions (other than .md) that should be included in public/ directory
STATIC_REGEX = .*\.(html|css|jpg|jpeg|png|xml|txt)

# Pandoc template used to generate HTML
TEMPLATE = templates/default.html

# List of subdirectories to create
SOURCE_DIRS := $(shell find src -mindepth 1 -type d)

# List of source markdown files
SOURCE_MARKDOWN := $(shell find src -type f -name '*.md' -and ! -name .bloglist.md)

# List of static assets
SOURCE_STATIC := $(shell find src               \
                     -type f                    \
                     -regextype posix-extended  \
                     -iregex '$(STATIC_REGEX)')

# List of all blog posts (excluding the main blog page)
BLOG_POSTS := $(shell find src/blog               \
                  -type f                         \
                  -name '*.md'                    \
                  -and ! -name .bloglist.md       \
                  -and ! -path src/blog/index.md)

# Subdirectories to create under public/
OUTPUT_DIRS := $(patsubst src/%, public/%, $(SOURCE_DIRS))

# .html files under public/, corresponding to each .md file under src/
OUTPUT_MARKDOWN := $(patsubst src/%, public/%, $(patsubst %.md, %.html, $(SOURCE_MARKDOWN)))

# Static file targets under public/
OUTPUT_STATIC := $(patsubst src/%, public/%, $(SOURCE_STATIC))

# Script to generate RSS feed
RSSGEN = scripts/rss.py               \
  src/blog                            \
  --title="$(FEED_TITLE)"             \
  --description="$(FEED_DESCRIPTION)" \
  --url=$(URL)                        \
  --blog-path=/blog                   \
  --feed-path=/blog/rss/feed.xml


######################
# File Targets
######################

# Default target: convert .md to .html, copy static assets, and generate RSS
public:                \
  $(OUTPUT_DIRS)       \
  $(OUTPUT_MARKDOWN)   \
  $(OUTPUT_STATIC)     \
  public/blog/feed.xml

# Homepage (/)
public/index.html: src/index.md src/.bloglist.md $(TEMPLATE)
    sed $$'/__BLOG_LIST__/{r src/.bloglist.md\nd}' $< \
      | pandoc --template=$(TEMPLATE) --output=$@

# Markdown list of 5 most recent blog posts
src/.bloglist.md: $(BLOG_POSTS) scripts/bloglist.py
    scripts/bloglist.py src/blog $(BLOG_LIST_LIMIT) > $@

# The main blog listing (/blog/)
public/blog/index.html: src/blog/index.md src/blog/.bloglist.md $(TEMPLATE)
    sed $$'/__BLOG_LIST__/{r src/blog/.bloglist.md\nd}' $< \
      | pandoc --template=$(TEMPLATE) --output=$@

# Markdown list of _all_ blog posts
src/blog/.bloglist.md: $(BLOG_POSTS) scripts/bloglist.py
    scripts/bloglist.py src/blog > $@

# Convert all other .md files to .html
public/%.html: src/%.md $(TEMPLATE)
    pandoc --template=$(TEMPLATE) --output=$@ $<

# Catch-all: copy static assets in src/ to public/
public/%: src/%
    cp --preserve=timestamps $< $@

# RSS feed
public/blog/feed.xml: $(BLOG_POSTS) scripts/rss.py
    $(RSSGEN) > $@


######################
# Phony Targets
######################

.PHONY: serve rsync clean

# Run a local HTTP server in the output directory
serve: public
    cd public && python3 -m http.server

# Deploy the site to your webserver
rsync: public
    rsync -rlphv --delete public/ webserver.example.com:/var/www/html

clean:
    rm -rf public
    rm -f src/.bloglist.md
    rm -f src/blog/.bloglist.md

Closing Thoughts

I admit, there is a small amount of hackery involved. You obviously can’t generate a time-sorted list of blog posts using pure markdown, so I’m generating the markdown list using a Python script in an intermediate step. I then (ab)use sed to shove that list into the markdown source on the fly. This means that changing the look of the blog list requires hacking up the Python code.

But overall, I’ve been quite happy with this little project. There’s just something about writing paragraphs in vi and typing :!make that warms my soul with memories of simpler times.