Making Markdown More HTML5 with Kramdown

April 25th, 2015

Markdown dates back to 2004, incidentally the same year the WHATWG started working on HTML5. Now that HTML 5.0 is a a W3C Recommendation, it would be awesome if the popular Markdown converters would also support the new stuff that the new markup brings with it.

Footnotes

The vanilla Markdown does not have support for footnotes, although Daring Fireball does use them. Support for footnotes was added by MultiMarkdown and the output mimics the HTML used by Daring Fireball:

<div class="footnotes">
  <hr>
  <ol>
    <li id="fn1"><p>Footnote</p></li>
  </ol>
</div>

This had lead to pretty much every Markdown converter that supports MMD's new features to do this in exactly similar fashion even though HTML5 added a new semantic element called <footer> that I think not only simplifies the markup but also gets rid of the extra <hr> element^[1].

<footer>
  <ol>
    <li id="fn1"><p>Footnote</p></li>
  </ol>
</footer>

Figures

Another new semantic thing HTML5 added is the <figure> element. This element is used to represent, according to the spec:

… unit of content, optionally with a caption, that is self-contained, that is typically referenced as a single unit from the main flow of the document, and that can be moved away from the main flow of the document without affecting the document’s meaning.

In blogging context, this means pretty much all the images, code examples and whatever else that is used to illustrate the main text. One fancy thing figure adds is a straightforward way to caption these figures. For images in blog articles, this would mean that a more correct way to output

![alt text](http://example.com/image.jpg "Title text")

would be

<figure>
  <img src="http://example.com/image.jpg" alt="alt text" title="Title text">
  <figcaption>Title text</figcaption>
</figure>

Screenshot of Half-Life 2 cards — An example figure with a caption

Also, this applies equally to the code examples above.

Extending Kramdown

Fortunately, Kramdown allows subclassing its HTML converter among its other totally awesome features.

module Kramdown
  class Converter::Html5 < Converter::Html

This class simply then needs to override convert_codeblock and footnote_content methods with almost identical methods where the <div>s in output strings are changed to <figure>s.

However, as <img> can also appear inline in text, for example as an emoticon, some additional code is needed to check for just block-level images. Probably a more “correct” approach would be to identify block-level images already in a parser (as Kramdown’s parser does for <code>). However, one – and very likely not a very good – way to achieve this in the converter is to add a bit of logic to convert_p.

def convert_p(el, indent)
  if el.options[:transparent]
    inner(el, indent)
  # Check if the paragraph only contains an image and treat it as a figure instead.
  elsif !el.children.nil? && el.children.count == 1 && el.children.first.type == :img
    convert_figure(el.children.first, indent)
  else
    format_as_block_html(el.type, el.attr, inner(el, indent), indent)
  end
end

def convert_figure(el, indent)
  el.attr["class"] = "img-responsive" # Add a class for Bootstrap
  "#{' '*indent}<figure><img#{html_attributes(el.attr)} />#{(el.attr['title'] ? "<figcaption>#{el.attr['title']}</figcaption>" : "")}</figure>\n"
end

To get this new converter to work, for example in Jekyll, you also need to create a new Jekyll converter that tells Kramdown to convert the document to #to_html5 instead of #to_html:

class Jekyll::Converters::Markdown::MyFancierMarkdown
  # ... snip ...

  def convert(content)
    # ... snip ...
    Kramdown::Document.new(content, Jekyll::Utils.symbolize_hash_keys(@config['kramdown'])).to_html5
  end
end

Also, change the default Markdown converter in _config.yml to:

markdown: MyFancierMarkdown

There are probably other things that would make converted Markdown more civilized in a HTML5 world but these two things, footnotes and figures, have been the two itches that have bothered me the most.

Obviously, even without HTML5 the horizontal line could (and probably even should anyway) be added by a CSS border-top rule to the div#footnotes element. ↩︎