Serve AI-readable Markdown from Hugo
When an LLM fetches a documentation or tutorial page, it frequently has to slog through the morass of “modern” HTML: headers, site navigation, inline scripts, inline styles, ads (not on this site, of course!), footers, and then, maybe somewhere in the middle, the actual content. Some sites don’t even include the actual content, choosing to render it via JavaScript!
Claude’s Web Fetch tool is reasonable at finding the main content, but if your site is client-side rendered, it won’t find it at all. Even on a server-rendered site, the extra content noise (even with “dynamic filtering”) eats into the context window and opens the possibility that the model will miss important details.
A plain Markdown file lets the LLM skip all of these shenanigans, so the model gets the text immediately, at a fraction of the token cost.
If you’re lucky enough to use Hugo (like this site),
custom output formats handle
this automatically. Every page gets an index.md sibling, and the site root
gets a /llms.txt index. No build plugins, no post-processing.
Define output formats
Add two named formats to your config.toml (or hugo.toml):
[outputFormats.markdown]
name = "markdown"
baseName = "index"
mediaType = "text/markdown"
isPlainText = true
[outputFormats.llms]
name = "llms"
baseName = "llms"
mediaType = "text/plain"
isPlainText = true
isPlainText = true tells Hugo to use Go’s text/template package instead of
html/template, so it won’t escape angle brackets or quotes in your Markdown content.
Assign formats to page kinds
Tell Hugo which formats to generate for pages and the home page:
[outputs]
page = ["HTML", "markdown"]
home = ["HTML", "RSS", "llms"]
The first entry in each array is the primary format. Everything else is an alternate.
See configure outputs for the full list of
page kinds (section, taxonomy, term, etc.) you can assign.
Create the page template
Hugo picks a template based on the output format name. For the markdown format, create
layouts/_default/single.md:
---
title: {{ .Title }}
url: {{ .Permalink }}
{{- with .Description }}
description: {{ . }}
{{- end }}
{{- with .Date }}
date: {{ .Format "2006-01-02" }}
{{- end }}
{{- with .Params.keywords }}
keywords: {{ delimit . ", " }}
{{- end }}
---
{{ .RawContent }}
RawContent is the original Markdown source. Hugo doesn’t render it, so your
headings, code blocks, and links stay intact.
Create the llms.txt template
The llms.txt spec describes a simple index format: a
site title, an optional description, and a list of pages with brief summaries.
Create layouts/_default/index.llms.txt:
# {{ .Site.Title }}
> {{ .Site.Params.description }}
{{ .Site.BaseURL }}
## Documentation
{{ range .Site.RegularPages }}
- [{{ .Title }}]({{ .Permalink }}index.md): {{ with .Description }}{{ . | htmlUnescape }}{{ else }}{{ .Summary | plainify | htmlUnescape | truncate 120 }}{{ end }}
{{- end }}
Each entry links directly to the index.md version of the page, not the HTML version.
Add a discovery link
The common convention for Markdown variants is to append .md to the URL, but that doesn’t work for Hugo’s pretty URLs like https://example.com/some-page/. Instead, add a <link rel="alternate"> tag to your HTML so clients can discover the Markdown version.
In layouts/_default/baseof.html (or wherever your <head> is
defined):
{{- with .OutputFormats.Get "markdown" }}
<link
href="{{ .Permalink }}"
rel="alternate"
type="text/markdown"
title="{{ $.Title }}"
/>
{{- end }}
The with guard ensures the tag only appears on pages that have a Markdown
output. It won’t show up on the home page or section indexes unless you add
markdown to those kinds’ outputs.
Set Content-Type headers
By default, Hugo and most static hosts serve .md files without a charset
parameter. Some HTTP clients misinterpret non-ASCII characters as a result.
Tragically, this causes critical emoji (probably not actually critical) to render as mojibake.
Set the Content-Type header for .md files to include charset=utf-8.
For the Hugo dev server, add to config.toml:
[[server.headers]]
for = "**.md"
[server.headers.values]
Content-Type = "text/markdown; charset=utf-8"
[[server.headers]]
for = "**.txt"
[server.headers.values]
Content-Type = "text/plain; charset=utf-8"
For Cloudflare and other static hosts that support a _headers file, add to
static/_headers:
/*.md
Content-Type: text/markdown; charset=utf-8
/llms.txt
Content-Type: text/plain; charset=utf-8
Other hosts have equivalent mechanisms: Netlify uses netlify.toml headers,
Vercel uses vercel.json headers. Visit the documentation for your host to find
out how to set custom headers for static files correctly.
Verify
After rebuilding (hugo or hugo server), check that everything works:
# Per-page Markdown output
curl -I https://yoursite.com/some-page/index.md
# Expect: Content-Type: text/markdown; charset=utf-8
# Site index
curl https://yoursite.com/llms.txt
# Expect: a Markdown list of all pages
# Discovery link in HTML
curl -s https://yoursite.com/some-page/ | grep 'rel="alternate"'
# Expect: <link href="..." rel="alternate" type="text/markdown" ...>
Locally: run hugo server, open any page in your browser, and inspect the
<head> for the alternate link.
