Machine-readable formats
Offer JSON, RSS, or plain markdown endpoints alongside HTML where it makes sense. Agents and feed readers prefer typed data over scraped HTML.
What it is
A machine-readable format is a representation of your content designed to be consumed by software rather than rendered for humans. HTML is the universal default, but for lists, feeds, and structured records it is the wrong shape. JSON, RSS, Atom, and plain markdown are better.
Most sites already publish one without thinking about it: sitemap.xml. Adding a feed and an optional JSON endpoint covers the cases agents and aggregators care about most.
<!-- /feed -->
<rss version="2.0">
<channel>
<title>Example Corp Blog</title>
<link>https://example.com/blog</link>
<description>Notes on building things.</description>
<item>
<title>Setting up CSP</title>
<link>https://example.com/blog/csp</link>
<pubDate>Tue, 12 May 2026 10:00:00 GMT</pubDate>
</item>
</channel>
</rss>
// /index.json
{
"version": "https://jsonfeed.org/version/1.1",
"title": "Example Corp Blog",
"home_page_url": "https://example.com/blog",
"feed_url": "https://example.com/index.json",
"items": [
{
"id": "https://example.com/blog/csp",
"url": "https://example.com/blog/csp",
"title": "Setting up CSP",
"date_published": "2026-05-12T10:00:00Z",
"content_html": "<p>...</p>"
}
]
}
Why it matters
- Agents that need a list of recent posts can fetch one URL instead of scraping a paginated index.
- Feed readers, archive tools, and notification services have decades of support for RSS and Atom.
- JSON Feed is friendlier to modern toolchains and easier to extend.
- For documentation, exposing a
.mdnext to each.htmlgives agents the source without HTML extraction.
The cost is small and the upside compounds: every consumer that does not have to parse HTML is one less source of bad quotes.
How to implement
-
Sitemap. Every site should have one — see XML sitemaps.
-
A feed. Publish RSS 2.0 or Atom at a discoverable URL such as
/feed,/feed.xml, or/rss.xml. Optionally publish a JSON Feed at/feed.jsonor/index.json. Link both from<head>:<link rel="alternate" type="application/rss+xml" title="Blog" href="/feed.xml"> <link rel="alternate" type="application/json" title="Blog" href="/feed.json"> -
Markdown sources. For documentation, serve
page.mdalongsidepage.html— see Per-page Markdown source endpoints for the full pattern, includingAccept: text/markdowncontent negotiation done correctly withVary: Accept. -
Content negotiation. Powerful, but easy to misuse. Always set
Vary: Accepton every representation, otherwise caches will pin the wrong one for every subsequent visitor. When the formats are very different (HTML vs JSON), distinct URLs are usually simpler. -
Stable schemas. Once published, treat the field names as a contract — same rule as URLs.
Common mistakes
- Truncating feed items to a one-line teaser. If you want agents to quote you fairly, give them the full content.
- Forgetting to update
pubDateordate_published, so readers think nothing has changed. - Letting the feed go stale because the CMS export broke quietly. Monitor it.
- Mixing absolute and relative URLs inside feed items. Use absolute everywhere.
Verification
- Validate RSS at validator.w3.org/feed.
- Validate JSON Feed against the spec.
- Subscribe to your own feed in a reader and confirm new posts appear.
Related topics
Sources & further reading
- RSS 2.0 Specification — RSS Advisory Board
- JSON Feed 1.1 — JSON Feed
- Is It Agent Ready? — Is It Agent Ready?