# The Website Specification
> A platform-agnostic, full specification of the technical features a good website should have. Built in the open under an MIT licence.
This is the llms.txt index for https://specification.website. The site is a platform-agnostic specification of the technical features a good website should have. It covers HTML foundations, SEO, accessibility, security, well-known URIs, agent readiness, performance, privacy, resilience, and internationalisation.
Every individual spec page is available as raw Markdown two ways:
1. Append `.md` to any spec URL — e.g. https://specification.website/spec/security/content-security-policy.md
2. Send `Accept: text/markdown` to the canonical (slash-terminated) URL — the same page returns Markdown with `Content-Location` pointing at the .md path and `Vary: Accept` set for caches.
For the full content of every spec page concatenated into one file, see https://specification.website/llms-full.txt.
For richer access, two named surfaces are available:
- MCP server at `https://mcp.specification.website/mcp` — stateless Streamable HTTP, no auth, with `search`, `list_topics`, `get_topic`, `get_checklist`, and `audit_url` tools. Server card: https://specification.website/.well-known/mcp/server-card.json.
- Agent Skill at https://specification.website/.well-known/agent-skills/specification-website/SKILL.md — discoverable via https://specification.website/.well-known/agent-skills/index.json. Teaches a compatible agent when and how to query the spec.
## Foundations
The HTML, head, and document basics every page needs.
- [The HTML doctype](https://specification.website/spec/foundations/doctype/): Every HTML document must start with as its first line. This opts the browser into standards mode; without it, you get quirks mode and broken layout.
- [The lang attribute on ](https://specification.website/spec/foundations/html-lang/): Set a valid BCP 47 language tag on the element so screen readers, translators, search engines, and browsers know what language the page is in.
- [](https://specification.website/spec/foundations/meta-charset/): Declare UTF-8 as the document character encoding in the first 1024 bytes of the HTML, so browsers parse text correctly before they hit any non-ASCII content.
- [](https://specification.website/spec/foundations/meta-viewport/): Tell mobile browsers to render the page at the device's actual width instead of pretending to be a 980-pixel desktop. One line, and never disable user scaling.
- [The
element](https://specification.website/spec/foundations/title/): Every HTML document must have exactly one non-empty element inside . It is used by browsers, search engines, screen readers, social previews, and AI agents.
- [](https://specification.website/spec/foundations/meta-description/): A short, unique summary of the page used by search engines and social platforms as a snippet. Google may rewrite it, but a good one is rewritten less often.
- [Canonical URL (rel="canonical")](https://specification.website/spec/foundations/canonical-url/): Declare the preferred URL for a page so search engines and crawlers consolidate ranking signals on one address, even when several URLs serve the same content.
- [Favicons and app icons](https://specification.website/spec/foundations/favicons/): Ship an SVG favicon, an ICO fallback at /favicon.ico, an apple-touch-icon, and a maskable PWA icon. Five files cover every browser and home-screen surface.
- [](https://specification.website/spec/foundations/theme-color/): Tints the browser chrome and OS surfaces to match your brand. Use the media attribute to ship one colour for light mode and another for dark mode.
- [](https://specification.website/spec/foundations/color-scheme/): Tells the browser which colour schemes your page is designed for. Prevents the white flash that dark-mode users see before your CSS loads, and lets the browser style scrollbars, form controls, and the page background to match.
- [Open Graph protocol](https://specification.website/spec/foundations/open-graph/): Open Graph tags control how pages look when shared on social platforms and chat apps. Set og:title, og:description, og:image, og:url, and og:type on every page.
- [Feed discovery with rel="alternate"](https://specification.website/spec/foundations/feed-discovery/): If your site publishes a feed — RSS, Atom, or JSON Feed — announce it in with . Feed readers, agents, and browsers discover it without guessing the URL.
- [Feed content hygiene](https://specification.website/spec/foundations/feed-hygiene/): If you publish a feed, ship it well-formed. Identify the feed inside itself with atom:link rel="self", give every item a stable guid, declare an update cadence with the Syndication module, and validate before deploy.
- [Popover API](https://specification.website/spec/foundations/popover-api/): Replace ARIA-puzzled JavaScript modals, menus, and tooltips with a native top-layer primitive that the browser opens, closes, and accessibility-wires for you.
## SEO
Search visibility — robots.txt, sitemaps, canonicals, structured data.
- [robots.txt](https://specification.website/spec/seo/robots-txt/): A plain-text file at the site root that tells crawlers which paths they may or may not fetch. Standardised in RFC 9309 and supported by every major search engine.
- [XML sitemaps](https://specification.website/spec/seo/xml-sitemaps/): An XML file listing the canonical URLs of a site, with optional metadata about when each was last changed. The fastest way to tell a search engine what exists.
- [Sitemap index files](https://specification.website/spec/seo/sitemap-index/): A sitemap of sitemaps. Used when a site has more than 50,000 URLs or wants to split sitemaps by content type for cleaner reporting.
- [Image and video sitemap extensions](https://specification.website/spec/seo/image-sitemaps/): Optional XML extensions that add image and video metadata to sitemap entries. Useful when media is loaded by JavaScript or hosted on a CDN that crawlers cannot reach by following links.
- [URL structure](https://specification.website/spec/seo/url-structure/): URLs are the most stable identifier on the web. Keep them lowercase, hyphenated, descriptive, and shallow. Treat them as a public API for your content.
- [Redirects (301/302/308)](https://specification.website/spec/seo/redirects/): HTTP redirects send a client from one URL to another. Use 301 or 308 for permanent moves, 302 or 307 for temporary ones, and never chain more than necessary.
- [Soft 404s](https://specification.website/spec/seo/soft-404/): A page that looks like a 'not found' message to a user but returns 200 OK to a crawler. Search engines treat soft 404s as a quality problem and often refuse to index them.
- [Meta robots and X-Robots-Tag](https://specification.website/spec/seo/meta-robots/): Every page must have an explicit, correct indexing policy — either implicit (default index, follow) on public pages, or an explicit noindex / X-Robots-Tag on staging, admin, thin, or private content. Get this wrong and you either disappear from search or expose what you didn't mean to.
- [Heading hierarchy](https://specification.website/spec/seo/heading-hierarchy/): Headings describe the sections of a page. They must form a nested outline, never be used for visual styling alone, and never skip levels.
- [Internal linking](https://specification.website/spec/seo/internal-linking/): Links from one page on a site to another. The strongest signal you control for telling crawlers and AI agents what a page is about and how important it is.
- [Structured data (JSON-LD)](https://specification.website/spec/seo/structured-data/): Machine-readable annotations that describe the content of a page using the schema.org vocabulary. JSON-LD is the format search engines and AI agents expect.
- [Breadcrumbs](https://specification.website/spec/seo/breadcrumbs/): A short trail showing the page's position in the site hierarchy. Visible in the UI for users, marked up as BreadcrumbList JSON-LD for search engines.
- [IndexNow](https://specification.website/spec/seo/indexnow/): An open protocol for telling participating search engines that a URL has changed. One HTTP request pushes Bing, Yandex, Naver, and Seznam to recrawl — Google does not participate.
## Accessibility
WCAG-aligned rules so people of all abilities can use the site.
- [Colour contrast](https://specification.website/spec/accessibility/color-contrast/): Text and meaningful non-text elements must have enough contrast against their background so people with low vision and people in harsh light can read them.
- [Image alt text](https://specification.website/spec/accessibility/image-alt-text/): Every element must have an alt attribute. The value describes the image's purpose to screen readers, search engines, and anyone whose image fails to load.
- [Form labels](https://specification.website/spec/accessibility/form-labels/): Every form control needs a programmatically associated label. A placeholder is not a label, and an unlabelled input is unusable for screen-reader and voice-control users.
- [Keyboard navigation](https://specification.website/spec/accessibility/keyboard-navigation/): Every interactive element on the page must be reachable and operable with a keyboard alone, in a logical order, with no traps that hold focus.
- [Visible focus indicators](https://specification.website/spec/accessibility/focus-indicators/): Whenever a control receives keyboard focus, the page must show a clear, high-contrast indicator. Removing focus outlines without a replacement is a top accessibility failure.
- [Skip links](https://specification.website/spec/accessibility/skip-links/): A 'skip to main content' link as the first focusable element lets keyboard and screen-reader users jump past repeated navigation on every page.
- [Semantic HTML and landmarks](https://specification.website/spec/accessibility/semantic-html/): Use the right HTML element for the job. Landmarks like ,