# The Website Specification > A platform-agnostic, full specification of the technical features a good website should have. Built in the open under an MIT licence. Source: https://specification.website · Repository: https://github.com/jdevalk/specification.website · Content licence: CC BY 4.0 --- # Foundations The HTML, head, and document basics every page needs. ## The HTML doctype Status: required · Source page: https://specification.website/spec/foundations/doctype/ Every HTML document must start with as its first line. This opts the browser into standards mode; without it, you get quirks mode and broken layout. ## What it is The doctype is a short declaration at the very top of an HTML document that tells the browser which rendering mode to use. In modern HTML, there is exactly one correct form: ```html ``` It is case-insensitive, so `` is equally valid. It must be the first thing in the document, before ``, with no whitespace, comments, or byte-order mark trickery in front of it. ## Why it matters Without a doctype, browsers fall back to **quirks mode** — a compatibility layer that emulates the buggy behaviour of browsers from the late 1990s. In quirks mode: - The CSS box model changes (widths include padding and border, like the old IE5 model). - Inline elements behave differently around whitespace. - Many modern CSS features are unreliable or disabled. - Table cell heights, image alignment, and font sizing all shift. With `` you get **standards mode**, where the browser follows current specifications. There is also a "limited-quirks" (almost-standards) mode triggered by some legacy doctypes, but you should never need it. In short: one missing line at the top of the document silently changes how every CSS rule on the page is interpreted. It is the cheapest correctness fix on the web. ## How to implement Put the doctype on line one of every HTML response. No XML prolog, no comment, no blank line: ```html Page title ... ``` The old HTML 4 and XHTML doctypes (`HTML 4.01 Transitional`, `XHTML 1.0 Strict`, etc.) are obsolete. Replace them with ``. The short form is part of the HTML Living Standard and works in every browser back to IE6. For XML-serialised HTML (XHTML served as `application/xhtml+xml`), the doctype is optional, but the document must still parse as XML. Almost no public sites need this — serve HTML. ## Common mistakes - A blank line, comment, or BOM before ``. Anything before the doctype can trigger quirks mode. - Using a legacy HTML 4 or XHTML doctype copied from a 2005 tutorial. - Sending the doctype only on some pages — error pages, print views, and embedded iframes need it too. - Letting a templating engine strip it during minification. ## Verification - View source on the page. The very first bytes must be ``. - In DevTools, run `document.compatMode` in the console. It should return `"CSS1Compat"` (standards mode), not `"BackCompat"` (quirks). - Check error pages (404, 500), redirect destinations, and any HTML fragments returned by APIs. ### Sources - HTML Living Standard — The DOCTYPE (WHATWG) — https://html.spec.whatwg.org/multipage/syntax.html#the-doctype - MDN — Doctype (MDN) — https://developer.mozilla.org/en-US/docs/Glossary/Doctype - MDN — Quirks Mode and Standards Mode (MDN) — https://developer.mozilla.org/en-US/docs/Web/HTML/Quirks_Mode_and_Standards_Mode --- ## The lang attribute on Status: required · Source page: https://specification.website/spec/foundations/html-lang/ Set a valid BCP 47 language tag on the element so screen readers, translators, search engines, and browsers know what language the page is in. ## What it is Every HTML document should declare its primary language on the root element using the `lang` attribute: ```html ``` The value is a **BCP 47 language tag** — a short, standardised string that identifies a language and, optionally, a region or script. Examples: `en`, `en-GB`, `nl`, `pt-BR`, `zh-Hant`, `de-AT`. ## Why it matters The language of the page is metadata that many systems rely on: - **Screen readers** switch pronunciation engines based on `lang`. Without it, VoiceOver and NVDA read English text with a Dutch accent, or vice versa, making content unintelligible. - **Browsers** offer translation prompts ("Translate this page from French?") only when they know the source language. - **Search engines** use it as one signal for which audience to show the page to. - **Spell checkers** in `