SEO
Search visibility — robots.txt, sitemaps, canonicals, structured data.
13 topics in this category.
-
robots.txt
RecommendedA plain-text file at the site root that tells crawlers which paths they may or may not fetch. Standardised in RFC 9309 and supported by every major search engine.
-
XML sitemaps
RecommendedAn XML file listing the canonical URLs of a site, with optional metadata about when each was last changed. The fastest way to tell a search engine what exists.
-
Sitemap index files
RecommendedA sitemap of sitemaps. Used when a site has more than 50,000 URLs or wants to split sitemaps by content type for cleaner reporting.
-
Image and video sitemap extensions
OptionalOptional XML extensions that add image and video metadata to sitemap entries. Useful when media is loaded by JavaScript or hosted on a CDN that crawlers cannot reach by following links.
-
URL structure
RecommendedURLs are the most stable identifier on the web. Keep them lowercase, hyphenated, descriptive, and shallow. Treat them as a public API for your content.
-
Redirects (301/302/308)
RequiredHTTP redirects send a client from one URL to another. Use 301 or 308 for permanent moves, 302 or 307 for temporary ones, and never chain more than necessary.
-
Soft 404s
AvoidA page that looks like a 'not found' message to a user but returns 200 OK to a crawler. Search engines treat soft 404s as a quality problem and often refuse to index them.
-
Meta robots and X-Robots-Tag
RequiredEvery page must have an explicit, correct indexing policy — either implicit (default index, follow) on public pages, or an explicit noindex / X-Robots-Tag on staging, admin, thin, or private content. Get this wrong and you either disappear from search or expose what you didn't mean to.
-
Heading hierarchy
RequiredHeadings describe the sections of a page. They must form a nested outline, never be used for visual styling alone, and never skip levels.
-
Internal linking
RecommendedLinks from one page on a site to another. The strongest signal you control for telling crawlers and AI agents what a page is about and how important it is.
-
Structured data (JSON-LD)
RecommendedMachine-readable annotations that describe the content of a page using the schema.org vocabulary. JSON-LD is the format search engines and AI agents expect.
-
Breadcrumbs
RecommendedA short trail showing the page's position in the site hierarchy. Visible in the UI for users, marked up as BreadcrumbList JSON-LD for search engines.
-
IndexNow
OptionalAn open protocol for telling participating search engines that a URL has changed. One HTTP request pushes Bing, Yandex, Naver, and Seznam to recrawl — Google does not participate.