Back to blog
SEO #llms.txt#seo#aeo

llms.txt in 2026: A Practical Guide for AI Discovery

llms.txt helps AI systems understand your site quickly. A practical guide to writing llms.txt that’s useful, honest, and easy to maintain.

16 min · January 29, 2026 · Updated January 27, 2026
Topic relevant background image

TL;DR

  • llms.txt is a machine-friendly “front door.”
  • Include the pages that define your product and best content.
  • Keep it short, structured, and updated.

What Is llms.txt (And What It’s Not)

llms.txt is an open, human-readable (and machine-friendly) file that gives AI assistants a curated map of your site in a single place.

It exists because AI systems often struggle with:

  • noisy HTML
  • navigation wrappers
  • JavaScript-heavy pages
  • huge sites that exceed context windows

Instead of “crawl everything,” llms.txt provides:

  • quick context (what the site is)
  • the best canonical links
  • an optional section for low-priority items

What it is not

  • It’s not a replacement for robots.txt
  • It’s not a crawler directive (it’s a content map)
  • It’s not a place to dump hundreds of links
  • It’s not a secret storage mechanism (treat it as public)

llms.txt vs robots.txt vs sitemap.xml

These files solve different problems:

FileAudiencePurpose
robots.txtweb crawlersallow/deny crawling paths
sitemap.xmlsearch enginesenumerate URLs for crawling/indexing
llms.txtAI assistantsprovide curated, LLM-friendly entry points

If you do SEO/AEO seriously in 2026, you’ll usually want all three.


The common pattern is Markdown:

# Site Name
> One-sentence description of what this site is.

## Quick context
- 2–5 bullets that explain what matters

## Key links
- [Homepage](https://example.com/)
- [Docs](https://example.com/docs)

## Featured resources
- [Best guide](https://example.com/blog/best-guide): why it matters

## Optional
- [Low-priority section](https://example.com/more)

Why this structure works

  • AI systems can quickly parse headings and lists
  • The “featured resources” section provides a short answerable set of high-signal pages
  • ## Optional lets smaller context windows skip less important items

What to Include (The High-Signal Set)

Think of llms.txt as a “front door.” Include the pages that define your product and your best, most evergreen content.

1) Quick context (what the site is)

Add a few bullets that remove ambiguity:

  • who it’s for
  • what it does
  • how it’s organized
  • what content is the most authoritative

Include canonical pages:

  • homepage
  • product/service pages
  • docs or help center
  • blog/journal
  • case studies/projects
  • contact

Pick 5–15 high-quality pieces:

  • “core guide” posts
  • category overviews
  • foundational pages you want cited

If you include too many, you recreate the same context-window problem you’re trying to solve.

4) Policies and trust pages (optional but useful)

Depending on your site:

  • privacy policy
  • terms
  • security page
  • data retention summary

These matter for enterprise evaluation and for safe citations.


What Not to Include (Avoid These Mistakes)

Don’t dump a full sitemap

If the file becomes a 300-link directory, you lose the benefit of curation.

Don’t include secrets or private endpoints

llms.txt is public. Never include:

  • internal admin links
  • private dashboards
  • secrets or tokens
  • customer-specific URLs

Don’t write marketing fluff

AI assistants will (sometimes) cite your llms.txt. If it’s vague, the citations will be vague.

Prefer specific, plain language.


Maintenance: Keep It Honest and Fresh

llms.txt becomes harmful if it points to outdated pages.

Practical maintenance rules:

  • update when navigation changes
  • update when “featured posts” change
  • review monthly if you ship content frequently
  • keep links canonical (avoid duplicate hostnames)

If you have both www and non-www, pick one canonical host and stick to it.


Validation (Don’t Ship Blind)

Use a validator to ensure formatting is easy to parse.

If you change llms.txt, run a quick check:

  • does the file still render as simple Markdown?
  • do the links resolve?
  • is the “Optional” section correctly labeled?

Implementation on Static Sites (Astro Example)

On static sites (like Astro), implementation is simple:

  • put the file in public/llms.txt
  • it becomes available at /llms.txt
  • keep it small and curated

If you want to go further, you can also publish Markdown-friendly versions of key pages (so AI systems can fetch clean text without HTML noise). This is optional but increasingly common for AI discovery flows.


Implementation Checklist

  • llms.txt exists at /<your-domain>/llms.txt
  • H1 + one-sentence summary present
  • “Quick context” explains what the site is and who it’s for
  • Key links cover the canonical top pages
  • Featured resources are curated (5–15 high-signal links)
  • Optional section exists for low-priority links
  • No secrets, private URLs, or customer-specific links
  • File is reviewed and updated on a schedule

FAQ

Does llms.txt affect Google SEO?

It’s primarily for AI discovery systems. But better structure and clarity often improves overall discoverability and internal linking discipline.

Should I block AI bots in robots.txt and still publish llms.txt?

That depends on your strategy. robots.txt is crawler policy; llms.txt is content mapping. Some teams publish llms.txt for assistants while still restricting certain bots. If you do this, be explicit and consistent.

How long should llms.txt be?

Short enough to be read quickly. If it doesn’t fit comfortably in a single context window, you probably included too much.


Sources & Further Reading

Interested in our research?

We share our work openly. If you'd like to collaborate or discuss ideas — we'd love to hear from you.

Get in Touch

Let's build
something real.

No more slide decks. No more "maybe next quarter".
Let's ship your MVP in weeks.

Start Building Now