What Is llms.txt? The Complete Guide (with Examples)

Updated 8 min read

llms.txt is a simple Markdown file you place at the root of your website (at /llms.txt) that gives AI crawlers and large language models a clean, curated map of your most important content. Proposed by Jeremy Howard in 2024, the llms.txt file acts like a guidebook for LLMs such as ChatGPT, Claude, and Perplexity: instead of forcing them to wade through bloated HTML, navigation menus, ads, and JavaScript, it hands them an organized list of links to the pages that actually matter. If you have ever asked "what is llms.txt and do I need one," this complete guide explains the format, shows a full llms.txt example, and walks you through how to create one step by step.

What is llms.txt, exactly?

At its core, llms.txt answers a single question: when an AI model wants to understand your website, which pages should it read first? Traditional web pages are built for humans and browsers. They are full of headers, sidebars, cookie banners, and scripts that add noise for a language model trying to extract meaning. The llms.txt file strips that away. It is a plain-text, Markdown-formatted document that lists your key URLs with short descriptions, written specifically so machines can parse it quickly and reliably.

The name is intentionally modeled on robots.txt. Both live at the root of your domain and both speak to automated agents. But they do very different jobs, which is the most common point of confusion, so let's clear it up first.

llms.txt vs robots.txt vs sitemap.xml

People often assume llms.txt vs robots.txt is a rivalry, but they are complementary. Each file talks to bots, yet each has a distinct purpose. Here is the clearest way to think about the three root-level files that matter for AI crawlers and search engines.

How llms.txt, robots.txt, and sitemap.xml differ

FileFormatPrimary audienceWhat it does
llms.txtMarkdownLLMs / AI assistantsCurates and explains your most important content so models can understand and cite it
robots.txtPlain text rulesAll crawlers / botsTells crawlers which paths they may or may not access
sitemap.xmlXMLSearch engine crawlersLists every indexable URL so search engines can discover pages

In short: robots.txt controls access (what bots are allowed to crawl), sitemap.xml aids discovery (a complete inventory of pages for indexing), and llms.txt aids comprehension (a hand-picked, explained subset that helps an LLM grasp what your site is about). A sitemap might list 5,000 URLs; a good llms.txt might highlight the 20 that define your business. They solve different problems and can happily coexist.

Key distinction: robots.txt and sitemap.xml have been web standards for decades and are honored by Google. llms.txt is a newer, community-driven proposal aimed specifically at AI models, not traditional search indexing.

Why llms.txt matters for GEO

GEO (Generative Engine Optimization) is the practice of optimizing your content so that AI engines understand, trust, and cite it in their answers. As more people get information from ChatGPT, Claude, Perplexity, and Google's AI Overviews instead of clicking ten blue links, being citable by these systems becomes a real growth channel. This is where the llms.txt file earns its keep.

LLMs operate under tight context limits. When a model needs to reason about your site, it cannot ingest every page; it has to choose. A machine-readable map does three useful things:

  • Reduces noise. By pointing models at clean, relevant content, you lower the chance they misread your site because of cluttered markup.
  • Signals priority. You explicitly tell the model which pages represent your core offering, documentation, or expertise, rather than leaving it to guess.
  • Improves citability. When a model can quickly locate accurate, well-structured information, it is more likely to summarize and cite you correctly.

Think of llms.txt as the GEO equivalent of a well-organized table of contents. It will not single-handedly get you cited, but it removes friction for the AI systems that increasingly mediate how people find you.

The llms.txt format

One reason adoption is easy is that the format is deliberately minimal and human-readable. It is just Markdown, following a loose but consistent structure. A valid llms.txt file generally contains these elements, in order:

  1. 01An H1 with the name of your site or project. This is the only required line.
  2. 02A blockquote (>) with a short summary of what the site or company does.
  3. 03Optional paragraphs of additional context or important notes.
  4. 04One or more H2 sections (for example ## Docs, ## Products, ## About) containing Markdown link lists.
  5. 05Each link uses the format [Title](URL): optional description so the model knows what each page covers.

There is also an optional companion file, llms-full.txt. While /llms.txt is the concise index of links, /llms-full.txt contains the actual full text of your key documentation inlined into one large Markdown file. The idea is that a model can pull llms-full.txt and have your entire knowledge base in a single, clean document without crawling page by page. Use /llms.txt for navigation and /llms-full.txt when you want to expose complete content.

The structure at a glance

A useful mental model: the H1 says who you are, the blockquote says what you do, and each H2 section groups related links the way you would organize a clean documentation sidebar. Sections labeled ## Optional are conventionally understood as lower priority, so a model can skip them when context is tight.

A complete llms.txt example

Talk is cheap, so here is a realistic, complete llms.txt example for a fictional SaaS company. Notice the H1, the blockquote summary, the grouped H2 sections, and the descriptive link format. You can adapt this template directly for your own site.

# Acme Analytics

> Acme Analytics is a privacy-first product analytics platform that helps SaaS teams understand user behavior without third-party cookies.

This file helps AI assistants find our most important pages. For full documentation content, see /llms-full.txt.

## Documentation

- [Quickstart Guide](https://acme.com/docs/quickstart): Install the SDK and send your first event in five minutes.
- [API Reference](https://acme.com/docs/api): Complete REST and JavaScript API documentation.
- [Data Privacy](https://acme.com/docs/privacy): How we handle and store user data without cookies.

## Product

- [Features Overview](https://acme.com/features): Funnels, retention, and cohort analysis explained.
- [Pricing](https://acme.com/pricing): Plans, limits, and what is included in each tier.
- [Integrations](https://acme.com/integrations): Connect Acme with Segment, Stripe, and webhooks.

## Company

- [About Us](https://acme.com/about): Our mission and the team behind Acme Analytics.
- [Security](https://acme.com/security): SOC 2 compliance and infrastructure details.

## Optional

- [Blog](https://acme.com/blog): Articles on analytics, growth, and privacy.
- [Changelog](https://acme.com/changelog): Recent product updates and releases.

That is the whole file. No XML schema, no special tooling, no validation server required. If you can write Markdown, you can write llms.txt.

How to create your llms.txt step by step

Creating your first llms.txt file takes about twenty minutes. Here is the process from start to finish.

  1. 01Inventory your most important pages. List the URLs that best explain what your site does: documentation, key product pages, pricing, an about page, and core guides. Prioritize quality over completeness.
  2. 02Write the H1 and summary. Open a plain-text editor and add # Your Site Name on the first line, followed by a one or two sentence blockquote (>) describing what you do.
  3. 03Group your links into H2 sections. Create logical sections such as ## Docs, ## Product, and ## Company. Put lower-priority links under ## Optional.
  4. 04Add descriptions to each link. Use [Page Title](URL): short description. Keep descriptions factual and specific so a model knows exactly what it will find.
  5. 05Save the file as `llms.txt`. Use plain UTF-8 text. Do not save it as .txt.md or with a BOM.
  6. 06Upload it to your domain root. Deploy it so it is reachable at https://yourdomain.com/llms.txt.
  7. 07(Optional) Generate llms-full.txt. If you want to expose full content, concatenate your key pages as clean Markdown and publish it at /llms-full.txt.
  8. 08Test that it loads. Visit the URL in a browser and confirm it returns plain text, not a 404 or an HTML page.
Tip: Keep your llms.txt in version control alongside your site. When you ship a major new doc or product page, update the file in the same pull request so it never goes stale.

Where to put it and how to test it

Placement is not optional or flexible: the file must live at the root of your domain and be served at exactly /llms.txt. Just as crawlers look for robots.txt at https://yourdomain.com/robots.txt, tools and models expect llms.txt at https://yourdomain.com/llms.txt. A file buried at /docs/llms.txt or /files/llms.txt will not be discovered.

To verify it is live and correct, run through this quick checklist:

  • Open https://yourdomain.com/llms.txt directly in a browser and confirm the raw Markdown renders as plain text.
  • Check the HTTP response is 200 OK, not a redirect or 404.
  • Confirm the Content-Type is text/plain or text/markdown, not text/html.
  • Click each link in the file to make sure none are broken or pointing to staging URLs.
  • Run your site through a GEO checker like Check GEO Score to confirm the file is detected.

Do you actually need llms.txt? (an honest take)

Here is the candid answer: llms.txt is a promising, emerging standard, not a guaranteed ranking lever. As of 2026, it is a community-driven proposal with growing but incomplete adoption. Major AI providers have not all publicly committed to reading it the way Google formally honors robots.txt and sitemaps. So you should approach it with realistic expectations.

That said, the cost-benefit math is favorable. Creating an llms.txt file is cheap, low-risk, and quick. If AI engines adopt it broadly, you are already prepared. If they do not, you have lost twenty minutes and gained a tidy, human-readable index of your best content as a side effect. For documentation-heavy sites, SaaS products, and content businesses that want to be cited by AI assistants, it is a sensible, forward-looking addition to your GEO toolkit.

Just do not treat it as a magic bullet. llms.txt complements strong fundamentals; it does not replace them. Clear writing, semantic HTML, structured data, fast pages, and genuine expertise still do the heavy lifting for both SEO and GEO.

Common llms.txt mistakes

Most problems with llms.txt come down to a handful of avoidable errors. Watch out for these:

  • Placing it in the wrong location. It must be at /llms.txt, not in a subfolder. Anywhere else and it will not be found.
  • Serving it as HTML. If your server returns the file wrapped in HTML or with a text/html content type, parsers may choke on it. Serve it as plain text.
  • Listing every page. llms.txt is a curated map, not a sitemap. Dumping thousands of links defeats the purpose and buries your important pages.
  • Skipping descriptions. A bare list of URLs gives the model far less to work with. Always add a short, factual description per link.
  • Letting it go stale. Outdated links and removed pages erode trust. Update the file when your site changes.
  • Linking to staging or broken URLs. Double-check every link resolves to a live, public page.
  • Confusing it with robots.txt. llms.txt does not block or allow crawling; do not use it to try to control access.

Frequently asked questions about llms.txt

Is llms.txt the same as robots.txt?+

No. They share a naming convention and both live at your domain root, but they do opposite kinds of work. robots.txt is a long-standing standard that tells crawlers which paths they may or may not access. llms.txt is a newer Markdown file that curates and explains your best content so large language models can understand and cite it. One controls access; the other aids comprehension.

Do AI engines actually read llms.txt?+

Some tools and AI developer workflows already consume llms.txt, and several documentation platforms publish one. However, as of 2026 the major AI providers have not all formally confirmed they read it for live answers the way search engines honor sitemaps. Adoption is growing but still partial, so treat it as a low-cost, forward-looking measure rather than a guaranteed input.

Where do I put the llms.txt file?+

At the root of your domain, served at exactly https://yourdomain.com/llms.txt. It must not be in a subfolder like /docs/ or /files/. Tools expect it in the same root location where robots.txt lives.

What is the difference between llms.txt and llms-full.txt?+

llms.txt is a concise index: an H1, a summary, and curated link lists pointing to your key pages. llms-full.txt is an optional, larger file that inlines the actual full text of those pages into one clean Markdown document, so a model can read your entire knowledge base without crawling page by page. Use llms.txt for navigation and llms-full.txt for complete content.

Does llms.txt help SEO?+

Not directly. Traditional SEO ranking in Google is driven by robots.txt, sitemaps, content quality, links, and Core Web Vitals, not by llms.txt. Its value is on the GEO side: helping AI assistants understand and cite your content. Think of it as complementary to SEO, not a replacement for it.

What format should llms.txt use?+

Plain Markdown saved as UTF-8 text. Start with an H1 site name, add a blockquote summary, then group links under H2 headings using the [Title](URL): description format. No XML, JSON, or special schema is required.

How do I create an llms.txt file?+

List your most important pages, write an H1 with your site name and a blockquote summary, group the links into H2 sections with short descriptions, save it as llms.txt in UTF-8, and upload it to your domain root so it loads at /llms.txt. The whole process takes about twenty minutes.

How big should llms.txt be?+

Keep it curated. There is no hard size limit, but the entire point is to highlight your most important content, so a focused file with dozens of well-chosen links is far more useful than one with thousands. If you need to expose full content, use llms-full.txt instead.

Will llms.txt block AI bots from training on my site?+

No. llms.txt is not an access-control mechanism and does not block crawling or training. To restrict bot access you would use robots.txt directives or server-level rules. llms.txt only invites models to read curated content; it cannot stop them from reading anything else.

How do I test that my llms.txt works?+

Open https://yourdomain.com/llms.txt in a browser and confirm it returns plain-text Markdown with a 200 status, not a 404 or an HTML page. Verify every link resolves to a live page, and run your site through a GEO checker such as Check GEO Score to confirm the file is detected.