llms.txt has been hyped as the GEO equivalent of robots.txt — a curated content map that LLMs will preferentially crawl. The data on actual citation lift is brutal: less than 0.001% of AI-cited content cites a llms.txt file as its source. I ship llms.txt on 500k.io. I also wrote a 4,000-word version called llms-full.txt. Neither of them is what’s earning my citations. Pretending otherwise sells courses, not citations.
If you’ve already read GEO 2026: how to get cited by ChatGPT and Perplexity, this article is the counter-position to one of the items I half-recommended in there. Same author, less polite this time.
What is llms.txt, and where did it come from?
llms.txt is a Markdown-formatted file proposed by Jeremy Howard at Answer.AI in September 2024. The idea: give LLM crawlers a curated, machine-readable index of your domain’s most important content — a hand-picked subset of your pages, optimized for LLM consumption1. Loosely modeled on robots.txt (which tells crawlers what NOT to index) and sitemap.xml (which tells crawlers what TO index). The pitch was: LLMs need help filtering signal from noise on a typical domain, and a Markdown manifest is the cleanest interface.
The proposal has merit on paper. The execution data, 18 months in, is the problem.
What the actual citation data says
OtterlyAI runs ongoing audits of AI citation sources across ChatGPT, Perplexity, Claude, and Gemini. Their late-2025 dataset is the cleanest public benchmark on which sites actually get cited and why. The findings on llms.txt:
- Of all citations across the four major engines, fewer than 0.001% trace back to a llms.txt file as the citation source2
- Sites that ship llms.txt are not measurably more likely to be cited than sites that don’t, controlling for content depth and schema discipline
- ChatGPT, Perplexity, Claude, and Gemini have not formally documented using llms.txt for retrieval (as of May 2026)
The signal is consistent across categories. B2B SaaS, founder economics, technical tutorials, e-commerce — none of them show a measurable citation lift from llms.txt presence.
Why the hype kept growing anyway
If the data is brutal, why does every “GEO checklist” article in 2026 recommend llms.txt? Three reasons.
1. It’s an easy “GEO win” to recommend
Consultancies need deliverables. “We shipped llms.txt for your domain” is concrete, fast, and feels modern. Clients see a new file on their domain and check a box. The fact that it doesn’t measurably move citations is invisible to them — and to most consultancies.
2. Confirmation bias from concurrent improvements
The pattern I’ve seen most often: a site ships llms.txt and 12 other GEO improvements (schema, TLDR boxes, FAQ sections, internal linking, brand mentions) within the same week. Citations rise over the following 90 days. The site owner credits llms.txt because it was the most novel of the 13 changes. The other 12 changes were the actual driver.
This is the same fallacy as crediting your morning coffee for a productive day when you also slept 8 hours, ate a real breakfast, and started early.
3. It’s a defensible practice in case adoption happens
If ChatGPT or Perplexity announces tomorrow that llms.txt is officially preferred for retrieval, every site that already ships it gets a head start. The cost of being wrong on llms.txt is minimal (30 minutes of writing, ~5KB of bandwidth). The cost of being wrong on other GEO bets is much higher. So the “ship it anyway” calculus survives even with weak data.
What actually drives AI citations
The data doesn’t lie about what works either. Across the OtterlyAI dataset, Profound’s Q1 2026 study, and Ahrefs’ December 2025 brand mentions analysis, six signals dominate citation likelihood.
| Signal | Citation lift correlation | Effort to ship |
|---|---|---|
| Brand mentions across 5+ authoritative sites | 3.0x vs sites with under 2 | High (months) |
| FAQPage schema with 5+ Q&A | 2.4x vs no FAQ schema | Low (hours) |
| TLDR / summary box at top of article | 2.1x vs no summary | Low (minutes per article) |
| Recency (dateModified within 90 days) | 1.8x vs static | Low (quarterly refresh) |
| Source diversity (cited by ≥3 external sites) | 1.7x vs cited by 0-1 | Medium (months) |
| Topic depth (3000+ words on the canonical query) | 1.5x vs under 1500 words | Medium (writing time) |
| llms.txt presence | No measurable lift | Low (30 min) |
The contrast is the punchline. Brand mentions, schema, and recency move the needle. llms.txt does not.
If you have 30 minutes free, here’s the honest priority list:
- Add FAQPage schema to your top 3 articles
- Add a TLDR box to your top 3 articles
- Bump dateModified on articles older than 90 days (after a real refresh)
- Submit your brand to Wikidata
- Comment thoughtfully on 3 Reddit threads in your niche
Any one of those will move citations more than llms.txt.
So why do I still ship it?
Because the “ship it anyway” calculus survives. Here’s my own logic on 500k.io:
- llms.txt cost me 30 minutes to write and 5 minutes to ship
- llms-full.txt (the longer 4,000-word version with full content excerpts) cost me 90 minutes
- Total cost: 2 hours, mostly automated via a build script
- Total benefit if no LLM ever adopts: zero
- Total benefit if any major LLM adopts in 2026-2027: I’m already there
Cheap insurance. I don’t credit it for any citation lift on 500k.io. I credit schema discipline, TLDR boxes, FAQ sections, weekly content cadence, and the fact that I link to my stack, skills, and dashboard from articles consistently. That’s what’s earning the 0-2 weekly citations I’m tracking on Day 90.
“Ship llms.txt. Don’t ship a strategy around llms.txt. The difference between those two postures is the difference between covering your downside and pretending you’ve moved the needle.” — Maxime Le Morillon, building 500k.io in public
What the llms.txt advocates have right
Steel-manning the case. llms.txt advocates are correct on three points:
1. The intent is good
Giving LLMs a clean signal-to-noise interface IS a real problem. Most websites have 80% noise — affiliate links, sidebar widgets, footer junk. A curated Markdown index could in theory help.
2. The cost is genuinely low
Unlike many GEO tactics that require months of distribution work, llms.txt ships in 30 minutes. The downside is bounded. Hard to argue against on cost grounds alone.
3. Standards adoption takes time
robots.txt took years to become universal. Sitemap.xml took longer. If llms.txt is the future, it’ll look exactly like its current present — proposed, partially adopted, low signal — for 2-3 more years before any inflection.
The defensible position is: I respect the intent, I’ll ship the file, but I won’t spend more than 2 hours on it and I won’t recommend it as a strategy. Anyone telling clients llms.txt is a meaningful GEO lever is selling, not informing.
What I’d ship instead, in 30 minutes
If a founder asked me “I have 30 minutes, what’s the highest-leverage GEO action,” llms.txt wouldn’t be in my top 5. Here’s what I’d actually do:
- Add FAQPage schema to your top article (10 min). Validates in Google Rich Results Test. Real citation lift.
- Write a TLDR box for your top 3 articles (15 min). 60-80 words, self-contained, citation-magnet.
- Submit your brand to Wikidata (5 min for first draft). 90-day delay for approval, but high authority signal.
That’s 30 minutes. Each item has documented citation lift. None of them is llms.txt.
If you have an hour, add: comment thoughtfully on 1 Reddit thread in your niche, write a 4-FAQ section on your homepage, and bump dateModified on your 2 oldest articles after a real content refresh. All of these correlate more strongly with citations than llms.txt does.
What I’d watch over the next 12 months
I don’t think llms.txt is dead. I think it’s premature. Here’s what I’m watching:
- Anthropic announcing ClaudeBot uses llms.txt for retrieval. This would be the inflection. So far: no announcement.
- Perplexity adding llms.txt to their crawler docs. Currently silent.
- OpenAI documenting llms.txt support in their developer docs. Currently silent.
- Google announcing AIO uses llms.txt. Highly unlikely given Google’s existing crawl infrastructure.
- Citation rate for llms.txt-equipped sites diverging in OtterlyAI data. Quarterly check-in.
If any of these flip, I’ll update this article. dateModified bumps. The intellectually honest move is to publish the current data and revise when the data changes.
Why I wrote a counter-position essay
Most GEO content in 2026 is consultative-positive. “Here are 47 things you can do for AI citations” with no signal weighting. The reader walks away thinking 47 actions matter equally. They don’t. 5 of the 47 do most of the work. llms.txt isn’t in the 5.
Counter-positions are how I stay honest with my own audience. If I told you “ship llms.txt, you’ll see citations rise” because that’s what every other GEO blog says, I’d be selling you a hope, not a lever. The brand promise of 500k.io is real numbers from real operators. That includes saying “this thing you’ve been told is critical isn’t.”
FAQ
What is llms.txt?
It’s a Markdown-formatted file at the root of your domain (yoursite.com/llms.txt) that’s supposed to give LLM crawlers a curated index of your most important content. Proposed by Jeremy Howard in late 2024. Major LLMs have not formally committed to using it for retrieval.
Does llms.txt actually work?
Not measurably, in 2026. OtterlyAI’s ongoing audit shows less than 0.001% of AI-cited content traces back to a llms.txt file as the citation source. The signal is aspirational, not proven.
Should I still ship llms.txt?
Yes, ship it anyway. It’s 30 minutes of work, it doesn’t hurt anything, and if any major LLM does start using it, you’re covered. Treat it as insurance, not strategy.
What actually drives AI citations in 2026?
Brand mentions across the web, schema discipline (FAQPage, Article, Person), TLDR boxes for LLM extraction, recency signals (dateModified bumps), source diversity, and topic depth. Boring fundamentals beat speculative protocols.
Why do people still recommend llms.txt?
Three reasons: it’s an easy “GEO win” to recommend to clients, confirmation bias from concurrent improvements, and it’s a defensible practice in case adoption happens. None of these mean it currently moves citations.
Is llms.txt the same as robots.txt?
No. robots.txt is a 30-year-old standard all major crawlers respect — including LLM bots. llms.txt is a 2024 proposal no major LLM has formally adopted. Robots.txt is mandatory for GEO; llms.txt is optional and unproven.
Going further
- GEO 2026: how to get cited by ChatGPT and Perplexity
- AEO vs GEO vs SEO in 2026
- How to Rank in Perplexity When You’re a DR 0 Founder
- The AI SEO Playbook 2026
- The 500K dashboard
Footnotes
FAQ
What is llms.txt?
It's a Markdown-formatted file at the root of your domain (yoursite.com/llms.txt) that's supposed to give LLM crawlers a curated index of your most important content. Proposed by Jeremy Howard in late 2024, it's loosely modeled on robots.txt and sitemap.xml. Major LLMs have not formally committed to using it for retrieval.
Does llms.txt actually work?
Not measurably, in 2026. OtterlyAI's ongoing audit shows less than 0.001% of AI-cited content lives on sites with llms.txt as a citation source. ChatGPT, Perplexity, Claude, and Gemini do not officially confirm using it for retrieval. The signal is mostly aspirational.
Should I still ship llms.txt?
Yes, ship it anyway. It's 30 minutes of work, it doesn't hurt anything, and if any major LLM does start using it, you're already there. Treat it as cheap insurance, not a strategy. Don't build content strategy around it.
What actually drives AI citations in 2026?
Brand mentions across the web (Wikidata, Crunchbase, Reddit), schema discipline (FAQPage, Article, Person), TLDR boxes for extraction, recency signals (dateModified bumps), source diversity (cited by multiple authoritative sites), and topic depth (long-form authoritative content). Not llms.txt.
Why do people still recommend llms.txt?
Three reasons: (1) it's an easy 'GEO win' to recommend to clients ('we shipped llms.txt!'), (2) confirmation bias — people who shipped it also shipped 12 other GEO improvements and credit llms.txt for the citation lift, (3) it's a defensible practice in case adoption happens. None of these reasons mean it currently moves citations.
Is llms.txt the same as robots.txt?
No. robots.txt is a 30-year-old IETF-accepted standard that all major crawlers respect — including LLM bots like GPTBot and ClaudeBot. llms.txt is a 2024 proposal that no major LLM has formally adopted. Robots.txt is mandatory for GEO; llms.txt is optional and unproven.