Skip to content

Instantly share code, notes, and snippets.

@vimalk78
Last active February 6, 2026 07:56
Show Gist options
  • Select an option

  • Save vimalk78/1e13b56736a6840810e8a933754671d2 to your computer and use it in GitHub Desktop.

Select an option

Save vimalk78/1e13b56736a6840810e8a933754671d2 to your computer and use it in GitHub Desktop.
agent-friendly content check
❯ python3 ./agent_friendly_check.py
==============================================================================
AGENT-FRIENDLY CONTENT CHECK
Do websites already serve agent-friendly content?
==============================================================================
Hypothesis: If AI agents browse the web on behalf of users, websites
should offer agent-friendly content instead of HTML. This program
checks 34 major websites to see if any already do.
For each site, we probe for:
1. /llms.txt, /llms-full.txt — emerging standard for LLM-readable pages
2. Content negotiation — requesting JSON, Markdown, or plain text
via the HTTP Accept header
3. Machine endpoints — /.well-known/ai-plugin.json, OpenAPI specs
4. RSS/Atom feeds — legacy machine-readable format
5. robots.txt AI rules — do they mention GPTBot, Claude, etc.?
6. Sitemap & API endpoints — /sitemap.xml, /api
Probing docs.anthropic.com... done (15.4s) — 1: Markdown
Probing platform.openai.com... done (12.5s) — 3: llms.txt, llms-full.txt, sitemap
Probing ai.google.dev... done (11.1s) — 1: sitemap
Probing docs.mistral.ai... done (1.9s) — 3: llms.txt, llms-full.txt, OpenAPI
Probing docs.cohere.com... done (16.9s) — 4: llms.txt, llms-full.txt, text/plain, sitemap
Probing docs.perplexity.ai... done (4.8s) — 5: llms.txt, llms-full.txt, Markdown, OpenAPI, sitemap
Probing huggingface.co... done (6.9s) — 3: OpenAPI, sitemap, API
Probing docs.aws.amazon.com... done (2.6s) — 2: llms.txt, llms-full.txt
Probing learn.microsoft.com... done (7.5s) — none
Probing docs.together.ai... done (10.5s) — 5: llms.txt, llms-full.txt, Markdown, OpenAPI, sitemap
Probing docs.fireworks.ai... done (10.5s) — 4: llms.txt, llms-full.txt, Markdown, sitemap
Probing docs.groq.com... done (10.1s) — none
Probing docs.langchain.com... done (6.5s) — 4: llms.txt, llms-full.txt, Markdown, sitemap
Probing docs.llamaindex.ai... done (4.0s) — none
Probing docs.cursor.com... done (4.8s) — none
Probing docs.python.org... done (1.1s) — 1: sitemap
Probing docs.stripe.com... done (7.7s) — 5: llms.txt, JSON, text/plain, sitemap, API
Probing developer.mozilla.org... done (10.9s) — 2: RSS, sitemap
Probing docs.github.com... done (1.0s) — 1: llms.txt
Probing vercel.com... done (10.3s) — 3: llms.txt, Markdown, sitemap
Probing nextjs.org... done (2.5s) — 2: llms.txt, sitemap
Probing react.dev... done (2.1s) — 2: llms.txt, RSS
Probing www.bbc.com... done (3.0s) — 2: sitemap, robots.txt:AI
Probing www.nytimes.com... done (4.9s) — 2: RSS, robots.txt:AI
Probing www.theguardian.com... done (5.5s) — 2: RSS, robots.txt:AI
Probing en.wikipedia.org... done (6.6s) — 1: RSS
Probing stackoverflow.com... done (2.7s) — none
Probing www.amazon.com... done (17.5s) — 1: robots.txt:AI
Probing www.ebay.com... done (17.7s) — 1: robots.txt:AI
Probing www.usa.gov... done (17.2s) — 1: sitemap
Probing data.gov... done (19.8s) — 1: sitemap
Probing www.reddit.com... done (11.0s) — 1: RSS
Probing github.com... done (7.2s) — 1: JSON
Probing www.cloudflare.com... done (5.5s) — 1: sitemap
==============================================================================
RESULTS MATRIX
==============================================================================
Site llms JSON MD TXT RSS API OAI Map
------------------------------------------------------------------------------
docs.anthropic.com - - YES - - - - -
platform.openai.com YES - - - - - - YES
ai.google.dev - - - - - - - YES
docs.mistral.ai YES - - - - - YES -
docs.cohere.com YES - - YES - - - YES
docs.perplexity.ai YES - YES - - - YES YES
huggingface.co - - - - - YES YES YES
docs.aws.amazon.com YES - - - - - - -
learn.microsoft.com - - - - - - - -
docs.together.ai YES - YES - - - YES YES
docs.fireworks.ai YES - YES - - - - YES
docs.groq.com - - - - - - - -
docs.langchain.com YES - YES - - - - YES
docs.llamaindex.ai - - - - - - - -
docs.cursor.com - - - - - - - -
docs.python.org - - - - - - - YES
docs.stripe.com YES YES - YES - YES - YES
developer.mozilla.org - - - - YES - - YES
docs.github.com YES - - - - - - -
vercel.com YES - YES - - - - YES
nextjs.org YES - - - - - - YES
react.dev YES - - - YES - - -
www.bbc.com - - - - - - - YES
www.nytimes.com - - - - YES - - -
www.theguardian.com - - - - YES - - -
en.wikipedia.org - - - - YES - - -
stackoverflow.com - - - - - - - -
www.amazon.com - - - - - - - -
www.ebay.com - - - - - - - -
www.usa.gov - - - - - - - YES
data.gov - - - - - - - YES
www.reddit.com - - - - YES - - -
github.com - YES - - - - - -
www.cloudflare.com - - - - - - - YES
llms = /llms.txt | JSON = Accept: application/json | MD = Accept: text/markdown
TXT = Accept: text/plain | RSS = RSS/Atom feed | API = /api endpoint
OAI = OpenAPI spec | Map = sitemap.xml
==============================================================================
llms.txt ADOPTION (13/34 sites)
==============================================================================
platform.openai.com:
/llms.txt — 18,764 bytes
/llms-full.txt — 1,525,306 bytes
docs.mistral.ai:
/llms.txt — 14,660 bytes
/llms-full.txt — 991,794 bytes
docs.cohere.com:
/llms.txt — 102,615 bytes
/llms-full.txt — 2,886,704 bytes
docs.perplexity.ai:
/llms.txt — 20,490 bytes
/llms-full.txt — 957,176 bytes
docs.aws.amazon.com:
/llms.txt — 282,390 bytes
/llms-full.txt — 577,989 bytes
docs.together.ai:
/llms.txt — 30,585 bytes
/llms-full.txt — 1,233,716 bytes
docs.fireworks.ai:
/llms.txt — 42,853 bytes
/llms-full.txt — 820,341 bytes
docs.langchain.com:
/llms.txt — 84,102 bytes
/llms-full.txt — 7,108,302 bytes
docs.stripe.com:
/llms.txt — 88,769 bytes
docs.github.com:
/llms.txt — 1,914 bytes
vercel.com:
/llms.txt — 222,839 bytes
nextjs.org:
/llms.txt — 6,668 bytes
react.dev:
/llms.txt — 14,347 bytes
==============================================================================
CONTENT NEGOTIATION (9/34 sites respond to Accept headers)
==============================================================================
docs.anthropic.com: Markdown
docs.cohere.com: Plain text
docs.perplexity.ai: Markdown
docs.together.ai: Markdown
docs.fireworks.ai: Markdown
docs.langchain.com: Markdown
docs.stripe.com: JSON, Plain text
vercel.com: Markdown
github.com: JSON
==============================================================================
robots.txt AI AGENT RULES (5/34 sites mention AI bots)
==============================================================================
www.bbc.com:
GPTBot: User-agent: GPTBot; Disallow: /
ChatGPT-User: User-agent: ChatGPT-User; Disallow: /
anthropic: User-agent: anthropic-ai; Disallow: /
Claude-Web: User-agent: Claude-Web; Disallow: /
Amazonbot: User-agent: Amazonbot; Disallow: /
www.nytimes.com:
GPTBot: User-agent: GPTBot; Disallow: /
ChatGPT-User: User-agent: ChatGPT-User; Disallow: /
anthropic: User-agent: anthropic-ai; Disallow: /
Claude-Web: User-agent: Claude-Web; Disallow: /
CCBot: User-agent: CCBot; Disallow: /
www.theguardian.com:
anthropic: User-agent: anthropic-ai
Amazonbot: User-agent: Amazonbot
CCBot: User-agent: CCBot
Bytespider: User-agent: Bytespider
PerplexityBot: User-Agent: PerplexityBot
www.amazon.com:
GPTBot: User-agent: GPTBot; Disallow: /
ChatGPT-User: User-agent: ChatGPT-User; Disallow: /
CCBot: User-agent: CCBot; Disallow: /
Google-Extended: User-agent: Google-Extended; Disallow: /
Bytespider: User-agent: Bytespider; Disallow: /
www.ebay.com:
anthropic: User-agent: anthropic-ai; Disallow: /
Amazonbot: User-agent: AmazonBot; Disallow: /;
CCBot: User-agent: CCBot; Disallow: /
Bytespider: User-agent: Bytespider; Disallow: /
PerplexityBot: User-agent: PerplexityBot; Disallow: /
==============================================================================
SCORECARD
==============================================================================
Have /llms.txt [███████████████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 13/34
Have /llms-full.txt [████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 8/34
Serve JSON via Accept header [██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 2/34
Serve Markdown via Accept header [██████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 6/34
Serve plain text via Accept header [██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 2/34
Have RSS/Atom feeds [██████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 6/34
Have OpenAPI spec [████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 4/34
Have /api endpoint [██████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 2/34
Have sitemap.xml [███████████████████████████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 17/34
Have AI-plugin manifest [░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 0/34
Mention AI bots in robots.txt [███████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 5/34
==============================================================================
CONCLUSION
==============================================================================
Of 34 major sites tested:
- 13 offer /llms.txt (purpose-built for LLMs)
- 9 respond to content negotiation with non-HTML formats
- 5 acknowledge AI agents in robots.txt (mostly to BLOCK them)
- 6 have RSS feeds (legacy machine-readable format)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment