# The AI Readiness Kit

> Make your website understandable to AI before ChatGPT, Claude, Gemini, or Perplexity guess wrong.

- **Site:** https://ai.silverbackmarketing.com
- **Organization:** Silverback Marketing
- **Skill repo:** https://github.com/silverbackmarketing/ai-readiness
- **Guide PDF:** https://ai.silverbackmarketing.com/ai-readiness-guide.pdf
- **HTML version:** https://ai.silverbackmarketing.com

## The new front door

When people want the best product, the right service, or an answer, they ask AI assistants before they visit a website. Without clear, current, structured signals from you, those systems guess, and often guess wrong.

## The solution

16 files you place at the root of your website. They speak a format AI already understands, so your brand shows up accurately, completely, and on your terms.

### What the kit covers

- **Control access:** Tell every major AI crawler exactly what they can and cannot see.
- **Give the full story:** Cheat sheets and deep dives written for LLMs and RAG pipelines.
- **Map intent:** Route real user questions straight to the best page on your site.
- **Set the rules:** Publish clear policies on training data use and AI transparency.

## The 16 files

### Identity & Permissions

The files that introduce your brand and control which AI systems get access

- **robots.txt** (Critical): THE DOORMAN. Controls who gets in and where they can go. One of the oldest files on the web, upgraded for AI. Tells every crawler and AI bot exactly which sections of your site they can access, and points them toward your AI-specific readiness files.
  - Deploy at: `/robots.txt`
- **ai.txt** (Critical): YOUR AI BUSINESS CARD. Purpose-built for the age of AI. Your brand's complete introduction to every AI system. Contains company identity, what you sell, what you are known for, authoritative topics, and explicit rules for what AI systems can and cannot do with your content.
  - Deploy at: `/ai.txt`

### Content Files

The files that give AI the full story about your site in readable text

- **llms.txt** (Critical): THE CHEAT SHEET. A quick-read summary of your entire website, written specifically for large language models. Follows the llmstxt.org standard. Company name, short description, organized sections for every part of your site with direct links, and the most common questions people ask.
  - Deploy at: `/llms.txt`
- **llms-full.txt** (Critical): THE DEEP DIVE. The extended edition. Rich context, detailed Q&A sections, full category explanations, and everything an AI system needs to speak knowledgeably. llms.txt is the back-of-book summary; llms-full.txt is the complete book.
  - Deploy at: `/llms-full.txt`

### Map & Navigation

The files that help AI navigate your site structure efficiently

- **ai-sitemap.xml** (High): THE GPS. An upgraded sitemap with AI-specific directions. Beyond standard URL + lastmod, every entry includes content-type labels, topic descriptions, and plain-English one-sentence summaries so AI knows exactly what each page contains before visiting.
  - Deploy at: `/ai-sitemap.xml`
- **sitemap.md** (Medium): THE HUMAN-READABLE MAP. A plain-text, markdown-formatted overview of your entire website structure. No XML, no JSON, just organized sections, links, and conversational descriptions. The most approachable file for both humans and AI.
  - Deploy at: `/sitemap.md`

### Intelligence Files

The files that give AI deep knowledge of your entities, products, and user intents

- **ai-entities.json** (High): THE ENCYCLOPEDIA. A structured catalog of every significant element: product categories, subcategories, services, brands, and key concepts. Each entity has name, type, description, related URLs, and connections. Powers the knowledge graphs AI builds internally.
  - Deploy at: `/ai-entities.json`
- **ai-intent.json** (High): THE TRAFFIC DIRECTOR. A lookup table mapping real user questions (the exact things people type into AI assistants) to the single best page on your website to answer them. Transforms AI from a general information source into a precise navigation tool.
  - Deploy at: `/ai-intent.json`
- **ai-schema.json** (High): THE IDENTITY CARD. Uses the Schema.org standard (Google, Microsoft, Yahoo, Yandex) in a machine-readable JSON-LD format. Defines organization type, founding date, address, social profiles, contact info, and site search functionality with zero ambiguity.
  - Deploy at: `/ai-schema.json`

### Research Files

The files that power AI research and retrieval pipelines

- **rag-index.json** (High): THE RESEARCH DATABASE. A pre-built index designed for Retrieval-Augmented Generation (RAG). A JSON array of records (one per major page/section) containing URL, title, and topics covered. AI systems can load this directly into LlamaIndex, LangChain, Pinecone, Weaviate, etc.
  - Deploy at: `/rag-index.json`
- **rag-index.jsonl** (High): THE STREAMLINED DATABASE. The identical content to rag-index.json, but in JSON Lines (newline-delimited) format. Preferred by large-scale ML pipelines, OpenAI fine-tuning, and streaming vector DB ingestion because records can be processed one-by-one without loading the entire file into memory.
  - Deploy at: `/rag-index.jsonl`

### Policy Files

The files that set the rules for how AI can use your content

- **ai-disclosure.txt** (Medium): THE TRANSPARENCY REPORT. Your public statement about how your organization uses AI in products, operations, and customer interactions. Like a privacy policy for the AI era. Answers: Is AI generating content? Making decisions? Can users reach a real person?
  - Deploy at: `/ai-disclosure.txt`
- **training-data-policy.txt** (Medium): THE LICENSE AGREEMENT. Your formal published position on whether AI companies can use your content to train their models. Clearly states what is permitted (e.g. real-time RAG) vs. what requires a license (commercial model training). Protects your organization legally.
  - Deploy at: `/training-data-policy.txt`

### Operations Files

The files that help your team deploy and maintain everything

- **structured-data-guide.md** (Low): THE DEVELOPER HANDBOOK. The most technical file, written for your development team, not AI. A comprehensive step-by-step guide with ready-to-use JSON-LD examples for every major page type: Product, LocalBusiness, BlogPosting, Event, etc. Powers rich results and page-level understanding.
  - Deploy at: `/structured-data-guide.md`
- **manifest.json** (Medium): THE MASTER INVENTORY. The single source of truth for your entire AI readiness implementation. Lists every file deployed, what it does, its URL, format, intended audience, and update frequency. Includes a summary scorecard. Lets anyone instantly audit your readiness status.
  - Deploy at: `/manifest.json`
- **deployment-checklist.md** (Internal): THE LAUNCH PLAN. The practical playbook your team follows to move from files on a computer to files correctly serving live traffic. Organized in clear phases with verification steps for each critical file. Nothing is skipped, forgotten, or misconfigured.
  - Deploy at: `/deployment-checklist.md`

## How to become AI-ready

1. **Deploy critical identity and content files:** Start with robots.txt, ai.txt, llms.txt, and llms-full.txt at your site root. These four critical files deliver the biggest immediate impact and are what most AI systems look for first.

2. **Add map and navigation files:** Publish ai-sitemap.xml and sitemap.md so AI systems can discover every page efficiently and understand what each page contains before they crawl it.

3. **Publish intelligence and research files:** Deploy ai-entities.json, ai-intent.json, ai-schema.json, and the rag-index files. This is where precision happens: structured data that powers accurate answers and retrieval.

4. **Add policies, verify, and maintain:** Publish ai-disclosure.txt and training-data-policy.txt. Use manifest.json and deployment-checklist.md to verify every file is live, returning correct status codes, and kept current.

## MCP server

Connect a coding agent to the hosted Model Context Protocol server — no API key, no local install.

- **Endpoint:** https://ai.silverbackmarketing.com/api/mcp
- **Transport:** Streamable HTTP
- **Authentication:** None (public)
- **HTML section:** https://ai.silverbackmarketing.com/#mcp

### Tools (6)

- **generate_ai_readiness_files:** Start the full 18-file workflow for any domain — args: `url`
- **list_output_files:** List all output files in generation order
- **get_file_spec:** Detailed spec for one file (llms.txt, ai-entities.json, …) — args: `filename`
- **get_skill_instructions:** Full research and generation workflow (SKILL.md)
- **get_site_classification_guide:** Site-type taxonomy (SaaS, e-commerce, healthcare, …)
- **generate_rag_jsonl:** Convert rag-index.json to JSONL for embedding pipelines — args: `rag_index_json`

### Resources

- `ai-readiness://skill` — AI Readiness skill instructions (markdown)
- `ai-readiness://file-specs` — File specifications for all 18 outputs (markdown)

### Prompt

- `generate-ai-readiness` — Start the full workflow for a website URL

### Setup by client

#### Claude Code

- Config: `.mcp.json` (Streamable HTTP)

```json
{
  "mcpServers": {
    "ai-readiness": {
      "type": "http",
      "url": "https://ai.silverbackmarketing.com/api/mcp"
    }
  }
}
```

1. Add .mcp.json to your project root, or run: claude mcp add --transport http ai-readiness https://ai.silverbackmarketing.com/api/mcp
2. Restart the Claude Code session and run /mcp to verify.
3. Try: "Generate AI readiness files for mysite.com using ai-readiness"

#### Cursor

- Config: `.cursor/mcp.json` (Streamable HTTP)

```json
{
  "mcpServers": {
    "ai-readiness": {
      "url": "https://ai.silverbackmarketing.com/api/mcp"
    }
  }
}
```

1. Save the config to .cursor/mcp.json in your project (or Cursor Settings → MCP).
2. Refresh MCP servers in settings, or restart Cursor.
3. Try: "Use ai-readiness to generate files for example.com"

#### VS Code / Copilot

- Config: `.vscode/mcp.json` (Streamable HTTP)

```json
{
  "servers": {
    "ai-readiness": {
      "type": "http",
      "url": "https://ai.silverbackmarketing.com/api/mcp"
    }
  }
}
```

1. Save to .vscode/mcp.json in your workspace.
2. Open Command Palette → MCP: List Servers and confirm ai-readiness is running.
3. Enable agent mode in Copilot Chat and reference the MCP tools.

#### Claude Desktop

- Config: `claude_desktop_config.json` (stdio bridge via mcp-remote)

> Claude Desktop does not connect to remote URLs directly — mcp-remote proxies HTTP to stdio. Requires Node.js 18+.

```json
{
  "mcpServers": {
    "ai-readiness": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-remote",
        "https://ai.silverbackmarketing.com/api/mcp"
      ]
    }
  }
}
```

1. Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS).
2. Fully quit and reopen Claude Desktop.
3. Look for MCP tools in the chat input area.

#### Antigravity

- Config: `~/.gemini/antigravity/mcp_config.json` (uses serverUrl, not url)

> Antigravity uses serverUrl instead of url — copy the config exactly as shown.

```json
{
  "mcpServers": {
    "ai-readiness": {
      "serverUrl": "https://ai.silverbackmarketing.com/api/mcp"
    }
  }
}
```

1. Agent panel → … → MCP Servers → Manage → View raw config.
2. Paste the config, save, and click Refresh.
3. Try in agent mode: "Use ai-readiness to list all 18 output files"

#### OpenAI Codex

- Config: `~/.codex/config.toml` (config.toml)

```toml
[mcp_servers.ai-readiness]
url = "https://ai.silverbackmarketing.com/api/mcp"
enabled = true
```

1. Add to ~/.codex/config.toml, or run: codex mcp add ai-readiness --url https://ai.silverbackmarketing.com/api/mcp
2. Start a Codex session and run /mcp to list connected tools.
3. Try: "Use ai-readiness generate_ai_readiness_files for stripe.com"

### Example prompts

- "Generate AI readiness files for example.com"
- "Use ai-readiness to list all 18 output files"
- "Get the file spec for llms.txt from ai-readiness"

## FAQ

### Getting Started

The basics of AI readiness and why it matters for every website.

**What are AI Readiness Files?**

They are structured files you deploy at the root of your website to tell AI systems exactly who you are, what you offer, and where to send people. They speak a format AI already understands, so assistants stop guessing about your brand based on old training data.

**Why does AI readiness matter now?**

When someone wants a product recommendation, a service provider, or a straight answer, they often ask ChatGPT, Claude, Gemini, or Perplexity before they visit a website. If you are not giving those systems clear, current signals, they guess. And they often guess wrong.

**How many files are in the kit?**

The kit includes 17 purpose-built files across 7 categories: Identity & Permissions, Content Files, Map & Navigation, Intelligence Files, Research Files, Policy Files, and Operations Files. Each one helps AI understand and represent your site more accurately.

**Where should I start?**

Start with the critical files: robots.txt, ai.txt, llms.txt, and llms-full.txt. They deliver the biggest immediate impact and are what most AI systems look for first. Then add sitemaps, intelligence JSON files, policies, and operational files using the deployment checklist.

### Identity & Permissions

How you introduce your brand and control AI crawler access.

**What does robots.txt do for AI?**

robots.txt sits at the front door of your website and tells automated visitors (search engines, AI bots, and scrapers) which sections they can access. For AI readiness, it includes instructions for major crawlers like GPTBot and ClaudeBot, pointing them toward your llms.txt and ai-sitemap.xml while keeping checkout, login, and admin pages off limits.

**What is ai.txt?**

ai.txt is your brand's introduction to every AI system on the internet. Where robots.txt controls access, ai.txt goes further. It covers what you sell, what you are known for, your authoritative topics, and the rules for what AI systems can and cannot do with your content.

**How is ai.txt different from robots.txt?**

robots.txt controls access: which pages bots can crawl. ai.txt controls identity: who you are, what you offer, and your ground rules for AI use of your content. Think of robots.txt as the doorman and ai.txt as the briefing document you hand a journalist before an interview.

### Content & Navigation

Files that give AI the full story and a map of your site.

**What is llms.txt?**

llms.txt is a structured text file built for large language models. It gives AI a quick map of your website: company name, description, organized sections with direct links, and the most common questions your site answers. It follows the llmstxt.org standard used by ChatGPT, Claude, Gemini, and Perplexity.

**What's the difference between llms.txt and llms-full.txt?**

llms.txt is the cheat sheet: a quick summary an AI can scan in seconds. llms-full.txt is the deep dive with rich descriptions, detailed Q&A, and full category explanations. If llms.txt is a Wikipedia summary, llms-full.txt is the full article.

**Why do I need an ai-sitemap.xml?**

A regular sitemap lists pages and timestamps. The ai-sitemap adds what each page is about, its content type, and a plain-English summary. It is the difference between a paper road map and GPS with points of interest. AI crawlers can understand pages without fetching and parsing each one individually.

### Intelligence & Research

Structured data that powers accurate AI answers and retrieval.

**What is ai-intent.json?**

ai-intent.json maps real user questions (the things people type into AI assistants) to the best page on your website to answer each one. Without it, AI guesses which page to recommend. With it, you give AI a direct URL for every common query.

**What are ai-entities.json and ai-schema.json for?**

ai-entities.json is a structured catalog of the important parts of your site: products, categories, services, and key concepts. It powers AI knowledge graphs. ai-schema.json uses the Schema.org standard to describe your organization in machine-readable format so search engines and AI systems can identify you without ambiguity.

**What are the RAG index files?**

rag-index.json and rag-index.jsonl are ready-made indexes of your site built for AI research and retrieval pipelines. When a user asks a question, AI first pulls the most relevant documents from this index before generating an answer, which keeps responses grounded in your actual content.

### Policies & Maintenance

Rules for AI use of your content and keeping files current.

**What is a training data policy?**

training-data-policy.txt sets formal rules for how AI companies can use your content, including model training, RAG indexing, and commercial use. It answers a practical question every brand should decide upfront: what are others allowed to do with your work?

**What is ai-disclosure.txt?**

ai-disclosure.txt is a public transparency report explaining how your organization uses AI, whether for content generation, recommendations, or customer interactions. It builds trust by answering those questions before someone has to ask.

**How often should I update my AI readiness files?**

It depends on the file. Update llms.txt and sitemaps monthly or when your site structure changes. Refresh ai.txt and llms-full.txt quarterly or when products change. Review policy files annually. The manifest.json file lists recommended update frequencies for every file in the kit.

## Get started

Download the skill from https://github.com/silverbackmarketing/ai-readiness and use it in your favorite AI tool to generate AI visibility files for your website.

© 2026 Silverback Marketing. All rights reserved.