TL;DR
This glossary translates the moving pieces of Answer Engine Optimisation (AEO) into plain English. You’ll find concise, accurate definitions with links to primary sources so you can brief stakeholders, align teams, and make confident decisions about the files, tags, and structures that help AI assistants find, understand, cite, and action your brand. Where a term is a Rankmeon-specific concept, we label it clearly.
Why this glossary exists
AI assistants (ChatGPT, Claude, Perplexity, Gemini and others) are becoming a first stop for product research and support. Instead of ranking links, these systems assemble answers and often cite sources. To participate, your site needs to be machine-readable and policy-clear. That involves a handful of technical artifacts (for example robots.txt, sitemap.xml, schema markup) with rules defined in open documentation such as Google Search Central, Schema.org and Sitemaps.org. This glossary consolidates those definitions and adds AEO-specific language so technical and non-technical stakeholders share the same map.
Core AEO terms
Answer Engine Optimisation (AEO)
The practice of structuring your site and content so AI assistants can discover, understand, cite, and enable actions related to your brand. AEO builds on web standards (robots.txt, sitemaps, structured data) and embraces assistant-specific affordances like llms.txt (a proposal—see below), plus clear policy and pricing pages that can be quoted verbatim. Sources for underlying web standards: Google’s robots.txt guidance, Sitemaps.org protocol, and Schema.org vocabulary.
Answer Engine
An AI-powered system that returns direct answers (often with citations) instead of a ranked list of links. Examples include ChatGPT, Claude, Perplexity, and Google’s AI features. Mechanistically, many of these systems combine model knowledge with real-time retrieval and web policies.
AI Crawler (User Agent)
Automated software used by AI companies to fetch public pages. Examples include OpenAI’s GPTBot, Anthropic’s ClaudeBot, and PerplexityBot. Their behaviour is documented (to varying degrees) and commonly governed by robots.txt. See OpenAI’s GPTBot page and official support docs from Anthropic and Perplexity.
Citations (AI)
When an assistant attributes part of its answer to a source page. In Perplexity, for instance, citations appear as linked sources; in ChatGPT’s browsing or enterprise retrieval modes, citations may surface as footnotes. Citation visibility and rules vary by product and are evolving.
Discoverability (AEO)
How easily assistants and their crawlers can find your content. Practically: publish a valid robots.txt, expose sitemap.xml (and index files if needed), and add structured data so parsers can identify entities and facts.
Action Enablement
Your readiness for assistants to trigger useful outcomes (book a demo, start a checkout, open a support chat). This relies on clear calls-to-action, publicly documented flows, and machine-readable endpoints where applicable.
AI Visibility (or AI Visibility Score) (Rankmeon concept)
A 0–100 scoring approach we use to summarise how assistants currently represent your brand across dimensions like discoverability, citations, pricing clarity, answer readiness, and action enablement. See About Rankmeon.ai for the six dimensions we audit.
Web-standard building blocks
robots.txt
A plain-text file at the domain root (for example https://example.com/robots.txt) that tells crawlers which paths are allowed or disallowed. It’s not a security mechanism and doesn’t remove URLs from Google on its own; it manages crawl access. Google documents the file format and how it interprets rules.
Sitemap / sitemap.xml
An XML file listing URLs you want crawled, with optional metadata (lastmod, changefreq, priority). Place the file at a stable URL and reference it from robots.txt with a Sitemap: directive. The canonical protocol is described at Sitemaps.org, with Google providing implementation guidance.
Schema Markup (Structured Data)
Machine-readable annotations (commonly JSON-LD) using the Schema.org vocabulary to describe entities (Organization, Product, FAQPage, Review, HowTo, etc.). Correct markup helps parsers disambiguate names, addresses, products, and relationships. Start with Organization to ground brand identity data.
Organization schema
A Schema.org type for representing your company or project. Include identifiers such as name, URL, logo, sameAs links (e.g., LinkedIn, Crunchbase), contact points, and address where relevant, to reduce ambiguity across assistants that reference knowledge graphs.
Assistant and crawler controls
Google-Extended
A token that lets publishers control whether Google’s generative AI (e.g., Bard/Gemini era and Vertex AI) may use content for training. Configure via robots-style rules at the server side. Coverage and semantics have evolved; see reporting on Google’s introduction of Google-Extended.
GPTBot
OpenAI’s crawler. You can allow or block it in robots.txt. OpenAI publishes the user agent and policy page. Third-party trackers (e.g., Dark Visitors) also document observed behaviour.
ClaudeBot
Anthropic’s crawler. The company explains how to opt out via robots.txt in its support docs.
PerplexityBot
Perplexity’s crawler for surfacing links and powering live answers. Documentation covers its user agent and recommended robots.txt allowances. Note that recent reporting alleges “stealth crawling” behaviour that bypasses blocks; Perplexity disputes these claims. Treat all crawler controls as best-effort rather than absolute.
AEO-specific artefacts and proposals
llms.txt
A community proposal, not a formal standard, to publish a Markdown-like file at /llms.txt with assistant-friendly pointers (what to read, which endpoints to use, status pages, and links to deeper machine-readable docs). Think of it as “documentation for LLMs and agents,” discoverable from your root path. The proposal is documented at llmstxt.org; some developers also experiment with embedding llms.txt content inline in HTML.
Answer Cards (Rankmeon concept)
Small, structured JSON documents that state verifiable facts (e.g., “What is it?”, “Who is it for?”, “Pricing from…”, “How to buy”) to make responses quotable and reduce hallucination. We design these to mirror how assistants extract snippets.
Content and credibility concepts
Quotable Blocks
Sentences or bullets crafted to be copied verbatim by assistants. Keep them factual, short, and source-backed. Example: “Our platform supports JSON-LD schema for Product and Organization.”
Canonical URL
The preferred URL for a resource when duplicates exist. Canonicals help consolidate signals so crawlers (and assistants using web indices) understand which page to represent.
Open Graph (OG) / Social Cards
Metadata and preview images that control how your content appears when shared. While OG is not strictly for AEO, strong previews increase the chances that cited links are clicked and re-shared, amplifying trust signals to assistants that track engagement.
FAQPage / HowTo / Product schema
Schema types that frame facts and steps in machine-readable form. Appropriate use increases answer-readiness for common questions. See Schema.org for full definitions.
Governance, policies, and ethical controls
Data Use Policy (AI)
A page explaining whether and how you permit AI training or assistant use of your content. Link it from robots.txt and llms.txt if you maintain explicit policies.
Opt-out vs. Access Control
Opt-out signals (robots.txt, Google-Extended) are honoured by many but not all crawlers. Access control (authentication, paywalls, WAF rules) enforces policy but may still be probed; allegations of crawler evasion in 2025 show the limits of voluntary protocols. Plan accordingly.
Practical AEO metrics (Rankmeon framing)
Discovery Rate
Share of high-value pages present in your sitemaps and reachable (200 status, not blocked).
Citation Rate
How often assistants include your page as a referenced source for relevant prompts.
Action Readiness
How many assistant answers can end in a real business action because your page provides unambiguous contact, pricing, or booking flows.
Frequently Asked Questions
Is AEO just “SEO for AI”?
AEO uses some of the same primitives (robots.txt, sitemaps, schema) but optimises for assistant answers rather than SERP rankings. The goal is to be cited and actioned inside conversations, not merely listed.
Do I need llms.txt?
It’s optional and not a standard. It can still be useful as a discoverable map for agents. If you use it, label it clearly and keep it consistent with your policies.
Will robots.txt stop every AI crawler?
No. Reputable crawlers document and honour robots.txt; others may not. Recent reporting alleges evasion by some bots, which those companies dispute. Use layered controls if content sensitivity is high.
What structured data should every company start with?
Organization plus any relevant complements (Product, FAQPage, HowTo, LocalBusiness). Ensure fields like name, url, logo, and sameAs are correct.
The Bottom Line
AEO is the operational layer that turns your site into assistant-ready knowledge: findable via sitemaps and robots.txt, understandable via schema, citable via quotable facts, and actionable through clear flows. Use this glossary to align teams and as a checklist for implementation.
Want a tailored glossary for your stack and policies? Start an AEO audit with Rankmeon.ai.
References (Harvard style)
- Google Developers (n.d.) Introduction to robots.txt. Available at: https://developers.google.com/search/docs/crawling-indexing/robots/intro.
- Google Developers (n.d.) How Google interprets robots.txt. Available at: https://developers.google.com/search/docs/crawling-indexing/robots/robots_txt.
- Sitemaps.org (n.d.) Sitemaps XML protocol. Available at: https://www.sitemaps.org/protocol.html.
- Google Developers (n.d.) Build and submit a sitemap. Available at: https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap.
- Schema.org (n.d.) Organization (type) and Schemas overview. Available at: https://schema.org/Organization and https://schema.org/docs/schemas.html.
- OpenAI (n.d.) GPTBot. Available at: https://platform.openai.com/docs/bots.
- Anthropic (n.d.) Does Anthropic crawl data from the web? Available at: https://support.claude.com/en/articles/8896518-does-anthropic-crawl-data-from-the-web-and-how-can-site-owners-block-the-crawler.
- Perplexity (n.d.) Perplexity Crawlers. Available at: https://docs.perplexity.ai/guides/bots.
- Search Engine Land (2023) Google introduces Google-Extended… Available at: https://searchengineland.com/google-extended-crawler-432636.
- The Verge (2025) Cloudflare says Perplexity’s AI bots are ‘stealth crawling’… Available at: https://www.theverge.com/news/718319/perplexity-stealth-crawling-cloudflare-ai-bots-report.
- Business Insider (2025) An AI data trap catches Perplexity impersonating Google… Available at: https://www.businessinsider.com/ai-data-trap-catches-perplexity-impersonating-google-cloudflare-2025-8.
- TechRadar Pro (2025) Perplexity accused of breaking a major online AI scraping rule… Available at: https://www.techradar.com/pro/cloudflare-says-perplexity-is-breaking-a-major-online-ai-scraping-rule.
- llmstxt.org (2024) The /llms.txt file (proposal). Available at: https://llmstxt.org/.
- Vercel (2025) A proposal for inline LLM instructions in HTML based on llms.txt. Available at: https://vercel.com/blog/a-proposal-for-inline-llm-instructions-in-html.