AI Search Optimization Glossary

Definitions of the key terms used in generative engine optimization (GEO), AI search readiness, and LLM citation practices. Use this as your reference when reading our audit reports and blog articles.

32 terms · last updated March 2026

AI Search Readiness

A measure of how well a website is prepared to be discovered, parsed, and cited by AI-powered search engines such as ChatGPT, Perplexity, and Google AI Overviews. Readiness is evaluated across four dimensions: Machine Readability, Extractability, Trust & Authority, and Offering Readiness.

Generative Engine Optimization (GEO)

The practice of optimising web content to be cited as a source in AI-generated responses from large language models (LLMs). GEO is the AI-era equivalent of traditional SEO: instead of ranking in a list of blue links, the goal is to appear as a cited source in a generated answer. Also referred to as Answer Engine Optimization (AEO) or LLM SEO.

Answer Engine Optimization (AEO)

Optimising content to be surfaced as an answer by AI-powered query engines (Perplexity, ChatGPT, Google AI Overviews, Bing Copilot). AEO focuses on content structure, extractability, and trust signals rather than traditional keyword density and backlink metrics.

LLM SEO

Informal term for the set of technical and content practices that help large language models (LLMs) discover, trust, and cite a website. Encompasses structured data, content clarity, trust signals, and technical accessibility for AI crawlers.

Machine Readability (MR)

One of four AI Search Readiness baskets (weight: 25%). Measures whether AI bots can technically access and parse a page: proper robots.txt configuration, absence of JavaScript-only rendering, valid SSL, canonical URLs, and complete meta tags. A site with low MR is functionally invisible to AI crawlers regardless of content quality.

Extractability (EX)

One of four AI Search Readiness baskets (weight: 30%). The highest-weighted dimension — measures whether an LLM can extract a specific, quotable answer from a page. Key signals: FAQ blocks, clear heading hierarchy (H1/H2), content depth, meta description quality, and BLUF structure. AI systems do not cite pages; they cite passages.

Trust & Authority (TR)

One of four AI Search Readiness baskets (weight: 25%). Measures whether AI systems have sufficient reason to trust a source: verified business identity (NAP), customer reviews with AggregateRating schema, named authorship, GTIN/MPN product identifiers, and accessible contact and privacy pages.

Offering Readiness (OR)

One of four AI Search Readiness baskets (weight: 20%). Measures how well product and service data is machine-readable: image alt text coverage, price and currency in Schema.org Offer markup, category breadcrumbs, and overall product content quality. Most relevant to e-commerce; informational sites naturally score lower.

Citation Rate

The percentage of relevant queries for which an AI search engine (Perplexity, ChatGPT, etc.) cites a specific website as a source. A citation rate of 0% means the site is never referenced as an authority, even for queries directly related to its products or expertise.

AI Crawler

An automated bot that indexes web content for use in AI-generated responses. Major AI crawlers include GPTBot (OpenAI), PerplexityBot, ClaudeBot (Anthropic), and GoogleBot (used for AI Overviews). Unlike Googlebot, AI crawlers typically have a limited JavaScript execution budget, making server-rendered content critical.

JavaScript Rendering Penalty

A scoring reduction applied when a page's static HTML contains fewer than 50 words of visible text, indicating that content is loaded via client-side JavaScript. Because AI crawlers often do not execute JavaScript fully, such pages are treated as near-blank by the scoring model. The Machine Readability subscore is multiplied by 0.5.

Schema.org / JSON-LD

A shared vocabulary of structured data types (Organization, Product, FAQ, Person, Article, etc.) and a serialisation format (JSON-LD) used to embed machine-readable metadata in web pages. Schema.org markup is the primary channel through which AI search engines build a factual understanding of a page's content, independent of prose text.

Structured Data

Machine-readable metadata embedded in a web page, typically as JSON-LD or Microdata following Schema.org types. Structured data allows AI crawlers to extract facts (price, product name, review rating, business address) without parsing unstructured text. Its absence is one of the most common reasons e-commerce sites are ignored by AI search.

E-E-A-T

Google's quality framework: Experience, Expertise, Authoritativeness, and Trustworthiness. Originally from Google's Search Quality Evaluator Guidelines, E-E-A-T signals are increasingly used across AI systems to judge whether a source is safe to cite. Key signals include named authors with credentials, business identity (NAP), external references, and review data.

NAP (Name, Address, Phone)

The triad of business identity signals used by search engines and AI systems to verify that a website represents a real, operating business. Consistent NAP data — especially when marked up with Schema.org/LocalBusiness — is a strong trust signal. Inconsistency between NAP on the site and NAP in external directories (Google Business, Yelp) reduces trust scores.

BLUF (Bottom Line Up Front)

A writing structure that places the key conclusion or answer at the very beginning of a section, before supporting details. BLUF-structured content is highly extractable by AI models because the core answer appears in the first sentence — matching how LLMs prefer to retrieve and quote sources. The inverse (burying the answer after background context) reduces extractability.

GPTBot

OpenAI's web crawler, used to index content for use in ChatGPT's browsing and search features. GPTBot respects robots.txt disallow rules. Sites that block GPTBot via robots.txt will not be cited by ChatGPT. Identifiable by the user-agent string "GPTBot".

PerplexityBot

Perplexity AI's web crawler. PerplexityBot has a limited JavaScript execution budget, making it particularly sensitive to client-side rendering. Pages with content loaded via React, Vue, or Angular without server-side rendering are frequently invisible to PerplexityBot.

FAQPage Schema

A Schema.org structured data type that marks up a page's Q&A content as a machine-readable FAQ. FAQPage schema is one of the strongest extractability signals: AI models trained on Q&A data naturally prefer sources that present information in question-and-answer format. Google also uses FAQPage schema for rich results in traditional search.

Citation Monitoring

The practice of systematically querying AI search engines (Perplexity, ChatGPT, etc.) with queries relevant to a website's topic area, and recording whether the site appears as a cited source. Used to measure citation rate and track the impact of AI search optimisation work over time.

Google AI Overviews

Google's AI-generated answer feature (formerly Search Generative Experience / SGE) that displays a synthesized response at the top of search results, citing 3–5 sources inline. AI Overviews appear for an estimated 30%+ of Google searches as of 2026. Unlike traditional search results, AI Overviews select sources based on answer-readiness and trust signals, not just ranking position.

RAG (Retrieval-Augmented Generation)

An AI architecture pattern where a language model retrieves relevant documents from a knowledge base before generating a response. AI search engines like Perplexity and ChatGPT use RAG to ground their answers in real web content. For website owners, RAG-readiness means structuring content so that retrieval systems can identify, chunk, and extract relevant passages efficiently.

llms.txt

An emerging convention (similar to robots.txt) where a website places a machine-readable file at /llms.txt describing its content, structure, and key offerings for AI crawlers. The file helps LLMs understand a site's purpose without parsing every page. Not yet an official standard, but gaining adoption in the GEO community.

AggregateRating Schema

A Schema.org type that encodes a product or service's average rating and review count in machine-readable format. AggregateRating is a key trust signal for AI citation decisions: our audit data shows that 91% of websites have zero review markup, making it the single most common AI readiness failure.

GTIN / MPN

Global Trade Item Number (GTIN) and Manufacturer Part Number (MPN) — standardized product identifiers that AI search engines use to verify product identity and match products across sources. Including GTIN or MPN in Schema.org/Product markup enables AI shopping features (ChatGPT Shopping, Google Shopping) to confidently reference your products. 66% of e-commerce sites in our audit data lack these identifiers.

Server-Side Rendering (SSR)

A web rendering approach where HTML is generated on the server and sent to the client as complete markup. SSR is critical for AI search readiness because AI crawlers (GPTBot, PerplexityBot) have limited JavaScript execution capability. Content that only renders via client-side JavaScript (React CSR, Vue SPA) is often invisible to AI crawlers, triggering the JavaScript Rendering Penalty.

Content Depth

A measure of the substantive information density on a page, evaluated by word count, topic coverage, and the presence of supporting evidence (data, examples, citations). Pages with fewer than 300 words are typically classified as thin content by AI systems. AI search engines prefer comprehensive, in-depth sources over superficial summaries when selecting citation targets.

Authorship Signals

On-page indicators that identify who created the content: author byline, author bio, professional credentials, links to social profiles (LinkedIn, Twitter), and Schema.org/Person markup. AI engines use authorship signals as a trust proxy — content with a named, verifiable author is more likely to be cited than anonymous content. 60% of sites in our audit lack these signals.

Trust Gap

The disconnect between a website's traditional SEO health and its AI search readiness, identified in our audit of 98 websites. Sites with excellent technical SEO (95%+ pass rates on SSL, mobile, robots.txt) simultaneously show 50–91% failure rates on trust signals (review markup, authorship, product identifiers). The Trust Gap explains why sites that rank well in Google can be completely absent from AI-generated answers.

ChatGPT Shopping

OpenAI's product recommendation and shopping feature within ChatGPT, which surfaces product suggestions with images, prices, and direct links. ChatGPT Shopping relies on Schema.org/Product and Schema.org/Offer markup, AggregateRating data, and crawlable product pages. Products without structured data are invisible to this feature.

robots.txt

A text file at the root of a website (/robots.txt) that instructs web crawlers which pages they may or may not access. For AI search readiness, the critical concern is whether AI-specific crawlers (GPTBot, PerplexityBot, ClaudeBot, OAI-SearchBot) are allowed or blocked. Many CMS platforms and security plugins block AI crawlers by default, making the site invisible to AI search engines.

Want to see how your site scores on these dimensions?

Run a free auditScoring methodology