Scoring Methodology
A transparent breakdown of how the AI Search Readiness Score is calculated: why we chose these four dimensions, how we weighted them, and what informed each check.
Model version: 1.0 — last updated March 2026
The Formula
Each basket score is calculated from its constituent checks and normalised to 0–100. The final score is a weighted average. Checks are grouped into three tiers:
- Core9 checks — visible to all users on the free tier.
- Premium13 checks — available on Starter and Pro plans.
- LLM4 checks — GPT-4o evaluations, available on Pro plan.
JS Dependency Penalty
If the static HTML of a page contains fewer than 50 words of visible text, the MR subscore is multiplied by 0.5. This penalty models the fact that AI crawlers operating on a limited compute budget will often not execute client-side JavaScript. A React app that renders a blank HTML shell is functionally invisible to most AI bots regardless of how well the rest of the page is optimised.
The Four Baskets
Machine Readability
25% weightA bot must be able to access and parse a page before any other signal matters. This basket checks for technical blockers: JavaScript-only rendering, robots.txt disallow rules, SSL issues, and missing canonical URLs. We weighted it at 25% because it is a binary gate — low MR renders all other signals irrelevant — but it is also the easiest dimension to fix, so it should not dominate the score.
Checks in this basket
Extractability
30% weightThis is the highest-weighted basket (30%) because it directly models the core action an LLM takes when citing a source: extracting a specific, quotable answer. AI systems do not cite pages — they cite passages. If a page buries its key claims in long prose, mixes topics, or lacks structured headers, an LLM will pass over it in favour of a cleaner source. The FAQ signal is weighted heavily here because Q&A format is the closest natural language approximation to the LLM's own output format.
Checks in this basket
Trust & Authority
25% weightLLMs are trained with RLHF and Constitutional AI methods that penalise citing unreliable sources. In practice, this means models are biased toward sources with verifiable identity signals: a business name, physical address, customer reviews, and named authors. These are the same signals Google uses for E-E-A-T evaluation. We weighted Trust at 25% — equal to Machine Readability — because both are gate-keeping dimensions: a technically perfect but anonymous site will still be under-cited.
Checks in this basket
Offering Readiness
20% weightAI commerce (ChatGPT Shopping, Perplexity product cards) requires machine-readable product data: price, currency, images with alt text, breadcrumbs, and GTINs. We weighted this basket at 20% — the lowest — because it is the most domain-specific: it directly applies to e-commerce sites but is partially inapplicable to informational or service-based sites. Future versions of the score will adjust this weight by detected site type.
Checks in this basket
What Informed This Model
The scoring model is a synthesis of publicly available documentation and observed citation behaviour across Perplexity, ChatGPT, Google AI Overviews, and Bing Copilot. Key sources that shaped our thinking:
- →Google Search Central — AI Overviews guidance: Google explicitly states that pages with clear, specific answers and strong E-E-A-T signals are preferred for inclusion in AI-generated summaries. This directly informed our Extractability and Trust baskets.
- →Perplexity's crawler documentation: PerplexityBot crawls pages with a limited JavaScript execution budget. Our MR7 check (JS Rendering) and the JS Dependency Penalty model this behaviour directly.
- →OpenAI's GPTBot documentation: GPTBot respects robots.txt and disallow rules. Sites blocking AI bots explicitly lose all citation potential — this is the basis of our MR1 check.
- →Schema.org and Google's Structured Data guidelines: JSON-LD structured data (Product, Organization, FAQ, Article) is used by AI systems to build factual understanding of a page independently of prose content. MR3 and TR6 check for this.
- →Observed citation patterns: We analysed citation sources across Perplexity responses in the e-commerce and local business verticals. Pages cited most frequently share common traits: clean NAP data, explicit FAQ sections, server-rendered content, and Product schema with price data.
Known Limitations
E-commerce bias
The Offering Readiness basket (20%) is most relevant to product-focused sites. For informational blogs, service businesses, or personal brands, OR checks like GTIN/MPN and Price are partially inapplicable. We are building site-type detection to adjust weights automatically in a future release.
Weights are hypotheses, not empirical results
The current weights (25/30/25/20) are informed by documentation analysis and observed patterns — not a controlled A/B experiment on thousands of sites. We are collecting correlation data between scores and actual citation rates from our monitoring pipeline. We will publish findings and update weights as evidence accumulates.
AI systems change constantly
Perplexity, ChatGPT, and Google AI Overviews update their retrieval and ranking logic frequently. What drives citations in Q1 2026 may shift by Q3. We treat this model as a living document and will publish changelog entries when significant updates are made.
Have a question about the methodology, want to propose a new check, or found a discrepancy between our score and observed citation behaviour?
Contact usRun a free audit