Scoring Methodology

A transparent breakdown of how the AI Search Readiness Score is calculated: why we chose these four dimensions, how we weighted them, and what informed each check.

Model version: 1.0 — last updated March 2026

The Formula

Score = 0.25 × MR + 0.30 × EX + 0.25 × TR + 0.20 × OR

Each basket score is calculated from its constituent checks and normalised to 0–100. The final score is a weighted average. Checks are grouped into three tiers:

Core9 checks — visible to all users on the free tier.
Premium13 checks — available on Starter and Pro plans.
LLM4 checks — GPT-4o evaluations, available on Pro plan.

JS Dependency Penalty

If the static HTML of a page contains fewer than 50 words of visible text, the MR subscore is multiplied by 0.5. This penalty models the fact that AI crawlers operating on a limited compute budget will often not execute client-side JavaScript. A React app that renders a blank HTML shell is functionally invisible to most AI bots regardless of how well the rest of the page is optimised.

The Four Baskets

Machine Readability

25% weight

A bot must be able to access and parse a page before any other signal matters. This basket checks for technical blockers: JavaScript-only rendering, robots.txt disallow rules, SSL issues, and missing canonical URLs. We weighted it at 25% because it is a binary gate — low MR renders all other signals irrelevant — but it is also the easiest dimension to fix, so it should not dominate the score.

Checks in this basket

MR1Indexation (robots.txt)Premium

MR2Language & Mobile OptimizationCore

MR3Schema.org Structured DataCore

MR4SSL / HTTPSPremium

MR5Open Graph CompletenessPremium

MR6Canonical URLPremium

MR7JS Rendering (AI Crawler View)Premium

MR14Page Title & Social Meta TagsCore

Extractability

30% weight

This is the highest-weighted basket (30%) because it directly models the core action an LLM takes when citing a source: extracting a specific, quotable answer. AI systems do not cite pages — they cite passages. If a page buries its key claims in long prose, mixes topics, or lacks structured headers, an LLM will pass over it in favour of a cleaner source. The FAQ signal is weighted heavily here because Q&A format is the closest natural language approximation to the LLM's own output format.

Checks in this basket

EX1Meta Description QualityPremium

EX3FAQ ContentCore

EX5Local Market RelevanceCore

EX6Heading Hierarchy (H1/H2)Premium

EX7Content DepthPremium

EX12Rich Content & Comparison TablesCore

EX_LLM_BLUFContent Clarity (BLUF/TL;DR)LLM

EX_LLM_FAQFAQ Content Richness (LLM)LLM

EX_LLM_LOCALLocal Relevance (LLM)LLM

EX_LLM_STRUCTUREContent Structure (LLM)LLM

Trust & Authority

25% weight

LLMs are trained with RLHF and Constitutional AI methods that penalise citing unreliable sources. In practice, this means models are biased toward sources with verifiable identity signals: a business name, physical address, customer reviews, and named authors. These are the same signals Google uses for E-E-A-T evaluation. We weighted Trust at 25% — equal to Machine Readability — because both are gate-keeping dimensions: a technically perfect but anonymous site will still be under-cited.

Checks in this basket

TR1Business Identity (NAP)Core

TR3Customer Reviews & RatingsCore

TR5Authorship SignalsPremium

TR6GTIN/MPN for ProductsPremium

TR7Contact & Privacy PagesPremium

Offering Readiness

20% weight

AI commerce (ChatGPT Shopping, Perplexity product cards) requires machine-readable product data: price, currency, images with alt text, breadcrumbs, and GTINs. We weighted this basket at 20% — the lowest — because it is the most domain-specific: it directly applies to e-commerce sites but is partially inapplicable to informational or service-based sites. Future versions of the score will adjust this weight by detected site type.

Checks in this basket

OR1Product/Content QualityCore

OR4Image Alt Text CoveragePremium

OR5Price & Currency in OfferPremium

OR6Category BreadcrumbsPremium

What Informed This Model

The scoring model is a synthesis of publicly available documentation and observed citation behaviour across Perplexity, ChatGPT, Google AI Overviews, and Bing Copilot. Key sources that shaped our thinking:

→Google Search Central — AI Overviews guidance: Google explicitly states that pages with clear, specific answers and strong E-E-A-T signals are preferred for inclusion in AI-generated summaries. This directly informed our Extractability and Trust baskets.
→Perplexity's crawler documentation: PerplexityBot crawls pages with a limited JavaScript execution budget. Our MR7 check (JS Rendering) and the JS Dependency Penalty model this behaviour directly.
→OpenAI's GPTBot documentation: GPTBot respects robots.txt and disallow rules. Sites blocking AI bots explicitly lose all citation potential — this is the basis of our MR1 check.
→Schema.org and Google's Structured Data guidelines: JSON-LD structured data (Product, Organization, FAQ, Article) is used by AI systems to build factual understanding of a page independently of prose content. MR3 and TR6 check for this.
→Observed citation patterns: We analysed citation sources across Perplexity responses in the e-commerce and local business verticals. Pages cited most frequently share common traits: clean NAP data, explicit FAQ sections, server-rendered content, and Product schema with price data.

Known Limitations

E-commerce bias

The Offering Readiness basket (20%) is most relevant to product-focused sites. For informational blogs, service businesses, or personal brands, OR checks like GTIN/MPN and Price are partially inapplicable. We are building site-type detection to adjust weights automatically in a future release.

Weights are hypotheses, not empirical results

The current weights (25/30/25/20) are informed by documentation analysis and observed patterns — not a controlled A/B experiment on thousands of sites. We are collecting correlation data between scores and actual citation rates from our monitoring pipeline. We will publish findings and update weights as evidence accumulates.

AI systems change constantly

Perplexity, ChatGPT, and Google AI Overviews update their retrieval and ranking logic frequently. What drives citations in Q1 2026 may shift by Q3. We treat this model as a living document and will publish changelog entries when significant updates are made.

Have a question about the methodology, want to propose a new check, or found a discrepancy between our score and observed citation behaviour?