Free Content Relevance Audit - Check Your Site's AI Search Readiness

Updated April 11, 20268 min read

TL;DR

The free audit at getaisearchscore.com runs the full Content Relevance Score on any website - five components (Query Coverage, Content Depth, Sub-Intent Coverage, Citation Reality on paid scans, Technical Health), per-query breakdown, sub-intent gap analysis, and the 26 legacy technical checks as one subcomponent. No login, no credit card, up to 50 pages crawled, 20 monitoring queries generated and editable. The free tier gives the full diagnostic - we don't gate core value. The only things reserved for the paid Starter consultation (149 one-time) are Citation Reality via Perplexity and a human expert review.

The free audit at getaisearchscore.com runs the full Content Relevance Score on any website - five components, per-query breakdown, sub-intent gap analysis, and the 26 technical checks from the original product rolled into a single Technical Health subcomponent. No login, no credit card, up to 50 pages crawled, 20 monitoring queries generated and editable. Results take a few minutes.

The audit detects whether AI crawlers (GPTBot, PerplexityBot, ClaudeBot) can access your pages, whether your content actually answers the queries your audience is asking AI engines, and where your biggest content and technical gaps are. The core value is not gated: the free tier gives you the full diagnostic.

The only thing reserved for the paid tier is Citation Reality (the Perplexity API check) and a human expert consultation. Everything else is free.

Why We Built This

There are plenty of traditional SEO tools. None of them measured whether a site's content actually answered the specific questions a user would type into ChatGPT or Perplexity. We built the first version of this tool around 26 technical checks, ran a study on 441 domains, and discovered that our own technical score did not predict AI citations. The correlation was r=0.009, essentially zero.

The follow-up study showed what did predict citations: content relevance - specifically, the match between user queries and the content on a site, measured via BM25 plus embedding cosine similarity. The classifier reached AUC 0.915. That finding drove the product rebuild. The current scanner measures content-query relevance directly, with query fan-out decomposition into sub-intents, instead of relying purely on structural signals.

The free tier exists because the core diagnostic is not the thing worth charging for. An automated report is easy to produce and easy to replicate. The thing worth paying for is a human expert interpreting the results and building an implementation plan. That is the Starter tier (€149, limited slots). Everything else is free.

What the Score Actually Means (and What It Does Not)

The tool scans up to 50 pages on your site, generates 20 monitoring queries from your niche (which you can edit before analysis runs), embeds each query, retrieves the top three relevant pages per query, and asks GPT-4o to rate each page on a 0-10 relevance scale with sub-intent coverage. You get a Content Relevance Score out of 100. A high score means your site's content closely matches the queries your audience asks. A low score means there are content gaps, technical blockers, or both.

Here is what we need to be honest about: we have not yet published a case where applying our recommendations made a real site's citation rate rise. That experiment is running on our own site right now. Until we have before/after data, what we can honestly claim is that the score measures the signal with the strongest empirical support so far - not that applying it will guarantee citations. Think of it like a medical diagnostic: a good diagnostic tells you what is wrong. Whether treatment works depends on the treatment, not the test.

The Five Components of the Score

The overall Content Relevance Score aggregates five components. The formula for paid scans is 0.25·QC + 0.20·CD + 0.20·SI + 0.20·CR + 0.15·TH. The free tier redistributes Citation Reality's weight across the content components: 0.30·QC + 0.25·CD + 0.25·SI + 0.20·TH.

Component	Weight (paid)	What It Measures
Query Coverage (QC)	25%	Share of queries where at least one page on the site scores 5+ out of 10 on LLM relevance rating.
Content Depth (CD)	20%	Average of the best-page relevance score across all queries. Measures how well your best answer answers each query.
Sub-Intent Coverage (SI)	20%	Fraction of sub-intents (query fan-out) that the site addresses across pages. Rewards broad content footprint per topic.
Citation Reality (CR, paid only)	20%	Share of monitoring queries where Perplexity actually cites the site. Ground truth, not prediction.
Technical Health (TH)	15%	The legacy 26-check technical audit: crawlability, schema, rendering, trust signals, product data. Weighted low because, alone, it did not predict citations.

How It Works

Enter your URL, email, and niche. The tool discovers pages via your sitemap.xml and crawls up to 50 of them automatically, simulating how AI bots see the site (including JavaScript rendering).
Receive 20 monitoring queries. GPT-4o generates queries a real user would ask an AI engine about your niche. You get a link to a query review page where you can edit, remove, or add queries before analysis runs. Default limit is 20.
Trigger analysis. The system embeds each query, retrieves the top three relevant pages per query from your crawl, and asks GPT-4o to rate relevance and identify sub-intent coverage gaps.
Read the results. You get the overall Content Relevance Score, a per-component breakdown (QC, CD, SI, TH), per-query findings, and a prioritized list of content gaps ranked by impact.
Fix and rescan. No limit on rescans. Fix the top gaps, scan again, see if the numbers move.

How to Read Your Score

These ranges describe content relevance readiness. A site at 80 covers most of its audience queries with decent depth. A site at 20 has large gaps - either content that does not match what people are asking, or technical issues blocking crawlers from seeing the content at all.

Score Range	Rating	What it means
0-29	Critical	Either technical blockers prevent AI crawlers from reading the site, or the content has almost no overlap with the queries the audience is asking. Fix Technical Health first, then rescan.
30-59	Needs Work	Some queries are covered, major gaps remain. Look at Sub-Intent Coverage - this is where the largest gains usually hide.
60-79	Solid	Content is broadly relevant. Further gains come from Content Depth: making your best answers deeper than anyone else's.
80-100	Clean	Content relevance is not the bottleneck. If citation rate is still low, the issue is likely domain reputation, freshness, or sheer competitive density - not readiness signals.

Free vs Starter

The free tier runs the full Content Relevance Score (QC, CD, SI, TH) with per-query breakdowns and actionable gap analysis. That is enough to find the biggest problems and act on them. The €149 Starter consultation adds Citation Reality via Perplexity and a human expert who walks through the results, writes prioritized rewrite tasks, and does a follow-up rescan.

Feature	Free	Starter (€149)
Content Relevance Score (QC / CD / SI / TH)	Yes	Yes
Query review flow (edit monitoring queries)	Yes	Yes
Per-query breakdown + sub-intent gaps	Yes	Yes
Unlimited rescans	Yes	Yes
Citation Reality (Perplexity monitoring)	No	Yes
Human expert review + rewrite task plan	No	Yes
Implementation call + follow-up rescan	No	Yes
PDF executive report	No	Yes

Common Fixes That Move Each Component

Most sites score below 50 on their first scan. These are the fixes we see making the biggest difference in each component.

Query Coverage (QC)

Write down the 10-20 questions your audience would type into ChatGPT about your niche
Check which of your existing pages actually answers each question in depth
For queries with no matching page, create new content (not thin listicles - pages that answer the question substantively)
For queries where the match is weak, rewrite the page to address the question directly in the first paragraph

Content Depth (CD)

Add concrete examples, data points, dated references, screenshots, direct quotes
Add a TL;DR summary block at the top of each key page
Add edge cases and “but here is when this breaks” sections
Cut fluff and filler - depth is not word count

Sub-Intent Coverage (SI)

For each top query, list the 3-5 sub-questions a reader is implicitly asking
Check whether any page covers each sub-question
Build a content hub: break one “big page” into multiple focused pages, each owning one sub-intent
Cross-link hub pages internally so both readers and crawlers understand the cluster

Technical Health (TH) - the hygiene floor

Allow AI crawlers in robots.txt (OAI-SearchBot, PerplexityBot, Google-Extended)
Ensure server-side rendering - test by disabling JavaScript in your browser
Add JSON-LD Organization schema to every page
Add FAQPage schema with 3-5 questions on key pages
Add BreadcrumbList schema across the site
Add aggregateRating to product or service pages
Ensure sitemap.xml is valid and includes all important pages

These 7 fixes can improve your Technical Health subcomponent significantly. Whether that translates into more AI citations is a separate question - our research suggests it depends far more on whether your content relevance (QC, CD, SI) is strong. Technical Health is the floor, not the ceiling.

What the Tool Detects by Vertical

The scanner uses LLM classification to detect your business type and adapts the Technical Health checks and monitoring queries accordingly. Here is what each vertical focuses on:

E-commerce: Product schema completeness, GTIN/MPN identifiers, pricing and availability data, product image quality, aggregate ratings, category-specific sub-intent queries
SaaS: Feature page structure, pricing table schema, integration documentation, comparison pages, free trial / demo CTA, solution-specific queries
B2B Services: Service area pages, case study schema, testimonials, service descriptions with schema markup, industry-specific queries
Personal Brands: Person schema, portfolio structure, authorship signals, social proof, speaking/media appearances
Local Business: LocalBusiness schema, NAP consistency, opening hours, Google Business Profile linking, location-specific queries
Content / Media: Article schema, author bios with expertise signals, topic authority clustering, date freshness, editorial standards

The Honest Picture

AI search is growing fast. ChatGPT, Perplexity, Google AI Overviews - they all cite fewer sources per answer than traditional search shows results. Being one of the 3-5 cited sources per query is the entire game.

What we cannot tell you is that running our scanner and fixing everything will guarantee you get cited. Our own research on the first version of the product (r=0.009) killed that narrative. What we can tell you is that measuring content relevance through query fan-out plus sub-intent coverage is the closest thing to a predictor we know of (AUC 0.915), and that the free tier gives you the full diagnostic without a paywall.

Run a scan, see what comes up, fix what makes sense. If you want to go deeper, the AI Search Readiness Checklist and Citation Rate Improvement Guide cover the details. If you want a human expert to walk through your results and build an implementation plan, book a Starter consultation.

Benefits of Using a Free Content Relevance Audit

Running an AI search readiness audit gives you three things that traditional SEO tools do not:

1.Query-content relevance measurement. Traditional tools check keyword rankings. This tool checks whether the questions your audience actually asks AI engines are answered by any page on your site, with sub-intent depth.
2.A prioritized gap list. Not a dashboard with hundreds of metrics - a ranked list of specific queries where your content is missing or shallow, ordered by impact.
3.Baseline measurement for comparison. You cannot improve what you do not measure. The five-component breakdown gives you a starting point and unlimited rescans let you track progress as you close the gaps.

Examples of Typical Scan Results

Typical outcomes across different site types. These are averages from our early scans, expressed as the Technical Health subcomponent (the part we have the most historical data on). Full Content Relevance distributions will follow once our self-test and pilot cases are published.

Site Type	Typical TH First Score	Common Issues Found	After Top 5 Fixes
E-commerce (Shopify)	15-25	Missing Product schema, no FAQ blocks, GPTBot blocked by app firewall	45-60
SaaS Landing Page	20-35	JS-rendered content invisible to crawlers, no Organization schema, thin meta descriptions	50-65
Content/Blog Site	30-45	No Article schema, missing author attribution, no FAQ markup	55-70
Local Business	10-20	No LocalBusiness schema, missing NAP consistency, no reviews markup	40-55

See a complete sample report for a real e-commerce store audit.

Frequently Asked Questions

Is the Content Relevance audit really free?+

Yes. The free tier runs the full four-component audit (Query Coverage, Content Depth, Sub-Intent Coverage, Technical Health) with per-query breakdowns, sub-intent gap analysis, and top recommendations. No credit card, no trial period, unlimited rescans. The Starter consultation (149 one-time, 4 slots/month) adds Citation Reality via Perplexity and a human expert with 20-40 prioritized rewrite tasks.

What does the tool check?+

Five components: (1) Query Coverage - what fraction of target queries any page on the site answers; (2) Content Depth - how deeply the best page addresses each query; (3) Sub-Intent Coverage - whether the site covers the full fan-out of information needs; (4) Citation Reality - whether Perplexity currently cites the site (paid only); (5) Technical Health - the 26 legacy technical checks (schema, crawl access, content structure, trust signals) as one subcomponent with 15-20% weight.

How long does the scan take?+

A few minutes. The scanner discovers pages via sitemap.xml, crawls up to 50 pages with Playwright (simulating AI bots), runs 26 technical checks, generates 20 monitoring queries via GPT-4o, then evaluates content relevance per query. You can edit the queries before analysis runs.

Does it work for non-e-commerce sites?+

Yes. The Content Relevance Score measures query-content match, which applies to any business type. The query generation adapts to your niche. Technical Health checks include vertical-specific items for e-commerce, SaaS, B2B services, local businesses, and content/media sites.

How do I interpret my score?+

0-29: major content and technical gaps, AI engines unlikely to cite you. 30-59: some query coverage but significant gaps in depth or sub-intent coverage. 60-79: solid content relevance, citations plausible for covered queries. 80+: strong coverage across target queries. Note: a high score means strong content relevance, not guaranteed citations.

Alexey Tolmachev

Senior Systems Analyst · AI Search Readiness Researcher

Senior Systems Analyst with 14 years of experience in data architecture, system integration, and technical specification design. Researches how AI search engines process structured data and select citation sources. Creator of the AI Search Readiness Score methodology.

LinkedIn ↗

Check Your AI Search Readiness

Get your free AI Search Readiness Score in under 2 minutes. See exactly what to fix so ChatGPT, Perplexity, and Google AI Overviews can find and cite your content.

Scan My Site — Free

No credit card required.

AI Search Readiness Checklist for E-Commerce (2026)

Two-part e-commerce checklist: content relevance (the strategy - query coverage, content depth, sub-intent coverage) and technical health (the hygiene - schema, crawl access, trust signals). Based on research showing content relevance predicts AI citations at AUC 0.915.

12 min read

How to Improve Your Citation Rate in AI Search Engines

Data-driven guide to improving your citation rate in AI search. 10-step action plan with before/after metrics and citation tracking methods.

10 min read

Why Your Site Isn't Cited in ChatGPT Answers (and How to Fix It)

Why ChatGPT ignores your site: 6 causes and a fix checklist. Covers crawler access, schema, content format, trust signals, freshness, and entity clarity.

9 min read