Does Perplexity have its own crawler?

Yes. Perplexity uses PerplexityBot as its web crawler. Check your robots.txt to ensure it is not blocked. Perplexity also uses its own indexing pipeline separate from Google, so being indexed by Google does not guarantee visibility in Perplexity.

Why does Perplexity cite my competitor but not me?

The most common reasons are: your competitor has better structured data (Product schema with complete offers), more review signals (aggregateRating), and answer-ready content (FAQ sections, comparison tables). Perplexity favors pages that directly answer the user's question with structured, extractable data.

How can I test if Perplexity can see my products?

Search for your exact product name in Perplexity. If it doesn't appear, search for your brand + product category. Then check: (1) Is PerplexityBot allowed in robots.txt? (2) Does Google's Rich Results Test show valid Product schema? (3) Is your product description at least 100 words of crawlable text (not rendered only by JavaScript)?

What Blocks Your Products from Appearing in Perplexity Answers

What I Learned Building a Perplexity Citation Monitor

I built a citation monitoring pipeline that uses Perplexity's API (sonar-reasoning-pro model) to track which sites get cited for specific queries. Over 90 API runs, I extracted 658 citations across 485 domains. That gave me a front-row seat to how Perplexity actually discovers and selects sources.

The first thing I noticed: Perplexity's citations are unstable. Run the same query twice and you can get different sources cited. About 29.3% of citation appearances looked like noise — sites showing up once and never again. This matters because if you search for your product once and don't see it, that single test tells you very little.

The second thing: Perplexity does not rank pages by backlinks and keywords the way Google does. It selects sources based on how well a page answers the specific question asked. PerplexityBot crawls independently of Google, maintains its own index, and pulls from pages with structured, extractable data.

The Honest Caveat: Structure vs. Content Relevance

I have to be upfront about something. When I correlated my AI Search Readiness scores (which measure structural factors like schema markup, crawlability, content format) against actual Perplexity citations, the correlation was essentially zero: r=0.009, p=0.849. Across 485 domains and 14,550 domain-query pairs, structural readiness did not predict who gets cited.

What did predict citations? Content relevance. Same-topic pages were cited at 5.17% vs 0.08% for cross-topic — a 62x difference. Perplexity cares first and foremost whether your page actually answers the question being asked.

So why does this article still list structural blockers? Because they are necessary but not sufficient. If PerplexityBot cannot crawl your site, relevance does not matter — you are invisible. If your content is behind a JavaScript wall, it might as well not exist. Fixing these blockers removes barriers to entry. But fixing them alone will not make Perplexity cite you. You also need content that directly answers the queries people ask.

The 8 Most Common Blockers

These are the structural issues I see most often when scanning sites with our AI Search Readiness tool. Blockers #1–3 are hard stops — they prevent Perplexity from seeing your content at all. Blockers #4–8 reduce your chances of being selected as a source.

#	Blocker	Impact	How to Check
1	robots.txt blocks PerplexityBot	Complete block	Check yoursite.com/robots.txt for PerplexityBot rules
2	Missing Product schema	No product cards	Google Rich Results Test on product pages
3	Content behind JS rendering	Partial/no indexing	View page source (not DevTools) — is product data visible?
4	No review/rating data	Lower citation priority	Check for aggregateRating in page source
5	No FAQ or answer-ready content	Not cited for queries	Check product pages for FAQ sections, comparison content
6	Thin product descriptions (<100 words)	Low relevance	Count words in product descriptions
7	Duplicate pages without canonical tags	Diluted signals	Check for rel=canonical on product variants
8	No sitemap.xml or broken sitemap	Incomplete discovery	Test yoursite.com/sitemap.xml in browser

Diagnostic Flowchart

Work through these in order. Each step assumes the previous ones passed.

Can PerplexityBot access your site?
Check robots.txt → If blocked, add Allow rule → Rescan
Is your product data in the HTML source?
View page source (Ctrl+U) → If product data only appears after JS execution, switch to server-side rendering
Do you have Product schema with Offers?
Run Google Rich Results Test → If no Product detected, add JSON-LD schema
Does your schema include price + availability?
Check Offer fields → If missing, add price, priceCurrency, availability
Do you have review data?
Check for aggregateRating → If absent, add review collection
Is your content answer-ready?
Check for FAQ sections, comparison tables → If absent, add answer blocks
Are descriptions substantive (200+ words)?
Count crawlable text → If thin, expand descriptions with features, use cases, specs
Does your sitemap include all products?
Compare sitemap URLs to actual product pages → If missing pages, regenerate sitemap

Fix Guide: Blocker #1 — Robots.txt

This is the most common hard stop I see. Many CMS platforms and security plugins block AI crawlers by default. Your robots.txt should explicitly allow them:

# Allow AI search engine crawlers
User-agent: PerplexityBot
Allow: /

User-agent: OAI-SearchBot
Allow: /

User-agent: ChatGPT-User
Allow: /

User-agent: Google-Extended
Allow: /

# Standard Googlebot rules
User-agent: Googlebot
Allow: /

Sitemap: https://yoursite.com/sitemap.xml

Fix Guide: Blocker #3 — Content Behind JS Rendering

I built our scanner with Playwright specifically to detect this. We fetch pages twice — once with a simple HTTP request (what AI crawlers see) and once with a full browser. If the word count drops below 50 in the static HTML, we apply a 50% penalty to the Machine Readability score.

From what I have observed, PerplexityBot does not execute JavaScript reliably. If your product name, price, and description only appear after client-side rendering, they are likely invisible to Perplexity's crawler.

How to test: Right-click your product page and select "View Page Source" (not DevTools Elements tab). If the product data is not in the raw HTML, it is invisible to AI crawlers.

Fixes by framework:

Next.js: Use getStaticProps or getServerSideProps for product data
Nuxt.js: Use useAsyncData or useFetch in SSR mode
Shopify: Product data is server-rendered by default — check custom theme modifications
WooCommerce: Product data is server-rendered by default — check custom plugins or AJAX loading
Custom SPA: Add server-side rendering or pre-rendering. At minimum, include JSON-LD Product schema in the static HTML

Fix Guide: Blocker #5 — No Answer-Ready Content

This is where the structural advice meets the content relevance finding from my research. Perplexity answers questions by extracting and synthesizing content from multiple sources. If your product pages only have specs with no contextual content, Perplexity has nothing to extract for "which" and "how to choose" queries.

But here is the thing I learned from 658 citations: the content has to actually match what people are asking. A generic FAQ bolted onto a product page is not enough. The questions need to be the real questions your customers ask, and the answers need to be specific enough that Perplexity can extract a useful snippet.

Add to each product category page:

FAQ section with 3–5 questions about the product category
Comparison table showing key differences between products
Buying guide paragraph explaining how to choose the right product

Add to each product page:

FAQ section with 2–3 questions specific to this product
Product description with use cases and who it is best for
"Why choose this product" summary block

What I Actually Observed About Citation Behavior

From monitoring citations across 90 API runs, a few patterns stood out that are not commonly discussed.

Citations are noisy. About 29.3% of citation appearances were one-offs — a domain cited once across all runs and never again. If you test your product in Perplexity and see it cited, do not celebrate yet. Test again tomorrow. If you do not see it, do not panic yet either.

The same query gives different results. Perplexity's sonar-reasoning-pro model does not return deterministic results. I saw the same query cite different sources on different runs. This means a single manual test is not a reliable diagnostic.

Content relevance dominates everything. Across my dataset, same-topic content was cited 62x more often than cross-topic content. No amount of schema markup or structural optimization overcame topic mismatch. If your product page is about diving gear and someone asks about hiking boots, you will not get cited no matter how perfect your markup is.

Measuring Your Fixes

After implementing fixes, test in two ways:

Structural check: Run our free AI Search Readiness audit to verify crawlability, schema, and content format issues are resolved
Manual citation test: Search for your product names and categories in Perplexity multiple times over several days. A single search is not reliable due to the citation instability I described above

Wait 2–4 weeks after fixes for PerplexityBot to re-crawl your site. Blockers #1–3 should resolve fastest since they are binary — either the crawler can access your content or it cannot.

But I want to be honest: fixing structural blockers is necessary groundwork, not a guarantee of citations. The strongest signal I found in my research was content relevance — whether your page genuinely answers the question someone is asking. Fix the blockers so Perplexity can see you, then focus on creating content that actually helps people make decisions about your products.

What Blocks Your Products from Appearing in Perplexity Answers

What I Learned Building a Perplexity Citation Monitor

The Honest Caveat: Structure vs. Content Relevance

The 8 Most Common Blockers

Diagnostic Flowchart

Fix Guide: Blocker #1 — Robots.txt

Fix Guide: Blocker #3 — Content Behind JS Rendering

Fix Guide: Blocker #5 — No Answer-Ready Content

What I Actually Observed About Citation Behavior

Measuring Your Fixes

Frequently Asked Questions

Check Your AI Search Readiness

Related Articles

Perplexity Not Showing Your Products? Here's Why and How to Fix It

Schema.org Markup for AI Search Visibility: E-Commerce Guide

How to Audit Your Website for ChatGPT Shopping Visibility