How to make your website appear in ChatGPT and Perplexity sources
AI answer engines don't rank ten blue links — they synthesize an answer and cite a handful of sources. This guide covers the concrete signals that make your pages easier to find, parse, and quote. There are no guaranteed-ranking tricks here: AI citations depend on factors outside your control, but the steps below remove the technical and editorial friction that keeps most sites from being cited at all.
Published June 2026 · 12 min read
The problem: visible to Google, invisible to AI
You can rank well on Google and still be missing from ChatGPT, Perplexity, Claude, and Gemini answers. These engines retrieve and summarize content differently from a classic search crawler: they favor pages that are easy to fetch, clearly structured, and written in a way that's safe to quote. If your site blocks AI crawlers, hides content behind heavy client-side rendering, or buries facts in marketing prose, it simply won't surface as a source.
The goal of this guide is not to game any model. It's to make your site retrievable and quotable — the two things every AI answer engine needs before it can cite you.
Why AI engines cite sources at all
Answer engines cite sources to ground their responses and let users verify claims. When a model assembles an answer, it pulls passages from pages it retrieved, then links the ones it leaned on. To be one of those pages you generally need three things:
- Accessible — the crawler can fetch the page without being blocked or stalled.
- Parseable — the meaningful content is in the HTML, not locked behind scripts.
- Quotable — facts, definitions, and steps are stated plainly enough to lift verbatim.
The rest of this guide is a checklist for each of those properties.
Technical checklist: be retrievable
Allow the AI crawlers you want
Many sites unknowingly block AI user agents in robots.txt.
Decide deliberately which engines you allow, then state it explicitly.
Common citation-related agents include GPTBot and
OAI-SearchBot (OpenAI),
ClaudeBot and Claude-SearchBot (Anthropic),
PerplexityBot (Perplexity), and
Google-Extended (Gemini/Vertex).
Serve content in the initial HTML
If your key content only appears after JavaScript runs, assume a retriever may never see it. Server-render or statically generate the text that matters. View the raw HTML (not the rendered DOM) and confirm your headings, paragraphs, and facts are present.
Keep pages fast and stable
Slow responses, redirect chains, and flaky status codes all reduce the chance a crawler finishes fetching your page. Use permanent (301) redirects, return clean 200s, and keep your canonical URLs consistent.
Content checklist: be quotable
- Lead with a direct answer. State the definition or conclusion in the first sentence, then expand.
- Use clear headings that match real questions people ask.
- Prefer concrete facts, numbers, and dated statements over vague claims — they're easier to cite and verify.
- Break processes into explicit, numbered steps.
- Add a short FAQ for the questions your audience actually types into an assistant.
- Keep one idea per paragraph so a model can lift a clean passage.
llms.txt, structured data, and crawlability
llms.txt
An llms.txt file is an emerging
convention: a plain-text map at your domain root that points AI tools
to your most important pages and summarizes what your site is about.
It's not a ranking guarantee and not yet honored by every engine, but
it's low-cost and makes your key URLs explicit. GEO Optimizer can
generate one for you.
Structured data (schema.org)
Mark up your pages with relevant schema — Article,
FAQPage, Organization,
WebSite. Structured data makes
entities and relationships explicit, which helps both classic rich
results and AI systems that read JSON-LD to disambiguate your content.
Crawlability basics
Maintain an accurate XML sitemap, link your pages internally so they're discoverable, and avoid orphan pages. If a human can't reach a page in a couple of clicks, a crawler probably won't either.
Monitor whether it's working
AI visibility isn't a one-time setup. Models, crawlers, and citation behavior change over time, so treat this like any other channel: take a baseline, make changes, and re-measure. A point-in-time audit tells you where you stand today; tracking over weeks tells you whether your changes moved anything.
Mistakes to avoid
- Blocking every AI user agent by default, then wondering why you're never cited.
- Rendering critical content only on the client.
- Chasing one engine's quirks instead of fixing fundamentals that help all of them.
- Stuffing keywords or fabricating facts — answer engines favor verifiable, well-sourced content.
- Treating llms.txt or schema as a magic switch. They help retrieval and parsing; they don't guarantee citations.
Check your site's AI visibility
See how your site scores across crawlability, llms.txt, structured data, and content signals — across eight categories, with specific recommendations. No account required.
The free audit is an instant, no-account diagnosis. A GeoReady account lets you save reports, monitor domains over time, and unlock the full report. See pricing.
Further reading
- The signals above are grounded in our research foundation.
- Why we think AI visibility should be auditable and open — the GEO Optimizer manifesto.
- More walkthroughs in the guides hub.