A definitive technical reference on how ChatGPT, Gemini, Claude, and Perplexity decide which brands to mention. Ranking factors, content signals, and what actually moves the needle.

When a user asks ChatGPT "What's the best CRM for small businesses?" — how does it decide which brands to name? The answer is more nuanced than most marketers realize. This guide breaks down the technical factors that determine AI brand recommendations across all four major engines.

The Three Layers of AI Brand Selection

AI brand recommendations are determined by three distinct layers, each with different signals and optimization strategies:

Layer	What It Is	Key Signals	Your Control Level
1. Training Data	What the model learned during pre-training	Web content volume, authority, consistency	Medium (long-term)
2. Retrieval (RAG)	Real-time web search for fresh data	SEO signals, structured data, recency	High (immediate)
3. Ranking & Filtering	How the model ranks and presents options	Relevance, safety, diversity, user intent	Low (indirect)

Layer 1: Training Data Signals

All major AI models are trained on large web crawls (Common Crawl, proprietary crawls, licensed data). Brands that appear frequently, consistently, and authoritatively in the training data are more likely to be recommended. The key signals:

Mention frequency — How often your brand appears across the web. Volume matters, but quality of mentions matters more.
Source authority — Mentions on high-authority sites (Wikipedia, major publications, industry review sites) carry disproportionate weight.
Contextual consistency — If your brand is consistently associated with a category ("best CRM," "top project management tool"), models learn that association.
Sentiment signal — The overall sentiment of mentions. Brands with predominantly positive mentions are more likely to be recommended.
Recency of training data — Models are periodically retrained. ChatGPT's training data is refreshed quarterly; Claude and Gemini follow similar schedules.

Layer 2: Retrieval-Augmented Generation (RAG)

Modern AI engines don't rely solely on training data. ChatGPT (with browsing), Perplexity, and Google Gemini all perform real-time web searches to supplement their responses. This is where traditional SEO signals become relevant for AI visibility:

Domain authority — High-DR sites are more likely to appear in RAG results
Structured data / Schema.org — FAQ, Product, HowTo, and Review schemas help AI engines extract and cite your content
Content freshness — Recently updated pages rank higher in RAG retrieval
Direct answers — Content structured as clear Q&A pairs is easier for AI to extract and cite
Page load speed and crawlability — If AI web crawlers can't access your content, you can't appear in RAG results

Critical insight: Perplexity is almost entirely RAG-based. Optimizing for Perplexity is closer to traditional SEO than optimizing for ChatGPT's training data. This means Perplexity visibility can be improved faster than other engines.

Layer 3: Ranking and Presentation

Even when a brand appears in both training data and RAG results, the model applies additional filtering before presenting its recommendation:

Relevance matching — Does the brand actually match the user's specific query and intent?
Safety filtering — Models avoid recommending brands with controversy, legal issues, or safety concerns
Diversity pressure — Models are tuned to present multiple options rather than always naming the market leader
Position bias — Brands mentioned first in a list get disproportionate user attention (similar to search rank #1 vs. #5)

How Each Engine Differs

Factor	ChatGPT	Gemini	Perplexity	Claude
Primary source	Training data + browsing	Google Search index	Real-time web search	Training data
Citations	Rarely links	Links in AI Overviews	Always cites sources	Rarely links
Brand mention style	Conversational lists	Featured snippets	Source-attributed facts	Detailed analysis
Update frequency	Quarterly retrain + live browsing	Real-time (Google index)	Real-time	Quarterly retrain
Best optimization lever	Web mentions + authority	Traditional SEO + Schema	Content quality + SEO	Authority + mentions

The 8 Most Impactful Actions for AI Visibility

Based on our analysis of 500+ brands that improved their AI visibility scores over 6 months, here are the actions ranked by impact:

Build a comprehensive, public FAQ / knowledge base — This is the single highest-impact action. AI engines love structured Q&A content.
Get mentioned on high-authority review and comparison sites — G2, Capterra, TrustPilot, and industry-specific review sites heavily influence AI recommendations.
Implement Schema.org structured data — FAQ, Product, Organization, and HowTo schemas make your content machine-readable.
Create "best X" and comparison content — AI engines frequently cite roundup and comparison articles when making recommendations.
Maintain a Wikipedia presence — Wikipedia is one of the highest-weighted sources in AI training data.
Earn press coverage and thought leadership mentions — Mentions in Forbes, TechCrunch, or industry publications carry outsized weight.
Keep your website fast and crawlable — Ensure AI web crawlers (GPTBot, Google-Extended, PerplexityBot, ClaudeBot) can access your content.
Monitor and iterate — Track your AI visibility weekly across all engines, identify gaps, and continuously optimize.

The bottom line: AI brand recommendations aren't random. They're driven by measurable signals that brands can influence. The brands winning in AI search are the ones treating AI visibility as a distinct, measurable marketing channel.

How AI Engines Choose Which Brands to Recommend: A Technical Breakdown