Guide · Published Apr 26, 2026 · 20 min read

Track Brand Mentions in ChatGPT, Claude & Perplexity (2026 Tools)

Track your brand in ChatGPT, Claude, Perplexity & Gemini. 45-min manual workflow plus tool comparisons. Step 2 of the 4-step AI visibility audit.

By Joao Da Silva, Co-Founder of friction AI

· April 26, 2026 · Last updated May 18, 2026

TL;DR. Tracking brand mentions in one model is a tutorial. Tracking across three is a workflow. This guide walks the 45-minute manual process for ChatGPT, Claude, and Perplexity, the spreadsheet that holds it together, and the threshold (about 2 hours per week) where automation pays for itself. This is Step 2 of the 4-step AI visibility audit — pair it with the free 15-prompt starter set for the full workflow.

📺 Watch the 4-step audit walkthrough on YouTube — full framework walkthrough, prompt setup, and diagnosis.

▶ Watch the full walkthrough on YouTube →

When a buyer evaluates your product through AI today, they probably do not stop at one model. They ask ChatGPT, copy the answer to Claude for a second opinion, then double-check on Perplexity because they want sourced citations. Your brand has to show up cleanly across all three, and the leaderboard you sit on in one is rarely identical to the leaderboard in the next.

Most marketers track only one model, and they pick the one their CMO uses. That is a fraction of the surface where buyers are evaluating you. This is the workflow we use to get the full picture in under an hour.

Why track brand mentions across multiple AI platforms?

Cross-model variance is the single biggest blind spot in AI brand tracking. ChatGPT and Perplexity agree on the top recommended brand somewhere between 60-80% of the time (friction AI 40-brand audit, April 2026), and disagree on positions 2-5 most of the time. A brand can be mentioned first in ChatGPT, fifth in Claude, and missing entirely in Perplexity. Tracking only one model hides two-thirds of the surface.

Horizontal bar chart showing illustrative prompt coverage out of 15 prompts across ChatGPT (11), Claude (8), Perplexity (13), and Gemini (9). Cross-platform overlap (mentioned in all four) sits at 7 of 15. Prompts where brand was mentioned, by platform Illustrative example — out of 15 prompts × 3 runs averaged. Same audit, four platforms. ChatGPT Claude Perplexity Gemini All four 11 / 15 8 / 15 13 / 15 9 / 15 7 / 15 (true cross-platform coverage) Illustrative composite; real audits vary by category, prompt set, and run window. The cross-platform overlap row is the metric that matters.
Cross-platform coverage is always lower than any single-platform read. Tracking one model overstates your real visibility. The "all four" row is the only honest number.

The three models source differently:

Each model exposes a different failure mode. If your brand is missing from ChatGPT but present in Perplexity, your training-data and entity-recognition surface is the gap. If you are present in ChatGPT but missing in Perplexity, your live web footprint is thin. The diagnostic value of running all three is what tells you which fix to invest in first. The full layered diagnostic methodology lives in the 4-step AI visibility audit pillar; this guide focuses on the operational side of running Step 2 across multiple platforms.

What do you need before you start tracking?

Five things, none of them paid: free accounts on each AI platform (ChatGPT, Claude, Perplexity), a locked prompt set you'll re-run every quarter, a simple spreadsheet template, about 45 minutes of focused time per round, and ideally a second monitor or split-screen setup to run prompts in parallel browser tabs.

  1. A free account on each platform. ChatGPT, Claude, and Perplexity all have free tiers that work for the audit. Pro Search on Perplexity is not required for a first read.

  2. Your prompt set. Use the 15 starter prompts from the 4-step audit pillar as your starting set. Or build a custom 10 to 15 from sales call transcripts and Reddit thread titles.

  3. A spreadsheet template. Five columns: prompt, model, run number, mentioned (yes/no), position (1-N), notes.

  4. About 45 minutes of focused time. Three platforms × 15 prompts × 3 averaged runs = 135 prompts. With copy-paste workflow and parallel browser tabs, the round takes 35 to 50 minutes.

  5. A second monitor or split-screen setup. Not strictly required. It cuts the time roughly in half because you can run the next prompt while the current one is generating.

That is the full kit. No subscriptions, no API keys, no tools. The whole thing runs in a browser.

Diagram showing four sequential stages: Prompt set up, Run across platforms, Capture responses, Score and diagnose. Arrows connect each stage left to right. The 4-stage manual tracking workflow About 45 minutes per round. Three platforms, 15 prompts, 3 runs each. STAGE 1 Prompt set up 15 starter prompts (or your custom set) STAGE 2 Run across platforms ChatGPT · Claude Perplexity · Gemini STAGE 3 Capture responses Spreadsheet: mention · position · notes STAGE 4 Score & diagnose Per-layer pass rate Cross-platform overlap
The manual workflow is four stages, sequential. Stage 4 (score & diagnose) is where the audit pays for itself — single-run readings without scoring are anecdote, not direction.

How do you build your prompt set?

Stage 1 is picking your 10-15 prompts. The cleanest starting point is the 15-prompt starter set from the 4-step audit pillar — five brand-anchored prompts (Layer 1), five category-only prompts (Layer 2), and five comparison and validation prompts (Layer 3). Intentionally short and natural-language, matching how buyers actually phrase queries.

If you want to customize, pull thread titles from your category subreddit and questions from your last 50 sales calls. The criterion for a tracking-worthy prompt comes from Citation Labs. It should have contrastive reasoning ("better," "worth it"), category anchoring, and a constraint clause (Citation Labs, 2025). HubSpot's Answer Engine Optimization guide makes the same point in different language: prompts that work for tracking are the ones a real buyer would type, not the ones a marketer would write. Generic "tell me about X" queries waste your run budget. They produce different answers every time without revealing useful patterns.

Whatever set you pick, freeze it. The point of tracking is to read deltas across runs and over time, and you cannot do that if the prompts move every quarter. Lock the list, and only swap a prompt if your category vocabulary genuinely shifts.

For the rest of this guide, assume you are running the standard 15 starter prompts.

How do you track brand mentions in ChatGPT?

Stage 2 is the ChatGPT pass. Open chat.openai.com, start a new chat for every prompt (this matters more than people realize because conversation memory contaminates subsequent answers), and paste the first prompt. Wait for the full response. Then in your spreadsheet, log:

Critical setting: run each prompt twice, once with the web search toggle OFF and once with it ON. The two answers will differ, often substantially. Web-search-off shows you what ChatGPT "knows" from training data; web-search-on shows you what it pulls live. This split is what tells you whether your gap is historical content authority or live retrieval. We covered the platform-specific setup in detail in how to track ChatGPT brand visibility, which is worth reading if ChatGPT is your highest-priority platform.

Run each prompt three times total (averaging for variance). If results vary wildly between the three runs, log the variance itself; high variance is a signal that AI is unsure about your brand, which is a Layer 1 finding.

How do you track brand mentions in Claude?

Stage 3 is the Claude pass. Open claude.ai, again start a new conversation for every prompt (Claude's conversation memory contaminates subsequent answers more aggressively than ChatGPT's), and paste the same prompt set you used for ChatGPT. Same scoring rubric, same spreadsheet columns, but watch for Claude-specific hedging patterns.

Two Claude-specific quirks to watch for:

Claude has the smallest population of public AI search benchmarks, so the third-party data on what "good" looks like here is thinner. Track your own deltas quarter over quarter rather than benchmarking against an external number.

How do you track brand mentions in Perplexity?

Stage 4 is the Perplexity pass. Open perplexity.ai and use Quick Search (not Pro Search) for the first round so you have a fair baseline against the other two free-tier platforms. Same prompt set, same scoring rubric. The unique advantage Perplexity offers is visible source citations on every answer — capture those URLs.

Perplexity is structurally different from ChatGPT and Claude in two ways that matter for tracking:

For deeper coverage of Perplexity-specific tactics (filters, source-domain analysis, Pro Search differences), see our standalone guide on tracking brand visibility in Perplexity.

See also: AI visibility tools compared

If you've reached the point where you want to graduate from manual tracking, we've compared the tools landscape in three companion posts — each focused on a different angle of the buying decision (full landscape vs head-to-head vs mid-market alternatives). Skim whichever matches your evaluation question:

We deliberately don't slot a tools matrix into this post because the comparison cluster above covers it in much more depth. Use the manual workflow here to learn the methodology first, then read the comparisons when you're ready to evaluate tooling seriously.

How do you score and aggregate your results?

After three platforms × 15 prompts × 3 runs, you have 135 data points. Reduce them to a per-platform leaderboard score. Count how many of the 15 prompts produced a clean mention (averaged across the 3 runs). Then track your average position when mentioned.

A useful summary table looks like this:

Table 1: Illustrative example — what your output might look like.

Platform Prompts mentioned (of 15) Avg position when mentioned Layer 1 pass Layer 2 pass Layer 3 pass
ChatGPT 11 2.3 5 of 5 3 of 5 3 of 5
Claude 8 3.1 4 of 5 2 of 5 2 of 5
Perplexity 13 1.9 5 of 5 4 of 5 4 of 5

The pattern in the table tells you everything. In a hypothetical run like this one, Claude is the weakest surface. The gap concentrates in Layers 2 and 3 (visibility and recommendation). That means the fix is more off-site authority and refreshed third-party content, not on-page entity work. Perplexity is the strongest surface, which suggests live retrieval is healthier than training data.

Calculate one cross-platform metric: the percentage of prompts where all three models mention your brand. That number is your true coverage. From the brands we have looked at, this number sits consistently below 50%. Most brands have meaningful coverage in one or two models but not all three.

How do you read the patterns and prioritize fixes?

Three rules for reading the per-platform patterns and choosing what to fix first — sequence matters (Layer 1 before 2 before 3), triangulation beats individual data points (use Perplexity's visible source URLs to identify the third-party content shaping your visibility), and prompt-level gaps beat platform-level gaps for fix priority:

We catalogued the specific patterns to look for in our 11 AI visibility failure modes guide. It is a useful companion read for Step 3 (analyze and diagnose) of the 4-step framework.

When should you graduate from manual to automated tracking?

The manual workflow is good — it costs nothing, takes under an hour per round, and produces clean directional data for the brands we've audited. It breaks down predictably at three thresholds: weekly cadence, multi-brand tracking, and historical run-over-run comparison. Hit any one of them and the math flips toward tooling:

  1. Time. If running the audit is taking more than 2 hours per week (across multiple brands or weekly cadence), the manual workflow is costing you more than tooling would.
  2. Brand count. Once you are tracking more than one brand (yours plus competitors, or multiple products), the manual workflow becomes unmanageable. The data points multiply linearly with brand count, but the spreadsheet hygiene work multiplies faster.
  3. Reporting cadence. If your CMO or board wants weekly or monthly numbers, you need automation. Quarterly is the natural manual cadence; weekly is automation territory.

Automation also unlocks two things manual cannot. The first is historical run-over-run deltas, with every change tracked automatically and timestamped. The second is multi-prompt aggregation across hundreds of variants, which is where the Omniscient and Citation Labs research on prompt patterns becomes operationally useful at scale (Omniscient Digital, 2025).

This is the problem friction AI was built to solve. We track the 15-prompt starter set (and your custom prompts) across ChatGPT, Claude, Perplexity, and Gemini on a quarterly, monthly, or weekly schedule, and surface the deltas that matter. If you are already at the 2-hour-per-week threshold, the manual-to-automated math is straightforward. If you are not, run the manual workflow above and revisit when the spreadsheet stops fitting.

Frequently Asked Questions

The most common questions teams ask after running their first multi-platform tracking round, with concrete how-to answers grounded in the workflow above. Each answer is scoped to one specific platform or one specific decision a brand team faces during their first audit.

How do I know if ChatGPT mentions my brand?

Open chat.openai.com in a new chat, paste a brand-anchored prompt like "Who is [your brand]?" or "What are the best [your category] tools?", and check whether your brand surfaces in the answer. Run each prompt three times in separate chats (conversation memory contaminates subsequent runs), and log: mention yes/no, position in the answer, accuracy of the description. Run once with web search OFF and once with it ON — the split between training-data recall and live retrieval is where the diagnostic value sits.

How do I track brand mentions in Claude?

Open claude.ai in a fresh conversation (not inside an existing Project, which biases context), and run the same prompt set you'd use for ChatGPT. Claude is structurally more conservative about naming brands, so read the order and confidence of brand mentions rather than just yes/no presence. Claude hedges where ChatGPT commits — "there are several strong tools" is a Claude-specific phrasing pattern, not a brand weakness. Track Claude readings as deltas quarter over quarter, not against external benchmarks.

How do I see my brand in Perplexity?

Open perplexity.ai, use Quick Search (not Pro Search) for free-tier baselines, and run your prompt set. The unique value Perplexity adds: every answer cites the source URLs it pulled from. Capture those URLs for every prompt where your brand appears or doesn't — within a quarter you'll have a map of which publications, Reddit threads, and listicles drive your AI visibility. Perplexity weights freshness heaviest of the three platforms, so it surfaces recent press faster than ChatGPT or Claude.

What's the best AI brand monitoring tool in 2026?

The honest answer depends on your scale and budget. Profound and AthenaHQ lead the enterprise tier ($1,500–5,000+/mo) with deep integrations and white-label dashboards. friction AI, Otterly, and Peec AI compete in the mid-market ($99–800/mo) with stronger multi-platform tracking. For solo founders or sub-$100/mo budgets, the manual workflow in this post genuinely outperforms paid tools at small scale. We compared all 9 in best AI visibility tools compared (2026).

How often should I run brand visibility checks?

Quarterly is the floor; monthly is right if your category is hot or your competitive landscape is shifting fast. AI's answers move every few weeks as content gets indexed and competitors enter. Single-shot audits are data; recurring audits are direction. The pattern of which prompts shift over time tells you whether your Layer 1, 2, or 3 investments are working. If your category is mature and stable, quarterly is plenty. If you're in a fast-moving SaaS category with weekly competitor launches, run monthly.

How long does the full multi-platform tracking workflow take?

About 35 to 50 minutes per round for one brand and 15 prompts run three times across ChatGPT, Claude, and Perplexity. The biggest time drain is opening a new chat for every prompt. Running prompts in parallel browser tabs roughly halves the wall-clock time once you have practiced the workflow.

Can I use ChatGPT Plus, Claude Pro, and Perplexity Pro instead of free tiers?

Yes, but it changes the readings. Pro tiers route to different, typically larger models than the free tiers. Check each platform's settings page for the current lineup if you need to lock that variable for benchmarking. If your buyers are mostly using free tiers, free-tier readings are more representative. If your audience skews technical or pro-tier, run both and track them as separate data sets.

Should I include Gemini in the audit?

Yes if your buyers are heavy Google ecosystem users (Workspace, Android, AI Overviews surfaces). Skip it if your audience skews ChatGPT or Anthropic-first. Gemini's leaderboards differ enough from the other three that adding it is meaningful. The marginal time cost is another 15 to 20 minutes per round, which only pays off when your audience uses it.

What's the minimum useful prompt count?

Ten prompts is the floor. Below that, you cannot meaningfully separate Layer 1, 2, and 3 signals. Fifteen is the standard set we recommend (5 per layer). Above 30 prompts, the diagnostic value plateaus and the time cost becomes a tax on the workflow.

How do I track if my brand isn't mentioned at all in a prompt?

Log it as a "not mentioned" and capture which brands were mentioned, in what order. The competitive map is just as valuable as your own appearance data. Over time, the brands consistently appearing where you are absent are the ones whose third-party content surface you need to study.

Can I just check once a quarter and skip the run-three-times averaging?

You can, but you will get noisier readings. Single-run audits drift between 60% and 80% on the top brand from one minute to the next; averaging three runs gets you closer to the true signal. If time is the constraint, drop to three platforms × 15 prompts × 1 run (45 prompts total) rather than three platforms × 5 prompts × 3 runs.

What's the difference between this audit and just searching Google for my brand?

Google measures click-driven discovery; this audit measures recommendation-driven discovery. The two surfaces share inputs (Reddit, listicles, comparison content) but weight them very differently. A brand can rank #1 on Google for "best CRM" and be missing from ChatGPT, and the reverse. Both audits are useful; they answer different questions about how buyers are finding you.


Run this audit on your own brand

Want this 4-step audit running across ChatGPT, Claude, Perplexity, and Gemini on a continuous schedule — without doing the spreadsheet by hand?

▶ Start your free trial of friction AI →

Or grab the free 15-prompt starter pack → and run the manual workflow tonight.


About the author. Joao da Silva is co-founder of friction AI alongside Camilla Wirth. friction AI tracks brand visibility across ChatGPT, Claude, Perplexity, and Gemini for SaaS and DTC brands. Joao writes about AI search, entity recognition, and the operational side of getting recommended by LLMs. Connect with him on LinkedIn.

Read on frictionai.co · View all posts