Insights · April 26, 2026 · 12 min read

5 Principles for Choosing AI Visibility Prompts

AI visibility audits live or die on prompt selection. These 5 principles tell you which queries to track, drawn from analysis of 25,755 AI citations.

By Joao Da Silva, Co-Founder of friction AI

ยท April 26, 2026

TL;DR. The single biggest mistake in AI visibility audits is not picking the wrong platform. It is picking the wrong prompts. These five principles tell you which queries to track, which to skip, and how to mine real buyer language so your audit actually diagnoses something. Companion read to the 15-prompt audit framework and to the YouTube walkthrough video.

A hand selecting one card from a row of cream cards on a linen surface, representing principled prompt selection

A bad prompt set produces clean-looking data that diagnoses nothing. The audit comes back, the dashboard fills in, the team agrees the brand is "doing okay" in AI search, and three quarters later the pipeline gap quietly traces back to a prompt set that was measuring the wrong things from day one. Prompt selection is the single highest-leverage decision in the audit, and it is the one most teams default through.

Why does prompt selection matter more than prompt count?

Prompt count is what most teams optimize. Prompt selection is what determines whether the audit produces direction. Search interest in "AI Visibility" grew 11.5x in the 12 months ending April 2026 (Google Trends, internal pull, Apr 2026), and the tooling space matured alongside it; what did not mature is most teams' instinct for which prompts are worth running in the first place.

Citation Labs studied the question and identified four properties that make a prompt worth tracking: contrastive reasoning ("better," "worth it"), category anchoring, brand-anchoring (a specific name in the prompt), and constraint clauses (Citation Labs, 2025). Omniscient Digital ran a complementary 200-prompt analysis across 25,755 AI citations and identified five universal BoFu prompt patterns that hold across e-commerce, SaaS, services, and healthcare (Omniscient Digital, 2025).

Both findings point at the same operational truth: a 50-prompt audit built on bad inputs is worse than a 10-prompt audit built on real buyer queries. The principles below are how you avoid the first and produce the second.

Principle 1: Are you searching like your customer?

The most common audit mistake is testing the version of buyer queries you imagine, sanitized through your own positioning vocabulary. Real prospects do not talk that way. The gap between what you think they ask and what they actually ask is where most audit blind spots live.

Marketers tend to write prompts in their own language because their own language is what they have. They read internal docs all day, they sit in pitch rehearsals, they sit in their own brand decks. By the time they sit down to write an audit prompt set, the words they reach for are the words from those internal artifacts, not the words a prospect would type into ChatGPT at 11pm trying to solve a problem.

The fix is to never write prompts from imagination. Pull them from artifacts that capture what real buyers actually say: subreddit thread titles, sales call recordings, support ticket openers, Reddit and Quora questions in your category. The tactical guide to finding buyer questions walks through exactly where to find each of those. Use those words verbatim in your prompts. If a prompt sounds like marketing copy, it is the wrong prompt.

Principle 2: Should you lead with problems or categories?

Buyers know their pain. They do not know your category. A prospect does not think "I need a CRM," they think "I keep losing track of leads." Your prompts should reflect the problem-shaped mental model, not the category-shaped one. Problem-led prompts surface a different (and often more honest) leaderboard than category-led prompts.

This matters because category-led queries reward the brands that own the category-defining content (G2 lists, "best of" listicles, established players). Problem-led queries reward the brands whose content describes the buyer's pain in the buyer's own words. These are very different surfaces, and a brand can dominate one while being invisible in the other.

The diagnostic test: take the same intent and write it two ways.

Both queries are valid. Both probably matter for your audit. But they will surface different brands, and the gap between them is what tells you whether your homepage is written in marketing language or buyer language. Run one of each per buyer segment, and read the delta.

Principle 3: Are your prompts conversational or Google-shaped?

Write prompts as full questions, the way ChatGPT was built to be used, not as Google-style fragments. The same intent produces different AI responses depending on form. ChatGPT often gives more advisory, opinionated answers to conversational queries and more list-style answers to fragments. Test in the form your buyers actually type.

A practical example. The query "best CRM SaaS" is a Google-shaped fragment. A buyer might type it into Google but is unlikely to type it into ChatGPT. The conversational version of the same intent is "What's the best CRM for a small SaaS team in 2026?" That second version is what your audit should track because it is what your buyers actually send.

The signal here is whether your prompts read like search queries or like questions. Search queries are short, keyword-dense, and stripped of grammar. Questions are full sentences, often with context ("for a 10-person team," "if I'm switching from HubSpot," "if I need API access"). LLMs respond to context. Your audit should test how they respond to the context your buyers actually provide.

Principle 4: Are you using buyer language or marketing copy?

Mine sales call transcripts. Skip your brand decks. The vocabulary your prospects use when they don't know your brand exists yet, that is the prompt to test. Real prospects do not talk in your category's marketing dialect; they describe their pain in their own words.

This principle compounds with Principle 1 but is worth stating separately because the failure mode is different. Principle 1 is about who you imagine the buyer to be. Principle 4 is about what words you let them use. Even teams that successfully build the audit around real buyers will sometimes still rewrite the buyer's language into "cleaner" marketing-friendly versions. Don't.

The two best sources of buyer language, in order of accessibility:

  1. Sales call transcripts. Ask three of your AEs: "What are the top 10 questions buyers ask in discovery calls?" Their pattern recognition is gold. If you have Gong, Chorus, or Fireflies, mining transcripts directly is even better, but the AE shortcut works for teams without that infrastructure.
  2. First-touch support tickets. Filter Intercom, Zendesk, or HubSpot for the openers your existing customers wrote when they signed up. What they asked then is what your prospects are typing into ChatGPT right now.

A 60-minute exercise: read 10 sales call transcripts, scan 30 thread titles in your category subreddit, pull the first messages from your last 50 support tickets. You will end up with about 60 prompt candidates. De-dupe, group by funnel layer, and the strongest 15 to 20 become your tracked set.

Principle 5: Are you mixing branded and non-branded prompts deliberately?

Branded prompts test how AI represents you (validation, comparison, reviews). Non-branded prompts test whether AI surfaces you at all when the buyer isn't searching by name. Both surfaces matter, and they reveal different failure modes. Most teams test only one and miss the other entirely.

The asymmetry is usually that teams over-test branded prompts and under-test non-branded ones. Branded prompts feel more comfortable because the brand name is in the input, which means the brand name will probably be in the output too, which feels like a win. The truth is that the larger-volume buyer queries happen at the non-branded layer (the buyer searching by category, problem, or use case) and that is where most brands have their biggest invisible gaps.

The clean split is the 5-by-3 layer framework from the 15-prompt audit: five brand-anchored prompts at Layer 1 (entity recognition), five category-only prompts at Layer 2 (visibility), and five brand-anchored prompts at Layer 3 (recommendation). Layer 2 is the non-branded layer, and skipping it is the most common single error in DIY audit construction. A balanced audit lands at roughly 60% non-branded prompts; if yours is below that, your data is biased toward false confidence at the high-volume top of funnel.

How do you apply all 5 principles in one hour?

Run the exercise from Principle 4. The 60-minute version produces ~60 prompt candidates pulled from real buyer language across 4 sources (subreddits, sales calls, support tickets, and meta-prompted ChatGPT for buyer-perspective queries). Then apply the 5 principles as filters:

  1. Cut anything written in your marketing voice. If a candidate sounds like a tagline, drop it (Principle 1, 4).
  2. Convert any category-only candidate to a problem-led variant. Keep both versions and run them as a paired set (Principle 2).
  3. Make sure every candidate is a full question, not a fragment. Rewrite Google-shaped queries into conversational form (Principle 3).
  4. Group by buyer language source. Tag each candidate with where you pulled it from. Source diversity is itself a quality signal (Principle 4).
  5. Force at least 60% non-branded prompts. If you do not have enough non-branded candidates, generate more before stopping (Principle 5).

What survives that filter is your tracked prompt set. Lock it. Run it across ChatGPT, Claude, and Perplexity. Re-read the results quarterly and only swap a prompt if your category vocabulary genuinely shifts.

The whole exercise (mining + filtering + locking) runs in about 90 minutes for a single brand. It is the highest-leverage 90 minutes you will spend on AI search this quarter, and the prompts you produce will outlast three or four model updates.

Frequently Asked Questions

How many prompts should an AI visibility audit include?

Ten is the floor; below that, you cannot meaningfully separate Layer 1, 2, and 3 signals. Fifteen is the standard set we recommend (5 per layer). Above 30, the diagnostic value plateaus and the time cost becomes a tax on the workflow. The 15-prompt structure used in the pillar audit is the cleanest minimum.

Should I write prompts in the first or third person?

First-person framing ("Should I use X for my team?") tends to elicit more advisory, opinionated AI answers; third-person framing ("Is X good for small teams?") tends to elicit more list-style answers. Test in the form your actual buyers use. If you don't know which form they use, run both for a few prompts and look at which response shape matches what your sales team hears in calls.

How often should I refresh my tracked prompt set?

Quarterly is the default. The set itself should rarely change, only the inputs (your [brand], [category], [ICP] placeholders) and any new use cases that have become mainstream in your market. If your category vocabulary has not meaningfully shifted, do not swap prompts; you will lose the run-over-run comparison value.

What if my buyers don't use AI search yet?

Audit anyway, but adjust expectations. AI surfaces are a leading indicator. Buyers who are not yet using ChatGPT for evaluation today are typically using it within 6 to 12 months as it becomes embedded in workflows they already use (Notion AI, Google Workspace AI, Slack AI). The audit you run today is positioning for next year's buyer behavior, not just this quarter's.

Can I use the same prompt set across multiple products or business units?

No. Run a separate prompt set per product or per ICP. ChatGPT might know HubSpot the company perfectly and be vague on HubSpot Marketing Hub specifically. Same applies to multi-ICP brands. Buyer language and AI's recommendation patterns shift completely between segments, so each segment needs its own audit with its own buyer-language inputs.

What's the difference between branded and non-branded prompts?

Branded prompts include your company name (e.g., "How does HubSpot compare to Salesforce?"). Non-branded prompts do not (e.g., "What are the best CRM tools for small SaaS teams?"). Branded prompts test validation and reputation. Non-branded prompts test discovery, which is where most large-volume buyer queries actually happen. A balanced audit needs both, weighted roughly 60-40 toward non-branded.

What if my sales team doesn't have time to provide buyer questions?

Spend 30 minutes reading the most recent 20 thread titles in your category subreddit. The titles ARE the prompts; people type their problems literally as they think them. This is the single fastest substitute for sales-call mining and produces 70% of the same value in a fraction of the time.


About the author. Joao da Silva is co-founder of friction AI alongside Camilla Wirth. friction AI tracks brand visibility across ChatGPT, Claude, Perplexity, and Gemini for SaaS and DTC brands. Joao writes about AI search, entity recognition, and the operational side of getting recommended by LLMs. Connect with him on LinkedIn.

Read on frictionai.co · View all posts