If you have ever typed your own category into ChatGPT and waited to see whether your name appears, you have already done the first, rough version of measuring AI search visibility. The problem is that one check on one day in one assistant tells you almost nothing. Answers shift, models update, and what surfaces for you might be different from what a buyer in the next city sees. To know where you actually stand, you need a method you can repeat.
The good news is that AI visibility is measurable. You do not need to wait for a perfect industry-standard metric to emerge. You need a consistent prompt set, a few clear numbers, and the discipline to run the same test on the same schedule. Below is the exact framework we use across the audits we run, written so you can do it yourself or understand what a partner should be doing for you.
What "AI search visibility" actually means
AI search visibility is how often, and how favorably, AI assistants mention and recommend your business when people ask questions you should be the answer to. It is not the same as a keyword ranking. A traditional search result is a position in a list of links; an AI answer is a synthesized recommendation, often with no link at all. You can rank on page one of Google and still be completely absent from the answer ChatGPT gives the same buyer.
That gap is why measuring AI visibility directly matters. If you only watch your rankings, you are watching the wrong scoreboard for a growing share of buyer journeys. For the bigger picture of how this discipline fits together, our guide to answer engine optimization covers the strategy; this page is about the measurement underneath it.
It also helps to separate two related ideas. Visibility is whether and how you appear in answers; performance is what that appearance does for your business. Most people obsess over the first and never connect it to the second, which is how you end up with a flattering screenshot and no new customers. A useful measurement program holds both in view at once, so a rise in mentions can be traced to a rise in qualified inquiries rather than left as a feel-good metric.
The four metrics that matter
You can drown in dashboards. We keep it to four numbers that map to real outcomes.
| Metric | What it tells you | How to read it |
|---|---|---|
| Presence rate | How often you are mentioned at all | Mentions ÷ total prompts tested |
| Share of voice | How often you are the top recommendation | Top-pick appearances ÷ total prompts |
| Citation rate | How often your own pages are cited as sources | Prompts citing your domain ÷ total prompts |
| AI referral traffic | Real visits coming from AI assistants | Sessions from AI sources in analytics |
The first three come from prompt testing. The fourth comes from your own analytics. Together they answer the two questions that matter: are you in the conversation, and is being in the conversation sending you business?
Step 1: Build a prompt set that mirrors real buyers
The single biggest mistake we see is testing with your brand name. Asking "Is [Your Company] good?" tells you nothing, because the assistant already has the answer in the prompt. Buyers who do not yet know you do not type your name. They describe a need.
Write 15 to 30 prompts the way a customer would phrase them, in plain language:
- Category prompts: "best mortgage broker in Seattle," "top accounting firms for small business."
- Problem prompts: "I need help refinancing with bad credit, who should I call?"
- Comparison prompts: "[competitor] vs alternatives," "who is better than [competitor] for X?"
- Local prompts: your service plus your city, neighborhood, or region.
Lock this list down and reuse it every time. A consistent prompt set is what turns a hunch into a trend you can trust.
Step 2: Test across the engines that matter
Visibility is not one number; it is a spread across platforms that behave differently. Run your prompt set through each of the assistants your buyers actually use:
- ChatGPT: the highest-traffic assistant, with live web browsing for current results.
- Google AI Overviews and AI Mode: what appears above the classic blue links for many searches.
- Perplexity: citation-heavy, so it shows clearly whether your pages are trusted sources.
- Gemini: tied into the Google ecosystem and Workspace.
- Microsoft Copilot: relevant for any audience that lives in Windows and Office.
You will quickly notice you might be strong in one and invisible in another. That spread is a feature of the measurement, not a flaw. It tells you where the next unit of work should go. If a whole platform is dark, our breakdown of why your business is not showing up in AI search walks through the usual culprits.
Step 3: Score and log every result
For each prompt on each platform, capture a few things in a simple spreadsheet: were you mentioned, were you the first or recommended option, were you cited with a link, and what was the tone of the mention. Tone matters more than people expect. Being named as "an option to consider" is not the same as being named as "the firm clients consistently recommend," and that difference in brand sentiment in AI moves real decisions.
Run two or three variations of each prompt and average them. AI answers carry natural variance, and a single run can mislead you. Date every entry so you can compare month over month against the same baseline.
Step 4: Connect visibility to traffic and leads
Prompt testing tells you what AI is saying. Your analytics tell you whether it is paying off. In Google Analytics, segment referral traffic by source and look for AI domains such as chatgpt.com, perplexity.ai, gemini.google.com, and copilot.microsoft.com. Watch session count, but watch behavior more closely: AI-referred visitors often arrive further along in their decision and convert at a higher rate, because the assistant has already done the vetting.
This is also where you connect the dots to outcomes. We saw it clearly with Keith Akada, a Seattle mortgage broker who went from invisible in AI answers to the most-recommended broker in his market in about six weeks, and that shift showed up as roughly 30 leads and four closed deals. The number we cared about was not a vanity score; it was the booked business that followed once the recommendations started landing.
Step 5: Turn the numbers into a fix list
Measurement only matters if it tells you what to do next. Each metric points to a different lever:
- Low presence rate often means thin or buried content, missing structured data, or inconsistent business information across the web.
- Low share of voice usually means competitors have stronger reviews, citations, and third-party mentions than you do.
- Low citation rate points to pages that are not written in an extractable, answer-first format models can quote.
- Strong mentions but flat traffic suggests the recommendation is there but the click-through or follow-up path is weak.
Read your scorecard as a to-do list. The framework above tells you which gap to close first instead of doing everything at once. To go deeper on what to track over time, our rundown of AI search KPIs worth tracking pairs naturally with this measurement routine.
How often to measure
Monthly is the right cadence for most businesses. AI answers move week to week, so daily checking is noise and one-time checks are meaningless. Pick a day, run the full prompt set, log it, and compare against last month. Quarterly, step back and look at the trend line rather than any single reading. The point of a fixed schedule is to make movement visible and to tie it to the changes you actually made.
The bottom line
Measuring AI search visibility is not mysterious. Build a buyer-realistic prompt set, run it across the major assistants on a fixed schedule, track presence, share of voice, citations, and referral traffic, and read the results as a fix list. Do that consistently and you stop guessing about whether AI is recommending you and start seeing exactly where you stand and what to do next. That baseline is the foundation everything else in AI search builds on.