Anatomy of an AI Visibility Audit

This lesson breaks down the structural framework of a professional AI Visibility Audit, moving from data retrieval to strategic optimisation for generative engines.

12 min read
Foundations

Introduction

Transitioning from traditional SEO auditing to AI Visibility auditing requires a shift in perspective. While traditional audits focus on crawlability, indexing, and blue-link rankings, an AI Visibility Audit (AVA) examines how Large Language Models (LLMs) and Generative Search Engines (GSEs) perceive, synthesise, and reference a brand's data. This lesson provides a step-by-step breakdown of the audit's anatomy, ensuring practitioners can provide clients with actionable, data-driven roadmaps for the age of Answer Engine Optimisation (AEO).

The Three Pillars of an AI Visibility Audit

A robust audit is divided into three distinct segments: The Data Input Audit, The Model Response Audit, and The Attribution Analysis. Each pillar addresses a different stage of the generative process.

1. The Data Input Audit (Sources and Scrutiny)

Before an AI can recommend a brand, it must have access to high-quality data. In this phase, we audit the 'training set' and 'retrieval set' of the brand.

  • Structured Data Health: Are Schema.org types (specifically Product, Service, Organization, and FAQPage) implemented correctly? AI models use structured data as a shortcut to understanding relationships.
  • Semantic Clarity: Using tools like Google's Natural Language API, we test if the website content is written in a way that machines can easily parse into entities and attributes.
  • Third-Party Citations: AI models rely heavily on off-site data. We audit the 'Brand Footprint' on Wikidata, Wikipedia, industry-specific directories, and high-authority news sites.

2. The Model Response Audit (The "Black Box" Test)

This involves querying various models (ChatGPT, Perplexity, Gemini, Claude) with a standardised set of prompts to see how the brand is currently positioned.

  • Direct Brand Queries: "What is [Brand Name]?" / "Who are the founders of [Brand Name]?"
  • Comparison Queries: "What are the best alternatives to [Competitor]?" / "Compare [Brand A] and [Brand B]."
  • Solution-Based Queries: "What is the best software for [Problem]?"

3. The Attribution Analysis (The 'Why' behind the Answer)

When an AI provides an answer, we must identify which sources it cited. This reveals our strongest visibility assets and our competitors' leverage points.

  • In-text Citations: Tracking which specific pages from our site or third-party sites are linked.
  • Bibliographic Reference: Analysing the 'Sources' section in tools like Perplexity or Search Generative Experience (SGE).
  • Sentiment Analysis: Is the model's tone regarding the brand neutral, positive, or mistakenly critical?

The Worked Example: 'LuxBoutique' Travel Group

Let’s apply this anatomy to a mid-sized luxury travel agency.

Step 1: Scoping the Prompts We create a spreadsheet with 50 prompts categorised by intent (Informational, Transactional, Navigational). One prompt is: "What are the most ethical luxury safari operators in Kenya?"

Step 2: Execution Running this prompt through Gemini and Perplexity. Gemini provides a list of three operators, but LuxBoutique is not among them. However, one of the sources Gemini cites is a 'Conde Nast' article from 2023.

Step 3: Gap Analysis We audit the Conde Nast article. LuxBoutique is mentioned in the text, but the URL points to a broken page or an outdated package.

Step 4: Remediation Plan The audit recommendation isn't "more keywords." It is: "Update the 2023 package page, implement 301 redirects to the 2024 equivalent, and reach out to the author at Conde Nast to update the link. Simultaneously, improve the 'Ethics & Sustainability' section on the LuxBoutique site using AboutPage Schema to ensure LLMs categorise them as 'Ethical'."

Technical Components: The Audit Checklist

A professional grade AI Visibility Audit must contain the following technical checks:

  1. Robots.txt Analysis: Checking specifically for GPTBot, CCBot, and Google-Extended permissions. Are you accidentally blocking the models you want to be visible in?
  2. Entity Mapping: Identifying the 'Primary Entity' of every key landing page. Does the 'Name', 'Description', and 'Image' match the Knowledge Graph expectation?
  3. LLM Sentiment Baseline: Using a scale of -1 to +1 to score how current models describe the brand's pricing, reliability, and innovation.
  4. Information Density: Auditing the 'Fluff-to-Fact' ratio. LLMs prefer high-density information over marketing prose.

Competitive Benchmarking

An audit is incomplete without knowing where the brand sits relative to its peers. We use a 'Share of Model' (SoM) metric. If a prompt is run 10 times and your brand is mentioned in 3 instances, while a competitor is mentioned in 8, the competitor has a higher 'Generative Authority'. We analyse why: Is it their Reddit presence? Their Wikipedia backlink? Their clear, bulleted product descriptions?

Formatting the Audit Report

Clients don't want a 100-page PDF of raw data. The report should follow this structure:

  • Executive Summary: The 'AI Visibility Score' (a proprietary or internal metric).
  • Visibility Heatmap: Which models 'know' the brand and which are 'blind' to it.
  • The Attribution Gap: A list of high-authority sites that AI models trust where the brand is currently missing.
  • Content Optimisation Roadmap: Specific pages that need 'Entity-First' rewriting.
  • Technical Fixes: Immediate changes to Schema or robots.txt.

Putting It Into Practice

To begin your audit journey, do not attempt to audit an entire site of 10,000 pages.

  1. Identify the Core 20: Pick the 20 most commercially important queries for the client.
  2. Run the 'Zero Baseline': Manually query these in three different LLMs and record the citations.
  3. Map the Citation Path: For every citation that isn't the client's site, identify if that source is a publisher you can influence via PR or a directory you can update via SEO.
  4. Audit the 'About' Page: Ensure the About page is the 'Source of Truth' for the brand entity, containing clear facts, dates, and leadership names.
  5. Monitor and Iterate: AI models update their weights and retrieval methods often. A visibility audit is a snapshot in time; recommend a quarterly review to the client.

Visual diagram

[ diagram placeholder ]

A flow chart showing data moving from a website and third-party sources into an 'AI Model Processing' box, outputting to a 'Generative Response' with lines looping back to identify 'Citation Sources'.

Exercise

Select a local business and query three different AI models (e.g., ChatGPT, Gemini, Perplexity) with the prompt 'Compare [Business Name] with its top two competitors'. List the sources cited by each model and identify one 'Citation Gap' where a competitor is mentioned on a third-party site but your chosen business is not.

Key takeaways

  • AI Visibility Audits focus on how LLMs synthesise data, not just keyword rankings.
  • The audit consists of three pillars: Data Input, Model Response, and Attribution Analysis.
  • Structured data (Schema.org) acts as a critical shortcut for AI model understanding.
  • Third-party citations (Wikipedia, Wikidata, News) are as important as on-page content.
  • Prompts for the audit should cover Brand, Comparison, and Solution intent.
  • Attribution Analysis identifies the specific sources an LLM trusts for an answer.
  • A 'Share of Model' (SoM) metric helps benchmark against competitors.
  • Technical checks must include robots.txt permissions for specific AI crawlers.
  • LLMs prefer high-density information and clear entity definitions over marketing fluff.
  • Audits should result in a prioritised roadmap of content, technical, and PR updates.

Lesson Quiz

Pass at 70%.

1. What is the primary difference between a traditional SEO audit and an AI Visibility Audit?
2. Which Schema.org types are most critical for establishing a 'source of truth' for an AI model?
3. In the context of an audit, what does 'Share of Model' (SoM) measure?
4. What is 'Attribution Analysis' in an AI Visibility Audit?
5. Which robots.txt user-agent is Specifically used by OpenAI to crawl for GPT-4 training data?
6. Why is 'Information Density' important for AI visibility?
7. What is the purpose of a 'Solution-Based Query' during an audit?
8. In the LuxBoutique example, why was the Conde Nast mention problematic?
9. What role does Wikidata play in an AI Visibility Audit?
10. How often should an AI Visibility Audit ideally be performed for a dynamic brand?
Create a free account to save progress and earn a certificate.