Introduction
In the era of Generative Engine Optimisation (GEO), a brand's visibility is no longer measured solely by blue links. Instead, the metric of success is the frequency and quality of 'citations'—the references AI models provide to validate their claims. Running a citation audit is the process of systematically cataloguing where LLMs (Large Language Models) like ChatGPT, Claude, and Gemini are sourcing their information and how often your brand (or your client’s brand) is being featured.
This lesson provides a repeatable, data-led methodology for capturing these citations at scale. Unlike traditional SEO audits that rely on crawler data, citation audits require a mix of prompt engineering, sentiment analysis, and source attribution tracking to understand why a model chooses one source over another.
The Citation Audit Framework
A professional citation audit consists of four distinct phases: Discovery, Extraction, Categorisation, and Gap Analysis. By following this sequence, practitioners can move from anecdotal 'spot checks' to a comprehensive visibility profile.
1. The Discovery Phase: Defining the Prompt Set
You cannot audit 'everything'. You must define a representative set of prompts across the user journey. We categorise these into three buckets:
- Brand/Navigational: "What is [Brand Name]?" or "Who founded [Brand Name]?"
- Category/Commercial: "What are the best CRM tools for small businesses in the UK?"
- Informational/How-to: "How do I calculate VAT for a remote workforce?"
For a standard audit, aim for a sample size of 50 to 100 prompts per target model to ensure statistical relevance.
2. The Extraction Phase: Capturing the Reference
When an AI generates a response, citations appear in various forms: inline footnotes, 'Sources' lists at the bottom, or embedded hyperlinks.
- Direct Citations: Explicit links to a URL.
- Implicit References: Mentioning a brand name without a link (this is still valuable for brand salience).
- Attribution Weight: Does the AI quote your data directly, or merely list you in a 'Top 10' list?
3. The Categorisation Phase: Source Mapping
Once citations are collected, map them by source type. This helps identify which channels the AI trusts for your niche:
- Owned Media: Your official website and blog.
- Earned Media: PR pieces, news sites, and guest posts.
- Community/Social: Reddit threads, Quora, and niche forums.
- Aggregators: Review sites like G2, Trustpilot, or Capterra.
Worked Example: Auditing an Enterprise SaaS Brand
Let’s assume we are auditing 'FinTrack', a fictional expense management software.
Step 1: Prompting Specifically on Perplexity and ChatGPT, we run the prompt: "Compare the top 5 expense management tools for UK mid-market firms."
Step 2: Observation ChatGPT lists FinTrack as #3. It provides a footnote link. However, the link does not go to FinTrack.com. It goes to a 'TechRadar' review from 2022.
Step 3: Analysis The citation audit reveals that the 'Visibility Source' is not the brand site, but a third-party review. The takeaway: To improve this citation, FinTrack needs to update its profile on TechRadar or provide more authoritative, structured data on its own 'Compare' pages to encourage the AI to source from the official site.
Technical Considerations for Scaling
Performing this manually is time-consuming. To scale, use the following approach:
- API Integration: Use the OpenAI or Anthropic APIs to run your prompt list through a script.
- Web Scraping (Perplexity/Gemini): Since these models are connected to the live web, use tools like Browse.ai or custom Python scripts to scrape the 'Sources' section of the UI.
- Sentiment Tagging: Use a secondary AI layer to tag if the citation is positive, neutral, or negative.
Identifying Citation 'Leakage'
Citation leakage occurs when an AI discusses your product but attributes the information to a competitor or an outdated third-party source. During your audit, flag any instance where your brand is mentioned but the citation link points elsewhere. This is the 'Citation Gap'. Reducing this gap is the primary goal of an AI Visibility Practitioner.
Putting it into Practice
To begin your first citation audit, follow these steps:
- Select your targets: Choose the 3 most relevant LLMs for your audience (e.g., ChatGPT-4o, Perplexity, and Gemini).
- Build a spreadsheet: Create columns for 'Prompt', 'Response Headline', 'Brand Mentioned (Y/N)', 'Citation URL', and 'Source Type'.
- Run the 'Niche Authority' test: Use broad informational prompts like "What are the current trends in [Industry]?" and see which domains are cited most frequently. These are your 'Authority Benchmarks'.
- Cross-reference with SEO: Compare your citation list with your top-ranking pages in Google Search. If a page ranks #1 in Search but is never cited by AI, there may be a formatting or 'readability' issue for the AI model's training data/retrieval system.
- Report findings: Present the 'Share of Citations' (similar to Share of Voice) to the client to justify investment in AEO-specific content updates.