Skip to main content

AgentIQ Benchmark

AgentIQ Benchmark is the foundational dashboard within the Agent IQ application.

Updated over 2 weeks ago

Overview

AgentIQ Benchmark is the foundational dashboard within the Agent IQ application. It is designed to monitor and grade your store's "agentic capabilities" essentially measuring how well your products can be discovered, understood, and recommended by AI agents (like ChatGPT, Gemini, and Copilot) compared to the global Shopify catalog.

Use this tool to get a high-level health check of your catalog's metadata, verify if your products appear for specific user intents, and see which of your items are performing best in AI search results.


Video tutorial

Catalog Health Tracker

The Catalog Health Tracker is the first section you see upon onboarding. It provides a visual breakdown of your entire product catalog's readiness for AI agents.

Interpreting the Health Bar

The tracker categorizes your products into four health statuses based on their metadata quality and taxonomy:

  • Excellent (Green): Products with rich, structured data that agents can easily interpret.

  • Mediocre (Yellow): Products that may lack some optional attributes but are partially visible.

  • Poor (Red): Products with significant data gaps that hinder AI discovery.

  • Uncategorized (Gray): Products where the Taxonomy category is not set.

Critical Action: If you see a high percentage of Uncategorized products (e.g., 45%), prioritize fixing these first. Taxonomy is arguably the most critical attribute for AI product listing. Without it, agents struggle to classify what your product actually is.


Tracked Queries Alignment

This section allows you to monitor a specific set of natural language queries (intents) to see if your store products appear in the results.

How It Works

  • Default vs. Custom Queries: Upon installation, the app generates a basic set of queries relevant to your store. You can edit these or add your own custom queries to track specific user intents you want to target.

  • Scoring System:

    • Score (e.g., 90): Represents the alignment between your products and the prompt. A higher score means your product data closely matches what the AI is looking for.

    • Product Count (e.g., "1 product"): Shows exactly how many of your products were returned by the agent for that specific query.

Usage Tip

Regularly review the "Last Run" date to ensure your data is current. If a query returns "0 products" or has a low score, inspect the product data for that category to ensure keywords and attributes align with the user's search intent.


Product Leaderboard for Tracked Queries

The Leaderboard identifies your "MVP" products the items that are most frequently selected by AI agents across all your tracked queries.

Understanding the Rankings

This list ranks products by their appearance frequency.

  • Rank #1 (The Winner): The product that appeared in the highest percentage of query results.

    • Example: If "Ice Cleats" has a score of 17% of queries, it means that for every 100 queries you are tracking, this product was suggested by the AI 17 times.

  • Lower Ranks: Products that appear less frequently (e.g., 8% of queries).

Strategic Insight

Use this leaderboard to understand which products are your "AI Drivers." If your best-selling product isn't on this list, it may have poor data structure (Catalog Health) or simply doesn't align well with the specific queries you are tracking.

Did this answer your question?