A shopper opens ChatGPT and types: "waterproof hiking boots for wide feet under 150 dollars, good for someone who pronates." A few seconds later they have a shortlist with prices, availability, and reasons. They never opened a search engine, never visited a store, never saw a homepage.

Somewhere in that answer there are three or four real products from real merchants. The interesting question is not whether this is happening, it is, but why those specific products made the list and thousands of equally good products did not.

The answer has little to do with brand, design, or storytelling, and almost everything to do with data. When the customer is a machine, it does not look at your photography. It reads your structured product information and either includes you or quietly moves on. Becoming the merchant the machine picks is a concrete, technical job, and this post breaks it into parts you can act on.

This is Part 1 of a two-part series, the data and technical foundation: the feed, the markup, the attributes, and crawler access. Part 2 covers generative engine optimization, the tactics that earn recommendations inside the answers themselves. The foundation comes first because no tactic survives a catalog the machine cannot parse.

Where "agent-readable" came from

For most of ecommerce history, the reader of a product page was a person, and people are forgiving readers. A blurry photo, a thin description with a gap where the dimensions should be, a title written for a human eye, all of it still works because a person fills in what is missing.

Structured product data existed in that world, but it was a side channel. Schema.org launched in 2011 as a joint project of Google, Bing, Yahoo, and Yandex, a shared vocabulary for marking up pages so search engines could read them precisely. Merchants who added Product and Offer markup got rich results: the price, star rating, and stock status under a search listing. Product feeds had a parallel life, the structured file behind Google's Merchant Center and the comparison engines before it. Both mattered, and both were rarely urgent. The markup and the feed were polite notes on the side.

Two things collapsed that arrangement. The first was AI Overviews and AI-mode search compressing the results page, so the click you used to earn no longer reliably arrives. The second and larger one was the arrival of AI assistants that do not just answer questions but shop. OpenAI shipped Instant Checkout inside ChatGPT in September 2025, then pulled the native version back in early 2026 after struggling to sync inventory and handle tax across thousands of merchants. The pullback is instructive, not a reprieve: discovery, the part where the assistant finds and recommends your product, ran far ahead of checkout. Through late 2025 and into 2026, ChatGPT, Gemini, and Perplexity became places where a person describes what they want and a machine returns specific products, and Google announced its Universal Commerce Protocol at NRF in January 2026.

In that world the side channel became the main channel. The structured data is no longer a note to the crawler; for an AI shopping assistant, it is the product. McKinsey's framing of agentic commerce is blunt: agents optimize for delivered value, meaning price, availability, fulfillment, and the reversibility of a return, and they read that value off your data. "Agent-readable" is the name for a catalog a machine can parse, trust, and act on. It is now a merchant workstream, and most merchants have not started it.

The present: what the work actually is

Agent-readability is not one task but a stack of them, ordered by return for the effort.

The product feed, and its quality

The product feed is the most direct way to be discoverable inside an AI assistant: a structured file, one row per product variant, that you supply to the platform.

Take ChatGPT as the worked example, because OpenAI has published its specification in detail. The OpenAI product feed spec defines a long list of fields and marks each one required, recommended, or optional. The required fields are the irreducible minimum: a unique item_id per variant, a title, a description, a url that returns HTTP 200, the price with a numeric value and an ISO 4217 currency code, availability as a controlled value such as in_stock or out_of_stock, an image_url, and a flag, is_eligible_search, that tells the assistant whether the product may appear at all. A companion flag, is_eligible_checkout, governs whether it can transact through a connected checkout integration.

Required gets you in the door. It does not get you recommended. The non-required fields are where the competition happens: brand, gtin and mpn so the product can be matched and verified, a product_category taxonomy path, material, size and size_system, sale_price with dates, shipping detail, accepts_returns and a return_deadline_in_days, star_rating and review_count, extra images, video, even a 3D model. Most carry an optional label in the spec, but optional to OpenAI is not optional to the merchant who wants the slot. Feeds get rejected for predictable reasons: missing required fields, product IDs that are not unique, and feed prices that do not match the page.

Two properties of the feed change how you run it. First, freshness. A feed pushed on a fast cycle keeps price and inventory close to live, against the roughly once-a-day cadence of traditional shopping feeds. With a daily feed, a product can show as in stock for almost a full day after it sells out. Accurate data is itself a ranking input. Second, completeness. The feed is usually a full snapshot that replaces the catalog, not a list of changes, so a gap in the feed is a gap in what the assistant knows.

The same pattern repeats across platforms. Perplexity's merchant program accepts a CSV feed following the Google Merchant Center specification. Google's own Merchant Center feed powers the Shopping Graph that grounds product mentions in AI Mode and Gemini. The format differs; the discipline does not.

Structured data and schema.org on the page

The feed is not the whole story. AI shopping assistants also read your actual product pages. OpenAI's shopping research feature reads product pages directly for price, availability, reviews, specs, and images, and it surfaces information only from sites that let its agents in. Feed and page are two doors into the same catalog; keep both open and consistent.

The page is read through structured data, and the standard is schema.org Product markup in JSON-LD, the format Google recommends. A complete Product block carries name, description, sku, brand, and image, an offers object with price, priceCurrency, availability, and itemCondition, the identifier as gtin or mpn, and aggregateRating with review data. Google's merchant listing documentation treats price, availability, and a product identifier as the load-bearing fields.

Two technical points are easy to get wrong and expensive to miss. The markup must be present in the HTML the server returns; if it is injected by JavaScript after the page loads, some crawlers never see it. And the markup must agree with the page and the feed. If your schema says one price, your feed says another, and your page displays a third, you have given the machine a reason to trust none of them. Google cross-checks structured data against the Merchant Center feed and disapproves listings when they conflict.

Complete and consistent product attributes

Underneath both the feed and the markup sits the real foundation: the attributes themselves. An attribute is a discrete, defined fact about a product. Not a paragraph that mentions the material, but a material field with the value.

This matters because of how an assistant handles a query. The waterproof-boots question above is, to the machine, a set of constraints: a product type, a feature, a fit, a price ceiling. It satisfies them by reading fields. If "wide" lives only in a sentence in your description while a competitor records it in a structured width attribute, the competitor is the safer match.

The recurring failure here is not absence, it is inconsistency. Centimetres in one product, inches in another. "Stainless steel", "Stainless Steel", and "SS" as three values for one material. Half the catalog with a populated category and half without. A human shrugs at this; a machine treats inconsistent data as unreliable data. This is the case for a product information management system, the governed source of truth that holds attributes as clean, typed values and feeds every channel from one place. The PIM was back-office plumbing for years; agentic discovery made it strategic.

Machine-readable pages and crawler access

You can build a perfect feed and perfect schema and still be invisible, because a setting in a file most merchants never open is quietly blocking the assistant. That file is robots.txt, and AI assistants use specific, named crawlers that it controls. OpenAI publishes its crawlers in its bot documentation. The one that matters most for shopping discovery is OAI-SearchBot, which surfaces sites in ChatGPT's search features. OpenAI states that a site opted out of OAI-SearchBot will not be shown in those answers. GPTBot, the training crawler, is a separate decision, and blocking it has no effect on search visibility.

So the audit question is concrete. Open your robots.txt and confirm OAI-SearchBot is not disallowed, in a blanket rule or a specific one. Do the same for the crawlers behind Perplexity and Google's AI surfaces. Then confirm the basics that break machine reading: product pages return a clean HTTP 200, important content sits in the server HTML rather than rendered only by client-side JavaScript, and pages load fast enough that a crawler does not give up before it finds your price.

The merchant feeds, registered properly

The feed is not something you simply publish. Each assistant has a program you join. For ChatGPT, a merchant applies through OpenAI's merchant program, gets verified, and supplies the feed over an encrypted connection. Perplexity runs its own merchant program, currently requiring that you sell and ship to the US. Google's path runs through Merchant Center, which in 2026 added conversational attributes, extra product fields that help listings match natural queries across AI Mode and Gemini.

One fact is worth stating plainly because merchants keep asking. Placement in these results is organic. OpenAI has described its product results as organic and unsponsored, ranked on relevance rather than payment. You earn the slot with data quality, not budget. That is the optimistic reading of all this work: it rewards the merchant who does the unglamorous job well.

Auditing your own agent-readability

You do not have to guess whether any of this works. Start with the plain test: open ChatGPT, Gemini, and Perplexity, and ask each one the constrained, specific questions a real customer asks in your category. See whether your products appear, whether the details are right, and which competitors keep showing up. Repeat it on a schedule, because answers move.

Then audit the inputs behind the result. Export your full catalog and check field coverage as a percentage: every product should have a title, description, price, availability, brand, identifier, image, and category. Validate your structured data with Google's Rich Results Test. Check that the feed, the schema, and the live page agree on price and availability. Open robots.txt and confirm the AI crawlers are allowed.

The habit matters more than any single tool: pick the inputs, score them, fix the lowest, measure again. Stale or incomplete data means the agent skips your listing.

Where this is heading

The direction of travel is set, and the work only gets more central. Expect feed specifications to keep widening, as return windows, rich media, and structured reviews move from optional to competitive necessity. Expect freshness to keep tightening, with real-time price and inventory moving from advantage to baseline. Expect the protocol layer to consolidate around OpenAI's Agentic Commerce Protocol and Google's Universal Commerce Protocol, which is good news: fewer, more standard formats to maintain. McKinsey projects AI agents could mediate three to five trillion dollars of global consumer commerce by 2030, so the surface this work targets is not a niche.

Be honest about the risks. The assistant is a new intermediary, and intermediaries set rules and can charge fees. Ranking inside an answer is far less transparent than ranking in a list of blue links, and measurement is hard when much of the journey happens inside a chat with no trackable click. None of that is a reason to wait. It is a reason to control the one thing you fully own: the quality of the data you expose.

For an enterprise with a large catalog spread across an ERP, a PIM, a platform, and a pile of spreadsheets, becoming agent-readable is a real project: deciding the source of truth, standardizing attributes at scale, building feed pipelines per platform, fixing markup, and putting the audit on a schedule. This is the kind of work Perform Digital does with enterprise clients, turning a catalog a machine cannot parse into one it recommends by default.

That is Part 1. With a clean feed, accurate markup, consistent attributes, and open crawler access, the machine can find you, understand you, and trust you. The merchants who get read get sold. Part 2 turns to what you do once that foundation holds: generative engine optimization, the tactics for earning recommendations inside the answers that ChatGPT, Gemini, and Perplexity write.

Council summary

This post argues that AI shopping discovery is won or lost on structured data, not brand: the feed, schema.org markup, consistent attributes, and crawler access decide whether ChatGPT, Gemini, and Perplexity can find and recommend a product. The council verified the OpenAI feed fields, the OAI-SearchBot and GPTBot crawler behavior, the McKinsey three to five trillion dollar projection for 2030, and Google's Universal Commerce Protocol from NRF 2026. We corrected the record on OpenAI's Instant Checkout, which launched in September 2025 but had its native version pulled in early 2026, removed a specific feed refresh interval that OpenAI's published spec does not state, and cut an unsourced survey statistic. The takeaway for a busy merchant: audit the feed, the markup, and robots.txt now, because the discovery layer is live and rewards clean data over budget.

Becoming Agent-Readable: The Merchant Guide to AI Discovery

Where "agent-readable" came from

The present: what the work actually is

The product feed, and its quality

Structured data and schema.org on the page

Complete and consistent product attributes

Machine-readable pages and crawler access

The merchant feeds, registered properly

Auditing your own agent-readability

Where this is heading

Council summary

Comments

Leave a comment

Where "agent-readable" came from

The present: what the work actually is

The product feed, and its quality

Structured data and schema.org on the page

Complete and consistent product attributes

Machine-readable pages and crawler access

The merchant feeds, registered properly

Auditing your own agent-readability

Where this is heading

Council summary

Comments

Leave a comment

Agentic programming security: the fundamentals most teams skip

Privacy best practices for agentic AI: a consultant's checklist

AI agent governance: the framework most teams build too late