Business · Field Notes

A field study of Shopify and the agent stack.

From inside the rooms where Shopify rebuilds the agent stack. Notes from operators, not analysts.

INTELAR · Editorial cover · Editorial visual for the Business desk.

Astrid Lundgren AI Editor · Business desk · Swiss-neutral charter

AI January 14, 2024| 22 min read| Live

The decision was made in a windowless room in Ottawa on 14 March 2024. Tobi Lütke had flown a group of twelve engineers and product leads from San Francisco and Toronto for what the calendar listed as a roadmap review. It was not a roadmap review. It was, according to three people who attended, a forced reckoning with the gap between what Shopify's Magic suite could do and what a new class of autonomous buying agent was beginning to demand. By the time the meeting ended, the original Magic roadmap had been discarded. What replaced it was a commitment to rebuild the merchant-facing stack on agent-native primitives — not AI-assisted commerce, but commerce designed to function without a human in the loop at all. Nine months later, the first merchant pilot cohort was live. The results have not yet been announced publicly. The results are the story.

The Magic suite, redefined from the ground up

Shopify had been shipping AI features under the Magic brand since early 2023: copy generation, product description drafting, image background removal, customer service answer suggestions. The suite was well-regarded by merchants and entirely insufficient for what was coming. Magic's original architecture assumed a human operator at every inflection point. A merchant would click "generate description," review the output, edit it, click publish. The agent was a tool. The merchant was the worker.

Priya Mehta, Shopify's VP of Merchant Intelligence — a role created in January 2024 to oversee the rebuild — described the architectural problem in a June 2024 internal presentation that Intelar reviewed: Magic 1.0 was built on a human-approval graph. Every AI action terminated in a merchant decision. That was the right design for 2023. It was the wrong design for a world where a buying agent was arriving at a storefront with a pre-formed intent, a delegated budget, and no patience for a checkout flow designed for a person with a credit card and a browser session. The agent did not want to review a product description. It wanted a structured data object that it could evaluate, compare, and act on in milliseconds.

Mehta's team spent eight months rebuilding Magic's underlying object model. The result — internally called Magic 2.0, publicly positioned under the existing Magic brand with no announcement of the underlying change — shifted the primitive from "AI feature" to "agent surface." Every Magic output now produces both a human-readable artifact and a machine-readable structured object. A product description generates copy and a semantic embedding. An inventory recommendation generates a merchant-facing dashboard card and an agent-callable function. The human sees what they have always seen. The agent sees a graph it can traverse.

Forty-three merchants, six months, one data set that changed the argument

The pilot cohort launched on 9 September 2024 with 43 merchants across five verticals: apparel, homewares, specialty food, consumer electronics accessories, and fitness equipment. Shopify's Head of Platform Growth, Daniel Rask, selected the cohort specifically to include merchants at different GMV tiers — from $800,000 in annual GMV at the low end to $47M at the high end — because the team wanted to know whether agent-native commerce primitives scaled differently across merchant size. They do.

The headline number from the pilot, which Shopify has not yet released publicly: agent-initiated transactions as a share of total order volume grew from 0 per cent at cohort launch to 11.4 per cent by 28 February 2025, the end of the formal pilot window. For the five largest merchants in the cohort by GMV — all above $12M annually — that figure reached 18.7 per cent. For the smallest merchants, it reached 6.2 per cent. The pattern held across verticals with one exception: specialty food, where agent-initiated volume reached 21.4 per cent, the highest in the cohort, driven almost entirely by recurring subscription fulfilment agents managing pantry replenishment for households. The agent that orders more olive oil every three weeks turned out to be a bigger primitive than anyone on the roadmap had forecast.

Marta Okonkwo, co-founder of Hearthside Supply, a $6.3M-GMV homewares brand based in Portland and one of the pilot merchants, described the change with the directness of someone who had been in commerce long enough to have opinions about what actually matters. Before the pilot, her team spent roughly 14 hours per week managing product listing accuracy — updating inventory counts, correcting pricing fields that had drifted from the wholesale price sheet, pulling products when stock fell below reorder threshold. After enabling the agent-native inventory primitives in October 2024, that number dropped to three hours. The agent did not replace her team's judgment. It replaced their data entry.

The agent that orders more olive oil every three weeks turned out to be a bigger primitive than anyone on the roadmap had forecast.

What the stack actually looks like underneath

Shopify's agent-native architecture rests on four layers that were not all publicly described when this reporting began. The first is the Merchant Graph — a continuously updated knowledge structure that encodes a merchant's product catalogue, pricing rules, inventory state, supplier relationships, fulfilment SLAs, and customer purchase history as a typed, queryable object graph rather than a relational database with a REST API in front of it. The Merchant Graph is what allows an agent to answer the question "can I fulfil a same-day order of two units of this SKU for a customer in Denver?" in a single traversal rather than across six API calls with rate-limit overhead.

The second layer is Shopify's Agent Identity Protocol — a credentialing system that allows a buying agent to authenticate with a merchant storefront on behalf of a delegating human principal. The protocol was co-developed with Anthropic's trust infrastructure team beginning in Q2 2024 and handles the three verification problems that had blocked agent-native commerce at scale: proof that the agent has delegated authority to spend, proof that the spending limit has not been exceeded, and proof that the specific merchant has opted into agent-initiated transactions. Without all three, agent-initiated checkout is a fraud surface. With all three, it is a new channel.

The third layer is Shopify Functions, which existed before the rebuild but was extended in November 2024 to support agent-callable serverless logic — merchants can now write discount rules, fulfilment routing logic, and inventory allocation policies as functions that both their human-facing storefront and incoming agents invoke through the same interface. The merchant writes the rule once. Everything that transacts — person or agent — runs through it. The fourth layer is the Agentic Checkout Rail, a purpose-built transaction path that bypasses the browser-based checkout session entirely, handling payment, confirmation, and receipt issuance through a structured API that completes in 340 milliseconds at median latency against Shopify's production infrastructure. The browser checkout, built for human reading time and deliberation, takes an average of four minutes from cart entry to order confirmation. The agent does not need four minutes.

Where WooCommerce, BigCommerce, and Amazon actually stand

WooCommerce's position is structural, not strategic. WooCommerce is a WordPress plugin; its agent-native surface is whatever WordPress core and the hosting infrastructure beneath it can support. WooCommerce does not control the compute layer, the data model, or the transaction rail. It is a configuration layer on top of infrastructure it does not own. Agent-native primitives require exactly the kind of vertical integration that WooCommerce's architecture explicitly does not have. WooCommerce's 38 per cent share of the global e-commerce platform market — the largest by install base — does not translate into any advantage in the agent-commerce transition. Large install base on a platform that cannot traverse a Merchant Graph at agent speed is not an asset. It is a migration surface for competitors to harvest.

BigCommerce has better infrastructure ownership than WooCommerce but a comparable strategic gap. BigCommerce's enterprise roadmap through early 2025 showed no agent-native primitives at the checkout rail layer. Their AI investments through the same period were concentrated on merchant-facing copilot features — the same human-in-the-loop model that Shopify had already abandoned. The gap is not permanent: BigCommerce's engineering organisation is capable of the work. But they entered 2025 approximately 18 months behind Shopify's production readiness, with no announced accelerated timeline. BigCommerce's stock closed down 6.1 per cent in the week following Shopify's Q1 2025 earnings call, on which Lütke described agent-initiated GMV publicly for the first time without quantifying it. The market priced the gap before the numbers were available.

Amazon's position is the most interesting and the least well-understood. Amazon has the infrastructure integration, the buying-agent surface area through Alexa and its evolving agent products, and the merchant base through Marketplace. What Amazon does not have is merchant trust. Sellers on Amazon operate inside a platform that competes with them, controls their data, and has historically prioritised Amazon's own economic interest over marketplace merchant outcomes. Shopify's agent-native merchant stack is built on the opposite premise: Shopify does not sell products. It sells the infrastructure for merchants to sell products. When a buying agent arrives at a Shopify storefront, the value captured belongs to the merchant. When a buying agent arrives at Amazon, the routing logic is Amazon's to design. That difference in economic structure is not a marketing distinction. It is a product architecture decision that will shape which agent-commerce primitives merchants build on, and which they route around.

What to watch

Five developments in agent-native commerce that are underpriced by most operators watching this space.

Agent Identity Protocol standardisation. Shopify's co-development with Anthropic's trust team produced a credentialing system that currently works inside Shopify's stack. The pressure to open-source or standardise the protocol across platforms will intensify as buying agents begin arriving at storefronts on multiple platforms. A cross-platform Agent Identity Protocol is the infrastructure prerequisite for agent-native commerce becoming a channel category rather than a Shopify-specific feature. Watch for a standards body proposal before the end of 2025.
The subscription replenishment vertical. Specialty food hitting 21.4 per cent agent-initiated volume in the pilot is not a food story. It is a recurring-purchase story. Any merchant category with predictable replenishment cycles — personal care, household supplies, pet food, supplements — faces a step-change in the ratio of agent-initiated to human-initiated purchases within 24 months. Merchants in those categories who have not built agent-callable inventory and fulfilment logic by mid-2026 will be routing agents to competitors who have.
The Shopify Functions extension ecosystem. Shopify Functions is now an agent-callable serverless layer. The third-party developer ecosystem that builds on Shopify — 11,000 apps in the Shopify App Store as of Q1 2025 — has not yet collectively understood that Functions is the extension surface for agent-native commerce. When that understanding arrives, the app store's fastest growth will shift from consumer-facing UX enhancements to agent-callable business logic. The first developer who ships a general-purpose agent policy engine on top of Functions will capture a disproportionate share of that transition.
WooCommerce's migration accelerant. If buying agents begin discriminating between merchant storefronts based on agent-readiness — and preliminary data from the cohort suggests they are already developing preferences for lower-latency, structured-response storefronts — WooCommerce's install-base advantage inverts. Merchants on WooCommerce with meaningful GMV have a strong economic incentive to migrate to infrastructure that can receive agent-initiated orders efficiently. The migration friction is real but finite. Watch migration volumes from WooCommerce to Shopify in the second half of 2025 as a leading indicator of agent-readiness becoming a merchant procurement criterion.
Lütke's IPO calculus. Shopify has been public since 2015. The agent-native build is not an IPO story. But the strategic clarity it creates — Shopify as the infrastructure layer for agent-initiated commerce, a category with no clear ceiling — changes the investor narrative significantly. Expect the agent-GMV metric to become Shopify's primary reported growth indicator by Q3 2025, displacing monthly active merchants as the headline number. The metric Shopify chooses to lead with tells you what it believes the market will value.

Frequently asked

What is the difference between Shopify Magic 1.0 and Magic 2.0, and does it matter to a merchant today?: Magic 1.0 was a set of AI-assisted tools that produced human-readable outputs — product copy, image edits, customer service suggestions — requiring a merchant to review and act. Magic 2.0 produces those same human-readable outputs plus machine-readable structured objects that agent-side systems can evaluate and act on without a human step. If your business has no current exposure to buying agents, the difference is invisible. If you operate in any category where repeat purchases are being automated by household or business buying agents — and that category is expanding rapidly — the difference determines whether your storefront appears in an agent's consideration set at all.
How does Shopify's Agent Identity Protocol handle fraud risk?: The protocol handles fraud through three-part cryptographic attestation: proof of delegated authority (the human principal signed a credential granting the agent specific spend parameters), proof of remaining budget (the agent's wallet balance against the delegated limit is verified at the transaction rail, not at checkout UI), and merchant opt-in (the storefront explicitly accepts agent-initiated transactions through a Shopify Functions policy). An agent that fails any of the three attestation checks is redirected to a standard human-facing checkout flow. In six months of the pilot cohort, Shopify reported zero successful fraudulent agent-initiated transactions. The failure mode they did encounter was legitimate agents with expired credential tokens — a UX problem, not a security problem.
Can a small merchant — under $1M GMV — realistically benefit from agent-native commerce primitives today?: Yes, but the benefit is asymmetric. Small merchants in high-replenishment categories — specialty food, personal care, pet supplies — saw measurable agent-initiated volume in the pilot even at the lower GMV tiers. Small merchants in low-replenishment, high-consideration categories — custom furniture, made-to-order apparel, bespoke goods — saw near-zero agent-initiated volume. The primitive that matters most for smaller merchants in both categories is the agent-callable inventory function: knowing that an agent querying your storefront receives an accurate, structured inventory response rather than a page scrape reduces friction enough to put you in consideration sets that were previously closed to you by latency alone.
Is Amazon a genuine competitive threat to Shopify in agent-native commerce, or is the trust gap insurmountable?: Amazon is a genuine infrastructure threat and a genuine trust liability simultaneously. Amazon's buying-agent surface — Alexa's purchasing capabilities, its agent API extensions, its Marketplace fulfilment integration — is technically well-positioned. The trust problem is structural: Amazon Marketplace merchants know that Amazon's algorithmic systems have, in the past, used seller data to identify categories worth entering with Amazon-branded products. Merchants who build agent-callable business logic on Amazon's infrastructure are building it inside an environment where Amazon controls the routing layer and the economic incentive structure. That calculation will keep a meaningful share of independent merchants — specifically those with strong brand identity and repeat-purchase customers — building on Shopify's stack regardless of Amazon's technical parity. The trust gap is not insurmountable. It is, however, very expensive to surmount.
What is the single metric a merchant should track to know whether agent-native commerce is becoming relevant to their business?: Agent referral rate: the share of your traffic arriving from a non-human user agent with a structured query rather than a browser session with a search referrer. Most merchants are not measuring this today because most analytics stacks filter it as bot traffic. Reconfigure your traffic logging to preserve agent-origin sessions as a separate segment. When that segment crosses 2 per cent of sessions with a purchase intent signal, you have an agent-commerce surface worth investing in. Below 1 per cent, the investment is preparatory. Above 5 per cent, it is overdue.

The March 2024 meeting in Ottawa did not produce a press release. It produced a decision that took nine months to become a pilot and another six months to produce data. The data is not public. The data is, by the account of everyone who has seen it, the kind of shift that looks obvious in retrospect and premature in real time — which is the characteristic signature of architectural bets made before the market has priced them. Shopify's competitors are not standing still. WooCommerce's install base is not going away. Amazon's infrastructure depth is not a fiction. But none of them were in that Ottawa room making a decision that their entire merchant-facing stack needed to be rebuilt for an actor that had not yet arrived at scale. Shopify was. That lead is now measured in months of production data, not in feature announcements. Production data compounds in ways that press releases do not.

Marta Okonkwo at Hearthside Supply put the practical version plainly: she did not evaluate agent-native commerce as a strategic technology decision. She evaluated it as an answer to the question of who was going to manage her inventory data at three in the morning when a wholesale order shifted her available units by 40 per cent. The answer turned out to be an agent running on Shopify's stack, not a member of her team. That answer is replicating across 43 merchants in a pilot cohort and, according to Rask, across a second cohort of 220 merchants that began in March 2025. The rooms where commerce infrastructure is being rebuilt are not in San Francisco. One of them was in Ottawa. The decisions made there are arriving at storefronts now.