Agentic Commerce Pt. 6: Who Pays When AI Gets It Wrong? (Liability)

When an AI agent buys the wrong thing, who pays? Inside chargebacks, Section 75, the CMA, AB 316, AI insurance, and the new dispute protocols.

Table of contents

TLDR: When an AI agent buys the wrong thing, the recourse system runs on rules built for humans. Chargebacks, Section 75, and Reg Z still apply, but insurers are pulling cover, courts are assigning blame, and the dispute layer is being rebuilt from scratch.

Follow the seriesPart 1 • Part 2 • Part 3 • Part 4 • Part 5 • Part 6 (this) • Part 7

Your AI agent reorders household supplies every Tuesday. One week, a feed error lists a 12-pack at the usual price, the agent treats it as the normal reorder, and twelve full cases arrive.

The agent doesn’t catch it, £6 becomes £72, and the money has already left your account.

Who fixes this: you, the merchant, the bank, or the company that built the agent?

That question is at the heart of Layer 6. Authority decided what the agent could do. Liability decides what happens after it acts and something goes wrong.

Four parties can carry the cost, and the rules for splitting it between them have assumed humans tapping cards, not software making choices.

So far the answer leans on old machinery. In the US, existing card-dispute rules apply, because regulators keep saying there’s no exemption from consumer law for new technology. In the UK, chargeback and Section 75 still apply. 

Regulators have made it clear the business that deploys an agent remains responsible for what it does, even when a third party built it.

But the recovery tools weren’t designed for a buyer that never sleeps. Beneath the legal questions, a financial layer is forming. 

Insurers are pulling AI out of standard cover, a new class of agent insurance is appearing, and protocol designers are building escrow, dispute resolution, and automated arbitration into the payment rails.

The record of what the agent was told, and what it did, undergirds every claim.

We’re still working off the old card rules

When an agent buys the wrong thing, the first question is rather procedural: how do you get the money back?

The UK offers two routes:

Chargeback lets a cardholder ask the bank to reverse a payment, and covers debit and credit cards at any amount.

But there’s a catch: it’s a voluntary scheme run under Visa, Mastercard, and Amex rules, not a legal right, and the window to claim runs to about 120 days from purchase.

Section 75 of the Consumer Credit Act carries more force. It makes the credit card provider jointly liable with the retailer when something goes wrong, it’s a statutory right rather than a favour, and a claim can run up to six years later.

But it only applies to credit purchases over £100 and up to £30,000, and it breaks when a third-party processor comes between buyer and seller.

The US runs a parallel system:

Regulation Z governs credit card disputes, and Regulation E covers electronic transfers, giving consumers 60 days to report an unauthorised error and the bank roughly 10 business days to investigate or issue provisional credit.

Reg Z protections apply to consumer credit up to $73,400 in 2026, with mortgages covered regardless of size.

Recovery routeCoversLimitTime to claim
UK chargebackDebit and credit, any amountScheme rules, not law~120 days
UK Section 75Credit only£100 to £30,000Up to 6 years
US Reg ZConsumer creditUp to $73,400 (2026)Billing-error rules
US Reg EElectronic transfersUnauthorised errors60 days to report

Authorised, but not intended

The whole process turns on one question: was the transaction authorised? If you didn’t authorise it, fraud rules cover that use case.

But a mis-prompted purchase from an agent you did authorise doesn’t look like fraud. It looks like a decision you regret, and that’s newer ground for the recourse machinery.

And the error budget is thin. Recent research estimates enterprises generally need agent failure rates below 5%, yet reliability collapses under repetition: GPT-4-based agents that succeed about 60% of the time on a single attempt drop to roughly 25% across eight consecutive runs.

A sub-5% error rate across millions of automated purchases is still a large pile of wrong orders.

Who pays the deploying business?

For merchants and firms shipping agents, regulators have closed the easy exit.

In March 2026, the CMA told UK businesses that the same consumer law applies whether a customer deals with a human or an AI agent, and that the business is responsible even when a third party supplied or designed the agent. 

That responsibility holds even when the failure starts with an attack like prompt injection

Backed by the Digital Markets, Competition and Consumers Act, breaches can bring fines of up to 10% of worldwide turnover.

The US has moved the same way. California chaptered AB 316 in October 2025, and the bill bars defendants who developed, modified, or used AI from asserting that the AI autonomously caused the harm.

It doesn’t create strict liability, since a plaintiff still has to prove causation, but it removes the “blame the machine” escape. 

And legal scholars have mapped how products liability for defective design could reach AI developers directly.

Europe leaves a different kind of hole. The AI Liability Directive was withdrawn in October 2025, while the revised Product Liability Directive brings software and AI into product-liability scope.

But that still doesn’t neatly solve the everyday agentic-commerce problem: a bad purchase that causes financial loss rather than injury, property damage, or data loss.

In the UK, the Law Commission’s July 2025 discussion paper names the situations where no identifiable person ends up liable for an autonomous system’s conduct, then proposes no reform.

There’s a less visible problem beneath this. Law firm Clifford Chance notes that many agents run under legacy technology contracts written for passive software, where vendor disclaimers often push the cost back onto the deploying business.

That leaves firms with a familiar problem: legal responsibility on the customer side, limited recovery from the supplier side, and a loss someone has to cover.

The cost moves into insurance

If the deploying business carries the residual risk, the next question is who insures it. The early answer is: not the usual policies.

A Gallagher Re study maps where traditional cover falls short. Cyber, technology E&O, product liability, and commercial general liability each leave AI-driven losses uncovered or only partly covered.

The same study reported that from 1 January 2026, new standard ISO exclusions may let insurers strip generative-AI liability out of general liability policies altogether. 

Meanwhile US generative-AI lawsuits climbed 978% between 2021 and 2025, and vendor contracts typically cap a supplier’s liability at 12 months of fees with no performance warranties.

So a new line is forming to fill the hole, much as cyber insurance did in the 1990s. Munich Re’s aiSure pays out when an AI system’s error rate breaks an agreed threshold; Armilla launched standalone AI liability cover with Lloyd’s underwriters in 2025; and Testudo opened in January 2026 with claims-made cover for enterprises facing lawsuits over generative-AI outputs.

That pressure points toward collateral. To transact, an agent may need funds held in escrow or posted as a bond against disputes, either per transaction or folded into a premium. The deposit becomes the price of delegation.

The audit trail and the arbitration layer

If liability turns on what the agent was told and what it did, the record of both becomes the asset that settles the claim. Mandates evolve from permission to evidence.

Google’s AP2 protocol already frames its design around a non-repudiable audit trail linking intent to checkout to payment, so a dispute resolves against proof of what the user authorised rather than against a model’s guess.

But an audit trail only proves what happened. It doesn’t decide whether what happened satisfied the buyer. So a second layer is forming on top.

Researchers have proposed an agent reimbursement system that holds payment in escrow, defines when a claim can be filed, and resolves compensation once a job goes wrong.

A companion IETF draft, the Agent Dispute Resolution Protocol, splits disputes into two classes: a cryptographic class that code can settle from the proof bundle, and a semantic class that needs pre-agreed acceptance criteria and, failing those, escalation to a human arbitrator.

Its output is an escrow directive that releases, refunds, or splits the held funds. 

Read together, the stack looks like this: AP2 records authority, the reimbursement layer holds the money, and the dispute protocol decides who gets it back. 

Digital arbitration moves from a customer-service queue into the protocol itself.

Person looking at a receipt dissolving into scattered pixels and cardboard boxes, illustrating an AI agent making unintended purchases

It all comes down to governance

Every mechanism in this piece is a patch on the same underlying problem.

Mandates, audit trails, insurance, escrow, arbitration, the rule that the deployer answers for the agent: each one handles a consequence after authority has been granted and a purchase has run.

None of them governs the agent itself. That’s Layer 7.

Authority sets what the agent can do. Liability decides who pays when it goes wrong. Governance is the layer above both: who sets the rules, monitors the agent in production, and keeps the power to shut it down.

That’s where our series ends.

Get a free audit

Book a 30-minute call to see where AI could help your organisation.