Contemplative moment in modern design

How to Properly Scrutinise Your Vendor’s AI Model Card

Most AI model cards are compliance theatre. Here's a practitioner's guide to reading them critically before you sign off on a vendor's AI system.

Table of contents

A vendor has sent you an AI model card. Maybe procurement flagged it, maybe it arrived attached to a contract, maybe your legal team forwarded it with a question mark in the subject line. 

The document looks thorough. It has sections of key information, tables, performance metrics, and a reassuring paragraph about responsible AI. You nodded and filed it away.

That was perhaps the wrong call.

An AI model card is the closest thing the artificial intelligence industry has to a drug insert. It’s a structured, standardised document that accompanies a machine learning model, telling you what the model does, what data trained it, where it performs well, and where it fails. 

The concept originated in a 2018 academic paper by Mitchell et al. Hugging Face made it mainstream by requiring model cards across their model hub, and Google, Amazon, and Microsoft followed. The format became the de facto standard for AI documentation and responsible AI disclosure.

It also became, in many hands, a compliance checkbox.

What a model card should tell you

A complete AI model card ideally covers the following sections in detail.

SectionWhat it’s supposed to tell you
Intended use caseThe specific contexts and audiences the AI or ML model was designed for, and where it shouldn’t be deployed
Base model and architectureThe technical foundation of the model and how it was built
Training dataWhat data the machine learning model learned from, including sources and known limitations
Performance metrics and quantitative analysesHow the model performs across different groups and conditions, not just in aggregate
Bias and fairnessMeasured disparities in model performance across specific subgroups
Evaluation methodologyHow the developer tested the model and which benchmarks they used
Risk managementNamed failure modes and the steps taken to address them
Model version and update historyWhich version this card covers and when documentation last changed
Compliance alignmentStated alignment with frameworks like the EU AI Act and NIST AI RMF

The EU AI Act requires structured technical documentation covering 13 specific elements for any high-risk AI system operating in EU markets, with full enforcement arriving in August 2026. A properly completed model card satisfies that requirement. An incomplete one won’t survive audit.

Some vendors publish a system card, which covers the full AI system rather than just the underlying model. A data card documents the training dataset specifically.

What most AI model cards contain

Research shows that just 14% of model cards include meaningful risk discussion. The document designed specifically to surface risk buries it in 86% of cases.

The sections most model cards handle well are the easy ones: model architecture, intended use, basic aggregate metrics. Standard model documentation. 

The sections that should concern a business reader are the ones on model details most likely to be vague, absent, or deliberately soft:

  • Training data gets described as “publicly available.” 
  • Bias gets a paragraph acknowledging it exists without quantifying it. 
  • Fairness appears as a stated goal rather than a measured outcome. 
  • Evaluation is reported as a single accuracy figure, which hides how the model performs on the specific subgroups your business actually serves, exposing you to risk.

Model card creation has become a documentation activity, not a transparency activity. The incentive is to publish a card, not to publish a useful one. The EU AI Act is changing this incentive structure, but until August 2026, scrutiny is your job.

Geometric design held by diverse hands

How to read an AI or ML model card

You don’t need a data scientist in the room to ask the right questions. These five will tell you whether a model card is substantive or performative.

Does the intended use case match your use case?

Every model card specifies the context its developers designed it for. If your use case differs from theirs, the documented performance metrics don’t apply to your situation. AI developers and vendors often present general-purpose models with general-purpose model cards. That’s not evidence the model works for your specific context.

Are the performance metrics disaggregated?

Aggregate accuracy figures are near-useless for compliance purposes. A model that’s 94% accurate overall can be 70% accurate for a specific demographic group your business serves. Ask whether the quantitative analyses break down by subgroup. If they don’t, that’s a due-diligence finding worth escalating.

What does the bias section say?

There’s a meaningful difference between “we acknowledge that bias may exist in AI models” and “we tested for bias across these specific groups and here are the results.” The first is a disclaimer. The second is evidence. Most model cards give you the first.

Does the model version number match what you’re deploying?

A model card for an outdated model version tells you nothing about the system you’re putting into production today. AI systems update frequently. Governance documentation that doesn’t track model versions is decorative.

Does the risk management section name specific risks?

The relevant information here is concrete: what can this model get wrong, under what conditions, and what did the developer do about it. Philosophical commitments to responsible AI development are not risk management.

If a vendor can’t answer these questions by pointing to their model card, or if no card exists, that’s the finding. No AI documentation means no accountability. In a procurement context, that’s a negotiating point and a potential legal exposure depending on your sector.

Why scrutinising ML models is now your job

Non-technical business leaders have historically delegated model evaluation entirely to technical teams. That made sense when AI systems were narrow tools with contained failure modes.

It makes less sense when an AI system influences hiring, credit decisions, healthcare diagnostics, or customer-facing interactions at scale.

The NIST AI RMF and the EU AI Act both extend accountability for AI systems beyond the development team to the buyers, deployers, and organisations embedding these models into their operations. 

A model card is the document that makes informed deployment possible, when written honestly.

Your technical team can tell you whether a model performs well on a benchmark. An AI model card tells you whether the vendor has thought carefully about where it fails, who it fails, and what they’ve done about it. 

Those are different questions, and the second set belongs to you.

The business leaders who build model card literacy now won’t be building it under a regulatory deadline later.

Get a free audit

Book a 30-minute call to see where AI could help your business.

Virtual personal assistant from Los Angeles supports companies with administrative tasks and handling of office organizational issues.