June 3, 2026

The Floridi Conjecture: Why Broad AI May Never Be Reliable

Mo Shehu, PhD

No AI system can be both broadly capable (scope) and provably reliable (certainty). The Floridi conjecture explains why and what it means for AI governance.

TLDR: The Floridi conjecture proposes a formal trade-off between an AI system’s scope and the certainty of its outputs. If the conjecture holds, any large language model broad enough to be useful carries an irreducible error rate. A later amendment by Alberto Messina makes the conjecture more computable. Both versions point to a useful lens for business strategy: AI reliability depends on scope, and governance should treat it that way.

In 2025, Professor Luciano Floridi of Yale University, a leading voice in the philosophy of information and AI ethics, published a short paper with a large claim.

It proposes a formal limit on artificial intelligence: an AI system can be broad, or it can be reliable, but it can’t fully be both.

The conjecture says that the wider the range of tasks an AI model handles across different contexts, the lower the ceiling on how certain its outputs can be. Researchers now call the idea the Floridi conjecture.

What Floridi’s conjecture proposes

Floridi builds the conjecture around two measures.

Certainty (C(M)) measures how provably error-free a system’s outputs are. A score of 1 means a formal guarantee of zero error, with no exceptions and no edge cases.

Mapping scope (S(M)) measures how broad and complex a system’s input and output domain is. The original paper measures this through Kolmogorov complexity, an idea from math that captures how much information a domain contains—the length of the shortest program needed to reproduce an object, such as a piece of text.

The conjecture describes a fundamental trade-off: C(M) multiplied by S(M) stays at or below a fixed value, k. Certainty and scope can’t both run high at once.

Two consequences then follow: Push certainty toward 1 and scope must shrink. Widen scope past a threshold and certainty must fall below 1 and stay there, whatever you spend on training, compute, or engineering.

System	Scope	Certainty
Pocket calculator	Narrow (fixed arithmetic)	Provably perfect
Coq proof assistant	Narrow (formal logic only)	Provably perfect
GPT-4, Claude, Gemini	Very broad (text, images, code)	Statistical only
General medical chatbot	Very broad	No formal guarantee

A pocket calculator never hallucinates—punching in 1+1 will give you 2 each time. A general-purpose AI model sometimes does. The conjecture gives a reason for that difference and argues that scale alone can’t remove it.

In Lucian Floridi’s follow-up post on LinkedIn, the author said he’d be content if the conjecture holds as an empirical generalisation, “like one of those laws so common in IT/CS” (think Moore’s Law).

Where the idea of Floridi’s conjecture comes from

The conjecture draws together four established results that researchers and mathematicians have studied for decades, and it connects to older questions in epistemology about the limits of knowledge.

In symbolic AI, a long-known trade-off pits the expressiveness of a logic against the speed of reasoning within it. Richer languages describe more of the world but can’t guarantee fast or correct answers across all of it.

Wolpert and Macready proved in 1997 with their “No Free Lunch” paper that no algorithm beats all others across every task. General strength always costs performance somewhere. Put differently, the perfect artificial agent doesn’t exist.

Work on formal verification, going back to Rice’s theorem and the Halting Problem, shows you can’t fully verify arbitrary complex programs. Hard guarantees hold only inside tightly bounded systems, not across open-world systems—especially with AI agents in the mix.

Valiant’s 1984 PAC learning framework shows that as concept classes become more complex, learners usually need more data to reach approximate correctness, while the guarantee remains a matter of probability rather than certainty.

The Messina amendment

Strong conjectures attract scrutiny. Alberto Messina of RAI, the Italian public broadcaster, published a proposed revision that Floridi welcomed in public. Messina kept the general philosophy intact and fixed two technical problems.

First, the original certainty measure was too harsh and formally mixed two ideas: worst-case error over inputs and probabilistic assessment.

Messina replaced it with an expected correctness measure, weighted by input probability and later by user judgment.

This handles the fact that many tasks have no single right answer. A legal summary that captures most of the substance, though not every detail, still has value, and the original measure couldn’t record that.

Second, the original scope measure relied on Kolmogorov complexity, which is not practically computable and was underspecified when applied to whole input and output spaces.

Messina substituted Shannon joint entropy, a distribution-based measure that is easier to estimate from data.

The revised conjecture keeps its shape—S(M) multiplied by C(M) stays below k—with both measures now coherent and computable.

Floridi had already flagged that colleagues were reformulating his idea in Shannon terms, and he called that good news.

A separate disproof by Generoso Immediato argues that the universal product form fails under Floridi’s original definitions, and that Messina’s entropy-based revision still can’t restore a universal bound.

Messina’s revision therefore fixes some formal weaknesses in Floridi’s setup, but it doesn’t settle the larger dispute over whether the certainty-scope trade-off can hold as a universal mathematical law.

Floridi’s conjecture continues to draw responses across computer science, information theory, and philosophy.

What it means for AI hallucination

The conjecture gives a theoretical base for a pattern practitioners already report: AI hallucination has no complete fix.

Any useful general-purpose large language model, one that covers law, medicine, finance, code, and writing at once, runs deep in the high-scope range. The conjecture says this would cap its certainty below 1.

Messina’s version reaches a similar practical warning through a gentler user-centric measure: as entropy rises across a domain, perfect correctness becomes harder to defend. But that revised ceiling remains disputed.

So the practical question changes from which AI tool has the lowest error rate to what error rate a given task can absorb, and what controls go around it.

Three things the Floridi conjecture doesn’t claim

It doesn’t cover human intelligence. Floridi confirmed this in discussion: people gain breadth without losing depth in ways the conjecture doesn’t model. The limit applies to sufficiently expressive AI mechanisms, not to intelligence as such.

It doesn’t say broad AI has no value. High-scope systems still deliver value; they just also carry irreducible uncertainty that governance has to plan for.

It doesn’t rule out progress. As one person noted in the same thread, different architectures might reach different certainty at the same scope, so there’s a band of possible curves rather than one fixed line. Better design moves a system to the outer edge of what the limit allows, without breaking it.

What it means in practice

Layered design offers one partial response. Pair high-certainty, low-scope parts, such as formally verified logic, rules engines, or structured databases, with low-certainty, high-scope parts such as generative AI.

Distrust single-number accuracy claims. A score that reports both broad coverage and near-perfect accuracy deserves hard questions. Either the real scope is narrower than it looks, or the figure has no worst-case proof.

Messina’s user-centric measure helps here: benchmark accuracy may not match how often a model gives acceptable answers on your users’ actual inputs.

Useful evaluation reports coverage and confidence as pairs, not one headline number. Scrutinise your vendor’s model card closely.

Match governance to scope. A buyer or regulator who demands zero error from an open-world system assumes the conjecture is false, and the burden of proof falls on them.

Under risk-based AI regulation like the EU AI Act and similar regimes, a defensible standard names an acceptable error rate per use case, applies controls in proportion to the cost of a wrong answer, and audits AI models against that line.

Floridi treats this as a question of ethics as much as engineering, and the paper puts it plainly: “policies predicated on 100% correctness in an open domain” assume the trade-off can be sidestepped.

Tags: artificial intelligence

Get a free audit

Book a 30-minute call to see where AI could help your organisation.