Why 90% accuracy fails in business automation

Here's a number that sounds good until you do the maths: 90% accuracy on a document processing task.

Ninety percent feels like a win. In most business contexts, 90% would be a strong result. But in automation it leaves you with a constant stream of exceptions to handle. In many cases, that’s harder to manage than doing the work yourself.

Think about what 90% means in production. If a process handles 1,000 documents a day, 90% accuracy means 100 exceptions per day land on someone's desk for manual review. Every day. Without fail. You haven't reduced the human workload, you've just changed its shape. Instead of humans doing the work, they're now auditing the AI's work. Which is slower, more frustrating, and harder to scale than just doing it manually.

At 80% accuracy — which many pilot projects celebrate as a milestone — you need a human review on one in five items. Every item. The AI is no longer an autonomous system, but an input queue for your existing team.

There's a threshold below which automation doesn't reduce cost. It adds aggravation.

The accuracy paradox in automation

The uncomfortable reality of enterprise automation is that partial accuracy is not a stepping stone to full automation. It's a different product category entirely.

A system running at 85% accuracy requires constant human supervision. That supervision becomes a permanent operational cost. The team can never fully disengage from the process, because they don't know which 15% of outputs to trust and which to question. The cognitive overhead of auditing AI output is often higher than the cognitive overhead of doing the work, because at least when you do the work yourself, you know what you've done.

This is why so many AI pilots look impressive in a controlled demo and fail in production. In a demo, you show the successes. In production, the 10-20% failure rate is your operations team's full-time job.

Real automation requires accuracy in the 95-99% range before it achieves genuine operational relief. Below that threshold, you're not automating a process, just creating an expensive review queue.

Why accuracy matters for AI tool selection

The global market for intelligent document automation passed USD 8 billion in 2024, growing at nearly 15% per year. Hundreds of vendors now claim to automate document-heavy workflows. Most of them operate in that dangerous middle ground: accurate enough to impress in a demo, not accurate enough to remove humans from the process.

The reason is architectural. Generic AI systems like large language models abd general-purpose extraction tools, are optimised for breadth, not precision. They perform well across a wide range of inputs. But enterprise document workflows are not wide and varied. They're narrow and specific: your supplier's formats, your product codes, your regulatory requirements, your edge cases.

A generalised model hasn't been trained on your insurance policy templates, your manufacturing order formats, or your broker statement structures. It hasn't learned the business rules your team applies when those documents arrive inconsistently formatted or partially complete. It will extract data reasonably well in the middle of the distribution. It will fail on the edges, and the edges are exactly where human judgment was most needed in the first place.

The cargo cult problem in enterprise AI

There's a well-known story from the Second World War about indigenous communities in the Pacific who observed military operations on their islands. They saw planes land and bring goods. When the military left, they built runways out of bamboo and lit fires to guide planes that never came. They had copied the form of the thing without understanding the substance.

Richard Feynman later used this story to describe cargo cult science: research that has the appearance of rigour without its essence.

It applies directly to how many enterprises are approaching AI. They see compelling demos. They run a pilot. They copy the surface behaviours — an AI tool, a proof of concept, a slide deck with accuracy numbers — without building the underlying precision required to make the system work in production.

The result: months of integration work, a 75% accuracy ceiling, and an operations team now stuck manually correcting AI output instead of doing their actual jobs. The bamboo runway looks right. It just doesn't do what you built it for.

What real automation requires

For a process to be genuinely automatable, three conditions need to be met simultaneously:

Accuracy high enough that the AI can own the outcome, not just assist with it.
Trainability: the model needs to learn from your specific documents, your formats, your exceptions, not just a generic training corpus.
Auditability: in regulated industries, every decision the system makes needs a traceable explanation. "The model said so" is not sufficient for compliance.

These requirements are incompatible with off-the-shelf, general-purpose AI tools. They require purpose-built systems trained on domain-specific data, with production-grade precision and compliance controls embedded from the ground up.

‍

Where KAPTO operates in automation

KAPTO was built at the other end of this spectrum. Not broad, but precise. Not general, but vertical. Not impressive in demos, but reliable in production.

The platform uses proprietary models trained on customer-specific document types like insurance claims, manufacturing orders, broker statements, and invoice packages. These models are designed specifically for each workflow, with the level of precision those processes require.

The result is accuracy above 98% on complex, high-variability document workflows. KAPTO uses the right model for the task: constrained, auditable, and trained on the documents it will actually encounter in production.

Vertical by design: Insurance, Manufacturing, Healtcare - not one-size-fits-all.
Production-grade from day one: not a pilot tool. Deployed in live operational environments.
Integrated execution: KAPTO doesn't just extract data. It validates, decides, and acts, directly in your ERP, CRM, or core systems.
Governance built in: single-tenant, EU data residency, GDPR and AI Act compliant.

The global IDP market is growing at 16% CAGR through 2029. Most of that spend will produce copilots, not automation. KAPTO is already at the frontier that the rest of the market is still heading toward.

‍

Gabriel De Dominicis

Gabriel is a co-founder and serves as the Managing Director and Head of AI in KAPTO. He leads KAPTO’s vision and technology. A mathematician-turned serial entrepreneur with 25+ years in enterprise IT, he focuses on execution-first AI for complex, regulated operations.

Have some questions about the topic?
Drop a message to me on LinkedIn.