Gartner Says 40% of Organizations Will Have AI Observability by 2028

Evaluate platforms closing the gap between visibility and real governance accountability.

Gartner's Prediction

On May 12, 2026, Gartner released a prediction: 40% of organizations deploying AI will implement dedicated AI observability tools by 2028 to monitor model performance, bias, and outputs.

"AI is everywhere, but most organizations are still figuring out how to monitor and trust these systems. That visibility gap makes scaling risky."
Padraig Byrne, VP Analyst at Gartner

Byrne also noted that the shift is being driven by executive concern over risk management in agentic AI, not just infrastructure health, and that "failure to adopt these tools exposes organizations to significant governance risks."

Gartner also published a related prediction in March 2026: by 2028, explainable AI will drive LLM observability investments to 50% of GenAI deployments, up from 15% today. These two forecasts paint a picture of a market waking up fast to the reality that AI systems operating in production without continuous visibility are a liability. That is an accurate picture. The observability market is real, the demand is real, and the urgency Gartner is naming is real.

40%OF AI-DEPLOYING ORGS WILL HAVE OBSERVABILITY TOOLS BY 2028
— GARTNER

15%OF GENAI DEPLOYMENTS HAVE LLM OBSERVABILITY TODAY
— GARTNER

60%OF AI-DEPLOYING ORGS STILL WON'T HAVE IT BY 2028

Observability Is a Monitoring Solution. Governance Is Something Else Entirely.

Here is the problem with celebrating the 40% figure: having an observability tool is not the same thing as having a governed AI program. A dashboard that captures model outputs tells you what the system produced. Governance tells you who was accountable for reviewing it, what they decided, and what evidence exists that they acted. Those are completely different things — and treating the first as a substitute for the second is how organizations end up with every surface feature of a governed program and none of the substance. This is what we like to call WEAK GOVERNANCE

The Gartner prediction measures tool adoption. It says nothing about whether those tools are connected to named accountability structures, enforced policy, or audit trails that capture human decisions rather than just system events. Organizations that buy observability platforms and treat them as governance are building the most expensive version of false confidence available in the market right now. The 60% who won't have observability by 2028 are dangerous — everyone agrees on that. The 40% who will have it but mistake it for governance are equally dangerous; they're just harder to identify until something goes wrong.

The Truth

Visibility is necessary infrastructure. It's not sufficient governance. The gap between those two things is where most AI incidents actually live — and this week's news proves that point from four completely different directions.

A Research Paper Published in March Said This Was Coming And Laid Out Exactly Why

Agents Don't Need a Better Brain — They Need a World · Danilo Naranjo Emparanza, Ocular Solution

A governance architecture paper proposing the Digital Citizenship Protocol for AI (DCP-AI), arguing that most critical agent failures are infrastructure failures and not model failures.

Danilo Naranjo Emparanza at Ocular Solution published a paper in March 2026 with a deliberately provocative thesis:

“The next major bottleneck in safe AI deployment isn't model intelligence — it's institutional infrastructure.”
Danilo Naranjo Emparanza at Ocular Solution

The paper maps eleven documented categories of autonomous agent failure — identity spoofing, opaque delegation, uncontrolled resource acquisition, orphaned processes, inter-agent conflict — and argues that none of them are primarily model problems.

They're governance problems. They happen because agents operate in environments that have computational power and network access, but no shared protocols for identity, accountability, or policy enforcement.

The paper proposes DCP-AI, a layered protocol architecture spanning cryptographic identity, intent declaration, tamper-evident audit trails, lifecycle governance, and delegated representation. The key insight connecting back to our counterclaim: observability tools give you visibility into what an agent did. The DCP-AI framework gives you the institutional substrate to know who authorized it, under what policy it acted, and what the audit trail shows a human did next.

Those are not the same capability. Gartner is predicting adoption of the first. The market gap the DCP-AI paper identifies is the absence of the second and no amount of dashboard adoption closes it.

Credo AI's GAIA Shows Human Governance Is Already Collapsing at Scale

GAIA — Govern AI Assistant · Credo AI

An AI-powered governance assistant designed to speed up intake and review of AI systems — reducing manual review time by up to tenfold for enterprise governance teams.

Credo AI's internal research showed a number that should alarm every governance team: one enterprise added 100 AI use cases for review while having capacity to review only six. GAIA — their Govern AI Assistant — was built specifically because human governance teams can no longer keep up with the volume of AI systems being deployed. It pre-fills governance questionnaires, maps use cases to risk scenarios and regulatory frameworks, and accelerates the review process up to tenfold. This is genuinely useful; governance teams drowning in intake reviews need exactly this kind of tooling.

But look at what this proves about the observability claim. If governance teams can't review six out of every hundred AI systems they're responsible for, adding observability dashboards to each of those systems generates more unreviewed signals, not better governance outcomes. GAIA is solving for the institutional processing gap — the accountability that turns monitoring signals into documented human decisions. That's the right problem to solve.

It also confirms exactly what our counterclaim states: the bottleneck isn't visibility, it's the accountability infrastructure that makes visibility actionable. Credo AI figured that out and built a product for it; the Gartner prediction doesn't measure it at all.

LuminosAI Monitors Proves Periodic Reviews Are Already Dead

LuminosAI Monitors · Continuous Legal Risk Testing for GenAI and Agentic Systems

Continuous, automated legal risk testing integrated directly into CI/CD pipelines and production environments — running evaluations in the background without disrupting engineering velocity.

LuminosAI launched Monitors on May 12, 2026 — continuous legal risk testing for generative AI and agentic systems running directly in production. CEO Andrew Burt described the core problem plainly:

“Most legal and compliance issues surface after deployment because model drift, changing prompts, and expanding use cases introduce new risks months after the initial assessment cleared everything.”
Andrew Burt CEO

Monitors addresses that by running continuous evaluation against discrimination, privacy, IP, and regulatory risk in real time, generating legally defensible audit trails for every detection.

This is the monitoring layer getting sharper and more continuous — exactly what Gartner's prediction is tracking. Connecting it back to the counterclaim: LuminosAI Monitors tells you when a legal risk surfaces in production. The question it doesn't answer though, is who owns that signal, what their documented response obligation is, and what the audit trail shows they did when the flag fired. LuminosAI generates the evidence. The accountability that ensures someone acts on it — and that their action is captured — has to exist independently. Both are necessary; neither replaces the other.

Organizations that adopt Monitors and skip the accountability structure have better data about their exposure without any more assurance that someone is actually managing it.

The Shai-Hulud Worm Hit AI Guardrail Packages And Observability Dashboards Couldn't Stop It

Mini Shai-Hulud Worm · TanStack, Mistral AI, Guardrails AI, 170+ Packages Compromised

A self-propagating worm compromised over 170 npm and PyPI packages — including Mistral AI's official SDKs and Guardrails AI — by gaining control of legitimate maintainer CI/CD pipelines while generating valid SLSA attestations.

On the same day Gartner published its observability prediction, a supply chain attack called Mini Shai-Hulud was actively compromising AI packages across npm and PyPI. The attack hit Mistral AI's official SDKs, Guardrails AI, TanStack, UiPath, OpenSearch, and over 170 other packages — with combined download counts exceeding 500 million. The threat actor, tracked as TeamPCP, compromised legitimate CI/CD pipelines and published malicious versions with valid SLSA Build Level 3 provenance attestations. Every automated pre-publication check passed. The packages looked fully legitimate according to every standard security signal available.

This is the DCP-AI paper's identity and authentication failure playing out in production, at scale, on the exact same day Gartner said the market needs more observability. Guardrails AI — a package specifically designed to add safety controls to AI systems — was itself compromised. An observability dashboard watching model outputs would not have caught this before it landed in production; the packages carried valid provenance records. What would have caught it is behavioral integrity verification or a continuous validation that installed packages behave within expected bounds, connected to a named owner who acts when they don't. That's institutional infrastructure. Gartner's 40% figure doesn't measure it.

Workday Had Audit Outputs.

Workday's AI Hiring Lawsuit Exposed the Governance Failure Nobody Wants to Talk About

Mobley v. Workday — 1.1 billion applications rejected via AI screening, federal class action certified, bias audit that cleared Workday contradicted by plaintiff analysis of the same data.

We covered the Workday lawsuit in detail this week — and it belongs in this piece because it is the most documented real-world proof of the gap we're describing. Workday had system logs. They had a bias audit. They had outputs documented across 1.1 billion screening decisions. What they didn't have was a governance program: no named person with authority to override automated decisions, no audit trail capturing human responses to what the system produced, no independent validation of the audit methodology they chose. The audit produced a document that cleared them. Plaintiff attorneys ran the same data and found disparities with odds greater than one in a quadrillion of being race-neutral.

That is the observability-versus-governance gap in federal court. Workday had visibility into outputs. The absence of institutional accountability — named owners, human response trails, independent methodology — turned that visibility into liability rather than protection. Every organization that adopts an observability platform and stops there is building the same exposure Workday is currently defending in a class action that could include hundreds of millions of people. The Gartner prediction tells you when organizations get the dashboard. It tells you nothing about whether they build what comes after it.

What Actually Closes the Gap Between Visibility and Governance

Now we’re not saying that Gartner is wrong with their thesis. They’re right in many aspects but they’re also in the unknown in many and it’s not their faults. This is a brand new market and nobody really knows what’s going on so they are right but there are just some things that they missed. The evidence from this week points to the same answer from four different directions. Observability tools generate signals — Gartner is right that most organizations don't have enough of them. The institutional layer that makes those signals governable requires three things observability platforms don't provide on their own: named accountability for every signal, audit trails that capture human decisions rather than just system events, and policy enforcement that operates at the agent layer before outputs are produced rather than after.

Credo AI's GAIA is building the accountability processing layer. LuminosAI Monitors is making the legal risk signal continuous and legally defensible. The DCP-AI framework is proposing the protocol substrate that makes agent identity, intent, and behavioral history verifiable across systems. Prescient Security's continuous AI pentester is extending adversarial validation into production rather than treating pre-deployment testing as sufficient. These are not competing with Gartner's observability prediction — they're describing the layers that have to sit underneath it for observability to mean anything from a governance standpoint. The 40% who adopt observability tools by 2028 and also build these layers will have a governed AI program. The ones who stop at the dashboard will have a very expensive false sense of security.

Our Take

AI Governance Take

Gartner's prediction is right about the direction. The 40% figure is the alarming part, not the reassuring one — because it means 60% of organizations deploying AI still won't have basic observability infrastructure two years from now. That's a real problem. The sharper problem is what the 40% do with what they build.

A monitoring dashboard connected to no accountability structure is documentation of exposure, not governance of it. The Workday case proved that in federal court. The Shai-Hulud worm proved it at the supply chain layer. Credo AI's GAIA proved it internally — one enterprise couldn't review 94 out of 100 AI systems it was responsible for, and the answer was building institutional processing capacity, not adding another monitoring tool.

The organizations that close this gap by 2028 — the ones that connect their observability tools to named signal owners, enforced policy, and audit trails that capture what humans decided — those are the ones that can answer a regulator's questions when asked. The window to build that infrastructure deliberately is right now, while the incident hasn't happened yet. Waiting for the dashboard to tell you something went wrong and then discovering the accountability structure wasn't there is a very expensive way to learn this lesson.

GetAIGovernance

Back to All Articles

AI Governance

AI Security

AI Monitoring

AI Compliance

AI ROI

Need help choosing?

AI Governance

AI Monitoring

AI Compliance

AI Security

Research Reports

Market Trend Analysis

Explore All Resources

Browse AI Monitoring

Browse AI Governance