Arize Built the Signal That Catches Credential Theft in Agent Traces

Eighteen minutes. That is how long a malicious VS Code extension called Nx Console spent in the marketplace before anyone caught it in May 2026. The extension had been compromised through a GitHub token stolen during the Mini Shai-Hulud supply chain campaign that preceded it. In those eighteen minutes it ran on roughly 6,000 developer machines and collected npm tokens, AWS credentials, GitHub tokens, vault tokens, SSH keys, and the configuration file where Claude Code stores its Anthropic API key on disk.

The governance point that makes this more than a security incident bulletin: the extension did not exploit a model vulnerability or escalate privileges through some exotic technique. It ran exactly as the extension host designed it to run. The harness executing your AI agent dispatched what it loaded from disk, because the harness is not a security boundary. Neither is the model. And most AI monitoring programs in production today are watching neither of those surfaces at the level that would surface this pattern before or during an attack.

On June 9, Arize AI's Nancy Chauhan published a monitoring methodology that catches credential-harvesting behavior directly in agent harness traces — specifically by detecting file reads that land outside an agent's defined workspace. This is the first concrete, production-ready operationalization of off-tree read detection for AI agents that GAIG has seen documented with real incident data, implementation methodology, and runnable code.

Time in Marketplace
18 min Window between Nx Console compromise publication and discovery. Enough time to run on 6,000 machines.
Machines Affected
6,000 Developer workstations where the compromised extension executed during the 18-minute window.
Packages — Shai-Hulud Wave 1
172 AI toolchain packages compromised in the Mini Shai-Hulud campaign that supplied the stolen token used in the Nx Console attack.

What Actually Happened

The Nx Console compromise did not arrive in isolation. It was the second wave of a coordinated campaign, and understanding the first wave is necessary to understand why the second one succeeded as quickly as it did.

April 29 — May 12, 2026

Mini Shai-Hulud — Wave 1

Attackers stole OIDC tokens from CI pipelines and used them to republish malware into legitimate packages through the maintainers' own pipelines. Over 400 malicious versions published across 172 packages in npm and PyPI. Each version carried valid SLSA build provenance — cryptographic records confirming the packages were built through legitimate, unmodified development processes. Primary targets included TanStack, UiPath, the OpenSearch JavaScript client, mistralai, and guardrails-ai. Persistence hooks were written into .claude/settings.json and .vscode/tasks.json — files the editor reads on every open, meaning the hooks survived package uninstall.

May 2026 — Direct Consequence

Nx Console Compromise — Wave 2

Attackers used a GitHub token collected during Mini Shai-Hulud to publish a compromised version of Nx Console into the VS Code marketplace. The extension ran on 6,000 developer machines in 18 minutes before discovery. Its credential harvest targeted seven specific files that AI developers are likely to have on their workstations.

The full list of files Nx Console targeted during the attack simulation that Arize documented:

~/.npmrc

~/.aws/credentials

~/.vault-token

~/.claude/anthropic_api_key

~/.claude/mcp_servers.json

~/.ssh/id_rsa

/etc/passwd

Every file on that list sits outside a normal agent workspace. A legitimate agent session working on a software project reads files inside the project directory. It has no business reading ~/.aws/credentials or ~/.ssh/id_rsa. That behavioral distinction — reads inside the workspace versus reads outside it — is the signal the Arize monitoring approach is designed to detect. Arize calls files accessed outside the defined project workspace "off-tree reads."

On June 9, Nancy Chauhan of Arize AI published the detection methodology, complete with a Python implementation of the off-tree read classifier, a simulated Nx Console attack trace showing session.off_tree_read_count = 7 with the full path array attached to the root span, and integration guidance for the Arize AX platform. The implementation is built on OpenInference tracing, which instruments agent harnesses to record every tool call as a structured span — which tool ran, which file it touched, and when.

"Every file system read in the malicious session was outside the project workspace — a clear signal that would not appear in a legitimate coding session."
Nancy Chauhan
Software Engineer –– Arize AI
June 9, 2026

How It Works

Part A: The Architectural Failure

The harness executing your AI agent is a dispatch layer. Its job is to load tool definitions, fire lifecycle hooks on events, and route inputs and outputs between the model and the tools it calls. It verifies none of what it loads for behavioral integrity or origin. Once a malicious dependency or a hooked configuration file exists on the machine, the attacker's code runs the next time the harness acts — because the harness trusts its load path unconditionally.

The Mini Shai-Hulud persistence mechanism illustrates this precisely. The hooks written into .claude/settings.json and .vscode/tasks.json survived package uninstall because the harness reads those configuration files every time the editor opens. Removing the malicious package left the hooks in place. The next editor session executed them. The harness did not distinguish between a hook that had always been there and one that was written by malware the previous afternoon.

Microsoft's documentation of CVE-2026-25592, a Semantic Kernel vulnerability from the same period, makes the same architectural point from a different angle. The vulnerability allowed an attacker to override Semantic Kernel's tool selection mechanism through a malformed tool response — steering what the kernel dispatched without touching the model or the user's instruction. The harness executed the redirect faithfully because it had no mechanism to verify that the tool selection was consistent with the session's intended behavior. A Microsoft security researcher described the category plainly: "The model is not a security boundary and neither is the kernel."

Most AI security programs draw their perimeter around the model — what goes in, what comes out, whether the outputs are appropriate. The Nx Console attack and Mini Shai-Hulud both operated at the harness and dependency layers, which sit underneath the model in the execution stack. The model never saw the credential harvest. The outputs the model produced were completely normal. The attack happened in the plumbing the model runs on, not in the model itself.

Part B: How Off-Tree Read Detection Works

Agent harnesses instrumented with OpenInference tracing already record every tool call as a structured span. The span captures which tool ran, what arguments it received, what files it accessed, and when. A normal agent coding session produces file access spans pointing inside the project workspace. A compromised session produces file access spans pointing to home directory credential files, OS configuration directories, and absolute paths that have no relationship to the project the agent is supposed to be working on.

The off-tree read classifier that Arize built is a function that takes a file path and returns true if it lands outside the defined workspace root — home directories, OS config directories, or any absolute path not under the project root. The session-level monitor aggregates these across all tool calls in a session, attaches the total as session.off_tree_read_count to the root span, and appends the full array of off-tree paths for human investigation. The monitor fires when any session in the evaluation window has an off-tree read count above zero.

The Arize AX simulation of the Nx Console attack produced this trace output on the root span:

Simulated Nx Console Attack — Root Span Output

# Root span attributes after classifier runs
session.off_tree_read_count  = 7
session.off_tree_read_paths  = [
    "~/.npmrc",
    "~/.aws/credentials",
    "~/.vault-token",
    "~/.claude/anthropic_api_key",
    "~/.claude/mcp_servers.json",
    "~/.ssh/id_rsa",
    "/etc/passwd"
]

# Monitor evaluation result
monitor.off_tree_read_alert  = True
monitor.fired_at             = "next evaluation cycle"

A clean session — one where the agent is reading source files, configuration files inside the project directory, and documentation it was pointed at — produces session.off_tree_read_count = 0. The monitor never fires. The detection has no false positive cost for normal agent behavior because normal agent behavior doesn't read home directory credential files.

The implementation uses Arize AX's evaluation framework, but the underlying approach is platform-agnostic. Any agent harness instrumented with OpenInference spans that captures file access paths in its tool call records can run this classifier. The paths are already in the trace data if the instrumentation covers file system tool calls — which it does in any implementation that follows the OpenInference standard for file tool spans.

Four Control Layers With Four Consequences

Control Layer	What the Attack Exposed	What the Detection Addresses
AI Security Security	The harness and dependency layers are attack surfaces that most AI security programs do not instrument. Mini Shai-Hulud spread through 172 packages with valid SLSA provenance. Every standard supply chain check passed. The attack operated below the security perimeter most programs draw.	Off-tree read detection closes a specific gap in the harness layer — file access outside the workspace is a behavioral signal regardless of how the attacker got code running. It complements host controls but doesn't require them to be present to fire.
AI Monitoring Monitoring	The signal was already in the traces. Most monitoring programs were not watching it. The off-tree read pattern maps directly to GAIG's Tool Invocation Drift Pre-Failure Signal — agents making tool calls with file access patterns outside their defined behavioral baseline. The data existed. Nobody built a monitor for it.	The Arize implementation is roughly an afternoon of work on top of existing OpenInference instrumentation. The gap between "this data exists in our traces" and "we have a monitor configured for this pattern" is an accountability question: who owns the monitoring configuration for agent harness behavior? That question needs a named answer.
AI Governance Governance	The supply chain dimension connects directly to GAIG's AI Supply Chain Pre-Failure Signal. SLSA provenance verified build origin, not behavioral integrity. An AI program with named ownership over its dependency stack — where someone is responsible for lockfile review, dependency pinning, and config file version control — would have had the controls that detect the persistence hooks Mini Shai-Hulud wrote into settings files.	Dependency pinning and lockfile diff review sit upstream of any monitoring solution. They are governance controls that require named ownership, not technical tools. The monitoring catches what slips through. The governance controls reduce what needs catching.
AI Compliance Compliance	Trace retention is the compliance implication Arize names explicitly. When a compromise is disclosed weeks or months after it occurred, the ability to query which tool calls ran during the exposure window — and whether any of them touched off-tree paths — determines whether an organization can assess its actual exposure versus estimating it.	Retaining traces long enough to cover the likely gap between infection and disclosure is audit readiness for the AI supply chain layer. EU AI Act Article 72 post-market monitoring obligations will eventually need to address trace retention windows. Most compliance frameworks haven't defined this requirement yet. Setting retention policy now puts programs ahead of the regulatory curve.

The monitoring accountability point deserves more direct treatment than a table cell. The off-tree read signal was present in agent trace data during every session that ran on a machine with Mini Shai-Hulud or Nx Console installed. Whether any program was configured to watch that signal — and whether anyone was named as responsible for acting on an alert if it fired — is the governance question underneath the monitoring question. The Arize implementation closes the technical gap. The accountability gap requires organizational design, not just a deployed classifier.

"Detecting credential theft in agent harness traces requires monitoring the signals your traces are already generating — the gap isn't the data, it's building the monitor."
Nancy Chauhan
Software Engineer — Arize AI
June 9, 2026

What's Still Missing

Three Things the Off-Tree Read Monitor Cannot Catch on Its Own

The extension code path. The Arize monitor catches credential harvesting that runs through an instrumented tool call — a file read that the agent harness dispatches and that OpenInference records as a span. The Nx Console extension's actual credential harvest ran as extension code in the VS Code extension host, not through an agent tool call. Those reads appear in host telemetry, not in agent traces. The monitor as described would not have caught the Nx Console attack directly in the traces — it would catch the pattern if an agent tool executed the same reads, but the extension ran them outside the agent layer. Arize says this plainly in the post: the agent layer monitoring complements host and install-time controls. It was never presented as a standalone solution.
The provenance attestation gap remains open. Mini Shai-Hulud carried valid SLSA build provenance on every malicious package. The supply chain attack succeeded because provenance attestation verifies build origin but says nothing about behavioral integrity — whether what was built does what the package description claims. No monitoring solution closes this gap. The governance controls that address it are dependency pinning, lockfile review, and behavioral baseline monitoring for harness configuration changes. These are organizational practices that require named ownership, not products that can be purchased and deployed.
The major enterprise AI platform announcements from the past two weeks said nothing about this. Microsoft Agent 365, Salesforce Agentforce Operations, and SAP's agentic infrastructure announcements from the same period addressed orchestration, workflow management, and enterprise integration. None of them addressed supply chain dependency governance at the harness layer. Dependency pinning, lockfile review, config file version control, and behavioral baseline monitoring for harness configuration changes remain practices that enterprise AI security programs have not standardized. Both May 2026 attacks proved that gap is exploitable at scale across the AI developer toolchain.

The honeytoken approach Arize recommends in the same post deserves more emphasis than it typically receives. The technique is simple: drop a fake API key in ~/.aws/credentials with a recognizable format but no real access. Configure an alert on any attempt to use it. Nothing in a legitimate workflow ever touches a fake credential. Any use of it — regardless of how the attacker obtained it, whether through an instrumented tool call, through extension code, or through a process that never touched the agent layer — means someone took the credential and attempted to use it. The honeytoken fires across all three attack paths simultaneously, without requiring trace instrumentation on any of them. It is the simplest cross-layer detection mechanism available and it should be deployed alongside the trace-based monitor.

Sources

Arize AI — Nancy Chauhan. "How to Detect Credential Theft in AI Agent Harness Traces." June 9, 2026. Primary source for off-tree read detection methodology, Nx Console simulation trace output, implementation guidance, and honeytoken approach. arize.com/blog/how-to-detect-credential-theft-in-ai-agent-harness-traces
SecurityWeek — "TanStack, Mistral AI, UiPath Hit in Fresh Supply Chain Attack." May 2026. Source for Mini Shai-Hulud wave 1 details: 172 packages, 400+ malicious versions, SLSA provenance, targeted packages including mistralai and guardrails-ai, persistence hook mechanism. securityweek.com
GetAIGovernance.net — "Mini Shai-Hulud Worm Compromises TanStack, Mistral AI, Guardrails AI, and Dozens of Other Packages." May 2026. GAIG's full coverage of the supply chain campaign, CI/CD pipeline compromise mechanism, and governance implications. getaigovernance.net/blog/mini-shai-hulud
Microsoft Security Response Center — CVE-2026-25592. Semantic Kernel vulnerability allowing tool selection override through malformed tool response. Documentation of the "model is not a security boundary, kernel is not a security boundary" principle. msrc.microsoft.com/update-guide/vulnerability/CVE-2026-25592
OpenInference — Tracing specification for AI and LLM applications. Span schema covering tool calls, file access, and agent harness events. github.com/Arize-ai/openinference
SLSA Framework — Supply chain Levels for Software Artifacts. Build provenance specification and limitations in detecting behavioral compromise in legitimately-built packages. slsa.dev/spec/v1.0
GetAIGovernance.net — "Gartner's Four Critical AI Threats Are a Security Problem and a Governance Failure." June 3, 2026. Context on prompt injection and supply chain as Gartner-named structural threat categories relevant to harness-layer attacks. getaigovernance.net/blog/gartner-four-threats

Our Take

AI Monitoring Take

The Arize post documents something that should be uncomfortable for monitoring teams to read: the off-tree read signal was present in agent trace data during every session that ran on a compromised machine. The credential harvest left a record in the instrumentation that was already running. The monitor that would have fired on it is about an afternoon of implementation work. The gap between those two facts is not a technology problem. It is an accountability problem — somebody owns the monitoring configuration for your agent harness behavior, or nobody does, and those two situations produce very different outcomes when an attack runs through your environment.

The implementation priorities that follow from the Arize methodology, in order of effort and coverage:

Deploy honeytokens immediately. A fake AWS credential and a fake npm token in their standard locations, each with an alert on any use attempt. Takes an hour. Catches credential exfiltration regardless of how it occurred. The simplest cross-layer detection available.
Build the off-tree read monitor on top of your existing OpenInference instrumentation. If your agent harness already records file access spans, the classifier is straightforward. The Arize post includes the implementation. If your harness does not record file access spans, that instrumentation gap needs to close before the monitor can run.
Pin your AI dependencies and review lockfile diffs on every update. Dependency pinning is the governance control that sits upstream of everything else. SLSA provenance passed on every Mini Shai-Hulud package. The lockfile diff would have shown new hooks being written to .claude/settings.json. Nobody was reviewing it.
Put your harness configuration files under version control and configure alerts on changes to .claude/settings.json, .vscode/tasks.json, and any other files your agent harness reads on startup. Unexpected modifications to those files are a persistence indicator.
Set a trace retention policy long enough to cover the realistic gap between compromise and disclosure. The Nx Console compromise was discovered in 18 minutes. Most are not. Retaining 90 days of agent traces gives you the query window to assess actual exposure when a supply chain incident is disclosed weeks later.
Name someone as accountable for monitoring configuration for your agent harness behavior. The off-tree read signal was in the traces. The honeytoken detection is trivially deployable. The reason neither exists at most organizations is that nobody was assigned to build them. Assign someone.

GetAIGovernance

Back to All Articles

AI Governance

AI Security

AI Monitoring

AI Compliance

AI ROI

Need help choosing?

AI Governance

AI Monitoring

AI Compliance

AI Security

Research Reports

AI ROI

Market Trend Analysis

Explore All Resources

What the Nx Console Attack Taught Us About AI Agent Monitoring