agent tool risk

The tools your AI agents use are unvetted.

MCP became the way AI agents reach the real world: files, databases, payment systems, internal tools. The servers behind those tools are installed by the thousand and run with deep access. Almost none of them have been independently run before someone trusted them.

See sealed report Open public registry

the shift

Agents moved from talking to acting.

A year ago an AI assistant answered questions. Now it opens files, edits records, calls APIs, and routes approvals through MCP servers. The model is no longer the risky part. The risky part is the tool it just decided to call.

then

It answered

The worst case was a wrong sentence. Nothing left the chat window.

now

It acts

The agent runs a tool with access to local files, credentials, and shell commands.

soon

Trusted by default

Teams wire dozens of servers into agents faster than anyone can review them.

the gap

Nobody runs the server before trusting it.

Every MCP directory today is a list of self-reported entries. No one executes the server to see if it speaks the protocol, what it does on startup, or what its source reaches for. The artifact an agent will drive ships on trust alone.

See what execution looks like

directoryname + stars + hope

throneexecuted in a microVM

outputtwo verdicts + evidence

standingpublic, dated, disputable

the risk

Two failures, both silent until they are not.

breaks

It breaks

A server that fails on a real client fails for every developer who installed it. The agent calls a tool, the call times out or returns garbage, and the first signal anyone gets is an issue thread, not a test.

unsafe

It is unsafe

A path-traversal bug, an unescaped command, or a prompt-injection surface becomes a breach vector the moment an autonomous agent drives it. No human reads each step, and no evidence trail explains what happened.

why now

The evidence question is coming.

SOX, GxP, ISO, and internal audit were built around people following approved workflows, not agents moving through systems on their own. As agents take real actions, teams will have to answer which tools are safe to allow, what proves they behave, and how an agent-driven action gets explained after the fact. That answer does not exist yet for MCP.

the answer

A trust layer that actually runs the server.

Throne executes every MCP server in a disposable microVM, tests it against client behavior calibrated from real Claude Code and Cursor traffic, and scans its source. Two independent verdicts, never mixed: does it work, and is it safe.