The framework the model can't talk its way out of.
Most agent safety filters what the model says. InnerWarden governs what the agent is allowed to do: every action is screened before it runs, and the enforcement-critical ones are refused in the kernel, where a jailbroken agent cannot argue with an -EPERM. Every row below maps to a real control and a passing test, not a marketing claim.
Ten threats, ten controls, each one it fires.
Agent Goal Hijack
Prompt-injection detection (24 patterns + ATR rules) on every command, argument and tool response; obfuscation flagged.
Tool Misuse
check-command denies dangerous tool calls (download-and-execute, tmp-exec, tampering); the kernel Execution Gate enforces it.
Delegated Trust
Actions beyond the agent's remit route to a human for approval (Telegram / Slack / dashboard) before they run.
Data Exfiltration
Credential-file read detection in the kernel, plus a secret/PII redaction transform that scrubs data before it leaves.
Privilege Escalation
Privilege-provenance detection, and the kernel Execution Gate refuses an unauthorized privileged exec. Non-forgeable.
Inter-Agent / Cross-Boundary
Per-tenant attribution read from the kernel cgroup (unspoofable); paid per-pod containment stops one agent reaching another's data.
Memory Leakage
The redaction transform scrubs the primary leakage vector: secrets and PII crossing into the agent's context.
Operator Control
An independent watchdog and a kernel disarm kill-switch keep the operator in control even if the agent misbehaves.
Cost / Quota Abuse
A per-session circuit breaker halts a runaway tool-call loop instead of billing another iteration.
Rogue Agents
check-command deny plus the kernel Execution Gate block unauthorized exec: reverse shells, miners, destruction.
Free detects. Paid enforces.
The open-source core detects and advises across all ten. An agent that respects the verdict is fully guarded. Active Defence adds kernel enforcement for tool misuse, privilege escalation and rogue agents (ASI02 / 05 / 10), unbypassable even by a jailbroken, non-cooperative agent, plus per-pod containment for cross-boundary attacks (ASI06).
Where we're honest about scope
ASI07 (Memory Leakage) is marked primary vector: the redaction transform removes obvious secrets and PII from content crossing into the agent's context, the way agents leak most. It is not a full persistent-memory scrubber, and we say so rather than claim more than the code does.
Because every verdict maps to its threat class, a deny doesn't just say no. It says which agentic threat it caught. check-command returns the OWASP Agentic ids (["ASI02","ASI10"]) alongside the decision, so your security team sees it in the framework they already evaluate against.