Skip to content
Honeypots

We Built a Honeypot That Attackers Can't Detect

9 min read

Honeypots are a powerful defensive tool. They let you observe attacker behavior, collect credentials, and waste their time on a fake target. But most honeypots have a fatal flaw: experienced attackers can detect them in seconds. Once detected, the attacker disconnects and moves to your real services.

Inner Warden's honeypot is different. It fakes the filesystem, simulates 25+ Linux commands, and uses a deterministic shell that costs zero AI tokens. Attackers interact with what looks and feels like a real compromised server.

How attackers detect honeypots

The first thing a skilled attacker does after landing a shell is check if it is real. These are the most common detection techniques:

Check for Docker/container
cat /proc/self/cgroup
# If output contains "docker" or "lxc", it's a container
Check init process
cat /proc/1/cmdline
# Real servers show /sbin/init or systemd
# Honeypots often show nothing or a python process
Check CPU info
cat /proc/cpuinfo
# Real servers have detailed CPU model, cores, cache
# Fake environments have generic or empty output
Check memory and uptime
cat /proc/meminfo
uptime
# Honeypots often show unrealistic values or errors

Popular honeypots like Cowrie and Kippo fail several of these checks. Their /proc is empty or returns errors. Their command output is static and does not vary between sessions. Automated attack scripts include these checks as a first step.

A filesystem that passes inspection

Inner Warden's honeypot generates realistic responses for every file path attackers commonly probe. The fake filesystem covers:

/proc
  • /proc/cpuinfo - Intel Xeon, 4 cores, cache sizes
  • /proc/meminfo - 8GB RAM, realistic buffers
  • /proc/self/cgroup - clean, no docker markers
  • /proc/1/cmdline - /sbin/init
  • /proc/version - matching kernel version
  • /proc/uptime - plausible uptime (days, not seconds)
/sys and /etc
  • /sys/class/net - eth0 with MAC address
  • /etc/hostname - realistic hostname
  • /etc/os-release - Ubuntu 22.04 LTS
  • /etc/passwd - standard system users
  • /etc/shadow - permission denied (as expected)
  • /etc/ssh/sshd_config - standard config

Every response is generated to match what a real Ubuntu 22.04 server would show. The CPU model, core count, and memory values are consistent across all paths. An attacker checking /proc/cpuinfo and then nproc gets the same number.

25+ commands with realistic output

The deterministic shell handles the commands attackers run most often. Each command produces output that matches what a real server would return:

CommandBehavior
whoami, idReturns session user with realistic UID/GID
uname -aLinux kernel 5.15, x86_64, matching hostname
ls, ls -laDirectory listings with timestamps and permissions
cat /etc/passwdStandard system users (root, www-data, nobody...)
ps auxRealistic process list (sshd, cron, systemd...)
ifconfig, ip addrNetwork interfaces with plausible IPs
df -hDisk usage with realistic partitions
w, lastLogin history with fake session data
wget, curlSimulated download (logged, not executed)
cd, pwd, echoWorking directory tracking with state

Commands the deterministic shell does not recognize are forwarded to the LLM fallback, which generates a plausible response based on the fake system context. The LLM only activates for unknown commands, keeping token usage near zero for typical attack sessions.

Deterministic shell: zero tokens, instant response

Most AI-powered honeypots send every command to an LLM. This is expensive and slow. An attacker running ls should not wait 2 seconds for a response. That delay is a detection signal.

Inner Warden's honeypot uses a two-tier architecture:

Tier 1: Deterministic shell (95% of commands)

Known commands are handled by compiled Rust code. Response time is under 1ms. No API calls, no tokens, no latency. The output is pre-computed from the fake system profile.

Tier 2: LLM fallback (5% of commands)

Unknown commands are sent to the configured AI provider with the system context (OS, hostname, users, running processes). The LLM generates a plausible response. This handles custom scripts, obscure flags, and commands the deterministic shell does not cover.

In production testing, a typical attacker session of 30-50 commands uses 0-3 LLM calls. The rest are handled instantly by the deterministic shell.

Automatic activation: attack to honeypot in seconds

The honeypot does not run on a separate server. It activates automatically when Inner Warden detects an attack. The flow works like this:

1. detectSensor detects SSH brute-force from 203.0.113.42 (8 failed logins in 30 seconds)
2. decideAI recommends honeypot activation instead of blocking (gather intelligence)
3. redirectiptables REDIRECT rule sends attacker's SSH traffic to honeypot port
4. interactAttacker lands in fake shell. Every command is logged. Credentials captured. Session recorded.
5. reportTelegram alert with full session transcript, captured credentials, and attacker source IP

The attacker never reaches your real SSH service. They believe they have compromised the server. You get a complete record of their tools, techniques, and objectives.

What the honeypot captures

  • Credentials - every username/password the attacker tries. Useful for identifying leaked credentials and shared password lists.
  • Commands - full command history with timestamps. Shows what the attacker is looking for (crypto miners, lateral movement, data exfiltration).
  • Download URLs - any wget/curl targets. These reveal the attacker's infrastructure and malware distribution points.
  • Session metadata - source IP, connection duration, SSH client version, key exchange algorithms.

Set it up

Install Inner Warden and the honeypot is ready to activate:

Install
curl -fsSL https://www.innerwarden.com/install | sudo bash

The honeypot activates automatically when the AI decides intelligence gathering is more valuable than immediate blocking. No manual configuration required.

What to do next

Related reading

Keep following the attack path

Explore Honeypots and Threat Intel
SSH Security

30 Days on a Fresh Ubuntu: Attacker Dwell Time and What They Did

Field notes from a server in observation mode. Connection attempts, top ports, top usernames, top countries, time-to-first-shell-attempt. Honest about what was reproducible.

10 min readRead
Threat Intelligence

Monthly Threat Report: Your Own CrowdStrike Intelligence

Auto-generated monthly reports with executive summary, MITRE heatmap, campaign detection, geographic distribution. Replace $100K/year consulting reports.

7 min readRead
Threat Intelligence

Behavioral DNA: Fingerprinting Attackers Without IP Addresses

How behavioral DNA identifies campaigns across IPs using SHA-256 hashing of attack patterns and union-find clustering. 47 IPs, 8 countries, one botnet.

9 min readRead
Threat Intelligence

How We Built a Live Attack Map with Real-Time eBPF Data

From kernel events to a world map in the browser: SSE endpoints, server-side GeoIP proxy, react-simple-maps, and the engineering behind innerwarden.com/live.

8 min readRead
Threat Intelligence

Collaborative Defense: How Game Theory Protects a Security Mesh Network

Ed25519 signed signals, tit-for-tat trust evolution, staging pools with TTL auto-reversal. How Inner Warden nodes share threat intelligence without letting anyone abuse the network.

9 min readRead
Threat Intelligence

How to Tell Real Googlebot from Fake: Reverse DNS Verification

Attackers disguise as Googlebot to bypass security. Inner Warden verifies bot identity via reverse DNS. Real Google gets through, fakes get caught.

6 min readRead