Architecture

Collaborative Defense: How Game Theory Protects a Security Mesh Network

Name: Inner Warden
Author: Inner Warden

April 3, 20269 min read

Servers defend alone

Every server on the internet fights the same attackers independently. An SSH brute-force hits your production box. You block the IP. Five minutes later, the same IP hits your neighbor's server. They have no idea you already identified this attacker. They start from zero.

This is the default state of infrastructure security. Each server is an island. There is no shared intelligence, no collective memory. The attacker gets unlimited attempts across unlimited targets because nobody talks to each other. Centralized threat feeds like AbuseIPDB help, but they have lag. By the time a report propagates, the attacker has already moved on to the next thousand IPs.

What if the servers could talk directly? What if blocking an attacker on one node automatically protected every other node in the network?

The mesh: attack one, block everywhere

A security mesh network connects Inner Warden instances directly. When one node detects and blocks an attacker, it broadcasts a threat signal to every peer. The peers evaluate the signal and, if trusted, apply a preemptive block before the attacker even reaches them.

The result: an attacker who targets any single server in the mesh is effectively targeting all of them simultaneously. The cost of attack multiplies by the number of nodes, while the cost of defense stays constant. This is the network effect applied to security.

Node A detects SSH brute-force from 198.51.100.23
  -> broadcasts ThreatSignal to peers B, C, D
  -> Node B: confirms (saw 3 failed logins from same IP) -> blocks
  -> Node C: no local evidence, adds to watchlist
  -> Node D: confirms (port scan from same IP) -> blocks

But this model has a dangerous failure mode. If any node in the mesh is compromised or malicious, it could broadcast fake threat signals and cause the entire network to block legitimate IPs. The cure becomes worse than the disease.

Why trust is the hard problem

Imagine a mesh of 50 servers. One gets compromised. The attacker now controls a node inside the trust network. They broadcast signals saying 8.8.8.8 is an attacker. Every node blocks Google DNS. Or they flood the network with thousands of signals, causing every node to choke on firewall rules.

Without a trust model, a mesh network is a weapon waiting to be turned against its operators. The question is not whether to share intelligence. It is how to share it without creating a new attack surface. Game theory gives us the answer.

Tit-for-tat trust scoring

The mesh uses a trust model inspired by the iterated Prisoner's Dilemma. Robert Axelrod's tournaments proved that the tit-for-tat strategy -- cooperate first, then mirror the other player's last move -- dominates over time. We adapt this for threat intelligence sharing.

Every peer starts with a trust score of 0.1. This is deliberately low. A new peer must prove itself before its signals carry weight. Trust changes based on whether the receiving node can locally confirm the signal:

Confirmed signal

The receiving node has local evidence (failed logins, port scan, web probes) from the same IP. Trust +0.05

Contradicted signal

The IP has only legitimate traffic locally, or the signal targets RFC1918 / known-good IPs. Trust -0.15

The asymmetry is intentional. Gaining trust is slow (+0.05 per confirmation). Losing trust is fast (-0.15 per contradiction). This means a compromised node that starts sending bad signals drops to zero trust after just a few false reports, while a legitimate node that consistently sends accurate signals builds trust over dozens of interactions. The penalty is 3x the reward because a single false block is more damaging than a single good block is helpful.

// Trust progression for an honest peer
Signal  1: 0.10 -> 0.15  (confirmed)
Signal  5: 0.30 -> 0.35  (confirmed)
Signal 10: 0.55 -> 0.60  (confirmed)
Signal 18: 0.95 -> 1.00  (max trust)

// Trust collapse for a compromised peer
Signal  1: 0.50 -> 0.35  (contradicted)
Signal  2: 0.35 -> 0.20  (contradicted)
Signal  3: 0.20 -> 0.05  (contradicted)
Signal  4: 0.05 -> 0.00  (quarantined)

Ed25519 signed signals

TLS encrypts the transport, but it does not prove who created a message. A compromised relay could inject signals that appear to come from a trusted peer. The mesh solves this with message-level cryptographic signatures.

Every node generates an Ed25519 keypair on first run. The public key is exchanged during peer registration. Every threat signal is signed with the sender's private key. Receiving nodes verify the signature before processing. No valid signature, no processing. Period.

ThreatSignal struct

ThreatSignal {
    node_id:        "warden-a1b2c3",
    ip:             "198.51.100.23",
    threat_type:    "ssh_bruteforce",
    confidence:     0.92,
    evidence_hash:  "sha256:3f2a...",  // hash of local evidence
    timestamp:      1742678400,
    signature:      "ed25519:9c4f...", // signs all fields above
}

The evidence_hash is a SHA-256 hash of the raw log lines that triggered detection. The receiving node cannot see the logs (privacy), but if challenged, the sender can reveal the pre-image to prove the evidence existed. This creates accountability without requiring nodes to share raw data.

The staging pool: never block immediately

Even with trust scoring and signed signals, no mesh signal ever triggers an immediate permanent block. Every signal enters a staging pool where it is evaluated against the sender's trust score and local evidence before any action is taken.

Effective score < 0.3

Signal discarded. Sender trust too low or confidence too weak.

discard

Effective score 0.3 - 0.6

Added to watchlist. IP is monitored for local suspicious activity.

watchlist

Effective score 0.6 - 0.8

Blocked with 1-hour TTL. Short block, auto-removed if no local confirmation.

block 1h

Effective score > 0.8

Blocked with 24-hour TTL. Strong signal from a highly trusted peer.

block 24h

The effective score is peer_trust * signal_confidence. A fully trusted peer (1.0) sending a high-confidence signal (0.92) yields 0.92, which means a 24-hour block. A new peer (0.1) sending the same signal yields 0.092, which means discard. Trust must be earned.

Rate limiting and circuit breaker

Trust scoring handles inaccurate peers. Rate limiting handles aggressive ones. Each peer is allowed a maximum of 50 signals per hour. If a peer exceeds this rate, it is automatically quarantined: all its signals are dropped until the rate returns to normal.

This prevents a compromised node from flooding the network. Even if the attacker controls a high-trust peer, they cannot send more than 50 block requests per hour. At that rate, the trust score will collapse long before they can cause widespread damage.

Hard safety rules

- Max 50 signals/hour per peer (quarantine on excess)
- RFC1918 addresses always rejected (10.x, 172.16-31.x, 192.168.x)
- Loopback (127.0.0.0/8) and link-local always rejected
- Signals targeting the node's own IP always rejected
- Duplicate signals for same IP within TTL window are deduplicated

The RFC1918 rejection is a hard rule that cannot be overridden by trust score. No mesh signal can ever block a private IP address. This prevents the most obvious attack vector: a compromised peer trying to block internal infrastructure.

TTL auto-reversal

Every block that originates from a mesh signal has a TTL. No exceptions. When the TTL expires, the block is automatically removed unless the node has gathered its own local evidence to justify keeping it.

This is the final safety net. Even if every other safeguard fails -- trust scoring, rate limiting, staging pool -- the damage is time-bounded. A false block lasts at most 24 hours before it auto-reverts. The system is self-healing.

Mesh block lifecycle:
  1. Signal received -> staging pool
  2. Effective score calculated -> action determined
  3. Block applied with TTL (1h or 24h)
  4. During TTL: node monitors for local evidence
  5a. Local evidence found -> block promoted to local block (standard TTL)
  5b. No local evidence -> block auto-removed at TTL expiry
  6. Peer trust updated based on outcome

The distinction between mesh blocks and local blocks is important. A locally detected attacker follows the normal block lifecycle with standard TTLs. A mesh-originated block is always shorter-lived and always reverts unless locally confirmed. The mesh accelerates detection. It never replaces local judgment.

Enable the mesh

The mesh is opt-in. You choose which peers to trust and when to join. Two commands to get started:

Enable mesh networking

innerwarden config mesh enable

This generates your Ed25519 keypair and starts listening for peer connections. Your node ID and public key are printed to the terminal. Share the public key with peers you want to connect to.

Add a peer

innerwarden config mesh add-peer \
  --endpoint "warden-b.example.com:9443" \
  --pubkey "ed25519:abc123...""

The peer starts at trust score 0.1. As it sends signals that your node can locally confirm, trust grows automatically. No manual trust escalation is needed. The game theory handles it.

What to do next

Threat intelligence sharing - complement mesh signals with AbuseIPDB reporting and Cloudflare WAF integration for defense in depth.
SSH brute-force detection - the most common attack type and the strongest source of mesh signals.
Port scan detection - detect reconnaissance that often precedes targeted attacks across multiple nodes.

Collaborative Defense: How Game Theory Protects a Security Mesh Network

Servers defend alone

The mesh: attack one, block everywhere

Why trust is the hard problem

Tit-for-tat trust scoring

Ed25519 signed signals

The staging pool: never block immediately

Rate limiting and circuit breaker

TTL auto-reversal

Enable the mesh

What to do next

Keep following the attack path

Agent vs Agentless Monitoring: When Each Wins

Where to Start Hacking on Inner Warden

Contributing Your First PR to Inner Warden

Ship Now, Secure Now: You Can't Pick One Anymore

Why Default-Deny Is the Wrong Default

30 Days on a Fresh Ubuntu: Attacker Dwell Time and What They Did