The pitch versus the install
Every security tool's pitch deck has a slide that says "blocks attacks automatically". Every operator who has been around long enough has been locked out of their own server by a tool that blocked attacks automatically. These two facts are related, and the tool either learns from it or gets uninstalled.
This is an argument for default dry-run. The detector should ship enabled. The blocker should ship disabled. The operator should turn the blocker on after they have read a week of what the detector flags, not before.
Failure mode 1: blocking your own admin SSH
You travel. Your laptop's IP changes. You SSH from a coffee shop and get a key prompt because the agent forwarded a wrong key first. The tool sees three failed auths and silently adds your new IP to the deny list. You spend twenty minutes wondering why the host is unreachable, then go to the cloud console to revert it.
Multiply that by every operator who installs the tool. A meaningful percentage will hit it within the first month, and a meaningful percentage of those will uninstall the tool rather than learn its block thresholds. That is a tool that loses its own users.
Failure mode 2: false-blocking Cloudflare
You sit your origin behind Cloudflare. Now every request to your host comes from a Cloudflare IP. A small subset of those requests will look anomalous to any heuristic detector, especially if the WAF rejected them at the edge and the origin never sees the body. A default-deny tool that blocks based on source IP will eventually drop a Cloudflare range. Now part of your real traffic gets a 502, you have no idea why, and your monitoring will not tell you because the request never reached your app.
You should use CF-Connecting-IP instead, of course. Most tools default to this. Some do not. The ones that do not are the ones that need dry-run more than anyone.
Failure mode 3: sudo lockouts
You install a tool that watches sudo and blocks the user on "suspicious" patterns. The vendor's idea of suspicious includes "sudo without a TTY" because that is what some malware does. Your CI pipeline runs ansible, which sudoes without a TTY. The tool blocks your CI user. The first deploy after install fails. The second deploy after install gets the tool uninstalled.
The detection was not wrong. Sudo without a TTY genuinely is a high-signal event for malware. The blocking was wrong, because the tool did not know your environment yet.
Failure mode 4: the SOC has no idea what got blocked
An auto-blocker that fires hundreds of low-confidence rules a day produces a deny list nobody understands. After a month no human can answer "why is this IP blocked". The list grows, real traffic starts to disappear, and the tool acquires a reputation for being unreliable rather than helpful. You cannot triage what you cannot read.
Dry-run avoids this entirely. Every action that would have been taken is recorded with the rule that fired, the confidence, and the input. The operator reads the log and decides what to promote.
The dry-run-first model
The model that earns trust looks like this. Every detector runs. Every rule fires. Every event is logged with what the tool would have done if blocking were on. After a week the operator reviews the would-have list and the false positive rate. They promote the high-confidence rules to actually block, leave the medium ones in alert mode, and tune the noisy ones.
An AI gate sits between detection and action for the middle-confidence rules. The gate looks at the host baseline, the prior history of the source, and the explanation of why the rule fired, then makes a per-incident call. That gate is the difference between hard rules that lock you out and a system that decides like a human would.
"But the attacker is faster than the operator"
Yes. That is the strongest argument for default-deny, and it is correct on the high-confidence end. A successful credential dump followed by an attempt to write to /etc/cron.d should be auto-blocked. There is no scenario where a real admin does that, and the action is reversible.
The fight is over the long tail of medium-confidence rules, which is most rules in any real product. Defaulting those to block is what causes the failure modes above. Defaulting them to alert and gating them through an AI or a human is what gets adoption.
How Inner Warden ships
Inner Warden ships in observation mode. The sensor runs full eBPF, the agent runs full triage, every detection produces an incident, every would-have block is recorded with explanation and confidence. Auto-block requires an explicit opt-in flag. The honest reason: a tool you cannot uninstall after a bad week is a tool you cannot trust on day one. We would rather earn that trust than presume it.
Related reading: False positives are a feature problem and Caldera validation, honest results.