Building an Autoencoder That Learns What Normal Looks Like on Your Server
Rule-based detection is excellent at catching known threats. If an attacker runs a reverse shell, fires up a crypto miner, or brute-forces SSH, deterministic rules will catch it every time. But rules have a blind spot: they can only detect what someone has already described. The novel attack, the creative technique, the thing nobody has written a signature for yet, slips through.
We tried solving this with a neural classifier. Version 10 was a supervised model trained on labeled attack data. It worked well on the training set and terribly in production. Too many false positives. Every unusual-but-legitimate workload triggered alerts. The model had learned what attacks look like, but it had no understanding of what normal looks like on your specific server.
The fix was to flip the problem. Instead of teaching a model what attacks look like, teach it what normal looks like. Then anything that does not reconstruct well is anomalous. That is the autoencoder approach, and it is what Inner Warden ships today.
Why an autoencoder, not a classifier
A classifier needs labeled examples of both normal and malicious behavior. This creates two problems. First, you need a comprehensive dataset of attacks, which is always incomplete. Second, the classifier learns to distinguish between the specific attacks in the training set and everything else. A new attack technique that was not in the training data may look more like "everything else" than like the known attacks.
An autoencoder only needs normal data. It learns to compress normal behavior into a small latent space and then reconstruct it. When you feed it something it has never seen before, the reconstruction is poor. The reconstruction error becomes the anomaly score. High error means the input does not look like anything the model learned during training.
Classifier (V10, abandoned):
Input: event features → "attack" or "benign"
Problem: needs labeled attack data
Problem: novel attacks classified as "benign"
Problem: high false positive rate on unusual workloads
Autoencoder (current):
Input: event features → compress → reconstruct → compare
Only needs normal data (your server's own traffic)
Novel attacks = high reconstruction error = flagged
Unusual-but-normal workloads = low error = ignoredNetwork architecture: 48 to 8 and back
The autoencoder is a bottleneck network with four layers. The input is a 48-dimensional feature vector. The encoder compresses it to 16 dimensions, then to 8. The decoder expands it back to 16, then to 48. The bottleneck of 8 neurons forces the network to learn only the most important patterns in normal behavior.
Input (48) ──→ Encode (16) ──→ Bottleneck (8) ──→ Decode (16) ──→ Output (48)
│ │
└────────────── reconstruction error = ║input - output║² ───────────┘
Activation: ReLU (hidden layers), Sigmoid (output)
Loss: Mean Squared Error between input and output
Weights: ~3KB total (48×16 + 16×8 + 8×16 + 16×48 + biases)
Written in pure Rust. No PyTorch. No TensorFlow. No ONNX runtime.
Inference: microseconds per event window.Why pure Rust? Because Inner Warden runs on production servers where installing Python, CUDA, or a 200MB ML runtime is not acceptable. The entire model, including weights and inference code, compiles into the agent binary. No external dependencies. No GPU required. The 3KB weight file loads in microseconds and inference is pure matrix multiplication on the CPU.
48 features from a sliding window
The input vector is built from a sliding window of the last 20 events. Each event has a kind (SSH login, process exec, network connect, file write, etc.), and the feature extractor produces two groups of 24 values each.
Features [0..24]: Event kind frequency
Count of each event kind in the 20-event window,
normalized to [0, 1].
Index 0: ssh_login (e.g., 3/20 = 0.15)
Index 1: ssh_failed (e.g., 8/20 = 0.40)
Index 2: process_exec (e.g., 5/20 = 0.25)
Index 3: network_connect (e.g., 2/20 = 0.10)
Index 4: file_write (e.g., 1/20 = 0.05)
...24 event kinds total
Features [24..48]: Bigram transition frequency
Count of attack-indicative two-event sequences.
Index 24: ssh_failed → ssh_success (brute-force success)
Index 25: exec → connect (post-exploit C2)
Index 26: connect → file_write (download payload)
Index 27: ssh_success → exec (lateral movement)
Index 28: file_write → exec (drop and run)
Index 29: exec → exec → exec (rapid tool chain)
...24 bigrams totalThe bigram features are the key innovation. A single ssh_failed event is normal. Eight ssh_failed events followed by one ssh_success is a brute-force that succeeded. The bigram ssh_failed to ssh_success captures this pattern as a single feature. Similarly, exec followed by connect is the signature of post-exploitation: run a tool, then phone home.
On a normal server, most bigram features stay near zero. The autoencoder learns this. When a bigram like exec to connect suddenly appears, the reconstruction error spikes because the model has never seen that pattern during training.
Lifecycle: from installation to activation
The autoencoder does not start scoring events on day one. It follows a careful lifecycle that prevents false positives during the learning period while gradually increasing its influence on the final score.
Day 0: Install
└── Sensor starts collecting events
└── Agent stores events in JSONL files
└── No autoencoder model exists yet
└── Rules and kill chain scoring operate normally
Day 1-7: Observation period
└── Events accumulate (typically 50K-500K per week)
└── Rules are the sole scoring mechanism
└── Autoencoder weight in final score: 0
Day 7, 3:00 AM: First training run
└── Read 7 days of events from JSONL
└── Extract feature vectors (sliding windows)
└── Train for 50 epochs
└── Timeout: 30 minutes
└── RAM budget: 500MB
└── Auto-test: verify reconstruction error on training data
└── Save model weights (~3KB)
Day 8+: Nightly retraining
└── Every night at 3 AM, retrain on last 7 days
└── Model adapts to evolving server behavior
└── Maturity score increases each cycleThe maturity curve
A freshly trained autoencoder is not trusted as much as one that has been retraining for 30 days. The maturity score controls how much weight the autoencoder gets in the final severity calculation. It follows an exponential curve:
maturity = 1 - e^(-0.1 * training_cycles)
Training cycle 1 (Day 8): maturity = 0.095 (~10%)
Training cycle 7 (Day 14): maturity = 0.503 (~50%)
Training cycle 14 (Day 21): maturity = 0.753 (~75%)
Training cycle 30 (Day 37): maturity = 0.950 (~95%)
// The curve asymptotes at 1.0 but never reaches it.
// After 30 cycles, the model is effectively at full trust.This design means a brand-new model with only one training cycle contributes less than 10% of its potential weight to the final score. If it produces a false positive, the impact is minimal. By day 30, the model has retrained 30 times on 30 different 7-day windows. It has seen your server through weekday traffic, weekend lulls, monthly cron jobs, and deployment spikes. At 95% maturity, it has earned its influence.
How the anomaly score integrates with rules
Inner Warden computes a final severity score from three sources. Each source contributes a weighted fraction, and the autoencoder's contribution is scaled by its maturity.
final_score = rules_score * 0.4
+ killchain_score * 0.3
+ anomaly_score * 0.3 * maturity
// Example: Day 8 (maturity = 0.095)
// anomaly detects something unusual: anomaly_score = 0.9
// effective anomaly contribution: 0.9 * 0.3 * 0.095 = 0.026
// Barely moves the needle. Good.
// Example: Day 37 (maturity = 0.950)
// same anomaly: 0.9 * 0.3 * 0.950 = 0.256
// Significant contribution. The model has earned trust.
// If rules_score = 0 and killchain_score = 0 but anomaly is high,
// this is exactly the scenario the autoencoder was built for:
// novel attack that no rule covers.The three-source scoring creates defense in depth. A known attack triggers rules. A multi-step attack triggers the kill chain engine. A novel attack that evades both still gets flagged by the autoencoder. All three need to miss for an attack to go undetected.
Rules as teacher: self-correcting false positives
The autoencoder has a built-in self-correction mechanism. It only trains on events that the rule engine considers benign. If the rules say an event is normal, the autoencoder learns to reconstruct it with low error. If the autoencoder later flags a similar event as anomalous, the next nightly training cycle absorbs it into the model of normal behavior.
Day 8: New deployment tool runs for the first time.
Rules: no match (benign)
Kill chain: no match (benign)
Autoencoder: high reconstruction error (anomaly!)
Result: small bump in score (maturity is low)
Day 9: Nightly training includes yesterday's deployment events.
Autoencoder learns: this pattern is normal.
Day 10: Same deployment tool runs again.
Autoencoder: low reconstruction error (normal)
False positive eliminated automatically.This is the key advantage of training on the server's own data. A classifier trained on a generic dataset would keep flagging your custom deployment tool forever. The autoencoder adapts because it retrains every night on your server's actual behavior.
Training: 50 epochs at 3 AM
Training happens nightly at 3 AM when most servers have minimal load. The process reads 7 days of events from JSONL files, extracts feature vectors using the sliding window, and runs 50 epochs of gradient descent.
// Training configuration
const EPOCHS: usize = 50;
const LEARNING_RATE: f32 = 0.001;
const WINDOW_SIZE: usize = 20; // sliding window of events
const FEATURE_DIM: usize = 48; // 24 kinds + 24 bigrams
const TIMEOUT: Duration = Duration::from_secs(30 * 60); // 30 min
const RAM_BUDGET: usize = 500 * 1024 * 1024; // 500 MB
// Training reads events from the last 7 days
// Events are stored in /var/lib/innerwarden/incidents-*.jsonl
// Each line is a JSON object with kind, timestamp, metadata
// Feature extraction produces one vector per window position
// Auto-test after training:
// Run inference on a sample of training data
// If mean reconstruction error > threshold, discard model
// Keep previous model weights as fallbackThe 500MB RAM budget is a hard limit. If the 7-day event window produces more data than fits in memory, the trainer uses reservoir sampling to get a representative subset. On most servers, a week of events fits comfortably within the budget. The 30-minute timeout ensures training never impacts daytime operations, even on slower hardware.
3KB of learned behavior
The total number of weights in the network is small by design. The four weight matrices plus biases add up to roughly 3KB when serialized. For comparison, a typical PyTorch model checkpoint for a similar architecture would be 50-100KB due to optimizer state and metadata. Inner Warden stores only the raw float32 weights.
Layer 1 (encode): 48 x 16 = 768 weights + 16 biases
Layer 2 (bottle): 16 x 8 = 128 weights + 8 biases
Layer 3 (decode): 8 x 16 = 128 weights + 16 biases
Layer 4 (output): 16 x 48 = 768 weights + 48 biases
Total: 1,792 weights + 88 biases = 1,880 parameters
Size: 1,880 x 4 bytes (f32) = 7,520 bytes (~7.5KB)
// Stored at /var/lib/innerwarden/anomaly-model.bin
// Loaded once at agent startup, replaced on nightly retrain
// Inference: 4 matrix multiplications + 4 bias additions
// No dynamic allocation during inferenceMicrosecond inference with zero allocation. The feature vector goes in, the reconstruction error comes out, and the agent continues processing the next event. There is no batching, no GPU transfer, no Python interpreter startup. Just Rust multiplying small matrices on the stack.
What it looks like in practice
Here is a real scenario. A server has been running Inner Warden for 30 days. The autoencoder has been trained 23 times. Maturity is 0.90. An attacker gets in through a zero-day in a web application that no rule covers.
14:22:01 Event: process_exec (nginx worker spawns /bin/sh)
14:22:01 Event: process_exec (/bin/sh spawns curl)
14:22:02 Event: network_connect (curl -> 185.220.101.XX:443)
14:22:03 Event: file_write (/tmp/.cache_update)
14:22:03 Event: process_exec (/tmp/.cache_update)
14:22:04 Event: network_connect (.cache_update -> 45.XX.XX.XX:4444)
Sliding window features:
exec frequency: 0.30 (normally ~0.05)
connect frequency: 0.10 (normally ~0.02)
bigram exec->connect: 0.15 (normally 0.00)
bigram write->exec: 0.05 (normally 0.00)
Rules score: 0.0 (no signature matches this zero-day)
Kill chain: 0.0 (not enough stages yet)
Anomaly score: 0.87 (reconstruction error far above threshold)
Effective: 0.87 * 0.3 * 0.90 = 0.235
Combined score: 0.235 -> Medium severity incident
Agent AI triage: escalates based on exec->connect bigram patternWithout the autoencoder, this attack would have been invisible until more kill chain stages triggered. The autoencoder caught it on the initial exploitation phase because it recognized the event pattern as fundamentally different from anything the server normally does.
Why not use an off-the-shelf ML solution?
The goal was never to build the most powerful anomaly detection model. It was to build one that actually runs on production servers without creating operational burden. A 3KB model that retrains itself every night and requires zero configuration is more useful than a 2GB model that needs a data science team to maintain.
Nothing to configure
The autoencoder is enabled by default. Install Inner Warden and it starts collecting events immediately. After 7 days, the first training run happens at 3 AM. You do not need to label data, tune hyperparameters, or provision a GPU.
curl -fsSL https://www.innerwarden.com/install | sudo bashAfter the first training cycle, you will see anomaly scores in the dashboard alongside rule-based and kill chain scores. The maturity indicator shows how much trust the model has earned. By day 30, it is operating at full capacity and catching things that no signature could.
What to read next
- Baseline learning - the EMA-based behavioral profiling that complements the autoencoder with process lineage and login hour tracking.
- Cross-layer correlation - how anomaly scores feed into the kill chain engine for multi-stage attack detection.
- Behavioral DNA - fingerprinting attackers across sessions using patterns that the autoencoder helps identify.
- eBPF kernel security - the 40 eBPF hooks that generate the events the autoencoder trains on.