DDoS Protection

Shield: DDoS Protection with XDP, Auto-Escalation, and Cloudflare Failover

April 8, 202611 min read

The DDoS problem for self-hosted servers

You cannot outbandwidth a botnet. A single VPS has 1 to 10 Gbps of upstream capacity. A decent botnet throws 100+ Gbps at your front door. The math does not work in your favor. No amount of clever software can absorb more traffic than your pipe can carry.

But here is the thing most people miss: the majority of DDoS attacks are not massive. Research consistently shows that most attacks fall in the 1 to 5 Gbps range. These are the opportunistic floods, the script-kiddie botnets, the ransom DDoS attempts that hit thousands of targets hoping someone pays. These attacks can absolutely be handled at the edge of your server, if you are fast enough.

The key phrase is "fast enough." If your mitigation runs in userspace, parses packets through the kernel TCP/IP stack, and makes decisions in application code, you are already too slow. By the time your Go or Python process sees the packet, the kernel has already allocated socket buffers, updated connection tracking tables, and consumed CPU cycles. At 10 million packets per second, that overhead kills you.

XDP: dropping packets before the kernel sees them

eXpress Data Path runs at the NIC driver level. This is before the kernel TCP/IP stack, before iptables, before connection tracking, before socket buffer allocation. A packet that XDP drops never consumes a single byte of kernel memory. It never touches a firewall rule. It never triggers a context switch to userspace.

Shield's XDP program inspects every incoming packet against a BPF HashMap of blocked IPs. The lookup is O(1). If the source IP is in the map, the packet gets an immediate XDP_DROP verdict. No response, no RST, no ICMP unreachable. Silence. The attacker gets nothing back, which also prevents reflection amplification.

XDP vs. iptables vs. userspace

XDP (NIC driver level):
  10M+ packets/sec   | 0 socket buffers | 0 conntrack entries

iptables (netfilter):
  ~2M packets/sec    | socket buffers allocated | conntrack table grows

Userspace firewall:
  ~200K packets/sec  | full kernel processing | context switches

At 10 million packets per second on commodity hardware, XDP is the only approach that works. Everything else falls over.

Per-IP adaptive rate limiting

Blocking known-bad IPs is necessary but not sufficient. DDoS attacks come from IPs you have never seen before. Shield implements per-IP rate limiting directly in XDP, drawing from research in "eBPF-Based DDoS Mitigation for IoT" (arXiv 2025) and the FlowSentryX architecture.

Three rate limiting algorithms are available, each with different tradeoffs:

Token bucket (default)

Allows short bursts while enforcing a long-term average rate. Best for web traffic where legitimate users send requests in bursts (page load, then idle, then another page load). Tolerates natural traffic patterns.

Fixed window

Simplest implementation. Counts packets per time window. Fast and predictable, but has a boundary problem: a burst at the end of one window and the start of the next can temporarily exceed the intended rate.

Sliding window

Most accurate. Combines current and previous window counts with a weight based on elapsed time. No boundary problem. Slightly more BPF map operations per packet.

All three use a BPF HashMap to track per-IP counters entirely in kernel space. No userspace round-trip for rate decisions. The result from real-world testing: 97% mitigation of flood traffic with zero measurable impact on legitimate users.

SYN cookie validation in XDP

SYN floods remain one of the most common DDoS vectors. The attacker sends millions of TCP SYN packets with spoofed source IPs. Your kernel allocates a half-open connection entry for each one, exhausting the SYN backlog queue. Legitimate connections cannot get through.

The kernel has its own SYN cookie mechanism, but it only activates after the backlog is full. By that point, damage is already done. Shield validates SYN cookies directly in XDP, based on research from "Me Love SYN-Cookies" (arXiv) and NetDev kernel conference work. The validation happens before the packet reaches the kernel TCP stack at all.

SYN flood mitigation flow

Incoming SYN packet
  -> XDP program intercepts at NIC driver
  -> Generate SYN cookie from (src_ip, src_port, dst_port, secret)
  -> Send SYN-ACK with cookie as sequence number
  -> Drop original SYN (kernel never sees it)

Incoming ACK (legitimate client completes handshake)
  -> XDP validates cookie from acknowledgment number
  -> Valid: pass to kernel TCP stack (connection established)
  -> Invalid: drop (spoofed source, never completed handshake)

The key insight: spoofed-source SYN floods never complete the handshake because the SYN-ACK goes to the spoofed IP, not the attacker. Only real clients send the final ACK with a valid cookie. Zero kernel resources consumed for flood traffic. Complete SYN flood mitigation at line rate.

Auto-escalation state machine

Inspired by Cloudflare's L4Drop architecture, Shield operates a four-state machine that automatically adjusts protection levels based on current traffic conditions. No manual intervention needed. No waking up at 3am to toggle settings.

Normal

Standard rate limits. XDP blocklist active. Baseline monitoring. This is where your server lives 99% of the time.

normal

Elevated

Traffic anomaly detected. Rate limits tightened by 50%. SYN cookie validation activated. TCP fingerprinting begins scoring connections.

elevated

UnderAttack

Confirmed DDoS in progress. Aggressive rate limiting. All new connections require SYN cookie validation. Bot-fingerprinted IPs auto-blocked.

under attack

Critical

Attack exceeds local capacity. Cloudflare failover triggered automatically. Only allowlisted IPs bypass local filtering. Telegram alert sent.

critical

The state machine is self-healing. When attack traffic subsides, Shield automatically steps back down through the states. Critical drops to UnderAttack, then Elevated, then Normal. Each transition has a cooldown period to prevent flapping. The entire cycle happens without human involvement.

Exponential Moving Average threshold

How does Shield know when traffic is abnormal? Static thresholds break immediately. If you set "alert above 10,000 packets/sec," you get false positives during legitimate traffic spikes and false negatives from slow-ramp attacks that stay just under the line.

Shield uses an Exponential Moving Average, drawing from "Kernel-level LDoS Detection with EMA" (ScienceDirect 2025). The EMA continuously recalibrates a baseline from recent traffic history. The detection threshold floats above this baseline by a configurable multiplier.

Adaptive baseline

EMA formula:
  baseline = alpha * current_rate + (1 - alpha) * previous_baseline

Detection:
  if current_rate > baseline * threshold_multiplier:
      escalate_state()

Example with alpha=0.1, multiplier=3.0:
  Normal traffic:   2,000 pps  -> baseline: 2,000  -> threshold: 6,000
  Traffic spike:    5,000 pps  -> baseline: 2,300  -> threshold: 6,900
  DDoS attack:    50,000 pps  -> baseline: 7,070  -> threshold: 21,210
  Attack detected: 50,000 >> 21,210 -> escalate

The alpha parameter controls how quickly the baseline adapts. A low alpha (0.05) creates a stable baseline that ignores short bursts. A higher alpha (0.2) tracks traffic changes more closely. Shield defaults to 0.1, which balances responsiveness with stability. No more guessing whether a traffic spike is a DDoS or just your site hitting the front page of a news aggregator.

TCP fingerprinting for bot detection

Not all DDoS traffic looks the same at the TCP level. Real browsers and operating systems produce distinct TCP SYN signatures: specific window sizes, MSS values, TTL ranges, and TCP option ordering. Shield performs passive OS fingerprinting on incoming SYN packets without any JavaScript challenges or CAPTCHAs.

Here is why this matters for DDoS detection: botnets typically run the same malware on thousands of compromised machines. That malware generates TCP packets with identical fingerprints. When Shield sees 5,000 connections from 5,000 different IPs but all with the exact same TCP signature, that is a botnet. Real traffic from real users on real devices shows diversity in fingerprints because people use different operating systems, browsers, and network configurations.

TCP fingerprint fields

Fields extracted from TCP SYN:
  - Initial window size
  - Maximum Segment Size (MSS)
  - Window scale factor
  - TCP options order (MSS, NOP, WScale, SACK, Timestamp)
  - IP TTL (indicates hop distance)
  - Don't Fragment bit

Botnet signature:
  5,000 IPs  |  1 unique fingerprint  ->  botnet (block)

Legitimate traffic:
  5,000 IPs  |  ~200 unique fingerprints  ->  real users (pass)

This detection is entirely passive. No extra round-trips, no JavaScript execution, no CAPTCHA walls. The fingerprint is extracted from the SYN packet that the client was already sending. Users never know it is happening.

BGP hijack detection

Some of the most devastating DDoS attacks do not flood your server at all. They hijack your IP prefix at the BGP routing level, redirecting your traffic to a black hole or an attacker-controlled network. Your server stays up, but nobody can reach it because the internet itself has been lied to about where your IP lives.

Shield implements the DFOH (Detection and Filtering of BGP Hijacks) approach from Holterbach et al., published at USENIX NSDI 2024. It subscribes to RIPE RIS and RouteViews BGP collectors to monitor route announcements affecting your prefixes.

DFOH detection pipeline

27-dimensional feature vector

Each BGP update is transformed into 27 features including AS path length, prefix length, number of origin ASes, peer visibility count, historical consistency, and timing patterns.

Random forest classifier

Trained on labeled hijack datasets. Classifies each update as legitimate or suspicious. 90.9% detection rate within 5 minutes of hijack onset.

Low false alarm rate

Only 17.5 false alarms per day across the entire internet routing table. For a single prefix, false alarms are extremely rare.

When a hijack is detected, Shield sends an immediate Telegram alert with the suspicious AS path, the conflicting origin AS, and the number of BGP collectors reporting the anomaly. Early detection gives you time to contact your upstream provider and initiate RPKI-based route origin validation before the hijack fully propagates.

Cloudflare auto-failover

Shield handles everything it can locally. But when the attack volume exceeds your pipe bandwidth, local mitigation is not enough. The packets saturate your upstream link before XDP even sees them. This is where Cloudflare comes in.

When Shield's state machine reaches Critical, it automatically pushes firewall rules to Cloudflare's WAF API and enables Under Attack Mode on your zone. The transition is seamless. Your DNS already points through Cloudflare (orange cloud), so traffic starts flowing through their 300+ Tbps network immediately. Cloudflare absorbs the volumetric flood while Shield continues handling application-layer filtering locally.

Auto-failover sequence

Shield state: Critical
  1. Push blocked IP ranges to Cloudflare WAF rules
  2. Enable Cloudflare Under Attack Mode (JavaScript challenge)
  3. Send Telegram notification:
     "Shield: Cloudflare failover activated.
      Attack: 12 Gbps UDP flood, 847K pps,
      Sources: 23,491 IPs, Duration: 4m 12s"
  4. Continue local XDP filtering for non-Cloudflare traffic

Attack subsides:
  5. Shield detects traffic returning to baseline (EMA)
  6. State machine steps down: Critical -> UnderAttack -> Elevated
  7. Disable Cloudflare Under Attack Mode
  8. Remove pushed WAF rules
  9. Send Telegram notification:
     "Shield: Cloudflare failover deactivated.
      Attack duration: 18m 34s. Returning to local protection."
  10. State returns to Normal

The entire failover and recovery cycle is automatic. Shield pushes to Cloudflare when needed and pulls back when the attack ends. You get a Telegram message at each transition so you know what happened, but you do not need to do anything. The system handles it.

What Shield can and cannot protect

No tool protects against everything. Here is an honest breakdown of what Shield handles and where it needs help.

Shield can protect against

Attack type	How
SYN flood	XDP SYN cookie validation at line rate
UDP flood	XDP rate limiting, up to pipe capacity
HTTP flood	L7 rate limiting + TCP bot fingerprinting
Slowloris	Connection timeout enforcement in XDP
DNS amplification	UDP source validation + rate limiting
BGP hijack	DFOH random forest detection, 90.9% in 5 min
Botnet DDoS	TCP fingerprinting identifies uniform bot signatures
Unknown zero-day DDoS	EMA anomaly detection triggers escalation regardless of attack type

Shield cannot protect against (alone)

Attack type	Why
Volumetric > pipe bandwidth	Physics. Needs upstream filtering (Cloudflare auto-failover handles this).
Sophisticated BGP manipulation	Can detect but cannot fix. Requires upstream provider and RPKI deployment.
Physical datacenter attack	Software cannot protect against someone cutting a fiber cable.

Telegram integration

Shield sends real-time DDoS alerts to the same Telegram bot you already use for SSH brute-force and privilege escalation alerts. Every state transition generates a notification with actionable context: attack type, current packets per second, duration, source IP count, and which mitigation actions were taken.

Example Telegram alerts

[Shield] State: Elevated
Traffic anomaly detected. Rate limits tightened.
Current: 8,200 pps | Baseline: 2,100 pps
SYN cookie validation: active

[Shield] State: UnderAttack
Confirmed DDoS: SYN flood + UDP amplification
Packets/sec: 847,000 | Sources: 12,491 IPs
Duration: 2m 18s | XDP drops: 99.3%

[Shield] State: Normal
Attack subsided. All systems nominal.
Total duration: 22m 04s
Peak: 1.2M pps | Total dropped: 1.4B packets
Cloudflare failover: not triggered (local capacity sufficient)

You get the full picture on your phone. When the attack is small enough for Shield to handle locally, you see it happen and resolve without touching anything. When it escalates to Cloudflare, you see that too. Either way, you know exactly what happened and what Shield did about it.

What to do next

eBPF for security - learn how Shield's XDP programs fit into the broader eBPF security architecture that monitors processes, connections, and file access.
Telegram security alerts - set up the Telegram bot that receives Shield DDoS notifications alongside all other security events.
Mesh network defense - combine Shield with the mesh network to share DDoS threat intelligence across multiple servers.

Shield: DDoS Protection with XDP, Auto-Escalation, and Cloudflare Failover

The DDoS problem for self-hosted servers

XDP: dropping packets before the kernel sees them

Per-IP adaptive rate limiting

SYN cookie validation in XDP

Auto-escalation state machine

Exponential Moving Average threshold

TCP fingerprinting for bot detection

BGP hijack detection

Cloudflare auto-failover

What Shield can and cannot protect

Telegram integration

What to do next

Keep following the attack path

Agent vs Agentless Monitoring: When Each Wins

Kubernetes Node Security with Inner Warden

Ship Now, Secure Now: You Can't Pick One Anymore

Why Default-Deny Is the Wrong Default

Replace Fail2ban + Wazuh + Suricata With One Binary

Endpoint Security for the Rest of Us