A Production-Grade, ML-Ready Traffic Analytics & Abuse-Detection Engine for Nginx, Apache & LiteSpeed
CFM (Configurable Firewall Manager) started as a modern nftables-first firewall manager designed for high-security hosting and infrastructure operators.
Over time, it evolved into a complete security platform: dynamic firewalling, live log-driven detection, autoblocking, system hardening, notifications, DNS/GeoIP enrichment, and API integration.
Today, CFM takes another major step forward with the introduction of the Unified Web Detector — a near-real-time HTTP analytics and suspicious-behavior engine that works across:
-
Nginx
-
Apache HTTPD
-
LiteSpeed
It ingests access logs (file or journald), computes metrics in one-minute sliding windows, enriches the data (ASN, GeoIP, PTR), and exposes this information via CLI, API, and a future web UI.
This post explains what it is, how it works, how to use it, and how it is ML-ready for the next phase of anomaly detection (HBOS, eHBOS, Isolation Forest).
☑️ In Short — What Is CFM?
CFM is an all-in-one firewall + intrusion detection + autoblock manager for Linux servers.
It combines:
-
nftables policy control
-
dynamic ALLOW/BLOCK mechanisms
-
log-driven detection (SSH, Exim, MySQL, FTP, ModSecurity, cPanel)
-
portscan & flood detection
-
autoblocking with TTL
-
enrichment (ASN, Country, PTR)
-
notifications (email, Slack, SMTP)
-
MaxMind auto-updates
-
API integration with external systems
It is designed for shared hosting providers, WordPress/WHM/cPanel infrastructure, self-hosted web stacks, and bare-metal servers that need strong security automation.
🔥 Introducing the Unified Web Detector
Previously, CFM had separate detectors for Nginx and Apache.
They are now replaced with one unified engine.
It implements a high-performance sliding-window analytics engine that:
-
Parses web server access logs in real time
-
Maintains per-vhost metrics
-
Computes suspiciousness scores
-
Tracks IPs, user agents, referrers, and paths
-
Provides instant drill-down reports
-
Is fully ML-ready (HBOS/eHBOS/iForest)
Supported input sources
CFM currently supports:
-
file tailing (e.g.,
/var/log/nginx/access.log) -
journald streams (
journalctl -u nginx, docker-logs equivalent)
The detector supports Nginx, Apache, and LiteSpeed via sample configs shipped in package (/usr/share/cfm)
📊 What the Web Detector Tracks
For every vhost per minute, CFM computes:
✔ Request metrics
-
total— total requests -
2xx,3xx,4xx,5xx -
401and499(Apache-specific: 499 appears only if logged)
✔ Rates & ratios
-
RPS (requests-per-second)
-
err% (error ratio)
-
auth401 ratio
-
avg response time (
avg_rt)
✔ Unique metrics
-
number of unique source IPs
-
direct traffic % (no referrer)
-
bot traffic % (Googlebot, Facebook, etc.)
✔ Top lists
For each vhost:
-
Top Source IPs
-
Top User Agents
-
Top Referrers
-
Top Paths
All of these are exposed both via CLI and JSON API.
🔎 Suspicious Vhost Detection
CFM evaluates each vhost once per sliding window and assigns a Suspicious Score.
This is NOT a traditional anomaly detection yet — but the design is ML-ready and can easily upgrade to HBOS/eHBOS later.
The engine detects vhosts that might be attacked, scanned, brute-forced, or otherwise behaving strangely.
Suspicious score inputs include:
-
High 4xx/5xx ratio
-
High 3xx ratio from a single IP
-
Unusual spike in UniqueIPs
-
Very high average response time
-
Bad-Agent concentration
-
High direct traffic %
-
Repeated hits from same AS / Country
Each vhost gets:
-
score (0.0–1.0)
-
reasons (human-readable list)
-
Additional metrics (unique IPs, rps, 3xx/4xx/5xx counts, 401 ratio)
The CLI (cfm httpd-top, cfm nginx-top) displays:
Suspicious vhosts
—- Suspicious vhosts —-
HOST SCORE REASONS RPS 3xx 4xx 5xx uniqIP err%
example.com 0.82 high_3xx,spike_ips 6.2 3.8 1.1 0.2 41 17.4
This immediately shows which vhosts might be under:
-
bot probing
-
credential stuffing
-
brute-force scans
-
misconfiguration loops
-
plugin/theme issues
🧪 ML-Ready Architecture
The suspicious scoring infrastructure is intentionally built so that ML can be slotted in without rewriting code.
Future algorithm plug-ins:
-
HBOS — Histogram-Based Outlier Score
-
eHBOS — Ensemble HBOS
-
Isolation Forest — tree-based anomaly scoring
-
Z-Score / statistical baselines
-
PCA pre-processing
-
Auto-thresholding per vhost
CFM can easily emit feature vectors:
-
RPS
-
3xx/4xx/5xx ratios
-
unique IPs
-
bot percentage
-
RT average
-
entropy of UAs/referrers
🔧 HTTP Debug API (for dashboards & Grafana)
CFM exposes metrics on 127.0.0.1:6060:
/httpd/top
/nginx/top
JSON output for dashboards.
/httpd/host?name=example.com
Full drill-down JSON.
/httpd/suspicious?min=0.6
Machine-readable suspicious vhosts.
/debug/pprof/*
Full pprof profiler for tuning performance.
This allows:
-
building a Grafana dashboard
-
building a Web UI
-
piping metrics to ClickHouse
-
testing ML algorithms live
🧩 Architecture Flow (High Level)
| Access Logs |
+——–+———+
|
v
(Tailing or Journald)
|
v
+———————+
| Web Detector |
| – per-vhost metrics |
| – sliding windows |
+———+———–+
|
v
+—————————-+
| Suspicious Scoring Engine |
| (Rules now, ML later) |
+————+—————+
|
v
+——————+ +—————-+
| CLI (cfm top) | | Debug JSON API |
+——————+ +—————-+
🎯 Why This Matters
Modern websites—especially WordPress, WooCommerce, cPanel/DirectAdmin deployments—are hit constantly by:
-
automated crawlers
-
low-skill scanners
-
credential stuffing
-
plugin exploit sweeps
-
global botnets
-
brute-force HTTP auth
-
SEO spam attacks
Traditional firewalls don’t see these because:
-
they act per-packet, not per-request
-
they don’t understand vhosts
-
they don’t understand user agents
-
they don’t track 3xx/4xx/5xx patterns
-
they don’t track referrers
-
they don’t track paths
-
they don’t enrich IPs
CFM fills that gap.
It gives you hosting-grade, NOC-grade visibility into your web traffic — instantly — with no external dependencies.
🏁 Final Notes
With this new web detector, CFM now offers:
✔ Real-time vhost analytics
✔ Suspicious activity detection
✔ Per-IP HTTP classification
✔ Response-time performance insights
✔ Enriched visibility (ASN, PTR, Country)
✔ ML-ready feature vectors
✔ Unified behavior for Nginx, Apache, LiteSpeed
✔ Dramatically better debugging & monitoring
You now have the foundation for:
-
HTTP anomaly detection
-
Auto-block policies based on vhost behavior
-
Web application threat detection
-
Performance & tuning analytics
-
Abuse detection with minimal CPU overhead