A Production-Grade, ML-Ready Traffic Analytics & Abuse-Detection Engine for Nginx, Apache & LiteSpeed

CFM (Configurable Firewall Manager) started as a modern nftables-first firewall manager designed for high-security hosting and infrastructure operators.
Over time, it evolved into a complete security platform: dynamic firewalling, live log-driven detection, autoblocking, system hardening, notifications, DNS/GeoIP enrichment, and API integration.

Today, CFM takes another major step forward with the introduction of the Unified Web Detector — a near-real-time HTTP analytics and suspicious-behavior engine that works across:

Nginx
Apache HTTPD
LiteSpeed

It ingests access logs (file or journald), computes metrics in one-minute sliding windows, enriches the data (ASN, GeoIP, PTR), and exposes this information via CLI, API, and a future web UI.

This post explains what it is, how it works, how to use it, and how it is ML-ready for the next phase of anomaly detection (HBOS, eHBOS, Isolation Forest).

☑️ In Short — What Is CFM?

CFM is an all-in-one firewall + intrusion detection + autoblock manager for Linux servers.
It combines:

nftables policy control
dynamic ALLOW/BLOCK mechanisms
log-driven detection (SSH, Exim, MySQL, FTP, ModSecurity, cPanel)
portscan & flood detection
autoblocking with TTL
enrichment (ASN, Country, PTR)
notifications (email, Slack, SMTP)
MaxMind auto-updates
API integration with external systems

It is designed for shared hosting providers, WordPress/WHM/cPanel infrastructure, self-hosted web stacks, and bare-metal servers that need strong security automation.

🔥 Introducing the Unified Web Detector

Previously, CFM had separate detectors for Nginx and Apache.
They are now replaced with one unified engine.

It implements a high-performance sliding-window analytics engine that:

Parses web server access logs in real time
Maintains per-vhost metrics
Computes suspiciousness scores
Tracks IPs, user agents, referrers, and paths
Provides instant drill-down reports
Is fully ML-ready (HBOS/eHBOS/iForest)

Supported input sources

CFM currently supports:

file tailing (e.g., /var/log/nginx/access.log)
journald streams (journalctl -u nginx, docker-logs equivalent)

The detector supports Nginx, Apache, and LiteSpeed via sample configs shipped in package (/usr/share/cfm)

📊 What the Web Detector Tracks

For every vhost per minute, CFM computes:

✔ Request metrics

total — total requests
2xx, 3xx, 4xx, 5xx
401 and 499 (Apache-specific: 499 appears only if logged)

✔ Rates & ratios

RPS (requests-per-second)
err% (error ratio)
auth401 ratio
avg response time (avg_rt)

✔ Unique metrics

number of unique source IPs
direct traffic % (no referrer)
bot traffic % (Googlebot, Facebook, etc.)

✔ Top lists

For each vhost:

Top Source IPs
Top User Agents
Top Referrers
Top Paths

All of these are exposed both via CLI and JSON API.

🔎 Suspicious Vhost Detection

CFM evaluates each vhost once per sliding window and assigns a Suspicious Score.
This is NOT a traditional anomaly detection yet — but the design is ML-ready and can easily upgrade to HBOS/eHBOS later.

The engine detects vhosts that might be attacked, scanned, brute-forced, or otherwise behaving strangely.

Suspicious score inputs include:

High 4xx/5xx ratio
High 3xx ratio from a single IP
Unusual spike in UniqueIPs
Very high average response time
Bad-Agent concentration
High direct traffic %
Repeated hits from same AS / Country

Each vhost gets:

score (0.0–1.0)
reasons (human-readable list)
Additional metrics (unique IPs, rps, 3xx/4xx/5xx counts, 401 ratio)

The CLI (cfm httpd-top, cfm nginx-top) displays:

Suspicious vhosts

—- Suspicious vhosts —-
HOST SCORE REASONS RPS 3xx 4xx 5xx uniqIP err%
example.com 0.82 high_3xx,spike_ips 6.2 3.8 1.1 0.2 41 17.4

This immediately shows which vhosts might be under:

bot probing
credential stuffing
brute-force scans
misconfiguration loops
plugin/theme issues

🧪 ML-Ready Architecture

The suspicious scoring infrastructure is intentionally built so that ML can be slotted in without rewriting code.

Future algorithm plug-ins:

HBOS — Histogram-Based Outlier Score
eHBOS — Ensemble HBOS
Isolation Forest — tree-based anomaly scoring
Z-Score / statistical baselines
PCA pre-processing
Auto-thresholding per vhost

CFM can easily emit feature vectors:

RPS
3xx/4xx/5xx ratios
unique IPs
bot percentage
RT average
entropy of UAs/referrers

🔧 HTTP Debug API (for dashboards & Grafana)

CFM exposes metrics on 127.0.0.1:6060:

`/httpd/top`

`/nginx/top`

JSON output for dashboards.

`/httpd/host?name=example.com`

Full drill-down JSON.

`/httpd/suspicious?min=0.6`

Machine-readable suspicious vhosts.

`/debug/pprof/*`

Full pprof profiler for tuning performance.

This allows:

building a Grafana dashboard
building a Web UI
piping metrics to ClickHouse
testing ML algorithms live

🧩 Architecture Flow (High Level)

+——————+
| Access Logs |

| nginx / apache / litespeed |
+——–+———+
|
v
(Tailing or Journald)
|
v
+———————+
| Web Detector |
| – per-vhost metrics |
| – sliding windows |
+———+———–+
|
v
+—————————-+
| Suspicious Scoring Engine |
| (Rules now, ML later) |
+————+—————+
|
v
+——————+ +—————-+
| CLI (cfm top) | | Debug JSON API |
+——————+ +—————-+

🎯 Why This Matters

Modern websites—especially WordPress, WooCommerce, cPanel/DirectAdmin deployments—are hit constantly by:

automated crawlers
low-skill scanners
credential stuffing
plugin exploit sweeps
global botnets
brute-force HTTP auth
SEO spam attacks

Traditional firewalls don’t see these because:

they act per-packet, not per-request
they don’t understand vhosts
they don’t understand user agents
they don’t track 3xx/4xx/5xx patterns
they don’t track referrers
they don’t track paths
they don’t enrich IPs

CFM fills that gap.

It gives you hosting-grade, NOC-grade visibility into your web traffic — instantly — with no external dependencies.

🏁 Final Notes

With this new web detector, CFM now offers:

✔ Real-time vhost analytics

✔ Suspicious activity detection

✔ Per-IP HTTP classification

✔ Response-time performance insights

✔ Enriched visibility (ASN, PTR, Country)

✔ ML-ready feature vectors

✔ Unified behavior for Nginx, Apache, LiteSpeed

✔ Dramatically better debugging & monitoring

You now have the foundation for:

HTTP anomaly detection
Auto-block policies based on vhost behavior
Web application threat detection
Performance & tuning analytics
Abuse detection with minimal CPU overhead

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28