security-collector-exporter: Monitoring Linux Security Auditing

May 14, 2026 Observability Prometheus, Linux, Security Monitoring, Go, Exporter 1100 words 6 min read

🔊

Why This Was Built

Anyone managing servers has probably had this experience: compliance audit comes, SSH into machines one by one to check—SSH config correct, SELinux enabled, firewall running, any expired accounts, password policies compliant. A few machines are fine; dozens or hundreds becomes purely manual grunt work.

And the more painful part: none of this has continuous monitoring. You check compliance today, someone changes a config tomorrow, and you’d never know.

The Prometheus ecosystem has node_exporter for basic system metrics (CPU, memory, disk), but security configuration state has always been a gap. security-collector-exporter fills this gap—turning all Linux security-related configurations and states into Prometheus metrics, integrating with existing monitoring systems, continuously tracking, and automatically alerting.

What It Collects

Overall coverage includes 15 categories of security metrics, from accounts to kernel parameters:

Category	Metrics	Description
System Info	`linux_security_os_version_info`	OS version, package count, last patch time
Account Management	`linux_security_account_info`	passwd info, sudo permission detection
Password Policy	`linux_security_password_*`	6 independent metrics covering shadow file fields
SSH Config	`linux_security_sshd_config_info`	sshd_config key configuration items
Firewall	`linux_security_firewall_enabled`	Supports firewalld/ufw/iptables/nftables
Port Monitoring	`linux_security_ports_use_info`	Includes process name, version, application name
Service Status	`linux_security_services_info`	systemd service start/stop and running status
SELinux	`linux_security_selinux_config`	Configuration and enforcement mode
Kernel Parameters	`linux_security_sysctl_*`	Security-related sysctl parameter validation
Scheduled Tasks	`linux_security_crontab_info`	System/user crontab entries
Audit Service	`linux_security_auditd_info`	auditd status and rule count
Login Policy	`linux_security_login_defs_info`	login.defs configuration items

A diagram showing the entire collection pipeline:

mermaid
flowchart TD
    fs["📁 Linux Filesystem<br/>accounts/SSH/SELinux<br/>crontab static config"]
    proc["⚡ /proc runtime<br/>net conns/process/containers"]
    svc["⚙️ services & commands<br/>firewall/systemd/auditd"]
    detect["🔍 version detection<br/>HTTP API/JAR/command line"]
    collect["🔧 Collector<br/>metric aggregation"]
    prom["📊 Prometheus<br/>/metrics :9102"]
    fs --> collect
    proc --> detect
    proc --> collect
    svc --> collect
    detect --> collect
    collect --> prom
    classDef src fill:#bbdefb,stroke:#2196F3,color:#1B5E20
    classDef mid fill:#fff3e0,stroke:#FF9800,color:#BF360C
    classDef out fill:#c8e6c9,stroke:#4CAF50,color:#1B5E20
    class fs,svc src
    class proc,detect,collect mid
    class prom out

The diagram has three layers: Linux system data sources at top (filesystem static config, /proc runtime data, system command output), the exporter’s internal Collector and Version Detection Engine in the middle, and the Prometheus collection endpoint at bottom.

Interesting Design Decisions

Version Detection in Port Metrics

Port metrics don’t just record port numbers and process names. For common services (MySQL, Nginx, Redis, etc.), it attempts to detect the version number; for Java applications (Elasticsearch, Kafka, Tomcat, Jenkins, etc.), it identifies the real application name and version through multiple methods—HTTP API calls, JAR MANIFEST.MF parsing, command-line argument extraction, container image label reading—layer by layer fallback.

This feature took the most effort; process_info.go alone is 1347 lines. Because Java applications only show java as the process name—you’d never know if it’s Elasticsearch or Kafka running.

Shadow File as Independent Metrics

Each field in /etc/shadow (last change date, max validity, min validity, warning days, inactive days, account expiration) isn’t combined into one large metric but split into 6 independent gauges. This makes PromQL threshold evaluations natural:

promql
1
2
# Find accounts with password validity exceeding 90 days
linux_security_password_max_days > 90

Multi-Layer Firewall State Detection

It doesn’t simply run systemctl is-active firewalld and call it done. Each firewall type has independent detection logic: checking systemd service file status, checking if the process is running, checking ufw’s special state file (/var/lib/ufw/ufw-not-booted), checking iptables rules file paths. Because in real environments, the situation where a firewall is “configured but not running” is all too common.

Deployment and Running

Docker is the simplest way:

bash
1
2
3
4
5
docker run -d \
  --name security-exporter \
  --privileged \
  -p 9102:9102 \
  ghcr.io/mickeyzzc/security-collector-exporter:0.1.0

--privileged is needed to read system files like /etc/shadow, /proc, etc.

Some useful startup parameters:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Only collect LISTEN state ports (default)
./security-exporter --collector.port-states="LISTEN"

# Also collect ESTABLISHED connections
./security-exporter --collector.port-states="LISTEN,ESTABLISHED"

# Only collect enabled services (default behavior)
./security-exporter --collector.services-enabled=true

# Combined filter: collect only services that are both enabled and running
./security-exporter --collector.services-enabled=true --collector.services-running=true

# Enable debug logs for troubleshooting
./security-exporter --log.level=debug

Add a scrape config on the Prometheus side:

yaml
1
2
3
4
scrape_configs:
  - job_name: 'security-exporter'
    static_configs:
      - targets: ['localhost:9102']

Example Alert Rules

The project includes a complete set of security compliance alert rules covering SSH, SELinux, firewall, password policy, and service management. Here are a few typical examples:

yaml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Root SSH login not disabled — critical
- alert: RootSSHLoginEnabled
  expr: linux_security_sshd_config_info{info_key="PermitRootLogin", info_value="yes"}
  labels:
    severity: critical

# SELinux not in enforcing mode
- alert: SELinuxNotEnforcing
  expr: linux_security_selinux_config{info_key="SELINUX", info_value=~"permissive|disabled"}
  labels:
    severity: warning

# Firewall configured but not running
- alert: FirewallNotRunning
  expr: linux_security_firewall_enabled{firewall_type!="none", is_running="false"} == 1
  labels:
    severity: warning

# Password validity exceeds 90 days
- alert: PasswordMaxDaysTooLong
  expr: linux_security_login_defs_info{info_key="PASS_MAX_DAYS", info_value="num"} > 90
  labels:
    severity: warning

You can even calculate a security compliance score (out of 100), weighting and aggregating all checks:

promql
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
(
  (linux_security_sshd_config_info{info_key="PermitRootLogin", info_value="no"} or vector(0)) * 20 +
  (linux_security_selinux_config{info_key="SELINUX", info_value="enforcing"} or vector(0)) * 15 +
  (linux_security_firewall_enabled{firewall_type!="none"} == 1) * 10 +
  (linux_security_firewall_enabled{firewall_type!="none", is_running="true"} == 1) * 5 +
  ((linux_security_login_defs_info{info_key="PASS_MIN_LEN", info_value="num"} >= 10) or vector(0)) * 10 +
  ((linux_security_login_defs_info{info_key="PASS_MAX_DAYS", info_value="num"} <= 90) or vector(0)) * 10 +
  (linux_security_services_info{service_name="xwindow", is_running="false"} or vector(0)) * 5 +
  (count(linux_security_services_info{service_name=~"nfs|cups|bluetooth|avahi-daemon|rpcbind|postfix", is_running="true"}) == 0) * 10 +
  (linux_security_hosts_options_info{file="hosts.deny", service="ALL", host="ALL", action="deny"} or vector(0)) * 5 +
  (linux_security_last_patch_time{package_type!="unknown"} or vector(0)) * 5
)

Turn it into a Grafana dashboard panel for a quick view of which machines are non-compliant.

Technical Implementation

Pure Go implementation, with prometheus/client_golang as the only third-party dependency. No shell command stitching; security-related data is obtained by reading files under /proc, /etc as much as possible, reducing external command dependencies.

The architecture is straightforward:

text
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
cmd/security-exporter/main.go     # Entry point, HTTP Server
internal/collector/                # Prometheus Collector implementation
internal/system/                   # Security check modules
  ├── account_info.go              # Accounts
  ├── network_info.go              # Network
  ├── service_info.go              # Services
  ├── process_info.go              # Process version detection (largest file)
  ├── selinux_detail.go            # SELinux
  └── ...
pkg/config/                        # Configuration management
pkg/logger/                        # Logging

Each system module is independent; an error in one module doesn’t affect collection in others.

Relationship with node_exporter

Not competitive but complementary. node_exporter handles basic OS metrics (CPU, memory, disk IO), while security-collector-exporter handles security configuration state. Running both together gives you a complete system health + security compliance view in your monitoring dashboards.

Project Repository

Code here: github.com/mickeyzzc/security-collector-exporter

v0.1.0 is the first stable version, supports Linux AMD64 and ARM64, Docker images published to GHCR. Future iterations will continue based on usage feedback. Feel free to file issues or submit PRs.