Linux High Availability and Load Balancing in Practice — From Keepalived to Performance Tuning

June 19, 2018 Linux Linux, Keepalived, Haproxy, Load-Balancing, Performance 2727 words 13 min read

Introduction

High availability and load balancing are key technologies for ensuring stable operation of enterprise application architectures. This article covers dual-machine hot standby with Keepalived, internal network service load balancing with HAProxy, and RPS/RFS tuning for high NIC soft interrupts, with the actual deployment configuration.

Keepalived Dual-Machine Hot Standby Deployment

VRRP Principles

The Virtual Router Redundancy Protocol (VRRP) is a protocol used to achieve router high availability. In a VRRP architecture, multiple routers form a virtual router group, where one serves as the MASTER router handling actual network traffic, and the others serve as BACKUP routers. When the MASTER router fails, a BACKUP router immediately takes over, ensuring continuity of network services.

Core VRRP concepts include:

Virtual Router ID (VRID): Identifies a virtual router group
Virtual IP Address: The IP address provided by the virtual router to the outside
Priority: Determines the router’s role in the group; higher values indicate higher priority
Advertisement Interval: The time interval between heartbeat messages sent by routers
Authentication Mechanism: Ensures only authorized routers can join the virtual router group

In our architecture, multiple VRRP instances are configured to achieve high availability for different network segments:

mermaid
flowchart TD
    A@{ shape: rounded, label: "Virtual Router Group" } --> B@{ shape: rounded, label: "External Network VRRP Instance" }
    A --> C@{ shape: rounded, label: "Internal Network VRRP Instance" }
    B --> D@{ shape: rounded, label: "Server-1 Master" }
    B --> E@{ shape: rounded, label: "Server-2 Backup" }
    C --> F@{ shape: rounded, label: "Server-1 Master" }
    C --> G@{ shape: rounded, label: "Server-2 Backup" }
    
    D --> H@{ shape: hex, label: "Floating IP1: 192.168.1.66" }
    D --> I@{ shape: hex, label: "Floating IP2: 192.168.1.67" }
    E --> H
    E --> I
    F --> J@{ shape: hex, label: "Floating IP3: 192.168.2.12" }
    F --> K@{ shape: hex, label: "Floating IP4: 192.168.2.16" }
    G --> J
    G --> K
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef network fill:#fff3e0,stroke:#ff9800
    classDef alert fill:#ffebee,stroke:#f44336
    class A,B,C primary
    class D,E,F,G network
    class H,I,J,K alert

Complete Configuration Details

Server-1 Configuration

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
! Configuration File for keepalived
global_defs {
    router_id server-1
}

static_ipaddress {
    192.168.1.64/23 dev br0
    192.168.2.13 dev br1
}

static_routes {
    default via 192.168.1.1 dev br0
    192.168.2.14/32 via 192.168.2.13 src 192.168.2.13
    192.168.2.12/32 via 192.168.2.13 src 192.168.2.13
    192.168.2.0/16 via 192.168.2.64 dev br1
}

vrrp_instance EX_1 {
    state BACKUP 
    interface br1 
    mcast_src_ip 192.168.2.64 
    virtual_router_id 100 
    priority 100 
    advert_int 2 
    authentication { 
        auth_type PASS
        auth_pass YOUR_PASSWORD 
    }
    virtual_ipaddress { 
        192.168.1.66 dev br0
        192.168.1.67 dev br0
        192.168.1.68 dev br0
        192.168.1.69 dev br0
    }
}

vrrp_instance EX_2 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.64
    virtual_router_id 101
    priority 50
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.1.76 dev br0
        192.168.1.77 dev br0
        192.168.1.78 dev br0
        192.168.1.79 dev br0
    }
}

vrrp_instance INT_1 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.64
    virtual_router_id 102
    priority 100
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.2.12 dev br1
        192.168.2.16 dev br1
        192.168.2.17 dev br1
        192.168.2.18 dev br1
        192.168.2.19 dev br1
    }
}

vrrp_instance INT_2 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.64
    virtual_router_id 103
    priority 50
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.2.20 dev br1
        192.168.2.21 dev br1
        192.168.2.22 dev br1
        192.168.2.23 dev br1
    }
}

Server-2 Configuration

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
! Configuration File for keepalived
global_defs {
    router_id server-2
}

static_ipaddress {
    192.168.1.65/23 dev br0
    192.168.2.14 dev br1
}

static_routes {
    default via 192.168.1.1 dev br0
    192.168.2.13/32 via 192.168.2.14 src 192.168.2.14
    192.168.2.12/32 via 192.168.2.14 src 192.168.2.14
    192.168.2.0/16 via 192.168.2.65 dev br1
}

vrrp_instance EX_1 {
    state BACKUP 
    interface br1
    mcast_src_ip 192.168.2.65
    virtual_router_id 100
    priority 50     
    advert_int 2  
    authentication { 
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {  
        192.168.1.66 dev br0
        192.168.1.67 dev br0
        192.168.1.68 dev br0
        192.168.1.69 dev br0
    }
}

vrrp_instance EX_2 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.65
    virtual_router_id 101
    priority 100
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.1.76 dev br0
        192.168.1.77 dev br0
        192.168.1.78 dev br0
        192.168.1.79 dev br0
    }
}

vrrp_instance INT_1 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.65
    virtual_router_id 102
    priority 50
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.2.12 dev br1
        192.168.2.16 dev br1
        192.168.2.17 dev br1
        192.168.2.18 dev br1
        192.168.2.19 dev br1
    }
}

vrrp_instance INT_2 {
    state BACKUP
    interface br1
    mcast_src_ip 192.168.2.65
    virtual_router_id 103
    priority 100
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass YOUR_PASSWORD
    }
    virtual_ipaddress {
        192.168.2.20 dev br1
        192.168.2.21 dev br1
        192.168.2.22 dev br1
        192.168.2.23 dev br1
    }
}

Configuration Key Points

Priority Settings: Server-1 is set to 100 in the external network VRRP instances (EX_1) and internal network VRRP instances (INT_1), serving as the master node; Server-2 is set to 100 in EX_2 and INT_2, serving as the master node. This design achieves load balancing while ensuring high availability.
Non-Preempt Mode: The #nopreempt comment in the configuration indicates the use of non-preempt mode, meaning the master node will not immediately reclaim resources after recovery, avoiding network flapping.
Dual Network Architecture: Separately handles external network (EX instances) and internal network (INT instances) traffic, achieving network isolation and independent high availability.
Authentication Mechanism: Uses simple password authentication to ensure VRRP communication security.

mermaid
flowchart TD
    A@{ shape: rounded, label: "Business System Client" } --> B@{ shape: rounded, label: "External Network" }
    B --> C@{ shape: hex, label: "Floating IP 192.168.1.66-69" }
    C --> D@{ shape: hex, label: "Keepalived VIP" }
    
    E@{ shape: rounded, label: "Internal Business System" } --> F@{ shape: rounded, label: "Internal Network" }
    F --> G@{ shape: hex, label: "Floating IP 192.168.2.12-23" }
    G --> H@{ shape: hex, label: "Keepalived VIP" }
    
    D --> I@{ shape: rounded, label: "Server-1/Server-2" }
    H --> I
    
    I --> J@{ shape: rounded, label: "Business System Services" }
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef network fill:#fff3e0,stroke:#ff9800
    classDef alert fill:#ffebee,stroke:#f44336
    class A,E,B,F,I,J primary
    class C,G,D,H alert

HAProxy Internal Network Service Load Balancing

TCP Mode Load Balancing

Deployment Architecture

For the two core services (antivirus service and delivery service) of the business system, we deployed HAProxy on Server-1 and Server-2 for load balancing. The advantages of this architecture include:

Service Availability: A single gateway failure does not affect backend client business systems
Load Distribution: Multiple servers share the service load
Transparent Failover: Backend configurations do not need to be aware of specific server changes

mermaid
flowchart TD
    A@{ shape: rounded, label: "Client System" } --> B@{ shape: hex, label: "HAProxy Load Balancer" }
    B --> C@{ shape: rounded, label: "Server-1" }
    B --> D@{ shape: rounded, label: "Server-2" }
    B --> E@{ shape: rounded, label: "Server-3" }
    B --> F@{ shape: rounded, label: "Server-4" }
    
    C --> G@{ shape: rounded, label: "Antivirus 6600" }
    D --> G
    E --> G
    F --> G
    
    C --> H@{ shape: rounded, label: "Delivery 8025" }
    D --> H
    E --> H
    F --> H
    
    C --> I@{ shape: rounded, label: "Anti-Spam 8070" }
    D --> I
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef network fill:#fff3e0,stroke:#ff9800
    classDef process fill:#f3e5f5,stroke:#9c27b0
    class A,B network
    class C,D,E,F primary
    class G,H,I process

HAProxy Configuration Details

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#---------------------------------------------------------------------
# Antivirus service load balancing configuration
#---------------------------------------------------------------------
listen kill-virus-service
    bind *:6600
    mode tcp
    balance     roundrobin
    server  server-1-antivirus 192.168.1.64:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-2-antivirus 192.168.1.65:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-3-antivirus 192.168.1.66:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-4-antivirus 192.168.1.67:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-5-antivirus 192.168.1.69:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-6-antivirus 192.168.1.71:6600 weight 1 maxconn 10000 check inter 10s  
    server  server-7-antivirus 192.168.1.73:6600 weight 1 maxconn 10000 check inter 10s  

#---------------------------------------------------------------------
# Delivery service load balancing configuration
#---------------------------------------------------------------------
listen delivery-service
    bind *:8025
    mode tcp
    balance     roundrobin
    server  server-1-smtp 192.168.1.64:8025 weight 1 maxconn 10000 check inter 10s  
    server  server-2-smtp 192.168.1.65:8025 weight 1 maxconn 10000 check inter 10s  
    server  server-3-smtp 192.168.1.69:8025 weight 1 maxconn 10000 check inter 10s  
    server  server-4-smtp 192.168.1.73:8025 weight 1 maxconn 10000 check inter 10s  

#---------------------------------------------------------------------
# Anti-spam engine service load balancing configuration
listen cac-service
    bind *:8070
    mode tcp
    balance     roundrobin
    server  server-1-cac 192.168.1.68:8070 weight 1 maxconn 20000 check inter 10s

Health Checks and Statistics

Health Check Mechanism

HAProxy provides a comprehensive health check mechanism:

Connection Check: Enabled via the check parameter
Check Interval: inter 10s means checking every 10 seconds
Maximum Connections: maxconn limits the maximum connections per server
Weight Settings: The weight parameter is used to assign traffic weights

Statistics Feature

HAProxy has a built-in web statistics interface that can be enabled through configuration:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
listen stats
    bind *:8404
    mode http
    stats enable
    stats uri /stats
    stats refresh 30s
    stats auth admin:YOUR_PASSWORD
    stats realm HAProxy\ Statistics
    stats hide-version
    stats auth admin:admin123

Status Monitoring

Monitor HAProxy status with the following commands:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Check HAProxy process status
systemctl status haproxy

# View HAProxy statistics
curl http://192.168.1.100:8404/stats

# Check connection status
ss -tulnp | grep haproxy

# View logs
tail -f /var/log/haproxy.log

Network Performance Tuning: NIC Soft Interrupt Optimization

Hard Interrupts and Soft Interrupts Principles

What is an Interrupt?

An interrupt is a corresponding hardware or software processing triggered by receiving asynchronous signals from peripheral hardware (relative to CPU and memory) or synchronous signals from software. Issuing such a signal is called an interrupt request (IRQ).

Difference Between Hard Interrupts and Soft Interrupts

Hard Interrupts:

Asynchronous signals sent by peripheral hardware to the CPU or memory
Requires an interrupt controller to participate
Fast processing, triggered directly in hardware fashion
Can be masked by setting the CPU’s mask bit

Soft Interrupts:

Interrupt signals sent by the software system itself to the operating system kernel
Usually triggered by hard interrupt handlers or the process scheduler
Instructs the CPU to process in the form of CPU instructions
Cannot be masked, part of system calls

Interrupt Processing Flow

mermaid
flowchart TD
    A@{ shape: hex, label: "Hardware Event" } --> B@{ shape: rounded, label: "Hard Interrupt Triggered" }
    B --> C@{ shape: rounded, label: "Save CPU Context" }
    C --> D@{ shape: rounded, label: "Execute Hard Interrupt Handler" }
    D --> E@{ shape: rounded, label: "Trigger Soft Interrupt" }
    E --> F@{ shape: rounded, label: "Soft Interrupt Processing" }
    F --> G@{ shape: rounded, label: "Restore CPU Context" }
    G --> H@{ shape: stadium, label: "Return to Original Program" }
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef process fill:#f3e5f5,stroke:#9c27b0
    classDef network fill:#fff3e0,stroke:#ff9800
    class A network
    class B,C,D,E,F,G primary
    class H process

Problem Diagnosis Process

Symptom Identification

The business gateway experienced network packet loss during peak hours, with CPU0 soft interrupt %sys reaching 90%, indicating that network processing had become a system bottleneck.

Monitoring Tools Usage

Viewing Interrupt Distribution:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# View interrupt statistics per CPU
cat /proc/interrupts | head -20

# View interrupt distribution per CPU
cat /proc/interrupts | grep "CPU0"

# Use itop for real-time monitoring
itop

# View system load
uptime

# View network statistics
netstat -s

Detailed Analysis Steps:

Identify the Problem CPU: Use /proc/interrupts to see which CPU handles the most interrupts
Locate the Interrupt Source: Analyze which NIC or device is generating high interrupts
Analyze Network Traffic: Use tools like iftop, nethogs to view traffic patterns
Check Drivers: Confirm whether the NIC driver version supports optimization

RPS/RFS Optimization Solution

RPS (Receive Packet Steering)

RPS allows distributing received network packets to multiple CPU cores for processing, avoiding single CPU overload.

Configuration Method:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# View current NIC CPU affinity
cat /proc/net/dev | grep -E "eth[0-9]+"

# Set RPS, allow all CPUs to handle NIC interrupts
echo ffffffff > /sys/class/net/eth0/queues/rx-0/rps_cpus

# Verify configuration
cat /sys/class/net/eth0/queues/rx-0/rps_cpus

# View RPS statistics
cat /proc/net/softnet_stat

RFS (Receive Flow Steering)

RFS further optimizes by scheduling packets belonging to the same network flow to the same CPU for processing, improving cache hit rates.

Configuration Method:

bash
1
2
3
4
5
6
7
8
# Enable RFS
echo 32768 > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt

# Set RFS filter
echo 2 > /proc/sys/net/core/rps_sock_flow_entries

# Verify RFS configuration
cat /sys/class/net/eth0/queues/rx-0/rps_flow_cnt

Comprehensive Optimization Configuration

Optimization Script Example:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#!/bin/bash

# Get NIC name
interface="eth0"

# Enable RPS
echo ffffffff > /sys/class/net/$interface/queues/rx-0/rps_cpus
echo ffffffff > /sys/class/net/$interface/queues/rx-1/rps_cpus
echo ffffffff > /sys/class/net/$interface/queues/rx-2/rps_cpus
echo ffffffff > /sys/class/net/$interface/queues/rx-3/rps_cpus

# Enable RFS
echo 32768 > /sys/class/net/$interface/queues/rx-0/rps_flow_cnt
echo 32768 > /sys/class/net/$interface/queues/rx-1/rps_flow_cnt
echo 32768 > /sys/class/net/$interface/queues/rx-2/rps_flow_cnt
echo 32768 > /sys/class/net/$interface/queues/rx-3/rps_flow_cnt

# Enable interrupt coalescing
echo 1 > /sys/class/net/$interface/queues/rx-0/rps_flow_cnt
echo 1 > /sys/class/net/$interface/queues/rx-1/rps_flow_cnt
echo 1 > /sys/class/net/$interface/queues/rx-2/rps_flow_cnt
echo 1 > /sys/class/net/$interface/queues/rx-3/rps_flow_cnt

# Adjust NIC queue length
echo 1000 > /proc/sys/net/core/netdev_max_backlog

# Adjust TCP buffer
echo 65536 > /proc/sys/net/core/rmem_max
echo 65536 > /proc/sys/net/core/wmem_max

# Enable TCP BBR congestion control
echo 'tcp_bbr' > /proc/sys/net/ipv4/tcp_congestion_control

# Verify configuration
echo "RPS Configuration:"
cat /sys/class/net/$interface/queues/rx-0/rps_cpus
echo "RFS Configuration:"
cat /sys/class/net/$interface/queues/rx-0/rps_flow_cnt

Optimization Results Verification

mermaid
flowchart TD
    A@{ shape: rounded, label: "Before Optimization" } --> B@{ shape: rounded, label: "Soft IRQ 90%" }
    A --> C@{ shape: rounded, label: "Packet Loss" }
    A --> D@{ shape: rounded, label: "Single CPU Overload" }
    
    E@{ shape: rounded, label: "After Optimization" } --> F@{ shape: rounded, label: "Soft IRQ 30%" }
    E --> G@{ shape: rounded, label: "Zero Packet Loss" }
    E --> H@{ shape: rounded, label: "Multi-CPU Load Balance" }
    
    B --> I@{ shape: rounded, label: "RPS/RFS Optimization" }
    C --> I
    D --> I
    
    I --> F
    I --> G
    I --> H
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef alert fill:#ffebee,stroke:#f44336
    classDef success fill:#e8f5e9,stroke:#4caf50
    classDef process fill:#f3e5f5,stroke:#9c27b0
    class A,B,C,D alert
    class E,F,G,H success
    class I process

Verification Commands:

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Monitor interrupt usage
watch -n 1 "cat /proc/interrupts | grep eth0"

# Monitor CPU usage
mpstat 1 5

# Monitor network performance
ping -c 100 192.168.1.1

# Check packet loss
netstat -s | grep packet

Conclusion

This article has detailed practical solutions for Linux high availability and load balancing, covering the following key points:

1. Keepalived Dual-Machine Hot Standby Architecture

High-availability network architecture achieved through VRRP protocol, with key configuration points including:

Multi-instance design for separate external and internal network traffic handling
Priority settings for active-standby load balancing
Non-preempt mode to avoid network flapping
Authentication mechanisms for secure communication

2. HAProxy Load Balancing Implementation

High-availability load balancing for core business system services:

TCP mode load balancing ensures service availability
Health check mechanisms automatically remove failed nodes
Statistics features facilitate operations monitoring
Unified multi-service management simplifies configuration

3. Network Performance Optimization Practices

Solutions for excessive soft interrupt issues:

RPS technology for multi-CPU distribution of received packets
RFS technology for optimized network flow processing
Comprehensive configuration to resolve network bottlenecks
Real-time monitoring to verify optimization results

4. Operations Recommendations

Regular Monitoring: Establish a comprehensive monitoring system to detect system bottlenecks in a timely manner
Capacity Planning: Plan capacity expansion in advance based on business growth forecasts
Failure Drills: Conduct regular failover drills to ensure high-availability mechanisms are effective
Documentation Maintenance: Keep configuration documents up to date for troubleshooting and team collaboration