Implementing a Lightweight File Backup System Based on OpenResty

Introduction

There are tools like filebeat for log collection, open-source cloud drives like seafile, and simple distributed file systems like FastDFS, among others. However, the remote backup logs of a certain enterprise SaaS platform still rely on traditional compressed file scp transfers to backup storage.

Since filebeat’s log collection completeness is debatable, open-source cloud drives have cluttered features that are not conducive to management, distributed file systems’ internal storage logic is not well-suited for the backup log management of the enterprise SaaS platform, and scp transfers pose security risks with private keys deployed on servers, we leveraged OpenResty’s features to implement a simple file backup API interface using Lua.

Requirements Analysis: Remote Backup of Log Compressed Files

In enterprise-level SaaS platforms, remote backup of log files is a common requirement. Especially in overseas service scenarios, we need to:

  1. Efficient transfer: Able to handle large compressed file transfers
  2. Secure and reliable: Avoid insecure methods like SSH keys
  3. Easy to manage: Support categorization by hostname, file type, and other dimensions
  4. Verification mechanism: Ensure the integrity of transferred files
  5. Chunked support: Support chunked transfer of large files

Existing solutions each have their shortcomings:

  • Filebeat: Log collection completeness is debatable
  • Open-source cloud drives: Cluttered features, not conducive to management
  • Distributed file systems: Internal storage logic is not suitable for backup log management
  • SCP transfer: Security issues with private key deployment

Technology Selection: OpenResty + Lua

Based on the above requirements, we chose the OpenResty + Lua technology combination to implement the lightweight file backup system, for the following main reasons:

Advantages of OpenResty

  1. High performance: Based on Nginx’s event-driven architecture with excellent concurrency performance
  2. Lua scripting support: Can embed Lua code directly in Nginx without additional processes
  3. HTTP API: Native support for RESTful API, easy for client integration
  4. Simple configuration: Nginx configuration files can handle all configuration
  5. Easy deployment: Single binary file, no complex dependencies

Technical Architecture

mermaid
flowchart TD
    A@{ shape: rounded, label: "Client" } -->|HTTP POST| B@{ shape: rounded, label: "OpenResty" }
    B --> C@{ shape: rounded, label: "Lua Script Processing" }
    C --> D@{ shape: rounded, label: "File Chunk Processing" }
    C --> E@{ shape: rounded, label: "MD5 Verification" }
    C --> F@{ shape: rounded, label: "Directory Creation" }
    C --> G@{ shape: cyl, label: "File Storage" }
    
    classDef primary fill:#e3f2fd,stroke:#1976d2
    classDef process fill:#f3e5f5,stroke:#9c27b0
    classDef storage fill:#e8f5e9,stroke:#4caf50
    class A,B,C,D,E,F primary
    class G storage

Core Implementation

1. File Upload API

We designed two main API endpoints:

/frontproxy Endpoint

Used for file backup of frontend proxy services:

nginx
1
2
3
4
5
6
location /frontproxy {
    set $fb_root_path '/mnt/data/frontproxy/';  # Root directory for this API's storage
    set $fb_chunk_size '8192';   # Set upload chunk size to 8k, default 4k
    set $fb_upload_time '30000'; # Set upload timeout to 30s, default 1s
    content_by_lua_file conf/lua/fbsimple/fbFromUpload.lua;
}

/mxproxy Endpoint

Used for file backup of mail proxy services:

nginx
1
2
3
4
5
6
location /mxproxy {
    set $fb_root_path '/mnt/data/mxproxy/';
    set $fb_chunk_size '8192';
    set $fb_upload_time '30000';
    content_by_lua_file conf/lua/fbsimple/fbFromUpload.lua;
}

2. Chunked Upload Support

The system supports chunked upload of large files by setting the fb_chunk_size parameter to control chunk size:

nginx
1
set $fb_chunk_size '8192';   # Set upload chunk size to 8k, default 4k

3. Verification and Deduplication

The system supports MD5 verification to ensure file transfer integrity:

nginx
1
2
3
4
5
6
# Client needs to provide MD5 checksum
curl http://example.com/mxproxy -X POST \
    -F 'data=@bin/bashrc/icmbash.sh' \
    -H 'src-hostname:test.example.com' \
    -H 'src-file-md5:cab8acb6d32e3b01be9e19efec57fc33' \
    -H "src-file-type:mxproxy"

Deployment and Usage

Server-side Configuration

Complete OpenResty configuration file:

nginx
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
server {
    listen       80;
    server_name  localhost;
    
    set $template_location "/usr/local/openresty/nginx/conf/lua";
    client_max_body_size 2048M;
    
    location / {
        root   html;
        index  index.html index.htm;
    }
    
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   html;
    }
    
    # File backup API - Frontend proxy
    location /frontproxy {
        set $fb_root_path '/mnt/data/frontproxy/';  # Root directory for this API's storage
        set $fb_chunk_size '8192';   # Set upload chunk size to 8k, default 4k
        set $fb_upload_time '30000'; # Set upload timeout to 30s, default 1s
        content_by_lua_file conf/lua/fbsimple/fbFromUpload.lua;
    }
    
    # File backup API - Mail proxy
    location /mxproxy {
        set $fb_root_path '/mnt/data/mxproxy/';
        set $fb_chunk_size '8192';
        set $fb_upload_time '30000';
        content_by_lua_file conf/lua/fbsimple/fbFromUpload.lua;
    }
}

Client Usage

Clients can upload files through standard HTTP POST requests. This directive creates directories under the API’s directory based on hostname/type and saves files in those directories. If no hostname and type parameters are provided, files are saved in the API’s root directory.

Basic Upload Example

bash
1
2
3
4
5
curl http://backup-api.example.com/mxproxy -X POST \
    -F 'data=@bin/bashrc/icmbash.sh' \
    -H 'src-hostname:test.example.com' \
    -H 'src-file-md5:cab8acb6d32e3b01be9e19efec57fc33' \
    -H "src-file-type:mxproxy"

Batch File Upload Script

bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/bin/bash
# Batch upload log file backups

BACKUP_URL="http://backup-api.example.com"
API_PATH="/mxproxy"
FILE_PATH="/var/log/app/backup.zip"

# Calculate file MD5
MD5_SUM=$(md5sum "$FILE_PATH" | cut -d' ' -f1)
HOSTNAME=$(hostname)
FILE_TYPE="app-backup"

# Upload file
curl "$BACKUP_URL$API_PATH" -X POST \
    -F "data=@$FILE_PATH" \
    -H "src-hostname:$HOSTNAME" \
    -H "src-file-md5:$MD5_SUM" \
    -H "src-file-type:$FILE_TYPE"

echo "File upload completed: $FILE_PATH"

Storage Structure

The system automatically creates a storage directory structure based on parameters in the request headers:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
/mnt/data/frontproxy/
├── hostname1/
│   ├── type1/
│   └── type2/
└── hostname2/
    ├── type1/
    └── type2/

/mnt/data/mxproxy/
├── hostname1/
│   ├── mxproxy/
│   └── transport/
└── hostname2/
    ├── mxproxy/
    └── transport/

Technical Implementation Details

Lua Script Processing Flow

The core logic of the file backup API is implemented in the fbFromUpload.lua script:

lua
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Pseudocode: File upload processing flow
function handle_upload(request)
    -- 1. Parse request parameters
    local hostname = request.headers['src-hostname']
    local md5 = request.headers['src-file-md5']
    local filetype = request.headers['src-file-type']
    local file_data = request.files['data']
    
    -- 2. Verify MD5 checksum
    if md5 and calculate_md5(file_data) ~= md5 then
        return 400, "MD5 verification failed"
    end
    
    -- 3. Determine storage path
    local storage_path = determine_path(hostname, filetype)
    
    -- 4. Create storage directory
    create_directory(storage_path)
    
    -- 5. Save file
    save_file(storage_path, file_data)
    
    -- 6. Return success response
    return 200, "File upload successful"
end

Error Handling

The system includes a comprehensive error handling mechanism:

  1. MD5 verification failure: Returns 400 error, indicating verification failure
  2. File too large: Rejects oversized files based on client_max_body_size configuration
  3. Timeout handling: Handles upload timeouts based on fb_upload_time configuration
  4. Directory permissions: Checks and ensures storage directories are writable
  5. Disk space: Checks if remaining disk space is sufficient

Performance Optimization

  1. Chunked upload: Large files are uploaded in chunks to avoid memory overflow
  2. Streaming processing: File streaming processing reduces memory usage
  3. Parallel processing: Supports concurrent uploads from multiple clients
  4. Cache optimization: Lua script caching reduces compilation overhead

Application Scenarios

1. Log File Backup

The system is primarily used for log file backup in enterprise-level SaaS platforms:

  • Application logs: Runtime logs of various service components
  • Access logs: Web server access logs
  • Error logs: System errors and exception logs
  • Audit logs: Security audit and operation logs

2. Configuration File Backup

  • Configuration version management: Periodic backup and version control of configuration files
  • Configuration synchronization: Synchronized backup of configuration files across multiple servers
  • Configuration recovery: Configuration file recovery during failures

3. Data File Backup

  • Temporary files: Temporary data files generated during processing
  • Export files: Data files exported by the system
  • Backup files: Database backup files, etc.

Monitoring and Maintenance

Log Monitoring

nginx
1
2
access_log /var/log/nginx/backup_access.log main;
error_log /var/log/nginx/backup_error.log;

Performance Monitoring

Key monitoring metrics:

  • Upload success rate: Number of successfully uploaded files
  • Upload failure rate: Proportion of failed uploads
  • Average upload time: Average time taken for file uploads
  • Disk usage: Disk usage of storage directories

Regular Maintenance

  1. Log rotation: Periodically clean old access logs
  2. Disk cleanup: Clean expired backup files
  3. Permission check: Verify file permission settings
  4. Backup verification: Periodically verify backup file integrity

Summary

The lightweight file backup system based on OpenResty has the following advantages:

Technical Advantages

  1. High performance: Based on Nginx’s event-driven architecture, supports high concurrency
  2. Lightweight: No additional dependencies, single binary deployment
  3. Simple configuration: Nginx configuration files handle all configuration
  4. Easy integration: Standard HTTP API, convenient for client integration
  5. Cost-effective: Low resource usage, low operating cost

Business Value

  1. Improved efficiency: Automated file backup, reduced manual intervention
  2. Ensured security: Avoids insecure transfer methods
  3. Easy management: Structured storage and categorized management
  4. Scalability: Easy to extend to more service types
  5. Simple maintenance: Familiar technology stack, low maintenance cost

Applicable Scenarios

This system is particularly suitable for the following scenarios:

  • Small and medium enterprise file backup needs
  • SaaS platform log file collection
  • Distributed system configuration file synchronization
  • Centralized management of temporary files
  • Scenarios requiring high-performance file transfer

Through this lightweight implementation, we successfully solved the file backup requirements of an enterprise-level SaaS platform while maintaining the simplicity and maintainability of the system. This OpenResty+Lua based solution provides a good reference pattern for similar file transfer and management problems.