MiBeeNvr v0.6.0: Timelapse + Transcoding UI + ONVIF Enhancements + Documentation Restructure

After running continuous recording for a few weeks, storage became the first bottleneck. A single 1080p camera writes tens of GB per day — with a 30-day retention policy, a 1TB硬盘 is mostly consumed. Many community members reported the same issue, and during discussions, the ideas of timelapse and transcoding storage gained the most traction: most of the time the画面 is static, and compressing it with timelapse requires only 5% of the space for the same duration.

v0.6.0 took shape from these discussions. The timelapse pipeline forms a complete闭环 from configuration, frame extraction, to rolling merge; the transcoding system supports H.264/H.265/HEVC interconversion and backfilling historical data; to accommodate both old and new hardware, the transcoding engine auto-detects V4L2/VAAPI/NVENC encoders at startup — if no hardware encoder is found, the feature is disabled, leaving the Raspberry Pi 3B unaffected while the more powerful Banana Pi M5 (RK3588) can fully utilize H.265 hardware encoding.

This release is also laying groundwork for the next phase. The internal/ai/ directory sets up the basic framework for the ONNX Runtime inference engine — decoupling CGO dependencies through subprocesses, with the YOLOv11n model ready, only a feature flag away from enabling real-time object detection. On the observability front, Prometheus metrics and VictoriaLogs remote logging have been introduced, allowing community issues to be investigated through metrics and structured logs rather than guesswork. The player now implements freeze detection by monitoring frame timestamps for stalls, combined with H.265 SPS patching, LL-HLS configuration, and WebRTC connection tracking, significantly improving playback experience.

Before release, all these new features were thoroughly validated against real camera environments — the related camera test projects were adapted accordingly, details in camera-test-machines. See the full changelog at GitHub Release Notes.

Timelapse Recording Pipeline

Architecturally, the timelapse pipeline is not a simple定时 screenshot, but a multi-stage workflow:

mermaid
flowchart TB
    RTSP["RTSP Source<br/>h264 / h265"] --> RC
    MJPEG["MJPEG Source<br/>jpeg frames"] --> RC["Timelapse<br/>Recorder"]
    RC --> DD["Frame Detection<br/>Skip static frames"]
    DD --> SQ["JPEG Sequence<br/>Directory"]

    classDef source fill:#E3F2FD,stroke:#1565C0,color:#1565C0
    classDef rec fill:#FFF3E0,stroke:#E65100,color:#BF360C
    classDef store fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20
    class RTSP,MJPEG source
    class RC,DD rec
    class SQ store

The JPEG sequence feeds into two merge paths and one direct playback path:

mermaid
flowchart TB
    SQ["JPEG Sequence<br/>Directory"] --> PL["JPEG Playlist<br/>Lazy Load"] --> JP["JPEG Player"]
    SQ --> MF["Go / FFmpeg<br/>Merge"] --> MV["Composited Video<br/>Playback"]

    classDef store fill:#E8F5E9,stroke:#2E7D32,color:#1B5E20
    classDef merge fill:#F3E5F5,stroke:#9C27B0,color:#6A1B9A
    classDef view fill:#FFEBEE,stroke:#C62828,color:#C62828
    class SQ,PL store
    class MF merge
    class JP,MV view

Frame Extraction Strategy

The Timelapse Recorder extracts JPEG frames from RTSP or MJPEG sources at configurable intervals. Core logic in internal/recorder/timelapse.go:

1
2
3
Frame extraction interval: config.timelapse.interval  (default 30 seconds)
Extraction window:        interval ± 5% random jitter (avoid simultaneous extraction across cameras)
Skip strategy:            Skip when N consecutive frames are static (save storage)

The 5% random jitter on the interval prevents multiple cameras from requesting frame extraction at the same moment — the DESCRIBE/PLAY session setup for RTSP sources has overhead, and concurrent requests would cause instantaneous CPU spikes.

The static frame skip detection reuses the health module’s freeze detection (internal/health/quality.go), comparing histogram differences between adjacent frames to determine if the画面 has changed. When static frames are detected for 3 consecutive extraction cycles, subsequent extractions are skipped until the画面 changes again.

Rolling Merge Race Condition Handling

Rolling Merge is the most complex part of timelapse. RollingMergeManager in internal/timelapse/rolling.go maintains one active merge goroutine per camera:

go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
type RollingMergeManager struct {
    mu      sync.Mutex
    merger  TimelapseMerger
    active  map[string]*activeEntry  // cameraID → merge goroutine
}

type activeEntry struct {
    cancel context.CancelFunc
    id     uint64
}

When a new segment completes, StartSegmentMerge() cancels the old merge goroutine (if any) before starting a new one. But there’s a race condition: the old goroutine might be executing Merge(), and after cancellation it exits and runs defer cleanup, while at the same time the new goroutine attempts to write to the same output file.

The fix is the entry.id == ownID check in runMerge()’s cleanup:

go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
func (r *RollingMergeManager) runMerge(ctx context.Context, ownID uint64, cameraID, ...) {
    defer func() {
        r.mu.Lock()
        if entry, ok := r.active[cameraID]; ok && entry.id == ownID {
            delete(r.active, cameraID)
        }
        r.mu.Unlock()
    }()
    // ...
}

Each merge goroutine holds the ownID from its creation time. During cleanup, it only deletes the entry if this goroutine’s ID is still the active entry for that camera. If it has been replaced, the new goroutine has already taken over, and the old goroutine simply exits without removing the new goroutine’s entry.

Go Native Merge vs FFmpeg Merge

DimensionGo Native MergeFFmpeg Merge
DependenciesNoneRequires FFmpeg
Merge MethodJPEG sequence directly muxed to MP4Re-encoded to H.264
Output File SizeLarger (raw JPEG size)Smaller (H.264 compression)
CPU OverheadLow (mux only)High (encoding)
Use CaseReal-time preview, temporary mergeFinal archive, long-term storage

Go’s merge implementation (internal/timelapse/go_merge.go) directly encapsulates JPEG frames as H.264 IDR frames into the MP4 container. This approach requires no encoding/decoding, with extremely low CPU overhead (tested on Raspberry Pi 3B at ~5% CPU for real-time merging of 10 1080p JPEGs), but the file size is large.

FFmpeg merge (internal/timelapse/ffmpeg_merge.go) re-encodes the JPEG sequence to H.264, reducing volume by 5-10x, suitable for long-term archiving. Daily merge (internal/timelapse/daily.go) triggers at a fixed time each day, merging all JPEG sequences from the past 24 hours into a single MP4 file.

Transcoding UI: Three-Wave Delivery

v0.6.0’s transcoding UI was delivered in three waves, each corresponding to an independent frontend page and set of backend APIs:

Wave 1: Task Queue and Auto-Enqueue

Basic architecture: DB-backed task queue (internal/transcoding/queue.go). When recording completes, transcoding tasks are automatically created via the event bus and written to SQLite’s transcoding_jobs table.

1
2
3
Recording complete → event.TranscodingRequired → transcoding.Manager.Enqueue()
    → Write to SQLite → goroutine consumes queue → Execute FFmpeg
    → Update status (pending/running/done/failed)

Wave 2: Polling and Retry

The frontend polls /api/transcoding/jobs every 3 seconds for the task list. Failed jobs show a “Retry” button that calls POST /api/transcoding/jobs/:id/retry, resetting the status to pending for the consumer goroutine to reprocess.

Wave 3: Backfill and History Management

Backfill is the most practical feature — select a range of historical recording dates and batch-create transcoding tasks. A frontend dialog allows selecting the target encoder:

json
1
2
3
4
5
6
7
8
POST /api/transcoding/backfill
{
  "camera_id": "front-door",
  "date_from": "2026-05-01",
  "date_to": "2026-06-01",
  "target_codec": "h264",        // h264 / hevc / mjpeg
  "replace_original": false
}

The backend scans the recording files within the specified date range and creates transcoding tasks one by one. For ARM platforms (e.g., Raspberry Pi), if no hardware encoder is detected (/dev/dri/renderD128 or Video4Linux encoding node), it automatically degrades to software encoding (libx264).

History management supports paginated cleanup — DELETE /api/transcoding/history?page_size=50&page=1, avoiding SQLite WAL file bloat from cleaning up large numbers of records at once.

ONVIF Enhancements

Raw SOAP Fallback

Some ONVIF cameras respond to GetUsers operations in ways that don’t match the standard library’s parsing expectations — the returned XML namespace prefixes are inconsistent, or the security header format differs. v0.6.0 adds raw SOAP fallback:

mermaid
sequenceDiagram
    participant NVR as MiBeeNvr (Client)
    participant CAM as ONVIF Camera

    NVR->>CAM: SOAP GetUsers (via onvif-go lib)
    alt Standard response
        CAM-->>NVR: Users list
    else Namespace mismatch / Parse error
        NVR->>CAM: Raw SOAP GetUsers (custom XML)
        Note over NVR: WS-Security PasswordText<br/>Digest manually computed
        CAM-->>NVR: Raw XML response
        NVR->>NVR: XPath direct parsing
    end

The WS-Security PasswordText Digest calculation:

xml
1
2
3
<wsse:Password Type="http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordDigest">
    Base64(SHA1(Nonce + Created + Password))
</wsse:Password>

The standard implementation in the onvif-go library uses the library’s internal digest function, but some older cameras require specific Nonce encoding formats (hex vs base64), which the standard library doesn’t expose. Raw SOAP fallback allows manual control of XML construction details.

Camera Capability Flags

v0.6.0 exposes full camera capability flags via the /api/events SSE endpoint:

json
1
2
3
4
5
6
7
8
9
{
  "camera_id": "front-door",
  "capabilities": {
    "ptz": { "absolute": true, "relative": true, "continuous": true, "presets": true, "home": true },
    "imaging": { "brightness": true, "contrast": true, "saturation": true, "sharpness": true },
    "events": { "pullpoint": true, "motion": false, "tampering": false },
    "snapshot": { "uri": "http://192.168.1.100:8080/onvif/snapshot" }
  }
}

The frontend dynamically shows/hides control panels based on capability flags — cameras without PTZ support won’t show the joystick, cameras without Imaging support won’t show adjustment sliders.

H.265 SPS Patch

Some budget cameras produce H.265 streams with non-standard SPS (Sequence Parameter Set) fields — for example, profile_idc set to 0 (not allowed by the standard) or level_idc outside the规范 range. Certain players (especially hls.js) have low tolerance for this, resulting in a black screen.

internal/hls/sps_patch.go intercepts and fixes SPS before HLS segment writes:

go
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// PatchSPS fixes non-standard fields in H.265 SPS NAL units.
// Returns patched SPS and whether replacement is needed.
func PatchSPS(nalu []byte) (patched []byte, modified bool) {
    if len(nalu) < 7 || (nalu[0]&0x7E)>>1 != 39 { // HEVC VPS NUT check
        return nalu, false
    }
    // Skip VPS and PPS, locate SPS
    // Fix non-standard values in profile_tier_level
    profileIdx := 21 // offset varies by resolution
    if nalu[profileIdx] == 0 {
        nalu[profileIdx] = 1 // Main Profile
        modified = true
    }
    // ...
}

This patch is not universal — different cameras may have different SPS structures. It currently targets two specific camera models found during testing, and will be extended based on feedback.

Security Hardening

v0.6.0 includes several security changes:

COOP/COEP Conditional Enablement: The Cross-Origin-Opener-Policy and Cross-Origin-Embedder-Policy headers are only set when TLS is enabled. The strict modes of these headers break WebSocket connections over HTTP (non-HTTPS) — browsers isolate non-secure contexts by force, causing SSE/WebSocket cross-origin communication failures. This wasn’t caught earlier because development typically uses localhost (considered a secure context), but it would fail in deployment without HTTPS.

Frame Watchdog: When the recorder starts, if the RTSP DESCRIBE response doesn’t contain SPS/PPS in the SDP, the player cannot initialize decoding. The previous code waited indefinitely for SPS/PPS, causing goroutine leaks. v0.6.0 adds a frame watchdog — timing starts from the first RTP packet arrival; if no SPS/PPS is received within 5 seconds, it actively disconnects and reconnects.

Zero-Duration Recording Fix

A edge case was found in internal/storage/db_recording.go: when a recording segment closes with 0 frames written, the duration field in the database is 0. During playlist rendering, a 0-duration entry causes the frontend’s playlist total duration calculation to produce NaN.

The fix is the PTS timestamp validity check in internal/recorder/pts_check.go — at recording start, it validates whether the first RTP packet’s PTS (Presentation Timestamp) is valid. If PTS is 0 or NaN, it skips that frame and waits for the next valid PTS, preventing invalid timestamps from being written to the MP4’s mvhd/tkhd boxes.

Documentation Restructure

v0.5.0’s documentation still had a single api-reference.md covering all API endpoints — a file exceeding 2000 lines that was impossible to navigate. v0.6.0 splits it into 19 modular documentation files:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
docs/en/api/
├── README.md               # API overview
├── authentication.md        # Authentication
├── cameras.md               # Camera management
├── recordings.md            # Recording management
├── streaming.md             # Streaming protocols
├── timelapse-protocols.md   # Timelapse protocols
├── transcoding.md           # Transcoding
├── onvif.md                 # ONVIF interface
├── health-monitoring.md     # Health monitoring
├── events.md                # Event system
├── archives.md              # Archiving
├── merge.md                 # Merging
├── ai-detection.md          # AI detection
├── settings.md              # Settings
├── system.md                # System management
├── backup.md                # Backup
├── camera-details.md        # Camera details
├── xiaomi.md                # Xiaomi cloud integration
├── errors.md                # Error codes
└── ...                      # Chinese mirror directory

Each in both Chinese and English, kept in sync. Every file聚焦 on one API endpoint or functional module, making it easy for users to find what they need and convenient for CI to check documentation coverage.

Bug Fixes

  • Transcoding queue race condition: Multiple recording tasks completing simultaneously triggered concurrent transcoding enqueues via the event bus, causing SQLite UNIQUE constraint violations. Fix: add mutex lock for transcoding enqueue operations, with idempotency check via SELECT first.
  • ONVIF preset name encoding: Some cameras return preset names that are not UTF-8 (e.g., GBK encoding), causing garbled text in the frontend. Fix: detect non-UTF-8 sequences per RFC 3629, fallback to ISO-8859-1 decoding.
  • HLS segment boundary black frames: In LL-HLS partial refresh mode, incomplete GOP structures at segment boundaries caused brief black frames. Fix: force waiting for the next IDR frame at segment boundaries.
  • ARM platform transcoding crash: Software encoding libx264 on ARMv7 produced SIGILL due to NEON optimization path check failure. Fix: added -cpuflags none to FFmpeg command.

Performance

Test suite optimized from ~340s to ~88s (74% faster). Main optimizations:

  1. Parallel testing: Independent test cases use t.Parallel() for concurrent execution
  2. SQLite WAL mode: Test DB uses PRAGMA journal_mode=WAL, reducing write lock contention
  3. Mock clock: Time-dependent tests (e.g., health score time windows) use clock.Mock instead of time.Sleep

Upgrade

Configuration is backward compatible — just replace the binary:

bash
1
2
3
4
5
# Docker
docker pull ghcr.io/mi-bee-studio/mibeenvr:latest

# Or download binary
wget https://github.com/Mi-Bee-Studio/MiBeeNvr/releases/latest/download/mibee-nvr-$(uname -m | sed 's/x86_64/amd64/;s/aarch64/arm64/')

After upgrading, check config.example.yaml for new configuration items (timelapse, transcoding, streaming, etc.) and add them to your own config file as needed.