YOLO Advanced Optimization: Lightweight, Quantization and Accuracy

11 min read
Model Lightweighting Strategies Model Size Selection Model Parameters (M) mAP CPU Inference Use Cases YOLO26n 2.8 38.9 Fastest Edge devices, Embedded YOLO26s 9.4 48.2 Very fast Mobile, Web YOLO26m 21.8 53.1 Medium Server, High performance YOLO11n 2.6 39.6 Fast Lightweight deployment YOLOv8n 3.2 37.3 Baseline General purpose Knowledge Distillation python 1 2 3 4 5 6 7 8 9 10 # Large model as teacher, small model as student teacher = YOLO("yolo26x.pt") student = YOLO("yolo26n.yaml") # Distillation training (Ultralytics built-in support) student.train( data="data.yaml", distill="yolo26x.pt", # Teacher model distill_ratio=0.5, # Distillation loss ratio ) Model Pruning Structured vs Unstructured Pruning Type Method Sparsity Pattern Hardware Acceleration Compression Ratio Unstructured Weight pruning Random sparse Difficult (special HW needed) High Structured Channel pruning Regular sparse Native acceleration Medium Torch Prune Channel Pruning Example python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 import torch import torch.nn.utils.prune as prune # L1 unstructured pruning on conv layers model = YOLO("yolo26n.pt") for name, module in model.model.named_modules(): if isinstance(module, torch.nn.Conv2d): prune.l1_unstructured(module, name="weight", amount=0.3) prune.remove(module, "weight") # Make pruning permanent # Channel pruning with torch-pruning library # pip install torch-pruning import torch_pruning as tp model = YOLO("yolo26n.pt").model DG = tp.DependencyGraph() DG.build_dependency(model, example_inputs=torch.randn(1, 3, 640, 640)) # Prune 20% channels by L1 norm pruning_plan = DG.get_pruning_plan( model.model[4], tp.prune_conv, pruning_dim=0, # Output channel dimension idxs=list(range(0, 64, 5)) # Keep every 5th channel ) pruning_plan.exec() Pruning Ratio Guidelines Model Safe Ratio Aggressive Ratio mAP Drop YOLO26n ≤20% 20-40% <1% / 2-5% YOLO26s ≤30% 30-50% <1% / 3-6% YOLO26m ≤40% 40-60% <1% / 3-8% YOLOv8n ≤20% 20-35% <1% / 2-4% Model Pruning and Quantization Export Time Quantization python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 model = YOLO("yolo26n.pt") # INT8 quantization (requires calibration data) model.export( format="engine", # TensorRT int8=True, data="data.yaml", # Calibration dataset batch=8, ) # ONNX dynamic quantization model.export( format="onnx", dynamic=True, simplify=True, ) TensorRT INT8 Calibration Step-by-Step Calibration Dataset Preparation INT8 quantization requires representative calibration data to determine activation value ranges:
YOLO Model Optimization Knowledge Distillation Model Quantization Edge Deployment
Continue reading →

From Hashmod to Jump Consistent Hash — stream-metrics-route Hash Algorithm Upgrade

12 min read
Introduction In the previous article, we reviewed the three-year evolution of stream-metrics-route and mentioned that the “dual hashmod scheduling” is the core scheduling mechanism of the entire gateway. However, during continuous production operation, one fatal flaw of hashmod became increasingly obvious—every scaling operation triggers full data redistribution. This article documents the complete decision process of migrating from hash % N (hashmod) to Jump Consistent Hash: which candidate algorithms were evaluated, why Jump Hash was ultimately chosen, and the specific impact before and after migration.
VictoriaMetrics Prometheus Consistent Hashing Stream-Metrics-Route Distributed Systems
Continue reading →

The Hidden Trap of Headless Browsers: Why Can't Your Automation Tool Catch Early Page Errors?

17 min read
Introduction You’re debugging a frontend engineering issue — the page is behaving abnormally. You ask an AI to open the page with a browser tool and check the console for errors. The AI opens the page, scans around, and tells you: The console is clean, no errors whatsoever. You’re skeptical. You open Chrome DevTools yourself — three bright red errors are staring you in the face, the page has already crashed into a white screen. The AI visited the exact same page using a Headless browser, so why did it catch nothing?
Headless Chrome Puppeteer Playwright Selenium Agent-Browser Frontend Automation Error Capture
Continue reading →

YOLO Model Training: Complete Custom Dataset Workflow

12 min read
Complete Custom Dataset Training Process Ultralytics Unified Training Code python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 from ultralytics import YOLO # Load model # model = YOLO("yolov8n.yaml") # Train from scratch # model = YOLO("yolo11n.pt") # Based on pre-trained weights model = YOLO("yolo26n.pt") # 2026 recommended, edge deployment first choice # Start training results = model.train( # Basic configuration data="data.yaml", # Dataset configuration epochs=100, # Training epochs imgsz=640, # Input size batch=16, # Batch size workers=8, # Data loading threads # Optimizer configuration optimizer="auto", # YOLO26 automatically uses MuSGD lr0=0.01, # Initial learning rate lrf=0.01, # Final learning rate factor momentum=0.937, # SGD momentum weight_decay=0.0005, # Weight decay # Data augmentation mosaic=1.0, mixup=0.1, copy_paste=0.1, # Other configuration device=0, # GPU device, "cpu" for CPU project="runs/train", # Save path name="yolo26_exp1", # Experiment name exist_ok=False, # Whether to overwrite pretrained=True, # Use pre-trained verbose=True, # Detailed logs seed=42, # Random seed ) # Validate model metrics = model.val() print(f"mAP50: {metrics.box.map50:.3f}") print(f"mAP50-95: {metrics.box.map:.3f}") Training Parameter Differences Across Versions Parameter YOLOv8 YOLO11 YOLO26 Default Optimizer SGD SGD MuSGD DFL Loss ✅ ✅ ❌ Removed NMS Post-processing ✅ ✅ ❌ Native no NMS Small Object Optimization Average Better Best (STAL) CPU Inference Speed Baseline +25% +43% Loss Function Breakdown YOLO’s loss function consists of three components, each targeting a different learning objective:
YOLO Model Training Deep Learning Hyperparameter Tuning
Continue reading →

security-collector-exporter: Monitoring Linux Security Auditing

6 min read
Why This Was Built Anyone managing servers has probably had this experience: compliance audit comes, SSH into machines one by one to check—SSH config correct, SELinux enabled, firewall running, any expired accounts, password policies compliant. A few machines are fine; dozens or hundreds becomes purely manual grunt work. And the more painful part: none of this has continuous monitoring. You check compliance today, someone changes a config tomorrow, and you’d never know.
Prometheus Linux Security Monitoring Go Exporter
Continue reading →

One Month with the Zhi Theme: Mermaid v11 Upgrade Experience

7 min read
One Month In It’s been nearly a month since the previous article Switched My Blog Theme: From Hugo NexT to Self-Written Zhi. During this month, the theme has been running stably without major issues. Replacing the NexT theme was the right decision — although there was some initial adjustment, the experience is now significantly better. After a month of use, the theme’s stability has exceeded expectations. My initial concerns — whether pure Hugo Pipes without build tools could support complex requirements — were proven unfounded. Daily maintenance has become very simple; modifying a feature no longer requires digging through deeply nested SCSS files — one CSS file gets the job done.
Hugo Mermaid Zhi AI Programming
Continue reading →

MiBeeNvr v0.2.0 Update: Docker Deployment, HLS Streaming, Recording Merging, and a Complete Installation Guide

11 min read
The previous article introduced MiBeeNvr’s basic features and design philosophy. It’s only been a week since v0.1.0, and v0.2.0 follows right behind. This update is substantial — 15 new features, some I needed myself, others from community feedback. This article covers three things: what’s new in v0.2.0, how to deploy from scratch, and some practical tips for real-world use. v0.2.0 New Features Overview This update has a lot of content. Here’s a breakdown by category:
NVR Go RTSP Camera Smart Home HLS
Continue reading →

VictoriaMetrics Stream Aggregation: Three-Year Review and Current Status (2026)

6 min read
Introduction It’s been exactly three years since the previous article Applying VictoriaMetrics Stream Aggregation for Metrics was published in March 2023. In these three years, the VictoriaMetrics ecosystem has undergone tremendous changes—let’s revisit the issues raised in that blog post, see what the official project has resolved, and where our stream-metrics-route project stands today. I. Problems We Encountered Three Years Ago Let’s quickly recap the core issue list from the 2023 blog post:
VictoriaMetrics Prometheus Metric Aggregation Vmagent Stream-Metrics-Route
Continue reading →

YOLO Dataset Preparation: Annotation Tools and Format Conversion

15 min read
Data Annotation Tools Usage LabelImg Installation and Usage bash 1 2 3 4 5 # Installation pip install labelImg # Launch labelImg Annotation Process: Open Dir → Select image folder Change Save Dir → Select annotation save folder Select YOLO format Create RectBox → Draw bounding box → Enter class name Save LabelMe Installation and Usage bash 1 2 pip install labelme labelme CVAT Self-Hosted Annotation Platform CVAT (Computer Vision Annotation Tool) is an open-source annotation platform by Intel, supporting Docker self-hosted deployment for team collaboration and large-scale annotation projects.
YOLO Dataset Data Annotation Data Augmentation
Continue reading →

BBR Congestion Control Algorithm Deep Dive

13 min read
BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Neal Cardwell, Yuchung Cheng, and others at Google, is one of the most advanced model-based congestion control algorithms available today. Unlike traditional loss-based algorithms (Reno, CUBIC), BBR explicitly models the network path by directly measuring bottleneck bandwidth and propagation delay, sending data at the BDP (Bandwidth-Delay Product) rate at the bottleneck point. BBR’s core insight is that packet loss does not equal congestion. On deep-buffered (Bufferbloat) or wireless links, packet loss can be caused by channel noise or excessive buffer queuing rather than genuine link saturation. BBR actively measures bandwidth and latency to precisely control the sending rate, rather than passively waiting for loss signals.
TCP BBR Congestion Control Google TCP Algorithm
Continue reading →