One Month In It’s been nearly a month since the previous article Switched My Blog Theme: From Hugo NexT to Self-Written Zhi. During this month, the theme has been running stably without major issues. Replacing the NexT theme was the right decision — although there was some initial adjustment, the experience is now significantly better.
After a month of use, the theme’s stability has exceeded expectations. My initial concerns — whether pure Hugo Pipes without build tools could support complex requirements — were proven unfounded. Daily maintenance has become very simple; modifying a feature no longer requires digging through deeply nested SCSS files — one CSS file gets the job done.
Data Annotation Tools Usage LabelImg Installation and Usage bash 1 2 3 4 5 # Installation pip install labelImg # Launch labelImg Annotation Process:
Open Dir → Select image folder Change Save Dir → Select annotation save folder Select YOLO format Create RectBox → Draw bounding box → Enter class name Save LabelMe Installation and Usage bash 1 2 pip install labelme labelme CVAT Self-Hosted Annotation Platform CVAT (Computer Vision Annotation Tool) is an open-source annotation platform by Intel, supporting Docker self-hosted deployment for team collaboration and large-scale annotation projects.
The previous article covered the initial configuration setup. This one documents the adjustments after two weeks of running: expanding from single vendor to a four-tier model pool, adding fallback chains, hitting the GLM-4.5-air trap of analyzing without writing code.
This post covers: fallback strategy design, complete free model pool inventory and analysis, concurrency control configuration, and the decision process for GLM-4.5-air replacement.
After the previous article’s initial configuration, I ran it for two weeks — all the issues that needed fixing surfaced.
Learning Path and Version Selection Guide Version Selection Guide Version Release Date Development Team Use Cases Recommendation Index YOLO26 2026.01 Ultralytics Official Edge deployment, CPU inference, industrial applications ⭐⭐⭐⭐⭐ YOLOv8 2023.01 Ultralytics Official Beginner learning, complete ecosystem, general scenarios ⭐⭐⭐⭐⭐ YOLO11 2024.09 Ultralytics Official Efficiency optimization, lightweight deployment ⭐⭐⭐⭐ YOLOv10 2024.05 Tsinghua University Research exploration, NMS-free end-to-end ⭐⭐⭐⭐ YOLOv9 2024.01 National Taiwan University High precision, small object detection ⭐⭐⭐⭐ YOLOv12 2025.02 Buffalo University + Chinese Academy of Sciences Attention mechanism research ⭐⭐⭐ Learning Path Recommendations Beginner Stage (1-2 weeks): Start with YOLOv8, master basic concepts and API usage Intermediate Stage (2-3 weeks): Learn custom dataset training, parameter tuning and optimization Advanced Stage (2-3 weeks): Learn model deployment, engineering implementation Research Stage (ongoing): Explore new features in YOLO11, YOLO26, YOLOv9/v10/v12 Complete YOLO Development History Timeline Version Release Date Core Innovation Milestone Significance YOLOv1 2015.06 Pioneer single-stage detection Foundation for real-time detection YOLOv2 2016.12 Batch Normalization, Anchor Dual improvement in accuracy and speed YOLOv3 2018.04 Multi-scale detection, residual networks Industry standard YOLOv4 2020.04 CSPDarknet, Mosaic Peak of engineering implementation YOLOv5 2020.06 PyTorch framework, user-friendly Highest adoption rate YOLOv7 2022.07 E-ELAN, reparameterization Balance between speed and accuracy YOLOv8 2023.01 C2f, Anchor-Free, unified framework Ultralytics unified ecosystem YOLOv9 2024.01 GELAN, PGI programmable gradient Training efficiency revolution YOLOv10 2024.05 NMS-free, efficiency-precision tradeoff End-to-end detection YOLO11 2024.09 Architecture optimization, parameter reduction Efficiency optimized version YOLOv12 2025.02 Area Attention mechanism Attention architecture YOLO26 2026.01 DFL-free, NMS-free, 43% CPU optimization Edge computing new standard Core Principles and Version Comparison Ultralytics Official Main Line Versions YOLOv8 Core Features:
Why Bother When it comes to writing code with AI, the gap between single-model and multi-model approaches keeps widening. No matter how strong a single model is, it can’t compete with a team of specialized models working in parallel.
Oh My OpenCode (OmO for short) is a multi-model orchestration plugin in the OpenCode ecosystem, with 11 Agents each having distinct responsibilities and 48 Hooks spanning the entire lifecycle. Zhipu’s Coding Plan provides access to the full GLM model series. Combining the two allows you to assign different models by role — strong coders for coding, strong reasoners for reasoning, free models for busywork.
Overview This article compares the resource requirements and usage costs of three major domestic LLMs, helping developers choose the right solution for their scenarios.
Model Vendor Architecture Minimum Deployable VRAM API Available GLM-5 Zhipu AI Dense (multiple versions) 24GB (8B) ✅ Kimi K2.5 Moonshot AI MoE (undisclosed) 24GB (lightweight) ✅ MiniMax M2.7 MiniMax MoE 230B Not yet open-sourced ✅ GLM-5 (Zhipu AI) Versions & Hardware Requirements GLM-5 offers 4 parameter versions, making it the widest-coverage domestic LLM currently available.
What is Harness Engineering? Definition: Harness Engineering is the discipline of designing constraints, feedback loops, tool systems, and verification mechanisms around AI agents.
This definition sounds very academic, so let’s understand it through a vivid metaphor:
Harnessing a Thousand-Mile Horse: A thousand-mile horse (AI Agent) has powerful running capabilities, but without a rider, it might run randomly, injure passersby, or even rush off a cliff. Harness Engineering equips this horse with reins (constraints), brakes (safety controls), whip (incentive mechanisms), and a rider (monitoring), ensuring it travels safely on the correct path.
Scenario: Information Is Correct, But Execution Goes Wrong Let’s start with a real-world story:
Background: A company deployed a RAG-based technical documentation Q&A system. This system worked perfectly—when users asked “How to configure Redis cluster?” it could accurately retrieve relevant information from technical documents and provide detailed configuration steps.
Problem: When a user asked “Delete temporary files in the test directory,” the system correctly retrieved the right technical documentation, but during execution it mistakenly deleted the entire project’s core code.
What is Context Engineering? In June 2025, Andrej Karpathy provided a definition of Context Engineering on the OpenAI engineering blog: “the delicate art and science of filling the context window with just the right information for the model to take the next step.”
This definition is exceptionally elegant. The core distinction from Prompt Engineering lies in:
Prompt Engineering: Optimizes “what you say” – focuses on how input instructions are expressed Context Engineering: Optimizes “what the model knows” – focuses on what information the model can access Imagine this: