AI & Tools

One Month with the Zhi Theme: Mermaid v11 Upgrade Experience

One Month In It’s been nearly a month since the previous article Switched My Blog Theme: From Hugo NexT to Self-Written Zhi. During this month, the theme has been running stably without major issues. Replacing the NexT theme was the right decision — although there was some initial adjustment, the experience is now significantly better. After a month of use, the theme’s stability has exceeded expectations. My initial concerns — whether pure Hugo Pipes without build tools could support complex requirements — were proven unfounded. Daily maintenance has become very simple; modifying a feature no longer requires digging through deeply nested SCSS files — one CSS file gets the job done.

Continue reading →

YOLO Dataset Preparation: Annotation Tools and Format Conversion

Data Annotation Tools Usage LabelImg Installation and Usage bash 1 2 3 4 5 # Installation pip install labelImg # Launch labelImg Annotation Process: Open Dir → Select image folder Change Save Dir → Select annotation save folder Select YOLO format Create RectBox → Draw bounding box → Enter class name Save LabelMe Installation and Usage bash 1 2 pip install labelme labelme CVAT Self-Hosted Annotation Platform CVAT (Computer Vision Annotation Tool) is an open-source annotation platform by Intel, supporting Docker self-hosted deployment for team collaboration and large-scale annotation projects.

Continue reading →

YOLO Quick Start: Model Loading and Inference

Model Loading and Inference Across Versions Ultralytics Unified API (Works with v8/11/26) python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 from ultralytics import YOLO # ========== YOLOv8 ========== model_v8 = YOLO("yolov8n.pt") # nano model_v8 = YOLO("yolov8s.pt") # small model_v8 = YOLO("yolov8m.pt") # medium model_v8 = YOLO("yolov8l.pt") # large model_v8 = YOLO("yolov8x.pt") # extra large # ========== YOLO11 ========== model_11 = YOLO("yolo11n.pt") # nano model_11 = YOLO("yolo11s.pt") # small model_11 = YOLO("yolo11m.pt") # medium model_11 = YOLO("yolo11l.pt") # large model_11 = YOLO("yolo11x.pt") # extra large # ========== YOLO26 (2026 latest) ========== model_26 = YOLO("yolo26n.pt") # nano recommended for edge deployment model_26 = YOLO("yolo26s.pt") # small model_26 = YOLO("yolo26m.pt") # medium model_26 = YOLO("yolo26l.pt") # large model_26 = YOLO("yolo26x.pt") # extra large Image Detection Hands-on Example python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 from ultralytics import YOLO # Load model (YOLO26 example) model = YOLO("yolo26n.pt") # Single image detection results = model("test.jpg", conf=0.25, iou=0.45) # Process results for result in results: boxes = result.boxes # Detection boxes masks = result.masks # Segmentation masks probs = result.probs # Classification probabilities # Print detection results for box in boxes: print(f"Class: {result.names[int(box.cls)]}, " f"Confidence: {box.conf.item():.3f}, " f"Coordinates: {box.xyxy.tolist()[0]}") # Save visualization results result.save("result.jpg") Video Detection Hands-on Example python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 from ultralytics import YOLO model = YOLO("yolo26n.pt") # Video file detection results = model.predict( source="input.mp4", save=True, # Save result video conf=0.3, show=False, # Whether to display in real-time stream=True # Stream processing to save memory ) # Process frame by frame for result in results: # Custom post-processing logic pass Real-time Camera Detection python 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 from ultralytics import YOLO import cv2 model = YOLO("yolo26n.pt") # Open camera cap = cv2.VideoCapture(0) # 0 is default camera while cap.isOpened(): ret, frame = cap.read() if not ret: break # Inference results = model(frame, verbose=False) # Draw results annotated_frame = results[0].plot() # Display cv2.imshow("YOLO Real-time", annotated_frame) if cv2.waitKey(1) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows() Version-specific Code Differences Feature YOLOv8 YOLO11 YOLO26 YOLOv9 YOLOv10 Unified API ✅ ✅ ✅ ❌ Separate repo ❌ Separate repo No NMS ❌ ❌ ✅ ❌ ✅ DFL Module ✅ ✅ ❌ Removed ✅ ✅ MuSGD Optimizer ❌ ❌ ✅ ❌ ❌ Export Compatibility Good Good Best Fair Fair Results Object API Deep Dive The model() or model.predict() call returns a list of Results objects. Each Results object encapsulates all inference outputs for a single image. Understanding its internal structure is essential for downstream processing.

Continue reading →

Evolution: Oh My OpenAgent Configuration Iteration Log

The previous article covered the initial configuration setup. This one documents the adjustments after two weeks of running: expanding from single vendor to a four-tier model pool, adding fallback chains, hitting the GLM-4.5-air trap of analyzing without writing code. This post covers: fallback strategy design, complete free model pool inventory and analysis, concurrency control configuration, and the decision process for GLM-4.5-air replacement. After the previous article’s initial configuration, I ran it for two weeks — all the issues that needed fixing surfaced.

Continue reading →

YOLO Getting Started: History, Version Comparison and Environment Setup

Learning Path and Version Selection Guide Version Selection Guide Version Release Date Development Team Use Cases Recommendation Index YOLO26 2026.01 Ultralytics Official Edge deployment, CPU inference, industrial applications ⭐⭐⭐⭐⭐ YOLOv8 2023.01 Ultralytics Official Beginner learning, complete ecosystem, general scenarios ⭐⭐⭐⭐⭐ YOLO11 2024.09 Ultralytics Official Efficiency optimization, lightweight deployment ⭐⭐⭐⭐ YOLOv10 2024.05 Tsinghua University Research exploration, NMS-free end-to-end ⭐⭐⭐⭐ YOLOv9 2024.01 National Taiwan University High precision, small object detection ⭐⭐⭐⭐ YOLOv12 2025.02 Buffalo University + Chinese Academy of Sciences Attention mechanism research ⭐⭐⭐ Learning Path Recommendations Beginner Stage (1-2 weeks): Start with YOLOv8, master basic concepts and API usage Intermediate Stage (2-3 weeks): Learn custom dataset training, parameter tuning and optimization Advanced Stage (2-3 weeks): Learn model deployment, engineering implementation Research Stage (ongoing): Explore new features in YOLO11, YOLO26, YOLOv9/v10/v12 Complete YOLO Development History Timeline Version Release Date Core Innovation Milestone Significance YOLOv1 2015.06 Pioneer single-stage detection Foundation for real-time detection YOLOv2 2016.12 Batch Normalization, Anchor Dual improvement in accuracy and speed YOLOv3 2018.04 Multi-scale detection, residual networks Industry standard YOLOv4 2020.04 CSPDarknet, Mosaic Peak of engineering implementation YOLOv5 2020.06 PyTorch framework, user-friendly Highest adoption rate YOLOv7 2022.07 E-ELAN, reparameterization Balance between speed and accuracy YOLOv8 2023.01 C2f, Anchor-Free, unified framework Ultralytics unified ecosystem YOLOv9 2024.01 GELAN, PGI programmable gradient Training efficiency revolution YOLOv10 2024.05 NMS-free, efficiency-precision tradeoff End-to-end detection YOLO11 2024.09 Architecture optimization, parameter reduction Efficiency optimized version YOLOv12 2025.02 Area Attention mechanism Attention architecture YOLO26 2026.01 DFL-free, NMS-free, 43% CPU optimization Edge computing new standard Core Principles and Version Comparison Ultralytics Official Main Line Versions YOLOv8 Core Features:

Continue reading →

Zhipu Coding Plan × Oh My OpenCode: Multi-Model Orchestration Setup Guide

Why Bother When it comes to writing code with AI, the gap between single-model and multi-model approaches keeps widening. No matter how strong a single model is, it can’t compete with a team of specialized models working in parallel. Oh My OpenCode (OmO for short) is a multi-model orchestration plugin in the OpenCode ecosystem, with 11 Agents each having distinct responsibilities and 48 Hooks spanning the entire lifecycle. Zhipu’s Coding Plan provides access to the full GLM model series. Combining the two allows you to assign different models by role — strong coders for coding, strong reasoners for reasoning, free models for busywork.

Continue reading →

Domestic LLM Resource and Cost Comparison: GLM-5 / Kimi K2.5 / MiniMax M2.7

Overview This article compares the resource requirements and usage costs of three major domestic LLMs, helping developers choose the right solution for their scenarios. Model Vendor Architecture Minimum Deployable VRAM API Available GLM-5 Zhipu AI Dense (multiple versions) 24GB (8B) ✅ Kimi K2.5 Moonshot AI MoE (undisclosed) 24GB (lightweight) ✅ MiniMax M2.7 MiniMax MoE 230B Not yet open-sourced ✅ GLM-5 (Zhipu AI) Versions & Hardware Requirements GLM-5 offers 4 parameter versions, making it the widest-coverage domestic LLM currently available.

Continue reading →

Harness Engineering: Putting Reins and Brakes on AI

What is Harness Engineering? Definition: Harness Engineering is the discipline of designing constraints, feedback loops, tool systems, and verification mechanisms around AI agents. This definition sounds very academic, so let’s understand it through a vivid metaphor: Harnessing a Thousand-Mile Horse: A thousand-mile horse (AI Agent) has powerful running capabilities, but without a rider, it might run randomly, injure passersby, or even rush off a cliff. Harness Engineering equips this horse with reins (constraints), brakes (safety controls), whip (incentive mechanisms), and a rider (monitoring), ensuring it travels safely on the correct path.

Continue reading →

From Context to Harness: Info Is Ready, But AI Is Still Unreliable

Scenario: Information Is Correct, But Execution Goes Wrong Let’s start with a real-world story: Background: A company deployed a RAG-based technical documentation Q&A system. This system worked perfectly—when users asked “How to configure Redis cluster?” it could accurately retrieve relevant information from technical documents and provide detailed configuration steps. Problem: When a user asked “Delete temporary files in the test directory,” the system correctly retrieved the right technical documentation, but during execution it mistakenly deleted the entire project’s core code.

Continue reading →

Context Engineering: Giving AI the Right Knowledge

What is Context Engineering? In June 2025, Andrej Karpathy provided a definition of Context Engineering on the OpenAI engineering blog: “the delicate art and science of filling the context window with just the right information for the model to take the next step.” This definition is exceptionally elegant. The core distinction from Prompt Engineering lies in: Prompt Engineering: Optimizes “what you say” – focuses on how input instructions are expressed Context Engineering: Optimizes “what the model knows” – focuses on what information the model can access Imagine this:

Continue reading →