YOLO Getting Started: History, Version Comparison and Environment Setup
Learning Path and Version Selection Guide
Version Selection Guide
| Version | Release Date | Development Team | Use Cases | Recommendation Index |
|---|---|---|---|---|
| YOLO26 | 2026.01 | Ultralytics Official | Edge deployment, CPU inference, industrial applications | ⭐⭐⭐⭐⭐ |
| YOLOv8 | 2023.01 | Ultralytics Official | Beginner learning, complete ecosystem, general scenarios | ⭐⭐⭐⭐⭐ |
| YOLO11 | 2024.09 | Ultralytics Official | Efficiency optimization, lightweight deployment | ⭐⭐⭐⭐ |
| YOLOv10 | 2024.05 | Tsinghua University | Research exploration, NMS-free end-to-end | ⭐⭐⭐⭐ |
| YOLOv9 | 2024.01 | National Taiwan University | High precision, small object detection | ⭐⭐⭐⭐ |
| YOLOv12 | 2025.02 | Buffalo University + Chinese Academy of Sciences | Attention mechanism research | ⭐⭐⭐ |
Learning Path Recommendations
- Beginner Stage (1-2 weeks): Start with YOLOv8, master basic concepts and API usage
- Intermediate Stage (2-3 weeks): Learn custom dataset training, parameter tuning and optimization
- Advanced Stage (2-3 weeks): Learn model deployment, engineering implementation
- Research Stage (ongoing): Explore new features in YOLO11, YOLO26, YOLOv9/v10/v12
Complete YOLO Development History Timeline
| Version | Release Date | Core Innovation | Milestone Significance |
|---|---|---|---|
| YOLOv1 | 2015.06 | Pioneer single-stage detection | Foundation for real-time detection |
| YOLOv2 | 2016.12 | Batch Normalization, Anchor | Dual improvement in accuracy and speed |
| YOLOv3 | 2018.04 | Multi-scale detection, residual networks | Industry standard |
| YOLOv4 | 2020.04 | CSPDarknet, Mosaic | Peak of engineering implementation |
| YOLOv5 | 2020.06 | PyTorch framework, user-friendly | Highest adoption rate |
| YOLOv7 | 2022.07 | E-ELAN, reparameterization | Balance between speed and accuracy |
| YOLOv8 | 2023.01 | C2f, Anchor-Free, unified framework | Ultralytics unified ecosystem |
| YOLOv9 | 2024.01 | GELAN, PGI programmable gradient | Training efficiency revolution |
| YOLOv10 | 2024.05 | NMS-free, efficiency-precision tradeoff | End-to-end detection |
| YOLO11 | 2024.09 | Architecture optimization, parameter reduction | Efficiency optimized version |
| YOLOv12 | 2025.02 | Area Attention mechanism | Attention architecture |
| YOLO26 | 2026.01 | DFL-free, NMS-free, 43% CPU optimization | Edge computing new standard |
Core Principles and Version Comparison
Ultralytics Official Main Line Versions
YOLOv8 Core Features:
- C2f module replaces C3, enhancing gradient flow
- Anchor-Free detection head, simplifying post-processing
- Unified framework supporting detection, segmentation, classification, pose estimation
- Most complete ecosystem, comprehensive documentation
YOLO11 Core Improvements:
- Backbone/Neck structure lightweight optimization
- 22% parameter reduction, 25% speed improvement
- Fully API compatible with YOLOv8, zero code changes required
- Improved small object detection accuracy
YOLO26 Revolutionary Breakthrough (2026 Latest):
- ✅ Removed DFL module: Simplified bounding box prediction, significantly improved hardware compatibility
- ✅ Native NMS-free: End-to-end inference, 50% reduction in deployment complexity
- ✅ 43% faster CPU inference: Optimized for edge devices, real-time without GPU
- ✅ ProgLoss + STAL: Significant improvement in small object detection accuracy
- ✅ MuSGD optimizer: Faster training convergence, stronger robustness
- ✅ Supports 6 major vision tasks: Detection, segmentation, classification, pose, rotated bounding boxes, keypoints
Third-party Research Versions
YOLOv9 (National Taiwan University):
- GELAN (Generalized Efficient Layer Aggregation Network)
- PGI (Programmable Gradient Information)
- Highest accuracy version: YOLOv9e achieves 55.6% mAP
YOLOv10 (Tsinghua University):
- Consistent dual assignment strategy
- Overall efficiency-precision optimization
- NMS-free end-to-end inference
YOLOv12 (Buffalo University + Chinese Academy of Sciences):
- Area Attention regional attention mechanism
- Linear complexity O(n)
- YOLOv12-N: 40.6% mAP @ 1.64ms T4
Complete Environment Setup Guide
Basic Environment Preparation
System Requirements:
- Windows 10/11, Ubuntu 20.04+, macOS 12+
- Python: 3.8 ~ 3.11 (recommended 3.10)
- PyTorch: >= 2.0 (recommended 2.3+)
Anaconda Environment Creation
| |
PyTorch Installation (GPU/CPU Versions)
GPU Version (Recommended, CUDA 12.1):
| |
CPU Version:
| |
Ultralytics Installation (Supports all official versions)
| |
Third-party versions separate installation
YOLOv9:
| |
YOLOv10:
| |
IoU (Intersection over Union) Explained
IoU (Intersection over Union) is one of the most fundamental evaluation metrics in object detection. It measures the overlap between a predicted bounding box and the ground truth.
IoU Formula
Mathematically:
| |
Visually: the intersection area of two boxes divided by their union area. IoU = 1 means perfect overlap, IoU = 0 means no overlap.
IoU Threshold Selection Guide
| Threshold | Strictness | Use Case |
|---|---|---|
| 0.5 | Lenient | General detection, quick evaluation |
| 0.75 | Medium | Precise localization required |
| 0.9 | Strict | High-precision detection, industrial QC |
mAP@50 vs mAP@50:95
- mAP@50: mAP calculated at a fixed IoU threshold of 0.5, measuring the ability to “detect objects”
- mAP@50:95: Average mAP across 10 IoU thresholds from 0.5 to 0.95 (step 0.05), measuring “localization precision”
In practice: focus on mAP@50 for high recall, focus on mAP@50:95 for precise localization. Using both together gives a comprehensive view of model performance.
Quick Verification Script
After installing Ultralytics, run this minimal script to verify your environment:
| |
Expected output (0 detections on random noise is normal):
| |
Zero detections on random noise is expected — it confirms the model loaded, preprocessing works, and inference runs correctly. Ultralytics automatically uses GPU when CUDA is available.
Docker Environment Setup
Prefer not to pollute your local Python environment or need GPU acceleration? Docker is the ideal solution.
Dockerfile Example
| |
Build and Run
| |
docker-compose Configuration
| |
Start commands:
| |
NVIDIA Container Toolkit Installation Guide: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
Your First Detection: Complete Hands-On Tutorial
Now let’s run a complete object detection example: download a test image, run inference with a pretrained model, and save the annotated result.
Step 1: Download a Test Image
| |
Step 2: Create the Inference Script
Create detect.py:
| |
Step 3: Run and View Results
| |
Expected output:
| |
The first run auto-downloads pretrained weights (~6MB); subsequent runs skip the download. YOLOv8n is the fastest Nano version; switch to
yolov8s.ptoryolov8m.ptfor higher accuracy.