Harness Engineering: Putting Reins and Brakes on AI

February 15, 2026 AI Tools AI Engineering, Paradigm Evolution, Harness Engineering, Agent AI Engineering Series 2199 words 5 min read

🔊

What is Harness Engineering?

Definition: Harness Engineering is the discipline of designing constraints, feedback loops, tool systems, and verification mechanisms around AI agents.

This definition can be understood through an analogy:

Harnessing a Thousand-Mile Horse: A thousand-mile horse (AI Agent) is capable of running fast, but without a rider, it might run randomly, injure passersby, or even rush off a cliff. Harness Engineering equips this horse with reins (constraints), brakes (safety controls), whip (incentive mechanisms), and a rider (monitoring), ensuring it travels safely on the correct path.

Core Philosophy: Human Steer, Agent Execute

Human Steer: Humans set objectives, monitor processes, make final decisions
Agent Execute: AI executes autonomously within constraint frameworks

This philosophy solves the fundamental problem that Context Engineering couldn’t address: AI has knowledge, but lacks behavioral constraints and verification.

Origins and Development

Mitchell Hashimoto’s Groundbreaking Contribution

On February 5, 2026, Mitchell Hashimoto, co-founder of HashiCorp, formally proposed the concept of “Harness Engineering.” It’s important to note: this is Mitchell Hashimoto (HashiCorp co-founder), not the CTO.

Mitchell wrote on his blog:

“Harness Engineering isn’t about limiting AI capabilities; it’s about ensuring they are exercised responsibly. Like providing safety equipment for a race car driver—it doesn’t limit how fast they can go, but ensures they can go fast safely.”

OpenAI’s Technological Push

On February 11, 2026, OpenAI published “Harness engineering: leveraging Codex,“阐述ing the importance of Harness Engineering from a technical perspective:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# OpenAI's Harness Engineering example
class CodexHarness:
    def __init__(self, codex_model, safety_constraints):
        self.model = codex_model
        self.constraints = safety_constraints
        self.execution_monitor = ExecutionMonitor()
    
    def safe_execute(self, task):
        # 1. Task safety verification
        if not self.validate_task(task):
            return "Task violates safety constraints"
        
        # 2. Execution process monitoring
        with self.execution_monitor:
            result = self.model.execute(task)
            
            # 3. Result verification
            if not self.validate_result(result):
                return "Execution result validation failed"
        
        return result

Wang Xin’s Theoretical Contributions

Technical expert Wang Xin published several articles about Harness Engineering on his personal website wangxin.io, perfecting the concept from a theoretical perspective. He proposed:

“The core of Harness Engineering is ‘controllability’. No matter how powerful an AI Agent is, if it cannot be controlled and verified by humans, it cannot be used in production environments.”

The Harness Engineering Formula

Mitchell Hashimoto proposed a concise formula:

Agent = LLM + Harness

This formula reveals the essence of modern AI Agents:

LLM: Provides cognitive and execution capabilities
Harness: Provides safety controls and constraint frameworks

Four Core Subsystems

Harness Engineering includes four interrelated core subsystems that together form the safety framework for AI Agents.

1. Tool Injection System

Manages the tools that AI Agents can call, ensuring the safety and controllability of tool invocation.

Core Components:

Function Calling Protocol: Standardized tool calling interface
Tool Registry: Metadata management for callable tools
Permission Control: Role-based tool access control
Sandbox Isolation: Isolated environment for tool execution

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class ToolInjectionSystem:
    def __init__(self):
        self.tool_registry = ToolRegistry()
        self.permission_manager = PermissionManager()
        self.sandbox = ExecutionSandbox()
    
    def inject_tools(self, agent, required_tools):
        """Safely inject tools into Agent"""
        # 1. Tool permission verification
        for tool in required_tools:
            if not self.permission_manager.can_use(agent, tool):
                raise PermissionError(f"Agent {agent} cannot use tool {tool}")
        
        # 2. Tool registration and validation
        validated_tools = []
        for tool in required_tools:
            tool_info = self.tool_registry.get_tool(tool)
            if self.validate_tool(tool_info):
                validated_tools.append(tool_info)
        
        # 3. Sandbox environment configuration
        sandbox_config = self.sandbox.create_config(validated_tools)
        
        return {
            'tools': validated_tools,
            'sandbox': sandbox_config,
            'permissions': self.permission_manager.get_permissions(agent)
        }

Claude Code Practice Case: Claude Code (released September 2025) is an excellent practice of the tool injection system:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Claude Code's tool injection mechanism
claude_code_config = {
    'allowed_tools': [
        'file_read',
        'file_write',
        'bash_execute',
        'web_search',
        'git_operations'
    ],
    'restricted_directories': [
        '/system',
        '/private',
        '/config'
    ],
    'execution_timeout': 30,  # seconds
    'max_file_size': '10MB'
}

2. State Management System

Tracks and manages the execution state of AI Agents, ensuring traceability and recoverability of tasks.

Core Components:

Task Progress Tracking: Real-time monitoring of task execution status
State Persistence: Saving intermediate states to database
Interruption Recovery: Continuing execution from interruption points
Concurrency Isolation: Preventing interference between multiple tasks

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class StateManager:
    def __init__(self, storage_backend):
        self.storage = storage_backend
        self.state_cache = {}
    
    def create_task_state(self, task_id, initial_state):
        """Create task state"""
        state = {
            'task_id': task_id,
            'status': 'initialized',
            'progress': 0.0,
            'subtasks': [],
            'created_at': datetime.now(),
            'last_updated': datetime.now(),
            'data': initial_state
        }
        
        self.storage.save_state(task_id, state)
        return state
    
    def update_progress(self, task_id, progress, subtask=None):
        """Update task progress"""
        state = self.storage.get_state(task_id)
        state['progress'] = progress
        state['last_updated'] = datetime.now()
        
        if subtask:
            state['subtasks'].append({
                'name': subtask,
                'completed_at': datetime.now()
            })
        
        self.storage.save_state(task_id, state)
        return state
    
    def save_checkpoint(self, task_id, checkpoint_data):
        """Save checkpoint"""
        state = self.storage.get_state(task_id)
        state['checkpoint'] = {
            'data': checkpoint_data,
            'saved_at': datetime.now()
        }
        self.storage.save_state(task_id, state)
    
    def restore_from_checkpoint(self, task_id):
        """Restore from checkpoint"""
        state = self.storage.get_state(task_id)
        if 'checkpoint' in state:
            return state['checkpoint']['data']
        return None

Concurrency Control Strategy:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class ConcurrencyManager:
    def __init__(self, max_concurrent_tasks=5):
        self.max_concurrent = max_concurrent_tasks
        self.active_tasks = {}
        self.task_locks = {}
    
    def execute_task(self, task):
        """Execute task with concurrency handling"""
        # 1. Check concurrency limits
        if len(self.active_tasks) >= self.max_concurrent:
            raise ConcurrencyError(f"Maximum {self.max_concurrent} tasks allowed")
        
        # 2. Create task lock
        task_lock = asyncio.Lock()
        self.task_locks[task.id] = task_lock
        
        try:
            # 3. Execute task
            async with task_lock:
                result = await task.execute()
                self.active_tasks[task.id] = result
                return result
        finally:
            # 4. Cleanup resources
            del self.task_locks[task.id]
            if task.id in self.active_tasks:
                del self.active_tasks[task.id]

3. Verification Loop System

Verifies the execution results of AI Agents, ensuring the quality and safety of outputs.

Core Components:

Output Quality Checking: Verifies the quality of generated content
Error Detection: Identifies anomalies during execution
Auto-correction: Automatically corrects errors
Human Review Trigger: Human intervention for complex situations

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
class VerificationLoop:
    def __init__(self):
        self.checkers = [
            QualityChecker(),
            SafetyChecker(),
            FormatChecker(),
            CompletenessChecker()
        ]
        self.error_handlers = [
            AutoFixHandler(),
            HumanReviewHandler(),
            LogHandler()
        ]
    
    async def verify_execution(self, task, result):
        """Verify execution results"""
        verification_results = {}
        
        # 1. Execute all checks
        for checker in self.checkers:
            try:
                check_result = await checker.check(result)
                verification_results[checker.name] = check_result
            except Exception as e:
                verification_results[checker.name] = {
                    'passed': False,
                    'error': str(e)
                }
        
        # 2. Process check results
        overall_passed = all(r['passed'] for r in verification_results.values())
        
        if not overall_passed:
            # 3. Error handling
            error_summary = self.summarize_errors(verification_results)
            corrected_result = await self.handle_errors(task, result, error_summary)
            return corrected_result
        
        return result
    
    def summarize_errors(self, verification_results):
        """Summarize error information"""
        errors = []
        for checker_name, result in verification_results.items():
            if not result['passed']:
                errors.append({
                    'checker': checker_name,
                    'issue': result.get('issue', 'Unknown issue'),
                    'severity': result.get('severity', 'medium')
                })
        return errors
    
    async def handle_errors(self, task, result, errors):
        """Handle errors"""
        for error in errors:
            # Handle by severity
            if error['severity'] == 'critical':
                # Trigger human review
                await self.trigger_human_review(task, result, error)
            elif error['severity'] == 'high':
                # Auto-fix
                await self.auto_fix(task, result, error)
            else:
                # Log error
                await self.log_error(task, result, error)
        
        return result

LangGraph Workflow Orchestration: LangGraph (released January 2024) provides workflow orchestration with composable nodes and conditional branches for verification loops:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# LangGraph verification loop example
verification_workflow = {
    'nodes': [
        {'id': 'input', 'type': 'input'},
        {'id': 'quality_check', 'type': 'quality_check'},
        {'id': 'safety_check', 'type': 'safety_check'},
        {'id': 'format_check', 'type': 'format_check'},
        {'id': 'auto_fix', 'type': 'auto_fix'},
        {'id': 'human_review', 'type': 'human_review'},
        {'id': 'output', 'type': 'output'}
    ],
    'edges': [
        {'from': 'input', 'to': 'quality_check'},
        {'from': 'quality_check', 'to': 'safety_check'},
        {'from': 'safety_check', 'to': 'format_check'},
        {'from': 'format_check', 'to': 'auto_fix', 'condition': 'needs_fix'},
        {'from': 'format_check', 'to': 'human_review', 'condition': 'needs_review'},
        {'from': 'auto_fix', 'to': 'output'},
        {'from': 'human_review', 'to': 'output'}
    ]
}

4. Constraint Layering System (Guardrails)

Sets multi-level constraints for AI Agents, ensuring behavioral compliance and safety.

Core Components:

Rule Engine: Configurable business rules
Format Enforcement: Output format requirements
Security Policy: Behavioral constraints for safety
Compliance Audit: Compliance checking and recording

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class GuardrailSystem:
    def __init__(self):
        self.rule_engine = RuleEngine()
        self.format_enforcer = FormatEnforcer()
        self.security_policy = SecurityPolicy()
        self.compliance_auditor = ComplianceAuditor()
    
    def apply_constraints(self, task, input_data):
        """Apply constraint system"""
        # 1. Rule checking
        rule_violations = self.rule_engine.check_rules(task, input_data)
        if rule_violations:
            raise RuleViolationError(rule_violations)
        
        # 2. Format enforcement
        formatted_input = self.format_enforcer.enforce_format(input_data)
        
        # 3. Security policy check
        security_check = self.security_policy.check_security(task, formatted_input)
        if not security_check['passed']:
            raise SecurityError(security_check['issues'])
        
        # 4. Compliance audit
        self.compliance_auditor.log_audit_event(task, formatted_input)
        
        return formatted_input
    
    def create_rule_set(self, domain):
        """Create domain-specific rule sets"""
        rule_sets = {
            'finance': [
                {'rule': 'no_financial_data_leakage', 'severity': 'critical'},
                {'rule': 'require_approval_for_large_transactions', 'severity': 'high'}
            ],
            'healthcare': [
                {'rule': 'hipaa_compliance', 'severity': 'critical'},
                {'rule': 'patient_data_anonymization', 'severity': 'high'}
            ],
            'general': [
                {'rule': 'no_harmful_content', 'severity': 'high'},
                {'rule': 'respect_privacy', 'severity': 'medium'}
            ]
        }
        
        return rule_sets.get(domain, rule_sets['general'])

Design Principles

Feedforward Control

Set constraints and expectations before execution to prevent problems:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class FeedforwardController:
    def __init__(self):
        self.constraints = []
        self.expectations = []
    
    def setup_constraints(self, task):
        """Setup feedforward constraints"""
        # 1. Permission pre-check
        self.validate_permissions(task)
        
        # 2. Resource pre-allocation
        self.allocate_resources(task)
        
        # 3. Risk assessment
        self.assess_risks(task)
        
        # 4. Expectation setting
        self.set_expectations(task)
    
    def validate_permissions(self, task):
        """Permission validation"""
        required_perms = task.get_required_permissions()
        for perm in required_perms:
            if not self.check_permission(perm):
                raise PermissionError(f"Missing permission: {perm}")
    
    def allocate_resources(self, task):
        """Resource allocation"""
        resources = task.get_required_resources()
        for resource, amount in resources.items():
            if not self.allocate(resource, amount):
                raise ResourceError(f"Cannot allocate {resource}: {amount}")

Feedback Correction

Verify and correct after execution for continuous improvement:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class FeedbackController:
    def __init__(self):
        self.validators = []
        self.correctors = []
    
    def correct_execution(self, task, result):
        """Execute correction"""
        # 1. Result verification
        validation_result = self.validate_result(result)
        
        if not validation_result['passed']:
            # 2. Auto-correction
            corrected_result = self.auto_correct(result, validation_result['issues'])
            
            # 3. Re-verification
            if not self.validate_result(corrected_result)['passed']:
                # 4. Human intervention
                return self.human_intervention(task, corrected_result)
            
            return corrected_result
        
        return result

Progressive Automation

A gradual process from human supervision to full automation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class ProgressiveAutomation:
    def __init__(self):
        self.automation_levels = {
            'level_1': 'human_supervised',
            'level_2': 'human_in_the_loop',
            'level_3': 'autonomous_with_monitoring',
            'level_4': 'fully_autonomous'
        }
    
    def get_automation_level(self, task_complexity, risk_level):
        """Determine automation level based on task complexity and risk"""
        if risk_level == 'critical':
            return 'level_1'
        elif task_complexity == 'simple' and risk_level == 'low':
            return 'level_4'
        else:
            return 'level_2'  # Default to human-in-the-loop

Observability

Ensure system transparency and monitorability:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class ObservabilitySystem:
    def __init__(self):
        self.logger = Logger()
        self.metrics = Metrics()
        self.tracer = Tracer()
    
    def log_execution(self, task, agent, result):
        """Record execution logs"""
        log_entry = {
            'timestamp': datetime.now(),
            'task': task,
            'agent': agent,
            'result': result,
            'duration': result.get('duration', 0),
            'status': result.get('status', 'unknown')
        }
        
        self.logger.info(log_entry)
        self.metrics.record_task_completion(task, result['status'])
    
    def trace_execution(self, task_id, steps):
        """Trace execution steps"""
        trace = {
            'task_id': task_id,
            'steps': steps,
            'start_time': datetime.now(),
            'end_time': None,
            'errors': []
        }
        
        return trace

Limitations and Challenges

Current Limitations

Despite providing multi-layered safety controls, Harness Engineering still has some limitations:

Human dependency for task initiation: Still requires humans to start tasks and set objectives
Lack of autonomy between tasks: Cannot autonomously switch and coordinate between multiple tasks
Fixed loop structures: Predefined loop structures cannot dynamically adapt to new situations
Insufficient adaptability: Limited ability to handle unexpected situations

Motivation and Future

These limitations are precisely what drive the evolution to the next engineering phase:

Autonomous coordination between tasks: AI can autonomously switch and coordinate between multiple tasks
Dynamic loop structures: Loop structures can dynamically adjust as needed
Adaptive decision-making: Dynamically adjust strategies based on execution results
Full autonomy: From “requiring human initiation” to “autonomous task management”

These developments will push AI engineering into the next phase: Loop Engineering.

Summary: The Value of Harness

The value of Harness Engineering lies in: transforming AI from a demo into a deployable production system.

Feature	Simple Demo	Harness Engineering System
Safety	No safety guarantees	Multi-layer constraint protection
Reliability	Uncontrollable results	Executable and verifiable
Maintainability	Difficult to maintain	State traceable
Scalability	Hard to scale	Modular design

Harness Engineering is a crucial step in AI engineering. It complements answer capability with safe, reliable execution capability, enabling AI to move from lab prototypes to production deployment.

Part of series: AI Engineering Series

← Previous From Context to Harness: Info Is Ready, But AI Is Still Unreliable Next → From Harness to Loop: If You Have to Start It Every Time, It's Not Autonomous