Harness Engineering: Putting Reins and Brakes on AI

What is Harness Engineering?

Definition: Harness Engineering is the discipline of designing constraints, feedback loops, tool systems, and verification mechanisms around AI agents.

This definition sounds very academic, so let’s understand it through a vivid metaphor:

Harnessing a Thousand-Mile Horse: A thousand-mile horse (AI Agent) has powerful running capabilities, but without a rider, it might run randomly, injure passersby, or even rush off a cliff. Harness Engineering equips this horse with reins (constraints), brakes (safety controls), whip (incentive mechanisms), and a rider (monitoring), ensuring it travels safely on the correct path.

Core Philosophy: Human Steer, Agent Execute

  • Human Steer: Humans set objectives, monitor processes, make final decisions
  • Agent Execute: AI executes autonomously within constraint frameworks

This philosophy solves the fundamental problem that Context Engineering couldn’t address: AI has knowledge, but lacks behavioral constraints and verification.

Origins and Development

Mitchell Hashimoto’s Groundbreaking Contribution

On February 5, 2026, Mitchell Hashimoto, co-founder of HashiCorp, formally proposed the concept of “Harness Engineering.” It’s important to note: this is Mitchell Hashimoto (HashiCorp co-founder), not the CTO.

Mitchell wrote on his blog:

“Harness Engineering isn’t about limiting AI capabilities; it’s about ensuring they are exercised responsibly. Like providing safety equipment for a race car driver—it doesn’t limit how fast they can go, but ensures they can go fast safely.”

OpenAI’s Technological Push

On February 11, 2026, OpenAI published “Harness engineering: leveraging Codex,“阐述ing the importance of Harness Engineering from a technical perspective:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# OpenAI's Harness Engineering example
class CodexHarness:
    def __init__(self, codex_model, safety_constraints):
        self.model = codex_model
        self.constraints = safety_constraints
        self.execution_monitor = ExecutionMonitor()
    
    def safe_execute(self, task):
        # 1. Task safety verification
        if not self.validate_task(task):
            return "Task violates safety constraints"
        
        # 2. Execution process monitoring
        with self.execution_monitor:
            result = self.model.execute(task)
            
            # 3. Result verification
            if not self.validate_result(result):
                return "Execution result validation failed"
        
        return result

Wang Xin’s Theoretical Contributions

Technical expert Wang Xin published several articles about Harness Engineering on his personal website wangxin.io, perfecting the concept from a theoretical perspective. He proposed:

“The core of Harness Engineering is ‘controllability’. No matter how powerful an AI Agent is, if it cannot be controlled and verified by humans, it cannot be used in production environments.”

The Harness Engineering Formula

Mitchell Hashimoto proposed a concise formula:

Agent = LLM + Harness

This formula reveals the essence of modern AI Agents:

  • LLM: Provides cognitive and execution capabilities
  • Harness: Provides safety controls and constraint frameworks

Four Core Subsystems

Harness Engineering includes four interrelated core subsystems that together form the safety framework for AI Agents.

1. Tool Injection System

Manages the tools that AI Agents can call, ensuring the safety and controllability of tool invocation.

Core Components:

  • Function Calling Protocol: Standardized tool calling interface
  • Tool Registry: Metadata management for callable tools
  • Permission Control: Role-based tool access control
  • Sandbox Isolation: Isolated environment for tool execution

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
class ToolInjectionSystem:
    def __init__(self):
        self.tool_registry = ToolRegistry()
        self.permission_manager = PermissionManager()
        self.sandbox = ExecutionSandbox()
    
    def inject_tools(self, agent, required_tools):
        """Safely inject tools into Agent"""
        # 1. Tool permission verification
        for tool in required_tools:
            if not self.permission_manager.can_use(agent, tool):
                raise PermissionError(f"Agent {agent} cannot use tool {tool}")
        
        # 2. Tool registration and validation
        validated_tools = []
        for tool in required_tools:
            tool_info = self.tool_registry.get_tool(tool)
            if self.validate_tool(tool_info):
                validated_tools.append(tool_info)
        
        # 3. Sandbox environment configuration
        sandbox_config = self.sandbox.create_config(validated_tools)
        
        return {
            'tools': validated_tools,
            'sandbox': sandbox_config,
            'permissions': self.permission_manager.get_permissions(agent)
        }

Claude Code Practice Case: Claude Code (released September 2025) is an excellent practice of the tool injection system:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Claude Code's tool injection mechanism
claude_code_config = {
    'allowed_tools': [
        'file_read',
        'file_write',
        'bash_execute',
        'web_search',
        'git_operations'
    ],
    'restricted_directories': [
        '/system',
        '/private',
        '/config'
    ],
    'execution_timeout': 30,  # seconds
    'max_file_size': '10MB'
}

2. State Management System

Tracks and manages the execution state of AI Agents, ensuring traceability and recoverability of tasks.

Core Components:

  • Task Progress Tracking: Real-time monitoring of task execution status
  • State Persistence: Saving intermediate states to database
  • Interruption Recovery: Continuing execution from interruption points
  • Concurrency Isolation: Preventing interference between multiple tasks

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class StateManager:
    def __init__(self, storage_backend):
        self.storage = storage_backend
        self.state_cache = {}
    
    def create_task_state(self, task_id, initial_state):
        """Create task state"""
        state = {
            'task_id': task_id,
            'status': 'initialized',
            'progress': 0.0,
            'subtasks': [],
            'created_at': datetime.now(),
            'last_updated': datetime.now(),
            'data': initial_state
        }
        
        self.storage.save_state(task_id, state)
        return state
    
    def update_progress(self, task_id, progress, subtask=None):
        """Update task progress"""
        state = self.storage.get_state(task_id)
        state['progress'] = progress
        state['last_updated'] = datetime.now()
        
        if subtask:
            state['subtasks'].append({
                'name': subtask,
                'completed_at': datetime.now()
            })
        
        self.storage.save_state(task_id, state)
        return state
    
    def save_checkpoint(self, task_id, checkpoint_data):
        """Save checkpoint"""
        state = self.storage.get_state(task_id)
        state['checkpoint'] = {
            'data': checkpoint_data,
            'saved_at': datetime.now()
        }
        self.storage.save_state(task_id, state)
    
    def restore_from_checkpoint(self, task_id):
        """Restore from checkpoint"""
        state = self.storage.get_state(task_id)
        if 'checkpoint' in state:
            return state['checkpoint']['data']
        return None

Concurrency Control Strategy:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class ConcurrencyManager:
    def __init__(self, max_concurrent_tasks=5):
        self.max_concurrent = max_concurrent_tasks
        self.active_tasks = {}
        self.task_locks = {}
    
    def execute_task(self, task):
        """Execute task with concurrency handling"""
        # 1. Check concurrency limits
        if len(self.active_tasks) >= self.max_concurrent:
            raise ConcurrencyError(f"Maximum {self.max_concurrent} tasks allowed")
        
        # 2. Create task lock
        task_lock = asyncio.Lock()
        self.task_locks[task.id] = task_lock
        
        try:
            # 3. Execute task
            async with task_lock:
                result = await task.execute()
                self.active_tasks[task.id] = result
                return result
        finally:
            # 4. Cleanup resources
            del self.task_locks[task.id]
            if task.id in self.active_tasks:
                del self.active_tasks[task.id]

3. Verification Loop System

Verifies the execution results of AI Agents, ensuring the quality and safety of outputs.

Core Components:

  • Output Quality Checking: Verifies the quality of generated content
  • Error Detection: Identifies anomalies during execution
  • Auto-correction: Automatically corrects errors
  • Human Review Trigger: Human intervention for complex situations

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
class VerificationLoop:
    def __init__(self):
        self.checkers = [
            QualityChecker(),
            SafetyChecker(),
            FormatChecker(),
            CompletenessChecker()
        ]
        self.error_handlers = [
            AutoFixHandler(),
            HumanReviewHandler(),
            LogHandler()
        ]
    
    async def verify_execution(self, task, result):
        """Verify execution results"""
        verification_results = {}
        
        # 1. Execute all checks
        for checker in self.checkers:
            try:
                check_result = await checker.check(result)
                verification_results[checker.name] = check_result
            except Exception as e:
                verification_results[checker.name] = {
                    'passed': False,
                    'error': str(e)
                }
        
        # 2. Process check results
        overall_passed = all(r['passed'] for r in verification_results.values())
        
        if not overall_passed:
            # 3. Error handling
            error_summary = self.summarize_errors(verification_results)
            corrected_result = await self.handle_errors(task, result, error_summary)
            return corrected_result
        
        return result
    
    def summarize_errors(self, verification_results):
        """Summarize error information"""
        errors = []
        for checker_name, result in verification_results.items():
            if not result['passed']:
                errors.append({
                    'checker': checker_name,
                    'issue': result.get('issue', 'Unknown issue'),
                    'severity': result.get('severity', 'medium')
                })
        return errors
    
    async def handle_errors(self, task, result, errors):
        """Handle errors"""
        for error in errors:
            # Handle by severity
            if error['severity'] == 'critical':
                # Trigger human review
                await self.trigger_human_review(task, result, error)
            elif error['severity'] == 'high':
                # Auto-fix
                await self.auto_fix(task, result, error)
            else:
                # Log error
                await self.log_error(task, result, error)
        
        return result

LangGraph Workflow Orchestration: LangGraph (released January 2024) provides powerful workflow orchestration capabilities for verification loops:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# LangGraph verification loop example
verification_workflow = {
    'nodes': [
        {'id': 'input', 'type': 'input'},
        {'id': 'quality_check', 'type': 'quality_check'},
        {'id': 'safety_check', 'type': 'safety_check'},
        {'id': 'format_check', 'type': 'format_check'},
        {'id': 'auto_fix', 'type': 'auto_fix'},
        {'id': 'human_review', 'type': 'human_review'},
        {'id': 'output', 'type': 'output'}
    ],
    'edges': [
        {'from': 'input', 'to': 'quality_check'},
        {'from': 'quality_check', 'to': 'safety_check'},
        {'from': 'safety_check', 'to': 'format_check'},
        {'from': 'format_check', 'to': 'auto_fix', 'condition': 'needs_fix'},
        {'from': 'format_check', 'to': 'human_review', 'condition': 'needs_review'},
        {'from': 'auto_fix', 'to': 'output'},
        {'from': 'human_review', 'to': 'output'}
    ]
}

4. Constraint Layering System (Guardrails)

Sets multi-level constraints for AI Agents, ensuring behavioral compliance and safety.

Core Components:

  • Rule Engine: Configurable business rules
  • Format Enforcement: Output format requirements
  • Security Policy: Behavioral constraints for safety
  • Compliance Audit: Compliance checking and recording

Technical Implementation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class GuardrailSystem:
    def __init__(self):
        self.rule_engine = RuleEngine()
        self.format_enforcer = FormatEnforcer()
        self.security_policy = SecurityPolicy()
        self.compliance_auditor = ComplianceAuditor()
    
    def apply_constraints(self, task, input_data):
        """Apply constraint system"""
        # 1. Rule checking
        rule_violations = self.rule_engine.check_rules(task, input_data)
        if rule_violations:
            raise RuleViolationError(rule_violations)
        
        # 2. Format enforcement
        formatted_input = self.format_enforcer.enforce_format(input_data)
        
        # 3. Security policy check
        security_check = self.security_policy.check_security(task, formatted_input)
        if not security_check['passed']:
            raise SecurityError(security_check['issues'])
        
        # 4. Compliance audit
        self.compliance_auditor.log_audit_event(task, formatted_input)
        
        return formatted_input
    
    def create_rule_set(self, domain):
        """Create domain-specific rule sets"""
        rule_sets = {
            'finance': [
                {'rule': 'no_financial_data_leakage', 'severity': 'critical'},
                {'rule': 'require_approval_for_large_transactions', 'severity': 'high'}
            ],
            'healthcare': [
                {'rule': 'hipaa_compliance', 'severity': 'critical'},
                {'rule': 'patient_data_anonymization', 'severity': 'high'}
            ],
            'general': [
                {'rule': 'no_harmful_content', 'severity': 'high'},
                {'rule': 'respect_privacy', 'severity': 'medium'}
            ]
        }
        
        return rule_sets.get(domain, rule_sets['general'])

Design Principles

Feedforward Control

Set constraints and expectations before execution to prevent problems:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class FeedforwardController:
    def __init__(self):
        self.constraints = []
        self.expectations = []
    
    def setup_constraints(self, task):
        """Setup feedforward constraints"""
        # 1. Permission pre-check
        self.validate_permissions(task)
        
        # 2. Resource pre-allocation
        self.allocate_resources(task)
        
        # 3. Risk assessment
        self.assess_risks(task)
        
        # 4. Expectation setting
        self.set_expectations(task)
    
    def validate_permissions(self, task):
        """Permission validation"""
        required_perms = task.get_required_permissions()
        for perm in required_perms:
            if not self.check_permission(perm):
                raise PermissionError(f"Missing permission: {perm}")
    
    def allocate_resources(self, task):
        """Resource allocation"""
        resources = task.get_required_resources()
        for resource, amount in resources.items():
            if not self.allocate(resource, amount):
                raise ResourceError(f"Cannot allocate {resource}: {amount}")

Feedback Correction

Verify and correct after execution for continuous improvement:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class FeedbackController:
    def __init__(self):
        self.validators = []
        self.correctors = []
    
    def correct_execution(self, task, result):
        """Execute correction"""
        # 1. Result verification
        validation_result = self.validate_result(result)
        
        if not validation_result['passed']:
            # 2. Auto-correction
            corrected_result = self.auto_correct(result, validation_result['issues'])
            
            # 3. Re-verification
            if not self.validate_result(corrected_result)['passed']:
                # 4. Human intervention
                return self.human_intervention(task, corrected_result)
            
            return corrected_result
        
        return result

Progressive Automation

A gradual process from human supervision to full automation:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
class ProgressiveAutomation:
    def __init__(self):
        self.automation_levels = {
            'level_1': 'human_supervised',
            'level_2': 'human_in_the_loop',
            'level_3': 'autonomous_with_monitoring',
            'level_4': 'fully_autonomous'
        }
    
    def get_automation_level(self, task_complexity, risk_level):
        """Determine automation level based on task complexity and risk"""
        if risk_level == 'critical':
            return 'level_1'
        elif task_complexity == 'simple' and risk_level == 'low':
            return 'level_4'
        else:
            return 'level_2'  # Default to human-in-the-loop

Observability

Ensure system transparency and monitorability:

python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class ObservabilitySystem:
    def __init__(self):
        self.logger = Logger()
        self.metrics = Metrics()
        self.tracer = Tracer()
    
    def log_execution(self, task, agent, result):
        """Record execution logs"""
        log_entry = {
            'timestamp': datetime.now(),
            'task': task,
            'agent': agent,
            'result': result,
            'duration': result.get('duration', 0),
            'status': result.get('status', 'unknown')
        }
        
        self.logger.info(log_entry)
        self.metrics.record_task_completion(task, result['status'])
    
    def trace_execution(self, task_id, steps):
        """Trace execution steps"""
        trace = {
            'task_id': task_id,
            'steps': steps,
            'start_time': datetime.now(),
            'end_time': None,
            'errors': []
        }
        
        return trace

Limitations and Challenges

Current Limitations

Despite providing powerful safety controls, Harness Engineering still has some limitations:

  1. Human dependency for task initiation: Still requires humans to start tasks and set objectives
  2. Lack of autonomy between tasks: Cannot autonomously switch and coordinate between multiple tasks
  3. Fixed loop structures: Predefined loop structures cannot dynamically adapt to new situations
  4. Insufficient adaptability: Limited ability to handle unexpected situations

Motivation and Future

These limitations are precisely what drive the evolution to the next engineering phase:

  1. Autonomous coordination between tasks: AI can autonomously switch and coordinate between multiple tasks
  2. Dynamic loop structures: Loop structures can dynamically adjust as needed
  3. Adaptive decision-making: Dynamically adjust strategies based on execution results
  4. Full autonomy: From “requiring human initiation” to “autonomous task management”

These developments will push AI engineering into the next phase: Loop Engineering.

Summary: The Value of Harness

The value of Harness Engineering lies in: transforming AI from a demo into a deployable production system.

FeatureSimple DemoHarness Engineering System
SafetyNo safety guaranteesMulti-layer constraint protection
ReliabilityUncontrollable resultsExecutable and verifiable
MaintainabilityDifficult to maintainState traceable
ScalabilityHard to scaleModular design

Harness Engineering is a crucial step in AI engineering. It gives us not just AI that can answer questions, but AI that can execute tasks safely and reliably. This marks an important milestone for AI technology moving from the laboratory to practical applications.