← Back to Index
Ouro Loop
Bounded-Autonomy Framework for AI Coding Agents with Runtime-Enforced Guardrails, Five Verification Gates, Three-Layer Self-Reflection, and Autonomous Remediation Organization: VictorVVedtion (independent researcher) Published: March 2026 Type: repo Report Type: PhD-Level Technical Analysis Report Date: April 2026
Table of Contents
- Full Title and Attribution
- Authors and Team
- Core Contribution
- Supported Solutions
- LLM Integration
- Key Results
- Reproducibility
- Compute and API Costs
- Architecture Solution
- Component Breakdown
- Core Mechanisms (Detailed)
- Programming Language
- Memory Management
- Continued Learning
- Applications
1 Full Title and Attribution
Full Title: Ouro Loop — Autonomous AI Agent Development Framework with Bounded Autonomy
Repository: github.com/VictorVVedtion/ouro-loop
License: MIT
Status: Experimental, actively developed (March 2026)
Stars: ~10 (early stage, March 2026)
Package: pip install ouro-loop (PyPI)
Inspiration: Directly extends Andrej Karpathy's autoresearch paradigm from ML experiment loops to general software engineering. The name "Ouro" references the Ouroboros — the serpent that eats its own tail — symbolizing the agent's ability to consume its own errors and iterate autonomously.
Tagline:
"To grant an entity absolute autonomy, you must first bind it with absolute constraints."
Test Suite: 507 tests passing (CI via GitHub Actions)
2 Authors and Team
Ouro Loop is developed by VictorVVedtion, an independent researcher/developer. The project is solo-authored and was built primarily with Claude Code as the AI co-development partner. No academic affiliation is listed.
The author positions Ouro Loop as a philosophical framework first and a software tool second. The accompanying MANIFESTO.md — titled "The Ouroboros Contract" — articulates a theory of Precision Autonomy wherein defining what an agent cannot do is the precondition for granting it full creative freedom within the remaining space. This is a notable departure from the instruction-based guardrail patterns common in prompt engineering.
The author's background appears to span blockchain infrastructure, consumer products, and financial systems — all domains from which Ouro Loop draws real-world validation examples.
3 Core Contribution
Key Contribution: Ouro Loop formalizes the concept of Bounded Autonomy for AI coding agents, implementing a six-stage autonomous development loop (BOUND → MAP → PLAN → BUILD → VERIFY → LOOP) with runtime-enforced constraints, multi-layer verification gates, three-layer self-reflective logging, and autonomous remediation — enabling AI agents to work unsupervised for extended periods without human babysitting while maintaining safety guarantees.
The Problem Statement
In the era of "vibe coding," unbound AI agents exhibit four pathological behaviors:
- Hallucination — referencing files, APIs, and modules that do not exist
- Regression — breaking established architectural patterns and constraints
- Symptom-chasing — repeatedly editing the same file without addressing root causes
- Context decay — forgetting critical constraints during long sessions
Existing approaches fall into two extremes:
| Approach | Problem |
|---|---|
| Human-in-the-loop | Constant interruptions negate autonomous value; developer becomes a babysitter |
| Instruction-only guardrails | .cursorrules and CLAUDE.md define static instructions the agent can ignore |
The Ouro Loop Solution
Ouro Loop introduces three key innovations:
-
The Event Horizon (BOUND): Before any code is written, the developer defines absolute constraints — DANGER ZONES (protected files), NEVER DO rules (absolute prohibitions), and IRON LAWS (invariants that must always hold). These constraints define a boundary the agent physically cannot cross.
-
Runtime Enforcement via Hooks: Unlike instruction-based approaches that rely on agent compliance, Ouro Loop enforces BOUND constraints through Claude Code Hooks that operate at the tool level. A
bound-guard.shhook interceptsEditandWriteoperations and returnsexit 2(hard block) for DANGER ZONE files. The agent cannot bypass this — it is a runtime constraint, not a behavioral suggestion. -
Autonomous Remediation: When verification fails, the agent does not alert the human. It consults its remediation playbook (
modules/remediation.md), decides on a fix strategy (revert, retry alternative, escalate), executes it, and reports what it did. The agent only escalates to human review when changes touch a DANGER ZONE or when 3+ consecutive retries fail.
Theoretical Framework
The author articulates this as the Ouroboros Contract:
By explicitly defining the 20 things an agent can never do (the BOUND), you implicitly authorize it to autonomously do the 10,000 things required to solve the problem. The constraint space defines the creative space.
This is a meaningful conceptual advance over both: - Allowlisting (specify everything the agent can do — doesn't scale) - Instruction-following (hope the agent obeys — brittle)
Instead, it proposes constraint-based creative freedom — a small set of hard constraints enables a large space of autonomous action.
4 Supported Solutions
| Solution Type | Support Level | Description |
|---|---|---|
| Overnight autonomous development | Primary use case | Define BOUND, start agent, sleep. Agent iterates through Build→Verify→Self-Fix cycles. |
| Long-running refactoring | Supported | Phase-based refactoring with verification gates ensuring nothing breaks between phases. |
| Continuous code review | Built-in (Sentinel) | 24/7 unattended code review with partition-based scanning and risk scoring. |
| CI/CD agent integration | Supported | Agents handle build failures, test regressions, and dependency updates autonomously. |
| Production-safe AI coding | Primary design goal | Financial systems, blockchain, medical software — domains where "move fast and break things" is unacceptable. |
| Multi-phase feature development | Supported | Severity-ordered phases: CRITICAL → HIGH → MEDIUM → LOW. Agent processes in order. |
| Root-cause investigation | Demonstrated | Real session logs show agents testing multiple hypotheses and finding architectural root causes. |
What Ouro Loop Is NOT
The project explicitly excludes certain use cases:
- Quick prototypes or hackathon projects (BOUND setup overhead not justified)
- Single-file scripts (methodology overhead exceeds benefit)
- Real-time interactive coding (designed for "set it and let it run")
Agent Compatibility
Ouro Loop is agent-agnostic — it works with any AI coding assistant that can read files and execute terminal commands:
| Agent | Integration Level |
|---|---|
| Claude Code (Anthropic) | Primary target; native program.md skill support, hook enforcement |
| Cursor | Via .cursorrules referencing Ouro Loop modules |
| Aider | Terminal-based; reads markdown instructions and executes Python |
| Codex CLI (OpenAI) | Shell command execution and file reading |
| Windsurf (Codeium) | Standard file/command integration |
| Any agent with file read + shell exec | General compatibility via program.md |
5 LLM Integration
LLM-Agnostic by Design
Ouro Loop does not call LLMs directly. It is a methodology framework and runtime state machine that sits between the developer and the AI agent. The LLM integration happens at the agent level (Claude Code, Cursor, etc.), not at the Ouro Loop level.
This is a critical architectural distinction from systems like autoresearch:
┌─────────────────────────────────────────────────┐
│ Developer │
│ 1. Define BOUND in CLAUDE.md │
│ 2. Point agent at program.md │
│ 3. Sleep │
└──────────────────────┬──────────────────────────┘
│
┌──────────────────────▼──────────────────────────┐
│ AI Agent (any) │
│ ┌──────────────────────────────────────────┐ │
│ │ Reads program.md → follows methodology │ │
│ │ Uses framework.py CLI → state + verify │ │
│ │ Hooks enforce BOUND at tool call level │ │
│ └──────────────────────────────────────────┘ │
│ │
│ LLM calls happen HERE, inside the agent │
└──────────────────────┬──────────────────────────┘
│
┌──────────────────────▼──────────────────────────┐
│ Target Project Codebase │
│ CLAUDE.md (BOUND) + .ouro/ (state) + code │
└─────────────────────────────────────────────────┘
How the Agent Uses Ouro Loop
The agent is instructed (via program.md) to:
- Read CLAUDE.md — extract DANGER ZONES, NEVER DO, IRON LAWS
- Use framework.py CLI — manage state, run verification, log results
- Follow the six-stage loop — BOUND → MAP → PLAN → BUILD → VERIFY → LOOP
- Self-remediate on failure — consult remediation playbook, decide action, execute, report
The LLM's reasoning capabilities are used for: - MAP stage: understanding the problem space (6 questions) - PLAN stage: complexity estimation and phase decomposition - BUILD stage: code generation (RED → GREEN → REFACTOR → COMMIT) - VERIFY stage: self-assessment against BOUND constraints - REMEDIATE: analyzing failures and choosing alternative approaches
Sentinel: Claude Code Integration
The Sentinel module explicitly targets Claude Code as the AI agent, using:
- --permission-mode bypassPermissions for unattended operation
- Claude Code's native session management and tool calling
- claude-opus-4-6 as the default model for review sessions
6 Key Results
Blockchain L1 — Consensus Performance Under Load
An AI agent used Ouro Loop to investigate why precommit latency spiked from 4ms to 200ms under transaction load on a 4-validator PBFT blockchain. Full session log available in examples/blockchain-l1/session-log.md.
Key findings:
- 5 hypotheses tested, 4 autonomous remediations — the ROOT_CAUSE gate fired 4 times, correctly identifying symptom-fixing
- After 3 consecutive failed hypotheses, the 3-failure step-back rule activated: "stop fixing symptoms, re-examine the architecture"
- Root cause was architectural, not code-level — a single-node HTTP bottleneck causing consensus-wide delays; fix was a Caddy reverse proxy
- Agent caught its own flawed experiment — identified running 4x full stress instead of 1x distributed before drawing wrong conclusions
| Metric | Before | After | Delta |
|---|---|---|---|
| Precommit (under load) | 100-200ms | 4ms | -98% |
| Block time (under load) | 111-200ms | 52-57ms | -53% |
| TPS Variance | 40.6% | 1.6% | -96% |
| SysErr rate | 0.00% | 0.00% | = (IRON LAW maintained) |
| Blocks/sec (soak load) | ~8.0 | ~18.5 | +131% |
Consumer Product — Lint Remediation in React/Next.js
A simpler session where the ROOT_CAUSE gate demonstrated its value:
- Agent attempted to suppress ESLint errors (symptom-fixing)
- ROOT_CAUSE gate identified the pattern and flagged it
- Agent was pushed toward genuinely better solutions:
- Replacing
<img>with Next.js<Image> - Properly handling
useEffectstate patterns - Using framework-appropriate patterns instead of suppression
Quantitative Framework Metrics
| Metric | Value |
|---|---|
| Test suite | 507 tests passing |
| Codebase size | ~47KB framework.py, ~30KB sentinel.py, ~14KB prepare.py, ~9KB program.md |
| Dependencies | Zero (pure Python 3.10+ stdlib) |
| Hook enforcement | Verified exit code 2 hard-block on DANGER ZONE files |
| Reflective log | JSONL format, 30-entry rolling window |
| Results audit trail | TSV format, full phase/verdict/violation logging |
7 Reproducibility
Installation
pip install ouro-loop
Or clone:
git clone https://github.com/VictorVVedtion/ouro-loop.git ~/.ouro-loop
Minimal Reproduction Steps
cd /path/to/your/project
# 1. Scan project structure
python -m prepare scan .
# 2. Initialize state directory (.ouro/)
python -m prepare init .
# 3. Generate CLAUDE.md template
python -m prepare template claude .
# 4. Edit CLAUDE.md with real BOUND constraints
# Define DANGER ZONES, NEVER DO, IRON LAWS
# 5. Point AI agent at program.md
# "Read program.md and CLAUDE.md. Start the Ouro Loop for: [task]"
Reproducibility Assessment
| Factor | Assessment |
|---|---|
| Code availability | Fully open-source, MIT license, pip installable |
| Zero dependencies | Pure Python 3.10+ stdlib — no dependency conflicts possible |
| Agent requirement | Requires a capable AI coding agent (Claude Code recommended) |
| Determinism | Non-deterministic — results depend on LLM reasoning and stochastic code generation |
| Session logs | Two real session logs provided in examples/ with detailed methodology observations |
| Test suite | 507 tests covering runtime logic, verification gates, and state management |
| CI/CD | GitHub Actions CI with test badge |
Limitations on Reproducibility
- LLM dependency: Results depend on the specific LLM used by the underlying agent. Different models may produce different remediation strategies.
- Project-specific BOUND: The methodology's effectiveness depends on the quality of BOUND definition. Poorly specified constraints lead to poor outcomes.
- Qualitative measurement: The core value proposition (overnight autonomous development without breakage) is measured qualitatively through session logs, not through standardized benchmarks.
8 Compute and API Costs
Framework Overhead: Near-Zero
Ouro Loop itself adds negligible compute overhead:
| Component | Resource Impact |
|---|---|
framework.py CLI |
~50ms per command (state read/write, git calls) |
| Verification gates | ~200ms total (5 gates, each running git subprocess with 10s timeout) |
| Reflective log write | ~10ms (JSONL append + trim to 30 entries) |
| Hook evaluation | ~100ms per hook (shell script, CLAUDE.md parsing) |
| State management | ~5ms (JSON read/write with atomic replace) |
LLM Costs: Agent-Dependent
Since Ouro Loop doesn't call LLMs directly, API costs come from the underlying agent:
| Scenario | Estimated Tokens | Estimated Cost (Claude Sonnet) |
|---|---|---|
program.md initial read |
~4,000 tokens | ~$0.01 |
| CLAUDE.md parsing | ~500-2,000 tokens | ~$0.005 |
| Per-phase MAP+PLAN+BUILD cycle | ~10,000-50,000 tokens | ~$0.10-$0.50 |
| Remediation cycle (on failure) | ~5,000-15,000 tokens | ~$0.05-$0.15 |
| Full overnight session (10 phases) | ~200,000-500,000 tokens | ~$2-$5 |
Sentinel 24/7 Review Costs
The Sentinel module runs continuous Claude Code sessions:
| Parameter | Default | Impact |
|---|---|---|
| Model | claude-opus-4-6 |
Premium pricing |
| Max turns per session | 200 | Upper bound on single-session cost |
| Session timeout | 120 minutes | Prevents runaway sessions |
| Cooldown between sessions | 30 seconds | Rate limiting |
| Permission mode | bypassPermissions |
No human approval delays |
Estimated 24/7 Sentinel cost: $50-$200/day depending on project size, partition count, and finding density. This is significant and should be budgeted for production use.
9 Architecture Solution
System Design: Three Files That Matter
Ouro Loop is architecturally minimalist by design — the entire system consists of three files:
┌────────────────────────────────────────────────────────┐
│ program.md │
│ "The Methodology" │
│ │
│ Human-authored instructions the AI agent follows. │
│ Defines the six-stage loop: BOUND → MAP → PLAN → │
│ BUILD → VERIFY → LOOP. │
│ Iterated by the HUMAN. │
│ │
│ Analogous to autoresearch's program.md, but for │
│ general software engineering instead of ML. │
└────────────────────────┬───────────────────────────────┘
│ Agent reads
┌────────────────────────▼───────────────────────────────┐
│ framework.py │
│ "The Runtime" │
│ │
│ Lightweight state machine + CLI for the agent. │
│ State tracking, verification gates, reflective log. │
│ Can be EXTENDED by the agent. │
│ │
│ Analogous to autoresearch's train.py — the file │
│ the agent iterates on. │
└────────────────────────┬───────────────────────────────┘
│ Agent uses
┌────────────────────────▼───────────────────────────────┐
│ prepare.py │
│ "The Initializer" │
│ │
│ Project scanning and .ouro/ directory creation. │
│ NOT MODIFIED by agent or human after init. │
│ │
│ Read-only reference, like autoresearch's prepare.py. │
└────────────────────────────────────────────────────────┘
The Six-Stage Loop
┌──────────────────────────────────────────────────┐
│ │
│ Stage 0: BOUND ◄── Human defines constraints │
│ │ │
│ ▼ │
│ Stage 1: MAP ──── Understand problem space │
│ │ (6 mandatory questions) │
│ ▼ │
│ Stage 2: PLAN ─── Decompose by severity │
│ │ (CRITICAL→HIGH→MEDIUM→LOW) │
│ ▼ │
│ Stage 3: BUILD ── RED→GREEN→REFACTOR→COMMIT │
│ │ │
│ ▼ │
│ Stage 4: VERIFY ─ Three-layer verification │
│ │ │ Layer 1: 5 automated gates │
│ │ │ Layer 2: Self-assessment │
│ │ │ Layer 3: Human review trigger │
│ │ │ │
│ │ ├── FAIL ──► REMEDIATE ──► back to BUILD │
│ │ │ (autonomous, inside BOUND) │
│ │ │ │
│ │ └── PASS ──► Stage 5: LOOP │
│ │ │ │
│ │ ├── Feed discoveries back │
│ │ ├── Update BOUND if needed │
│ │ └── Advance to next phase │
│ │ │
│ └────────────────── Repeat until all phases │
│ complete or DANGER ZONE hit │
└──────────────────────────────────────────────────┘
Enforcement Architecture: The Hook System
The hook system creates a runtime enforcement layer that operates below the agent's reasoning:
Agent decides to edit a file
│
┌─────────▼──────────┐
│ Claude Code Tool │
│ call: Edit/Write │
│ target: file.py │
└─────────┬──────────┘
│ PreToolUse event
┌─────────▼──────────┐
│ bound-guard.sh │
│ │
│ 1. Parse CLAUDE.md │
│ 2. Extract DANGER │
│ ZONES │
│ 3. Match file path │
│ against zones │
│ 4. Path-segment │
│ aware matching │
└─────────┬──────────┘
│
┌─────┴─────┐
│ │
IN ZONE NOT IN ZONE
│ │
exit 2 exit 0
(BLOCKED) (allowed)
│ │
Agent sees Tool executes
denial msg normally
This is architecturally significant because it makes BOUND enforcement non-bypassable from the agent's perspective. No amount of prompt injection, context confusion, or reasoning error can override an exit code 2 from a PreToolUse hook.
Data Flow: State and Logging
.ouro/
├── state.json ← Current loop state (stage, phase, history)
├── reflective-log.jsonl ← Three-layer self-awareness log (30 entries max)
└── sentinel/ ← Sentinel-specific state (if initialized)
├── sentinel-config.json
├── partitions.json
├── state.json
├── findings.jsonl
├── iteration-log.jsonl
├── suppressed.json
└── learnings.md
ouro-results.tsv ← Audit trail (phase/verdict/violations)
CLAUDE.md ← BOUND definition (DANGER ZONES, NEVER DO, IRON LAWS)
10 Component Breakdown
Component 1: program.md — The Methodology (9KB)
The central instruction document that turns any AI agent into an Ouro Loop agent. Key structural elements:
| Section | Purpose | Lines |
|---|---|---|
| Setup | Bootstrap sequence (read CLAUDE.md, check .ouro/) | ~15 |
| BOUND | Constraint definition requirements | ~25 |
| The Loop (MAP) | 6 mandatory questions before coding | ~15 |
| The Loop (PLAN) | Complexity routing table + phase decomposition | ~30 |
| The Loop (BUILD) | RED-GREEN-REFACTOR-COMMIT + 3 self-questions | ~25 |
| The Loop (VERIFY) | Three-layer verification specification | ~40 |
| The Loop (REMEDIATE) | Decision tree for autonomous failure handling | ~35 |
| The Loop (LOOP) | Feedback closure and phase advancement | ~20 |
| Context Management | Anti-context-decay strategies | ~20 |
| Rules | CAN DO / CANNOT DO / NEVER STOP | ~20 |
The VERIFY section includes a structured decision tree for remediation that the agent follows on failure:
VERIFY failed
│
Is the failure inside a DANGER ZONE?
│
YES → STOP. Report to human.
│
NO → What type of failure?
│
EXIST (hallucination) → Remove bad reference, find correct one
RELEVANCE (drift) → Stash out-of-scope changes, return to plan
ROOT_CAUSE (stuck) → Revert to last good state, different approach
RECALL (context decay) → Re-read CLAUDE.md BOUND section
MOMENTUM (stuck) → Stop reading, write something, iterate
TEST FAILURE → In scope? Fix. Not in scope? Revert.
Component 2: framework.py — The Runtime (47KB)
The state machine and verification engine. This is the file the agent interacts with via CLI.
| Subcomponent | LOC (est.) | Purpose |
|---|---|---|
| State management | ~80 | Load/save .ouro/state.json with atomic writes |
| CLAUDE.md parser | ~120 | Structured extraction of DANGER ZONES, NEVER DO, IRON LAWS |
| Complexity detection | ~50 | Trivial/simple/complex/architectural routing |
| Verification engine | ~200 | Layer 1 (5 gates) + Layer 2 (self-assessment) + Layer 3 (review triggers) |
| Pattern detection | ~80 | Consecutive failures, stuck loops, velocity trends, hot files, drift |
| Reflective logging | ~150 | Three-layer JSONL log (WHAT/WHY/PATTERN) with 30-entry rolling window |
| Result logging | ~50 | TSV audit trail (phase/verdict/violations) |
| CLI interface | ~100 | argparse-based: status, verify, log, advance, bound-check, reflect |
CLAUDE.md Parser — Dual-Strategy Extraction:
The parser uses a two-phase extraction strategy:
-
Primary extraction (structured): Regex-based parsing of standard section headers (
### DANGER ZONES,### NEVER DO,### IRON LAWS). Extracts backtick-wrapped paths from DANGER ZONES, list items from NEVER DO and IRON LAWS. -
Fallback extraction (prose): If primary extraction finds nothing but BOUND markers are present, the parser switches to heuristic mode — scanning for path-like strings near "DANGER" keywords, lines starting with "Never"/"Do not", and lines containing "must"/"always" near backtick-wrapped code.
This dual-strategy design accommodates both well-structured and free-form CLAUDE.md files.
Path-Segment-Aware DANGER ZONE Matching:
# Zone "auth/" matches "auth/login.py" but NOT "unauthorized.py"
# Zone "auth/core.py" matches exactly that file
# Zone ending with "/" is treated as a directory prefix
The matcher splits both the file path and zone pattern into segments and checks for contiguous subsequence matching. This prevents false positives from substring matching (e.g., "auth" matching "unauthorized").
Component 3: prepare.py — The Initializer (14KB)
Project scanning and .ouro/ directory creation. Key capabilities:
| Feature | Description |
|---|---|
| Language detection | Multi-language scanning via file extension mapping |
| File counting | LOC measurement across project |
| CLAUDE.md detection | Checks for existing BOUND definitions |
| Test detection | Scans for test files/directories |
| CI detection | Checks for CI/CD configuration files |
| Template generation | Creates CLAUDE.md templates with BOUND section scaffolding |
| State initialization | Creates .ouro/state.json with project metadata |
Component 4: sentinel.py — 24/7 Autonomous Code Review (30KB)
A daemon module for continuous, unattended code review:
| Subcomponent | Purpose |
|---|---|
| Command detection | Auto-discovers build/test/lint commands from 10+ project types |
| Partition scanner | Groups project directories into risk-scored partitions |
| Risk scoring | Cross-references partitions against DANGER ZONES for criticality |
| Config management | JSON config with validation for review, runner, and partitioning settings |
| Template rendering | Generates Sentinel-specific CLAUDE.md from templates |
| Runner installation | Generates immortal daemon scripts (nohup + launchd adoption) |
| Dashboard | Live progress monitoring script |
| State tracking | Iteration count, findings, coverage, partition history |
Sentinel Runner Architecture:
make sentinel-start
└→ nohup sentinel-runner.sh & disown
└→ Terminal closes? SIGHUP absorbed by nohup
└→ macOS launchd (PID 1) adopts the orphan
└→ Sleep/wake? launchd children survive
└→ Result: sentinel lives until killed
The runner launches Claude Code sessions in a loop, each reading sentinel state, picking the highest-priority partition, scanning for issues, and recording findings.
Component 5: Hooks (5 shell scripts)
| Hook | Event | Action | Enforcement |
|---|---|---|---|
bound-guard.sh |
PreToolUse: Edit/Write | Parse CLAUDE.md DANGER ZONES, match against target file | exit 2 hard-block |
root-cause-tracker.sh |
PostToolUse: Edit/Write | Track per-file edit count; warn at 3+, strong warn at 5+ | Warning (no block) |
drift-detector.sh |
PreToolUse: Edit/Write | Count distinct directories touched; warn at 5+ | Warning (scope alert) |
momentum-gate.sh |
PostToolUse: Edit/Write/Read | Track read/write ratio; warn at 3:1+ (analysis paralysis) | Warning (action prompt) |
recall-gate.sh |
PreCompact | Re-inject BOUND section into context before compression | Context preservation |
The recall-gate.sh hook is architecturally notable — it fires on the PreCompact event (before Claude Code compresses its context window) and re-injects the BOUND constraints. This prevents constraint amnesia during long sessions, a failure mode the author specifically targets.
Component 6: Module Documentation (modules/)
Deep-dive reference material for each stage:
| Module | Content |
|---|---|
bound.md |
How to identify and define boundaries |
map.md |
Problem space mapping techniques |
plan.md |
Phase decomposition and complexity routing |
build.md |
RED-GREEN-REFACTOR-COMMIT details |
verify.md |
Three-layer verification specification |
loop.md |
Feedback loop mechanics |
remediation.md |
Full remediation playbook with examples |
11 Core Mechanisms (Detailed)
Mechanism 1: Bounded Autonomy (The Event Horizon)
The central theoretical contribution. BOUND is defined in three layers:
Layer A — DANGER ZONES (Spatial Constraints): Files and directories where modifications carry outsized risk. The agent may read these files but cannot edit them without explicit human approval.
### DANGER ZONES
- `src/payments/calculator.py` — financial calculations, penny-level precision
- `migrations/` — database schema, irreversible in production
- `consensus/` — PBFT consensus protocol, correctness-critical
- `auth/middleware.py` — authentication, security boundary
Layer B — NEVER DO (Behavioral Constraints): Absolute prohibitions that the agent must never violate under any circumstances.
### NEVER DO
- Never use float for monetary values — always Decimal
- Never delete or rename migration files
- Never commit without running the test suite
- Never alter the consensus voting logic
Layer C — IRON LAWS (Invariant Constraints): Properties that must always hold, verifiable through automated checks.
### IRON LAWS
- All monetary values use Decimal with 2-digit precision
- All API responses include request_id field
- Test coverage for payment module never drops below 90%
- SysErr rate in consensus is 0.00%
The three layers serve different purposes:
| Layer | Type | Enforcement | Recovery |
|---|---|---|---|
| DANGER ZONES | Spatial | Hard-block via hooks | Human approval required |
| NEVER DO | Behavioral | Self-assessment in VERIFY | Agent self-corrects |
| IRON LAWS | Invariant | Automated verification | Agent must restore invariant |
Mechanism 2: Five Verification Gates (Layer 1)
Each gate maps to a specific pathological behavior:
Gate 1 — EXIST (Anti-Hallucination):
# Checks: CLAUDE.md exists, key files exist
# Status values: PASS / WARN (no CLAUDE.md) / FAIL (expected but missing)
# Addresses: Agents referencing files, APIs, modules that don't exist
The EXIST gate also checks whether the agent is operating with stale state — if BOUND was expected (from init snapshot) but CLAUDE.md is now missing, it fails rather than warns.
Gate 2 — RELEVANCE (Anti-Drift):
# Uses: git status --short to enumerate changed files
# Cross-references: changed files against DANGER ZONES
# Status: PASS (no DZ contact) / WARN (DZ files touched) / SKIP (no git)
# Addresses: Scope drift, unintended DANGER ZONE contact
Provides a structured list of changed files and DANGER ZONE overlaps for downstream analysis.
Gate 3 — ROOT_CAUSE (Anti-Symptom-Chasing):
# Uses: git log --name-only -10 to find frequently edited files
# Threshold: 3+ edits to same file → "hot file" warning
# Status: PASS (no hot files) / WARN (hot files detected)
# Addresses: The agent editing the same file repeatedly without fixing root cause
This is the gate that fired 4 times in the blockchain session, each time correctly identifying that the agent was fixing symptoms rather than the root cause.
Gate 4 — RECALL (Anti-Context-Decay):
# Checks: BOUND section parseable, DANGER ZONES present, IRON LAWS present
# Status: PASS (full BOUND) / WARN (incomplete or missing BOUND)
# Addresses: Agent forgetting constraints during long sessions
Complemented by the recall-gate.sh hook that re-injects BOUND before context compression.
Gate 5 — MOMENTUM (Anti-Analysis-Paralysis):
# Tracks: recent commit frequency
# Status: PASS (2+ recent commits) / WARN (0-1 commits)
# Addresses: Agent stuck in read-only analysis without producing output
Mechanism 3: Three-Layer Reflective Logging
Every verification result is logged as a structured JSONL entry with three layers of increasing abstraction:
Layer 1 — WHAT (Facts):
{
"stage": "BUILD",
"phase": "2/5",
"verdict": "FAIL",
"overall": "REVIEW",
"gates": {
"EXIST": {"status": "+", "detail": "..."},
"ROOT_CAUSE": {"status": "!", "detail": "Hot files: src/payments/stripe.py"}
},
"changed_files": ["src/payments/stripe.py", "src/payments/types.py"],
"danger_zone_contact": ["src/payments/stripe.py (zone: src/payments/)"],
"bound_violations": 0
}
Layer 2 — WHY (Decisions):
{
"complexity": "complex",
"complexity_reason": "Touches DANGER ZONE: src/payments/stripe.py",
"review_reasons": ["DANGER ZONE touched: src/payments/stripe.py"],
"bound_state": {"danger_zones": 2, "never_do": 3, "iron_laws": 2},
"notes": "payment validation failed"
}
Layer 3 — PATTERN (Self-Awareness):
{
"consecutive_failures": 2,
"stuck_loop": false,
"velocity_trend": "DECELERATING",
"retry_rate": 0.40,
"hot_files": ["src/payments/stripe.py"],
"drift_signal": true
}
Each entry also generates actionable alerts for quick LLM consumption:
>> DRIFT: working in DANGER ZONE — extra caution required
>> HOT FILES: src/payments/stripe.py — possible symptom-chasing
>> SLOWING: pass rate declining — reassess approach
The reflective log is capped at 30 entries and designed for the agent to read at the start of each iteration, providing ambient self-awareness without requiring full session replay.
Mechanism 4: Pattern Detection Engine
The pattern detection system analyzes the agent's behavioral history to identify recurring problems:
| Pattern | Detection Method | Alert Threshold |
|---|---|---|
| Consecutive failures | Count tail FAIL/RETRY verdicts | 2+ triggers alert |
| Stuck loop | Same stage failing 3+ times in a row | 3 consecutive same-stage failures |
| Velocity trend | Compare pass rates between two halves of recent history | >0.3 swing for ACCELERATING/DECELERATING; requires 6+ entries |
| Hot files | ROOT_CAUSE gate's repeated-edit tracking | 3+ edits to same file |
| Drift signal | RELEVANCE gate's DANGER ZONE contact tracking | Any DANGER ZONE file in changed list |
| Retry rate | Percentage of RETRY verdicts in last 5 entries | Continuous metric, no threshold |
Velocity trend detection is particularly nuanced — it requires at least 6 entries for meaningful analysis and uses a >0.3 swing threshold (not 0.2) to reduce false positives from natural variation.
Mechanism 5: Autonomous Remediation Protocol
When verification fails, the agent follows a structured remediation protocol rather than asking for human help:
VERIFY returns FAIL
│
┌─────────▼──────────┐
│ Is failure inside │
│ a DANGER ZONE? │
└─────────┬──────────┘
│
┌────────┴────────┐
│ │
YES NO
│ │
STOP. Consult
Report to remediation.md
human. decision tree
│ │
▼ ▼
[Human ┌─────────────┐
review] │ EXIST fail? │───► Remove bad reference
│ │ Find correct one
│ RELEVANCE? │───► Stash changes
│ │ Return to plan
│ ROOT_CAUSE? │───► Revert to last good
│ │ Try different approach
│ RECALL? │───► Re-read CLAUDE.md
│ │ Summarize constraints
│ MOMENTUM? │───► Stop reading
│ │ Write something
│ TEST FAIL? │───► In scope? Fix.
│ │ Not in scope? Revert.
└─────────────┘
│
▼
Report what was done:
[REMEDIATED] gate=... action=...
was: ...
did: ...
now: ...
bound: confirm no DZ touched
After every remediation, the agent produces a structured report:
[REMEDIATED] gate=ROOT_CAUSE action=revert_and_retry
was: editing src/payments/calc.py for the 4th time (same TypeError)
did: reverted to commit a1b2c3d, re-analyzed from scratch
now: trying middleware pattern instead
bound: no DANGER ZONE touched, no IRON LAW affected
Mechanism 6: Complexity Routing
The PLAN stage routes tasks through a complexity matrix that determines the appropriate level of formality:
| Signal | Trivial | Simple | Complex | Architectural |
|---|---|---|---|---|
| Max lines | 20 | 100 | 500 | Unlimited |
| Max files | 1 | 3 | 10 | Unlimited |
| Phases | 0 | 2 | 5 | Unlimited |
| DANGER ZONE | Not touched | Adjacent | Inside | Modifies IRON LAW |
| Risk level | None | Low | Medium | High |
| Dependencies | None | Known | Unknown | External |
Trivial and simple tasks execute directly without phase plans. Complex and architectural tasks require decomposition into independently-verifiable phases ordered by severity (CRITICAL → HIGH → MEDIUM → LOW), each changing 100-300 lines maximum.
Mechanism 7: Layer 3 Review Triggers
Layer 3 verification determines when human review is mandatory (not advisory):
| Trigger | Condition | Rationale |
|---|---|---|
| DANGER ZONE contact | Any changed file matches a DANGER ZONE | Core safety guarantee |
| Consecutive retries | 3+ consecutive RETRY verdicts | Agent likely stuck |
| Gate failure | Any Layer 1 gate returns FAIL | Serious verification issue |
| Architectural complexity | Complexity detection returns "architectural" | Cross-cutting changes need human judgment |
When Layer 3 triggers, the overall verdict becomes REVIEW (not FAIL), and the agent is expected to stop and present findings to the human.
12 Programming Language
Implementation Language: Python 3.10+
Zero external dependencies. The entire framework uses only Python standard library modules:
| Stdlib Module | Usage |
|---|---|
os, sys, shutil |
File operations, state management, atomic writes |
json |
State serialization, JSONL log format, config parsing |
re |
CLAUDE.md parsing (DANGER ZONES, NEVER DO, IRON LAWS extraction) |
subprocess |
Git operations (status, log, commit history) |
argparse |
CLI interface for all commands |
datetime, timezone |
Timestamps in UTC for state and logs |
collections.Counter |
Hot file frequency analysis in ROOT_CAUSE gate |
stat |
File permission management for executable scripts |
Hooks are written in Bash — shell scripts that integrate with Claude Code's hook system. They parse CLAUDE.md using grep and awk, match file paths, and return exit codes.
program.md and modules/*.md — Markdown as Code
A distinctive architectural choice: the core methodology is expressed as Markdown instruction documents, not as code. The agent reads program.md like a skill specification and follows its instructions. This makes the methodology:
- Human-readable — non-technical stakeholders can review and modify the constraints
- Agent-agnostic — any agent that can read Markdown can use Ouro Loop
- Versionable — methodology changes are tracked in git
- Iterable — humans refine the methodology over time
Code Quality Metrics
| Metric | Value |
|---|---|
| Total Python LOC | ~91KB across 3 files |
| Test count | 507 |
| Test framework | Standard pytest (inferred from CI) |
| Python version requirement | 3.10+ (for match/case, union types) |
| Dependencies | 0 |
| Linting | Not specified (but CI passes) |
13 Memory Management
State Persistence: .ouro/state.json
The primary state file tracks the agent's position in the loop:
{
"project_name": "my-payment-service",
"current_stage": "BUILD",
"current_phase": 2,
"total_phases": 5,
"bound_defined": true,
"history": [
{
"stage": "BUILD",
"phase": 1,
"verdict": "PASS",
"timestamp": "2026-03-15T10:00:00+00:00"
}
],
"updated_at": "2026-03-15T10:30:00+00:00"
}
Atomic writes via os.replace() with fallback to shutil.move() for cross-device scenarios (Docker volumes, NFS mounts).
Reflective Log: Three-Layer Self-Awareness Memory
The reflective log (.ouro/reflective-log.jsonl) is the system's primary memory mechanism. It serves as a compressed behavioral history that the agent reads at the start of each iteration.
Key design decisions: - JSONL format — each line is a self-contained JSON object, enabling simple append - 30-entry rolling window — prevents unbounded growth while maintaining enough history for pattern detection - Three-layer structure — WHAT (facts), WHY (decisions), PATTERN (self-awareness) — designed for fast LLM parsing - Actionable alerts — pre-computed summary alerts that the agent can act on without parsing raw data
This is the closest analog to a "working memory" for the AI agent — it provides context about past iterations without requiring full session replay.
Results Audit Trail: ouro-results.tsv
A lightweight, human-readable audit log:
phase verdict bound_violations notes
1/3 PASS 0 transactions endpoint + tests
2/3 RETRY 0 ROOT_CAUSE warning, fixing
2/3 PASS 0 fixed after retry
3/3 PASS 0 validation complete
Sentinel Memory: learnings.md
The Sentinel module maintains a cross-session knowledge accumulator (learnings.md) that is updated every 10 iterations with patterns and insights discovered during code review. This provides long-term institutional knowledge.
Context Decay Prevention
Ouro Loop implements three strategies against context decay:
- RECALL gate — verifies the agent can still state the task and top 3 constraints
recall-gate.shhook — re-injects BOUND into context before compression- Explicit instructions in
program.md— every 5 phases or ~30 minutes, the agent should run the RECALL gate and re-read CLAUDE.md if needed
Comparison with autoresearch Memory
| Aspect | autoresearch | Ouro Loop |
|---|---|---|
| State tracking | best_val_bpb metric file |
state.json (stage, phase, history) |
| History | Git commit log | Reflective log (JSONL) + results TSV |
| Self-awareness | Implicit (metric comparison) | Explicit (pattern detection, drift signals) |
| Long-term memory | None (stateless between runs) | Sentinel learnings.md |
| Context management | Not addressed | RECALL gate + recall hook + program.md instructions |
14 Continued Learning
Learning Through Reflective Logging
Ouro Loop's primary learning mechanism is the three-layer reflective log. The agent doesn't "learn" in the ML sense — it doesn't update model weights. Instead, it maintains a behavioral pattern memory that informs future iterations:
- Hot file detection → Agent learns to avoid symptom-chasing on frequently-edited files
- Velocity tracking → Agent detects when its approach is stalling and changes strategy
- Stuck loop detection → Agent recognizes when the same stage fails 3+ times and tries fundamentally different approaches
- Drift monitoring → Agent stays aware of unintended DANGER ZONE contact
Learning Through BOUND Evolution
The LOOP stage explicitly instructs the agent to feed discoveries back into BOUND:
"Did this phase reveal anything that should change the plan?" - New constraint discovered → Add to BOUND in CLAUDE.md - Remaining phases need adjustment → Update the plan - Similar pattern found → Note it for future phases
This creates a self-improving constraint system where the agent actively expands the safety boundary based on runtime experience.
Sentinel Cross-Session Learning
The Sentinel module implements longer-term learning:
learnings.md— updated every 10 iterations with discovered patternssuppressed.json— deduplication store for confirmed false positives- Partition history — tracks which areas have been reviewed and when
- Findings journal — JSONL log of all identified issues with severity ratings
Limitations of Learning
- No weight updates — the agent's underlying LLM is not fine-tuned. Learning is purely in-context.
- Session-bounded (main loop) — reflective log is per-session. New sessions start fresh (must read state.json to resume).
- Human-dependent BOUND quality — the quality of learning depends on well-defined constraints. Poorly specified BOUND leads to poor learning signals.
- No generalization across projects — patterns learned in one project don't transfer to another (no shared memory).
15 Applications
Application 1: Overnight Autonomous Development
The primary use case. Workflow:
- Developer defines BOUND in CLAUDE.md (15-30 minutes)
- Developer describes task and points agent at
program.md - Developer goes to sleep
- Agent runs: MAP → PLAN → BUILD → VERIFY → LOOP for each phase
- Agent remediates failures autonomously (inside BOUND)
- Developer wakes up to
ouro-results.tsvshowing completed phases
Real example: Blockchain L1 consensus investigation — agent tested 5 hypotheses, remediated 4 failures, found architectural root cause, all without human intervention.
Application 2: Production-Safe AI Coding
For domains where errors are catastrophic:
| Domain | DANGER ZONES | IRON LAWS | NEVER DO |
|---|---|---|---|
| Financial systems | payments/, billing/ |
Decimal precision, audit trail | No float for money |
| Blockchain | consensus/, p2p/ |
Zero SysErr rate | No vote logic changes |
| Medical software | dosage/, patient/ |
Unit consistency | No unvalidated inputs |
| Authentication | auth/, session/ |
Token expiry enforcement | No plaintext passwords |
Application 3: 24/7 Continuous Code Review (Sentinel)
Sentinel enables continuous, unattended code review:
- Partition scanning — project directories scored by risk (DANGER ZONE overlap, git activity, file count)
- Priority-based review — high-criticality partitions reviewed first
- Finding tracking — severity-rated findings logged in JSONL format
- Auto-fix capability — configurable fix attempts with blast radius limits
- Dashboard monitoring — live progress visualization
Application 4: Autoresearch Extension
Ouro Loop explicitly extends Karpathy's autoresearch paradigm:
| Aspect | autoresearch | Ouro Loop |
|---|---|---|
| Domain | ML experiments | General software engineering |
| Constraint | 5-minute training budget | BOUND (DANGER ZONES, NEVER DO, IRON LAWS) |
| Metric | val_bpb (single scalar) | Multi-layer verification (gates + self-assessment) |
| On failure | Auto-revert, next experiment | Auto-remediate, try alternative approach |
| Human programs | program.md (experiment strategy) |
program.md (dev strategy) + CLAUDE.md (boundaries) |
| AI modifies | train.py (model code) |
Target project code + framework.py |
| Read-only | prepare.py |
prepare.py + modules/ |
| Context awareness | None | Three-layer reflective log + pattern detection |
Application 5: Multi-Phase Feature Development
For complex features requiring structured decomposition:
- Complexity routing — automatically classifies task as trivial/simple/complex/architectural
- Severity ordering — CRITICAL phases first, then HIGH, MEDIUM, LOW
- Phase isolation — each phase independently verifiable (100-300 lines max)
- Phase advancement — agent advances without human permission (NEVER STOP instruction)
- Plan adaptation — remaining phases updated based on discoveries
Open Questions and Future Directions
- Multi-agent coordination: Currently single-agent. How would BOUND work with multiple agents operating on different parts of the codebase?
- BOUND learning: Could BOUND constraints be semi-automatically derived from project history (e.g., files that caused production incidents become DANGER ZONES)?
- Cross-project transfer: Could remediation patterns learned on one project transfer to another?
- Quantitative evaluation: The system lacks standardized benchmarks — results are presented through session logs rather than reproducible metrics.
- Hook ecosystem: Currently 5 hooks for Claude Code. Extending to other agents (Cursor, Aider) requires agent-specific enforcement mechanisms.
Comparison with Related Systems
| System | Approach | Enforcement | Self-Repair | Memory |
|---|---|---|---|---|
| Ouro Loop | Bounded autonomy + methodology | Runtime hooks (exit 2 hard-block) | Autonomous remediation with playbook | Reflective log + pattern detection |
| autoresearch | Metric-driven experiment loop | Budget constraint (5 min) | Auto-revert on metric regression | None |
| .cursorrules | Static instruction file | None (hope-based) | None | None |
| CLAUDE.md | Static instruction file | None (agent compliance) | None | None |
| Devin | Full autonomous agent | Proprietary guardrails | Built-in (proprietary) | Session memory (proprietary) |
| SWE-Agent | Issue resolution agent | Test suite pass/fail | Retry with feedback | Episode memory |
Ouro Loop occupies a unique position: it is not an AI agent itself, but a methodology and runtime framework that makes existing agents safer and more autonomous. This separation of concerns — the agent provides intelligence, Ouro Loop provides structure — is its key architectural insight.