← Back to Index

Ouro Loop

Bounded-Autonomy Framework for AI Coding Agents with Runtime-Enforced Guardrails, Five Verification Gates, Three-Layer Self-Reflection, and Autonomous Remediation Organization: VictorVVedtion (independent researcher) Published: March 2026 Type: repo Report Type: PhD-Level Technical Analysis Report Date: April 2026

Full Title and Attribution
Authors and Team
Core Contribution
Supported Solutions
LLM Integration
Key Results
Reproducibility
Compute and API Costs
Architecture Solution
Component Breakdown
Core Mechanisms (Detailed)
Programming Language
Memory Management
Continued Learning
Applications

1 Full Title and Attribution

Full Title: Ouro Loop — Autonomous AI Agent Development Framework with Bounded Autonomy

Repository: github.com/VictorVVedtion/ouro-loop

License: MIT

Status: Experimental, actively developed (March 2026)

Stars: ~10 (early stage, March 2026)

Package: pip install ouro-loop (PyPI)

Inspiration: Directly extends Andrej Karpathy's autoresearch paradigm from ML experiment loops to general software engineering. The name "Ouro" references the Ouroboros — the serpent that eats its own tail — symbolizing the agent's ability to consume its own errors and iterate autonomously.

Tagline:

"To grant an entity absolute autonomy, you must first bind it with absolute constraints."

Test Suite: 507 tests passing (CI via GitHub Actions)

2 Authors and Team

Ouro Loop is developed by VictorVVedtion, an independent researcher/developer. The project is solo-authored and was built primarily with Claude Code as the AI co-development partner. No academic affiliation is listed.

The author positions Ouro Loop as a philosophical framework first and a software tool second. The accompanying MANIFESTO.md — titled "The Ouroboros Contract" — articulates a theory of Precision Autonomy wherein defining what an agent cannot do is the precondition for granting it full creative freedom within the remaining space. This is a notable departure from the instruction-based guardrail patterns common in prompt engineering.

The author's background appears to span blockchain infrastructure, consumer products, and financial systems — all domains from which Ouro Loop draws real-world validation examples.

3 Core Contribution

Key Contribution: Ouro Loop formalizes the concept of Bounded Autonomy for AI coding agents, implementing a six-stage autonomous development loop (BOUND → MAP → PLAN → BUILD → VERIFY → LOOP) with runtime-enforced constraints, multi-layer verification gates, three-layer self-reflective logging, and autonomous remediation — enabling AI agents to work unsupervised for extended periods without human babysitting while maintaining safety guarantees.

The Problem Statement

In the era of "vibe coding," unbound AI agents exhibit four pathological behaviors:

Hallucination — referencing files, APIs, and modules that do not exist
Regression — breaking established architectural patterns and constraints
Symptom-chasing — repeatedly editing the same file without addressing root causes
Context decay — forgetting critical constraints during long sessions

Existing approaches fall into two extremes:

Approach	Problem
Human-in-the-loop	Constant interruptions negate autonomous value; developer becomes a babysitter
Instruction-only guardrails	`.cursorrules` and `CLAUDE.md` define static instructions the agent can ignore

The Ouro Loop Solution

Ouro Loop introduces three key innovations:

The Event Horizon (BOUND): Before any code is written, the developer defines absolute constraints — DANGER ZONES (protected files), NEVER DO rules (absolute prohibitions), and IRON LAWS (invariants that must always hold). These constraints define a boundary the agent physically cannot cross.
Runtime Enforcement via Hooks: Unlike instruction-based approaches that rely on agent compliance, Ouro Loop enforces BOUND constraints through Claude Code Hooks that operate at the tool level. A bound-guard.sh hook intercepts Edit and Write operations and returns exit 2 (hard block) for DANGER ZONE files. The agent cannot bypass this — it is a runtime constraint, not a behavioral suggestion.
Autonomous Remediation: When verification fails, the agent does not alert the human. It consults its remediation playbook (modules/remediation.md), decides on a fix strategy (revert, retry alternative, escalate), executes it, and reports what it did. The agent only escalates to human review when changes touch a DANGER ZONE or when 3+ consecutive retries fail.

Theoretical Framework

The author articulates this as the Ouroboros Contract:

By explicitly defining the 20 things an agent can never do (the BOUND), you implicitly authorize it to autonomously do the 10,000 things required to solve the problem. The constraint space defines the creative space.

This is a meaningful conceptual advance over both: - Allowlisting (specify everything the agent can do — doesn't scale) - Instruction-following (hope the agent obeys — brittle)

Instead, it proposes constraint-based creative freedom — a small set of hard constraints enables a large space of autonomous action.

4 Supported Solutions

Solution Type	Support Level	Description
Overnight autonomous development	Primary use case	Define BOUND, start agent, sleep. Agent iterates through Build→Verify→Self-Fix cycles.
Long-running refactoring	Supported	Phase-based refactoring with verification gates ensuring nothing breaks between phases.
Continuous code review	Built-in (Sentinel)	24/7 unattended code review with partition-based scanning and risk scoring.
CI/CD agent integration	Supported	Agents handle build failures, test regressions, and dependency updates autonomously.
Production-safe AI coding	Primary design goal	Financial systems, blockchain, medical software — domains where "move fast and break things" is unacceptable.
Multi-phase feature development	Supported	Severity-ordered phases: CRITICAL → HIGH → MEDIUM → LOW. Agent processes in order.
Root-cause investigation	Demonstrated	Real session logs show agents testing multiple hypotheses and finding architectural root causes.

What Ouro Loop Is NOT

The project explicitly excludes certain use cases:

Quick prototypes or hackathon projects (BOUND setup overhead not justified)
Single-file scripts (methodology overhead exceeds benefit)
Real-time interactive coding (designed for "set it and let it run")

Agent Compatibility

Ouro Loop is agent-agnostic — it works with any AI coding assistant that can read files and execute terminal commands:

Agent	Integration Level
Claude Code (Anthropic)	Primary target; native `program.md` skill support, hook enforcement
Cursor	Via `.cursorrules` referencing Ouro Loop modules
Aider	Terminal-based; reads markdown instructions and executes Python
Codex CLI (OpenAI)	Shell command execution and file reading
Windsurf (Codeium)	Standard file/command integration
Any agent with file read + shell exec	General compatibility via `program.md`

5 LLM Integration

LLM-Agnostic by Design

Ouro Loop does not call LLMs directly. It is a methodology framework and runtime state machine that sits between the developer and the AI agent. The LLM integration happens at the agent level (Claude Code, Cursor, etc.), not at the Ouro Loop level.

This is a critical architectural distinction from systems like autoresearch:

┌─────────────────────────────────────────────────┐
│                 Developer                        │
│  1. Define BOUND in CLAUDE.md                    │
│  2. Point agent at program.md                    │
│  3. Sleep                                        │
└──────────────────────┬──────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────┐
│              AI Agent (any)                      │
│  ┌──────────────────────────────────────────┐   │
│  │ Reads program.md → follows methodology   │   │
│  │ Uses framework.py CLI → state + verify   │   │
│  │ Hooks enforce BOUND at tool call level   │   │
│  └──────────────────────────────────────────┘   │
│                                                  │
│  LLM calls happen HERE, inside the agent        │
└──────────────────────┬──────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────┐
│           Target Project Codebase                │
│  CLAUDE.md (BOUND) + .ouro/ (state) + code      │
└─────────────────────────────────────────────────┘

How the Agent Uses Ouro Loop

The agent is instructed (via program.md) to:

Read CLAUDE.md — extract DANGER ZONES, NEVER DO, IRON LAWS
Use framework.py CLI — manage state, run verification, log results
Follow the six-stage loop — BOUND → MAP → PLAN → BUILD → VERIFY → LOOP
Self-remediate on failure — consult remediation playbook, decide action, execute, report

The LLM's reasoning capabilities are used for: - MAP stage: understanding the problem space (6 questions) - PLAN stage: complexity estimation and phase decomposition - BUILD stage: code generation (RED → GREEN → REFACTOR → COMMIT) - VERIFY stage: self-assessment against BOUND constraints - REMEDIATE: analyzing failures and choosing alternative approaches

Sentinel: Claude Code Integration

The Sentinel module explicitly targets Claude Code as the AI agent, using: - --permission-mode bypassPermissions for unattended operation - Claude Code's native session management and tool calling - claude-opus-4-6 as the default model for review sessions

6 Key Results

Blockchain L1 — Consensus Performance Under Load

An AI agent used Ouro Loop to investigate why precommit latency spiked from 4ms to 200ms under transaction load on a 4-validator PBFT blockchain. Full session log available in examples/blockchain-l1/session-log.md.

Key findings:

5 hypotheses tested, 4 autonomous remediations — the ROOT_CAUSE gate fired 4 times, correctly identifying symptom-fixing
After 3 consecutive failed hypotheses, the 3-failure step-back rule activated: "stop fixing symptoms, re-examine the architecture"
Root cause was architectural, not code-level — a single-node HTTP bottleneck causing consensus-wide delays; fix was a Caddy reverse proxy
Agent caught its own flawed experiment — identified running 4x full stress instead of 1x distributed before drawing wrong conclusions

Metric	Before	After	Delta
Precommit (under load)	100-200ms	4ms	-98%
Block time (under load)	111-200ms	52-57ms	-53%
TPS Variance	40.6%	1.6%	-96%
SysErr rate	0.00%	0.00%	= (IRON LAW maintained)
Blocks/sec (soak load)	~8.0	~18.5	+131%

Consumer Product — Lint Remediation in React/Next.js

A simpler session where the ROOT_CAUSE gate demonstrated its value:

Agent attempted to suppress ESLint errors (symptom-fixing)
ROOT_CAUSE gate identified the pattern and flagged it
Agent was pushed toward genuinely better solutions:
Replacing <img> with Next.js <Image>
Properly handling useEffect state patterns
Using framework-appropriate patterns instead of suppression

Quantitative Framework Metrics

Metric	Value
Test suite	507 tests passing
Codebase size	~47KB `framework.py`, ~30KB `sentinel.py`, ~14KB `prepare.py`, ~9KB `program.md`
Dependencies	Zero (pure Python 3.10+ stdlib)
Hook enforcement	Verified exit code 2 hard-block on DANGER ZONE files
Reflective log	JSONL format, 30-entry rolling window
Results audit trail	TSV format, full phase/verdict/violation logging

7 Reproducibility

Installation

pip install ouro-loop

Or clone:

git clone https://github.com/VictorVVedtion/ouro-loop.git ~/.ouro-loop

Minimal Reproduction Steps

cd /path/to/your/project

# 1. Scan project structure
python -m prepare scan .

# 2. Initialize state directory (.ouro/)
python -m prepare init .

# 3. Generate CLAUDE.md template
python -m prepare template claude .

# 4. Edit CLAUDE.md with real BOUND constraints
# Define DANGER ZONES, NEVER DO, IRON LAWS

# 5. Point AI agent at program.md
# "Read program.md and CLAUDE.md. Start the Ouro Loop for: [task]"

Reproducibility Assessment

Factor	Assessment
Code availability	Fully open-source, MIT license, pip installable
Zero dependencies	Pure Python 3.10+ stdlib — no dependency conflicts possible
Agent requirement	Requires a capable AI coding agent (Claude Code recommended)
Determinism	Non-deterministic — results depend on LLM reasoning and stochastic code generation
Session logs	Two real session logs provided in `examples/` with detailed methodology observations
Test suite	507 tests covering runtime logic, verification gates, and state management
CI/CD	GitHub Actions CI with test badge

Limitations on Reproducibility

LLM dependency: Results depend on the specific LLM used by the underlying agent. Different models may produce different remediation strategies.
Project-specific BOUND: The methodology's effectiveness depends on the quality of BOUND definition. Poorly specified constraints lead to poor outcomes.
Qualitative measurement: The core value proposition (overnight autonomous development without breakage) is measured qualitatively through session logs, not through standardized benchmarks.

8 Compute and API Costs

Framework Overhead: Near-Zero

Ouro Loop itself adds negligible compute overhead:

Component	Resource Impact
`framework.py` CLI	~50ms per command (state read/write, git calls)
Verification gates	~200ms total (5 gates, each running `git` subprocess with 10s timeout)
Reflective log write	~10ms (JSONL append + trim to 30 entries)
Hook evaluation	~100ms per hook (shell script, CLAUDE.md parsing)
State management	~5ms (JSON read/write with atomic replace)

LLM Costs: Agent-Dependent

Since Ouro Loop doesn't call LLMs directly, API costs come from the underlying agent:

Scenario	Estimated Tokens	Estimated Cost (Claude Sonnet)
`program.md` initial read	~4,000 tokens	~$0.01
CLAUDE.md parsing	~500-2,000 tokens	~$0.005
Per-phase MAP+PLAN+BUILD cycle	~10,000-50,000 tokens	~$0.10-$0.50
Remediation cycle (on failure)	~5,000-15,000 tokens	~$0.05-$0.15
Full overnight session (10 phases)	~200,000-500,000 tokens	~$2-$5

Sentinel 24/7 Review Costs

The Sentinel module runs continuous Claude Code sessions:

Parameter	Default	Impact
Model	`claude-opus-4-6`	Premium pricing
Max turns per session	200	Upper bound on single-session cost
Session timeout	120 minutes	Prevents runaway sessions
Cooldown between sessions	30 seconds	Rate limiting
Permission mode	`bypassPermissions`	No human approval delays

Estimated 24/7 Sentinel cost: $50-$200/day depending on project size, partition count, and finding density. This is significant and should be budgeted for production use.

9 Architecture Solution

System Design: Three Files That Matter

Ouro Loop is architecturally minimalist by design — the entire system consists of three files:

┌────────────────────────────────────────────────────────┐
│                    program.md                           │
│  "The Methodology"                                     │
│                                                        │
│  Human-authored instructions the AI agent follows.     │
│  Defines the six-stage loop: BOUND → MAP → PLAN →     │
│  BUILD → VERIFY → LOOP.                                │
│  Iterated by the HUMAN.                                │
│                                                        │
│  Analogous to autoresearch's program.md, but for       │
│  general software engineering instead of ML.           │
└────────────────────────┬───────────────────────────────┘
                         │ Agent reads
┌────────────────────────▼───────────────────────────────┐
│                   framework.py                          │
│  "The Runtime"                                         │
│                                                        │
│  Lightweight state machine + CLI for the agent.        │
│  State tracking, verification gates, reflective log.   │
│  Can be EXTENDED by the agent.                         │
│                                                        │
│  Analogous to autoresearch's train.py — the file       │
│  the agent iterates on.                                │
└────────────────────────┬───────────────────────────────┘
                         │ Agent uses
┌────────────────────────▼───────────────────────────────┐
│                    prepare.py                           │
│  "The Initializer"                                     │
│                                                        │
│  Project scanning and .ouro/ directory creation.       │
│  NOT MODIFIED by agent or human after init.            │
│                                                        │
│  Read-only reference, like autoresearch's prepare.py.  │
└────────────────────────────────────────────────────────┘

The Six-Stage Loop

 ┌──────────────────────────────────────────────────┐
 │                                                  │
 │  Stage 0: BOUND ◄── Human defines constraints   │
 │  │                                               │
 │  ▼                                               │
 │  Stage 1: MAP ──── Understand problem space      │
 │  │                 (6 mandatory questions)        │
 │  ▼                                               │
 │  Stage 2: PLAN ─── Decompose by severity         │
 │  │                 (CRITICAL→HIGH→MEDIUM→LOW)    │
 │  ▼                                               │
 │  Stage 3: BUILD ── RED→GREEN→REFACTOR→COMMIT     │
 │  │                                               │
 │  ▼                                               │
 │  Stage 4: VERIFY ─ Three-layer verification      │
 │  │   │             Layer 1: 5 automated gates    │
 │  │   │             Layer 2: Self-assessment      │
 │  │   │             Layer 3: Human review trigger │
 │  │   │                                           │
 │  │   ├── FAIL ──► REMEDIATE ──► back to BUILD    │
 │  │   │           (autonomous, inside BOUND)      │
 │  │   │                                           │
 │  │   └── PASS ──► Stage 5: LOOP                  │
 │  │                │                               │
 │  │                ├── Feed discoveries back       │
 │  │                ├── Update BOUND if needed      │
 │  │                └── Advance to next phase       │
 │  │                                               │
 │  └────────────────── Repeat until all phases     │
 │                      complete or DANGER ZONE hit │
 └──────────────────────────────────────────────────┘

Enforcement Architecture: The Hook System

The hook system creates a runtime enforcement layer that operates below the agent's reasoning:

    Agent decides to edit a file
              │
    ┌─────────▼──────────┐
    │ Claude Code Tool    │
    │ call: Edit/Write    │
    │ target: file.py     │
    └─────────┬──────────┘
              │ PreToolUse event
    ┌─────────▼──────────┐
    │ bound-guard.sh     │
    │                    │
    │ 1. Parse CLAUDE.md │
    │ 2. Extract DANGER  │
    │    ZONES           │
    │ 3. Match file path │
    │    against zones   │
    │ 4. Path-segment    │
    │    aware matching  │
    └─────────┬──────────┘
              │
        ┌─────┴─────┐
        │           │
    IN ZONE    NOT IN ZONE
        │           │
   exit 2       exit 0
   (BLOCKED)    (allowed)
        │           │
   Agent sees   Tool executes
   denial msg   normally

This is architecturally significant because it makes BOUND enforcement non-bypassable from the agent's perspective. No amount of prompt injection, context confusion, or reasoning error can override an exit code 2 from a PreToolUse hook.

Data Flow: State and Logging

.ouro/
├── state.json            ← Current loop state (stage, phase, history)
├── reflective-log.jsonl  ← Three-layer self-awareness log (30 entries max)
└── sentinel/             ← Sentinel-specific state (if initialized)
    ├── sentinel-config.json
    ├── partitions.json
    ├── state.json
    ├── findings.jsonl
    ├── iteration-log.jsonl
    ├── suppressed.json
    └── learnings.md

ouro-results.tsv          ← Audit trail (phase/verdict/violations)
CLAUDE.md                 ← BOUND definition (DANGER ZONES, NEVER DO, IRON LAWS)

10 Component Breakdown

Component 1: `program.md` — The Methodology (9KB)

The central instruction document that turns any AI agent into an Ouro Loop agent. Key structural elements:

Section	Purpose	Lines
Setup	Bootstrap sequence (read CLAUDE.md, check .ouro/)	~15
BOUND	Constraint definition requirements	~25
The Loop (MAP)	6 mandatory questions before coding	~15
The Loop (PLAN)	Complexity routing table + phase decomposition	~30
The Loop (BUILD)	RED-GREEN-REFACTOR-COMMIT + 3 self-questions	~25
The Loop (VERIFY)	Three-layer verification specification	~40
The Loop (REMEDIATE)	Decision tree for autonomous failure handling	~35
The Loop (LOOP)	Feedback closure and phase advancement	~20
Context Management	Anti-context-decay strategies	~20
Rules	CAN DO / CANNOT DO / NEVER STOP	~20

The VERIFY section includes a structured decision tree for remediation that the agent follows on failure:

VERIFY failed
    │
    Is the failure inside a DANGER ZONE?
    │
    YES → STOP. Report to human.
    │
    NO → What type of failure?
         │
         EXIST (hallucination)  → Remove bad reference, find correct one
         RELEVANCE (drift)      → Stash out-of-scope changes, return to plan
         ROOT_CAUSE (stuck)     → Revert to last good state, different approach
         RECALL (context decay) → Re-read CLAUDE.md BOUND section
         MOMENTUM (stuck)       → Stop reading, write something, iterate
         TEST FAILURE            → In scope? Fix. Not in scope? Revert.

Component 2: `framework.py` — The Runtime (47KB)

The state machine and verification engine. This is the file the agent interacts with via CLI.

Subcomponent	LOC (est.)	Purpose
State management	~80	Load/save `.ouro/state.json` with atomic writes
CLAUDE.md parser	~120	Structured extraction of DANGER ZONES, NEVER DO, IRON LAWS
Complexity detection	~50	Trivial/simple/complex/architectural routing
Verification engine	~200	Layer 1 (5 gates) + Layer 2 (self-assessment) + Layer 3 (review triggers)
Pattern detection	~80	Consecutive failures, stuck loops, velocity trends, hot files, drift
Reflective logging	~150	Three-layer JSONL log (WHAT/WHY/PATTERN) with 30-entry rolling window
Result logging	~50	TSV audit trail (phase/verdict/violations)
CLI interface	~100	argparse-based: status, verify, log, advance, bound-check, reflect

CLAUDE.md Parser — Dual-Strategy Extraction:

The parser uses a two-phase extraction strategy:

Primary extraction (structured): Regex-based parsing of standard section headers (### DANGER ZONES, ### NEVER DO, ### IRON LAWS). Extracts backtick-wrapped paths from DANGER ZONES, list items from NEVER DO and IRON LAWS.
Fallback extraction (prose): If primary extraction finds nothing but BOUND markers are present, the parser switches to heuristic mode — scanning for path-like strings near "DANGER" keywords, lines starting with "Never"/"Do not", and lines containing "must"/"always" near backtick-wrapped code.

This dual-strategy design accommodates both well-structured and free-form CLAUDE.md files.

Path-Segment-Aware DANGER ZONE Matching:

# Zone "auth/" matches "auth/login.py" but NOT "unauthorized.py"
# Zone "auth/core.py" matches exactly that file
# Zone ending with "/" is treated as a directory prefix

The matcher splits both the file path and zone pattern into segments and checks for contiguous subsequence matching. This prevents false positives from substring matching (e.g., "auth" matching "unauthorized").

Component 3: `prepare.py` — The Initializer (14KB)

Project scanning and .ouro/ directory creation. Key capabilities:

Feature	Description
Language detection	Multi-language scanning via file extension mapping
File counting	LOC measurement across project
CLAUDE.md detection	Checks for existing BOUND definitions
Test detection	Scans for test files/directories
CI detection	Checks for CI/CD configuration files
Template generation	Creates CLAUDE.md templates with BOUND section scaffolding
State initialization	Creates `.ouro/state.json` with project metadata

Component 4: `sentinel.py` — 24/7 Autonomous Code Review (30KB)

A daemon module for continuous, unattended code review:

Subcomponent	Purpose
Command detection	Auto-discovers build/test/lint commands from 10+ project types
Partition scanner	Groups project directories into risk-scored partitions
Risk scoring	Cross-references partitions against DANGER ZONES for criticality
Config management	JSON config with validation for review, runner, and partitioning settings
Template rendering	Generates Sentinel-specific CLAUDE.md from templates
Runner installation	Generates immortal daemon scripts (nohup + launchd adoption)
Dashboard	Live progress monitoring script
State tracking	Iteration count, findings, coverage, partition history

Sentinel Runner Architecture:

make sentinel-start
  └→ nohup sentinel-runner.sh & disown
       └→ Terminal closes? SIGHUP absorbed by nohup
            └→ macOS launchd (PID 1) adopts the orphan
                 └→ Sleep/wake? launchd children survive
                      └→ Result: sentinel lives until killed

The runner launches Claude Code sessions in a loop, each reading sentinel state, picking the highest-priority partition, scanning for issues, and recording findings.

Component 5: Hooks (5 shell scripts)

Hook	Event	Action	Enforcement
`bound-guard.sh`	PreToolUse: Edit/Write	Parse CLAUDE.md DANGER ZONES, match against target file	`exit 2` hard-block
`root-cause-tracker.sh`	PostToolUse: Edit/Write	Track per-file edit count; warn at 3+, strong warn at 5+	Warning (no block)
`drift-detector.sh`	PreToolUse: Edit/Write	Count distinct directories touched; warn at 5+	Warning (scope alert)
`momentum-gate.sh`	PostToolUse: Edit/Write/Read	Track read/write ratio; warn at 3:1+ (analysis paralysis)	Warning (action prompt)
`recall-gate.sh`	PreCompact	Re-inject BOUND section into context before compression	Context preservation

The recall-gate.sh hook is architecturally notable — it fires on the PreCompact event (before Claude Code compresses its context window) and re-injects the BOUND constraints. This prevents constraint amnesia during long sessions, a failure mode the author specifically targets.

Component 6: Module Documentation (`modules/`)

Deep-dive reference material for each stage:

Module	Content
`bound.md`	How to identify and define boundaries
`map.md`	Problem space mapping techniques
`plan.md`	Phase decomposition and complexity routing
`build.md`	RED-GREEN-REFACTOR-COMMIT details
`verify.md`	Three-layer verification specification
`loop.md`	Feedback loop mechanics
`remediation.md`	Full remediation playbook with examples

11 Core Mechanisms (Detailed)

Mechanism 1: Bounded Autonomy (The Event Horizon)

The central theoretical contribution. BOUND is defined in three layers:

Layer A — DANGER ZONES (Spatial Constraints): Files and directories where modifications carry outsized risk. The agent may read these files but cannot edit them without explicit human approval.

### DANGER ZONES
- `src/payments/calculator.py` — financial calculations, penny-level precision
- `migrations/` — database schema, irreversible in production
- `consensus/` — PBFT consensus protocol, correctness-critical
- `auth/middleware.py` — authentication, security boundary

Layer B — NEVER DO (Behavioral Constraints): Absolute prohibitions that the agent must never violate under any circumstances.

### NEVER DO
- Never use float for monetary values — always Decimal
- Never delete or rename migration files
- Never commit without running the test suite
- Never alter the consensus voting logic

Layer C — IRON LAWS (Invariant Constraints): Properties that must always hold, verifiable through automated checks.

### IRON LAWS
- All monetary values use Decimal with 2-digit precision
- All API responses include request_id field
- Test coverage for payment module never drops below 90%
- SysErr rate in consensus is 0.00%

The three layers serve different purposes:

Layer	Type	Enforcement	Recovery
DANGER ZONES	Spatial	Hard-block via hooks	Human approval required
NEVER DO	Behavioral	Self-assessment in VERIFY	Agent self-corrects
IRON LAWS	Invariant	Automated verification	Agent must restore invariant

Mechanism 2: Five Verification Gates (Layer 1)

Each gate maps to a specific pathological behavior:

Gate 1 — EXIST (Anti-Hallucination):

# Checks: CLAUDE.md exists, key files exist
# Status values: PASS / WARN (no CLAUDE.md) / FAIL (expected but missing)
# Addresses: Agents referencing files, APIs, modules that don't exist

The EXIST gate also checks whether the agent is operating with stale state — if BOUND was expected (from init snapshot) but CLAUDE.md is now missing, it fails rather than warns.

Gate 2 — RELEVANCE (Anti-Drift):

# Uses: git status --short to enumerate changed files
# Cross-references: changed files against DANGER ZONES
# Status: PASS (no DZ contact) / WARN (DZ files touched) / SKIP (no git)
# Addresses: Scope drift, unintended DANGER ZONE contact

Provides a structured list of changed files and DANGER ZONE overlaps for downstream analysis.

Gate 3 — ROOT_CAUSE (Anti-Symptom-Chasing):

# Uses: git log --name-only -10 to find frequently edited files
# Threshold: 3+ edits to same file → "hot file" warning
# Status: PASS (no hot files) / WARN (hot files detected)
# Addresses: The agent editing the same file repeatedly without fixing root cause

This is the gate that fired 4 times in the blockchain session, each time correctly identifying that the agent was fixing symptoms rather than the root cause.

Gate 4 — RECALL (Anti-Context-Decay):

# Checks: BOUND section parseable, DANGER ZONES present, IRON LAWS present
# Status: PASS (full BOUND) / WARN (incomplete or missing BOUND)
# Addresses: Agent forgetting constraints during long sessions

Complemented by the recall-gate.sh hook that re-injects BOUND before context compression.

Gate 5 — MOMENTUM (Anti-Analysis-Paralysis):

# Tracks: recent commit frequency
# Status: PASS (2+ recent commits) / WARN (0-1 commits)
# Addresses: Agent stuck in read-only analysis without producing output

Mechanism 3: Three-Layer Reflective Logging

Every verification result is logged as a structured JSONL entry with three layers of increasing abstraction:

Layer 1 — WHAT (Facts):

{
  "stage": "BUILD",
  "phase": "2/5",
  "verdict": "FAIL",
  "overall": "REVIEW",
  "gates": {
    "EXIST": {"status": "+", "detail": "..."},
    "ROOT_CAUSE": {"status": "!", "detail": "Hot files: src/payments/stripe.py"}
  },
  "changed_files": ["src/payments/stripe.py", "src/payments/types.py"],
  "danger_zone_contact": ["src/payments/stripe.py (zone: src/payments/)"],
  "bound_violations": 0
}

Layer 2 — WHY (Decisions):

{
  "complexity": "complex",
  "complexity_reason": "Touches DANGER ZONE: src/payments/stripe.py",
  "review_reasons": ["DANGER ZONE touched: src/payments/stripe.py"],
  "bound_state": {"danger_zones": 2, "never_do": 3, "iron_laws": 2},
  "notes": "payment validation failed"
}

Layer 3 — PATTERN (Self-Awareness):

{
  "consecutive_failures": 2,
  "stuck_loop": false,
  "velocity_trend": "DECELERATING",
  "retry_rate": 0.40,
  "hot_files": ["src/payments/stripe.py"],
  "drift_signal": true
}

Each entry also generates actionable alerts for quick LLM consumption:

>> DRIFT: working in DANGER ZONE — extra caution required
>> HOT FILES: src/payments/stripe.py — possible symptom-chasing
>> SLOWING: pass rate declining — reassess approach

The reflective log is capped at 30 entries and designed for the agent to read at the start of each iteration, providing ambient self-awareness without requiring full session replay.

Mechanism 4: Pattern Detection Engine

The pattern detection system analyzes the agent's behavioral history to identify recurring problems:

Pattern	Detection Method	Alert Threshold
Consecutive failures	Count tail FAIL/RETRY verdicts	2+ triggers alert
Stuck loop	Same stage failing 3+ times in a row	3 consecutive same-stage failures
Velocity trend	Compare pass rates between two halves of recent history	>0.3 swing for ACCELERATING/DECELERATING; requires 6+ entries
Hot files	ROOT_CAUSE gate's repeated-edit tracking	3+ edits to same file
Drift signal	RELEVANCE gate's DANGER ZONE contact tracking	Any DANGER ZONE file in changed list
Retry rate	Percentage of RETRY verdicts in last 5 entries	Continuous metric, no threshold

Velocity trend detection is particularly nuanced — it requires at least 6 entries for meaningful analysis and uses a >0.3 swing threshold (not 0.2) to reduce false positives from natural variation.

Mechanism 5: Autonomous Remediation Protocol

When verification fails, the agent follows a structured remediation protocol rather than asking for human help:

     VERIFY returns FAIL
              │
    ┌─────────▼──────────┐
    │ Is failure inside   │
    │ a DANGER ZONE?      │
    └─────────┬──────────┘
              │
     ┌────────┴────────┐
     │                 │
    YES               NO
     │                 │
  STOP.            Consult
  Report to      remediation.md
  human.         decision tree
     │                 │
     ▼                 ▼
  [Human          ┌─────────────┐
   review]        │ EXIST fail? │───► Remove bad reference
                  │             │     Find correct one
                  │ RELEVANCE?  │───► Stash changes
                  │             │     Return to plan
                  │ ROOT_CAUSE? │───► Revert to last good
                  │             │     Try different approach
                  │ RECALL?     │───► Re-read CLAUDE.md
                  │             │     Summarize constraints
                  │ MOMENTUM?   │───► Stop reading
                  │             │     Write something
                  │ TEST FAIL?  │───► In scope? Fix.
                  │             │     Not in scope? Revert.
                  └─────────────┘
                         │
                         ▼
              Report what was done:
              [REMEDIATED] gate=... action=...
                was: ...
                did: ...
                now: ...
                bound: confirm no DZ touched

After every remediation, the agent produces a structured report:

[REMEDIATED] gate=ROOT_CAUSE action=revert_and_retry
  was: editing src/payments/calc.py for the 4th time (same TypeError)
  did: reverted to commit a1b2c3d, re-analyzed from scratch
  now: trying middleware pattern instead
  bound: no DANGER ZONE touched, no IRON LAW affected

Mechanism 6: Complexity Routing

The PLAN stage routes tasks through a complexity matrix that determines the appropriate level of formality:

Signal	Trivial	Simple	Complex	Architectural
Max lines	20	100	500	Unlimited
Max files	1	3	10	Unlimited
Phases	0	2	5	Unlimited
DANGER ZONE	Not touched	Adjacent	Inside	Modifies IRON LAW
Risk level	None	Low	Medium	High
Dependencies	None	Known	Unknown	External

Trivial and simple tasks execute directly without phase plans. Complex and architectural tasks require decomposition into independently-verifiable phases ordered by severity (CRITICAL → HIGH → MEDIUM → LOW), each changing 100-300 lines maximum.

Mechanism 7: Layer 3 Review Triggers

Layer 3 verification determines when human review is mandatory (not advisory):

Trigger	Condition	Rationale
DANGER ZONE contact	Any changed file matches a DANGER ZONE	Core safety guarantee
Consecutive retries	3+ consecutive RETRY verdicts	Agent likely stuck
Gate failure	Any Layer 1 gate returns FAIL	Serious verification issue
Architectural complexity	Complexity detection returns "architectural"	Cross-cutting changes need human judgment

When Layer 3 triggers, the overall verdict becomes REVIEW (not FAIL), and the agent is expected to stop and present findings to the human.

12 Programming Language

Implementation Language: Python 3.10+

Zero external dependencies. The entire framework uses only Python standard library modules:

Stdlib Module	Usage
`os`, `sys`, `shutil`	File operations, state management, atomic writes
`json`	State serialization, JSONL log format, config parsing
`re`	CLAUDE.md parsing (DANGER ZONES, NEVER DO, IRON LAWS extraction)
`subprocess`	Git operations (status, log, commit history)
`argparse`	CLI interface for all commands
`datetime`, `timezone`	Timestamps in UTC for state and logs
`collections.Counter`	Hot file frequency analysis in ROOT_CAUSE gate
`stat`	File permission management for executable scripts

Hooks are written in Bash — shell scripts that integrate with Claude Code's hook system. They parse CLAUDE.md using grep and awk, match file paths, and return exit codes.

`program.md` and `modules/*.md` — Markdown as Code

A distinctive architectural choice: the core methodology is expressed as Markdown instruction documents, not as code. The agent reads program.md like a skill specification and follows its instructions. This makes the methodology:

Human-readable — non-technical stakeholders can review and modify the constraints
Agent-agnostic — any agent that can read Markdown can use Ouro Loop
Versionable — methodology changes are tracked in git
Iterable — humans refine the methodology over time

Code Quality Metrics

Metric	Value
Total Python LOC	~91KB across 3 files
Test count	507
Test framework	Standard `pytest` (inferred from CI)
Python version requirement	3.10+ (for match/case, union types)
Dependencies	0
Linting	Not specified (but CI passes)

13 Memory Management

State Persistence: `.ouro/state.json`

The primary state file tracks the agent's position in the loop:

{
  "project_name": "my-payment-service",
  "current_stage": "BUILD",
  "current_phase": 2,
  "total_phases": 5,
  "bound_defined": true,
  "history": [
    {
      "stage": "BUILD",
      "phase": 1,
      "verdict": "PASS",
      "timestamp": "2026-03-15T10:00:00+00:00"
    }
  ],
  "updated_at": "2026-03-15T10:30:00+00:00"
}

Atomic writes via os.replace() with fallback to shutil.move() for cross-device scenarios (Docker volumes, NFS mounts).

Reflective Log: Three-Layer Self-Awareness Memory

The reflective log (.ouro/reflective-log.jsonl) is the system's primary memory mechanism. It serves as a compressed behavioral history that the agent reads at the start of each iteration.

Key design decisions: - JSONL format — each line is a self-contained JSON object, enabling simple append - 30-entry rolling window — prevents unbounded growth while maintaining enough history for pattern detection - Three-layer structure — WHAT (facts), WHY (decisions), PATTERN (self-awareness) — designed for fast LLM parsing - Actionable alerts — pre-computed summary alerts that the agent can act on without parsing raw data

This is the closest analog to a "working memory" for the AI agent — it provides context about past iterations without requiring full session replay.

Results Audit Trail: `ouro-results.tsv`

A lightweight, human-readable audit log:

phase   verdict   bound_violations   notes
1/3     PASS      0                  transactions endpoint + tests
2/3     RETRY     0                  ROOT_CAUSE warning, fixing
2/3     PASS      0                  fixed after retry
3/3     PASS      0                  validation complete

Sentinel Memory: `learnings.md`

The Sentinel module maintains a cross-session knowledge accumulator (learnings.md) that is updated every 10 iterations with patterns and insights discovered during code review. This provides long-term institutional knowledge.

Context Decay Prevention

Ouro Loop implements three strategies against context decay:

RECALL gate — verifies the agent can still state the task and top 3 constraints
recall-gate.sh hook — re-injects BOUND into context before compression
Explicit instructions in program.md — every 5 phases or ~30 minutes, the agent should run the RECALL gate and re-read CLAUDE.md if needed

Comparison with autoresearch Memory

Aspect	autoresearch	Ouro Loop
State tracking	`best_val_bpb` metric file	`state.json` (stage, phase, history)
History	Git commit log	Reflective log (JSONL) + results TSV
Self-awareness	Implicit (metric comparison)	Explicit (pattern detection, drift signals)
Long-term memory	None (stateless between runs)	Sentinel `learnings.md`
Context management	Not addressed	RECALL gate + recall hook + program.md instructions

14 Continued Learning

Learning Through Reflective Logging

Ouro Loop's primary learning mechanism is the three-layer reflective log. The agent doesn't "learn" in the ML sense — it doesn't update model weights. Instead, it maintains a behavioral pattern memory that informs future iterations:

Hot file detection → Agent learns to avoid symptom-chasing on frequently-edited files
Velocity tracking → Agent detects when its approach is stalling and changes strategy
Stuck loop detection → Agent recognizes when the same stage fails 3+ times and tries fundamentally different approaches
Drift monitoring → Agent stays aware of unintended DANGER ZONE contact

Learning Through BOUND Evolution

The LOOP stage explicitly instructs the agent to feed discoveries back into BOUND:

"Did this phase reveal anything that should change the plan?" - New constraint discovered → Add to BOUND in CLAUDE.md - Remaining phases need adjustment → Update the plan - Similar pattern found → Note it for future phases

This creates a self-improving constraint system where the agent actively expands the safety boundary based on runtime experience.

Sentinel Cross-Session Learning

The Sentinel module implements longer-term learning:

learnings.md — updated every 10 iterations with discovered patterns
suppressed.json — deduplication store for confirmed false positives
Partition history — tracks which areas have been reviewed and when
Findings journal — JSONL log of all identified issues with severity ratings

Limitations of Learning

No weight updates — the agent's underlying LLM is not fine-tuned. Learning is purely in-context.
Session-bounded (main loop) — reflective log is per-session. New sessions start fresh (must read state.json to resume).
Human-dependent BOUND quality — the quality of learning depends on well-defined constraints. Poorly specified BOUND leads to poor learning signals.
No generalization across projects — patterns learned in one project don't transfer to another (no shared memory).

15 Applications

Application 1: Overnight Autonomous Development

The primary use case. Workflow:

Developer defines BOUND in CLAUDE.md (15-30 minutes)
Developer describes task and points agent at program.md
Developer goes to sleep
Agent runs: MAP → PLAN → BUILD → VERIFY → LOOP for each phase
Agent remediates failures autonomously (inside BOUND)
Developer wakes up to ouro-results.tsv showing completed phases

Real example: Blockchain L1 consensus investigation — agent tested 5 hypotheses, remediated 4 failures, found architectural root cause, all without human intervention.

Application 2: Production-Safe AI Coding

For domains where errors are catastrophic:

Domain	DANGER ZONES	IRON LAWS	NEVER DO
Financial systems	`payments/`, `billing/`	Decimal precision, audit trail	No float for money
Blockchain	`consensus/`, `p2p/`	Zero SysErr rate	No vote logic changes
Medical software	`dosage/`, `patient/`	Unit consistency	No unvalidated inputs
Authentication	`auth/`, `session/`	Token expiry enforcement	No plaintext passwords

Application 3: 24/7 Continuous Code Review (Sentinel)

Sentinel enables continuous, unattended code review:

Partition scanning — project directories scored by risk (DANGER ZONE overlap, git activity, file count)
Priority-based review — high-criticality partitions reviewed first
Finding tracking — severity-rated findings logged in JSONL format
Auto-fix capability — configurable fix attempts with blast radius limits
Dashboard monitoring — live progress visualization

Application 4: Autoresearch Extension

Ouro Loop explicitly extends Karpathy's autoresearch paradigm:

Aspect	autoresearch	Ouro Loop
Domain	ML experiments	General software engineering
Constraint	5-minute training budget	BOUND (DANGER ZONES, NEVER DO, IRON LAWS)
Metric	val_bpb (single scalar)	Multi-layer verification (gates + self-assessment)
On failure	Auto-revert, next experiment	Auto-remediate, try alternative approach
Human programs	`program.md` (experiment strategy)	`program.md` (dev strategy) + `CLAUDE.md` (boundaries)
AI modifies	`train.py` (model code)	Target project code + `framework.py`
Read-only	`prepare.py`	`prepare.py` + `modules/`
Context awareness	None	Three-layer reflective log + pattern detection

Application 5: Multi-Phase Feature Development

For complex features requiring structured decomposition:

Complexity routing — automatically classifies task as trivial/simple/complex/architectural
Severity ordering — CRITICAL phases first, then HIGH, MEDIUM, LOW
Phase isolation — each phase independently verifiable (100-300 lines max)
Phase advancement — agent advances without human permission (NEVER STOP instruction)
Plan adaptation — remaining phases updated based on discoveries

Open Questions and Future Directions

Multi-agent coordination: Currently single-agent. How would BOUND work with multiple agents operating on different parts of the codebase?
BOUND learning: Could BOUND constraints be semi-automatically derived from project history (e.g., files that caused production incidents become DANGER ZONES)?
Cross-project transfer: Could remediation patterns learned on one project transfer to another?
Quantitative evaluation: The system lacks standardized benchmarks — results are presented through session logs rather than reproducible metrics.
Hook ecosystem: Currently 5 hooks for Claude Code. Extending to other agents (Cursor, Aider) requires agent-specific enforcement mechanisms.

System	Approach	Enforcement	Self-Repair	Memory
Ouro Loop	Bounded autonomy + methodology	Runtime hooks (exit 2 hard-block)	Autonomous remediation with playbook	Reflective log + pattern detection
autoresearch	Metric-driven experiment loop	Budget constraint (5 min)	Auto-revert on metric regression	None
.cursorrules	Static instruction file	None (hope-based)	None	None
CLAUDE.md	Static instruction file	None (agent compliance)	None	None
Devin	Full autonomous agent	Proprietary guardrails	Built-in (proprietary)	Session memory (proprietary)
SWE-Agent	Issue resolution agent	Test suite pass/fail	Retry with feedback	Episode memory

Ouro Loop occupies a unique position: it is not an AI agent itself, but a methodology and runtime framework that makes existing agents safer and more autonomous. This separation of concerns — the agent provides intelligence, Ouro Loop provides structure — is its key architectural insight.