← Back to Index
Hyperagents
Self-referential agents that unify task-solving and self-improvement into a single editable program, enabling metacognitive self-modification—improving not just task behavior but the mechanism that generates future improvements. Organization: Meta FAIR / Meta Superintelligence Labs, University of British Columbia, Vector Institute, University of Edinburgh, NYU Published: March 19, 2026 Type: paper Report Type: PhD-Level Technical Analysis Report Date: April 2026
Table of Contents
- Full Title and Attribution
- Authors and Team
- Core Contribution
- Supported Solutions
- LLM Integration
- Key Results
- Reproducibility
- Compute and API Costs
- Architecture Solution
- Component Breakdown
- Core Mechanisms (Detailed)
- Programming Language
- Memory Management
- Continued Learning
- Applications
1 Full Title and Attribution
Full Title: Hyperagents
arXiv ID: 2603.19461
DOI: 10.48550/arXiv.2603.19461
License: CC BY 4.0
Venue: arXiv preprint (cs.AI)
Submission Date: March 19, 2026
Meta Publication Date: March 24, 2026
Code Repository: github.com/facebookresearch/HyperAgents
"DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve."
This paper addresses the fundamental limitation of existing self-improving AI systems: they rely on fixed, handcrafted meta-level mechanisms that constrain the rate and nature of improvement. By making the improvement procedure itself editable and evolvable, Hyperagents breaks through this ceiling and demonstrates genuine metacognitive self-modification across multiple domains.
The work extends Sakana AI's Darwin Gödel Machine (DGM) from a coding-only self-improvement system to a domain-general framework, establishing that meta-level improvement strategies can transfer across unrelated domains and accumulate across independent runs.
2 Authors and Team
| Author | Affiliation | Role / Expertise |
|---|---|---|
| Jenny Zhang | University of British Columbia / Vector Institute | Lead author; open-ended learning, Darwin Gödel Machine (original DGM) |
| Bingchen Zhao | University of Edinburgh | Self-supervised learning, visual representation |
| Wannan Yang | NYU | Agent systems, reinforcement learning |
| Jakob Foerster | University of Oxford / Meta FAIR | Multi-agent RL, game theory, AIRS-Bench |
| Jeff Clune | University of British Columbia / Vector Institute | Open-ended evolution, quality-diversity, AI-generating algorithms |
| Minqi Jiang | FAIR at Meta | Open-ended learning, environment design |
| Sam Devlin | Meta Superintelligence Labs | Agent development, game AI |
| Tatiana Shavrina | Meta Superintelligence Labs | NLP, benchmark design, AIRS-Bench |
Institutional Span: Six institutions across three countries—a cross-cutting collaboration between academic open-ended evolution research (Clune, Zhang) and industry agent systems (Meta FAIR, Meta Superintelligence Labs).
Key Intellectual Lineage:
- Jenny Zhang is the lead author of the original Darwin Gödel Machine (DGM) paper (Zhang et al., 2025b), making her the natural lead for its generalization.
- Jeff Clune is one of the most influential researchers in open-ended evolution and AI-generating algorithms. His research vision—that AI systems should be able to discover increasingly complex and diverse solutions without human intervention—is the philosophical foundation of Hyperagents.
- Jakob Foerster brings multi-agent systems expertise and co-authored AIRS-Bench, a benchmark for AI research science agents, providing evaluation infrastructure context.
- Tatiana Shavrina co-authored AIRS-Bench and contributes NLP/benchmark expertise from Meta Superintelligence Labs.
This team represents a convergence of the open-ended evolution school (Clune/Zhang) with large-scale industrial AI systems (Meta), bridging the gap between theoretical frameworks and practical agent deployments.
3 Core Contribution
The Problem: Fixed Meta-Level Mechanisms
All prior self-improving AI systems share a structural limitation: the mechanism that drives improvement is itself fixed and handcrafted.
TRADITIONAL SELF-IMPROVING SYSTEMS
──────────────────────────────────────────────────────────────
┌──────────────────────────────────────┐
│ FIXED META-LEVEL │ ← Cannot improve
│ (handcrafted improvement procedure) │ itself
│ │
│ "Generate random variation" │
│ "Evaluate on benchmark" │
│ "Keep if better" │
└──────────────┬───────────────────────┘
│ applies to
▼
┌──────────────────────────────────────┐
│ MUTABLE TASK-LEVEL │ ← Can improve
│ (the agent solving the actual task) │
└──────────────────────────────────────┘
Problem: The improvement rate is bounded by the quality
of the fixed meta-level mechanism.
──────────────────────────────────────────────────────────────
This creates the infinite regress problem: if a meta-agent improves a task agent, who improves the meta-agent? Adding more meta-levels just shifts the question upward without solving it.
The Darwin Gödel Machine (DGM): A Partial Solution
The original DGM (Zhang et al., 2025b, by Sakana AI) solved this for a single domain—coding:
Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability.
In DGM, the coding agent is both the task agent (to be evaluated) and the meta agent (to generate modifications). This creates a virtuous cycle only when the task domain aligns with the self-modification skill. For coding, this alignment is natural. For paper review, robotics, or mathematics, it is not.
The Hyperagent Solution: Collapse the Hierarchy
Hyperagents solve the infinite regress by collapsing the task agent and meta agent into a single editable program:
HYPERAGENT ARCHITECTURE
──────────────────────────────────────────────────────────────
┌──────────────────────────────────────┐
│ SINGLE EDITABLE PROGRAM │
│ │
│ ┌────────────────────────────────┐ │
│ │ TASK AGENT │ │ ← Editable
│ │ (solves the target task) │ │
│ └────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────┐ │
│ │ META AGENT │ │ ← ALSO Editable
│ │ (modifies itself + task agent)│ │
│ │ (improvement procedure) │ │
│ └────────────────────────────────┘ │
│ │
└──────────────────────────────────────┘
Key: The meta agent can modify ITSELF.
This enables metacognitive self-modification.
No infinite regress—one level, fully self-referential.
──────────────────────────────────────────────────────────────
The crucial innovation: The meta-level modification procedure is itself editable. The system can improve not only what it does (task-solving) but how it decides what to try next (self-modification strategy). This enables:
- Metacognitive self-modification — improving the improvement process
- Domain-agnostic improvement — no requirement that task and meta skills align
- Cross-domain transfer — meta-level strategies learned in one domain apply to others
- Accumulation across runs — meta improvements persist and compound
Formal Definition
A hyperagent is defined as a self-referential agent (H = (T, M)) where: - (T) is the task agent component (domain-specific, solves the target task) - (M) is the meta agent component (modifies both (T) and itself) - Both (T) and (M) are parts of a single editable program - (M) has read/write access to its own source code
The self-referential property means (M) can execute: - (M(T) \rightarrow T') — modify the task agent - (M(M) \rightarrow M') — modify itself (metacognitive modification) - (M(T, M) \rightarrow (T', M')) — modify both simultaneously
4 Supported Solutions
Task Domains
DGM-Hyperagents are evaluated across four diverse domains:
| Domain | Task Description | Evaluation Metric | Difficulty |
|---|---|---|---|
| Polyglot Coding | Solve coding challenges in multiple languages | Pass rate | Well-aligned with DGM's original domain |
| AI Paper Review | Predict conference accept/reject decisions | Accuracy (balanced) | Requires structured analytical reasoning |
| Robotics Reward Design | Write reward functions for simulated quadruped locomotion | Normalized performance | Requires understanding physics and RL |
| Olympiad Math Grading | Grade mathematical proofs at competition level | Grading accuracy | Requires deep mathematical reasoning |
Solution Representation
Each hyperagent is a complete Python program that encodes:
- Task-solving logic — domain-specific code for performing the task
- Self-modification logic — code that generates variants of itself
- Memory/tracking infrastructure — any persistent state the agent has built for itself
The program is self-contained and can be executed independently. The meta agent component interacts with an LLM to generate code modifications (diffs) that are applied to produce new agent variants.
Agent Evolution Paths
The system generates and evaluates a growing archive of agent variants:
Generation 0: [Initial Agent]
│
├──diff──> [Agent v1.1] ──eval──> score: 0.15
├──diff──> [Agent v1.2] ──eval──> score: 0.22
└──diff──> [Agent v1.3] ──eval──> score: 0.08
│
├──diff──> [Agent v2.1] ──eval──> score: 0.35
└──diff──> [Agent v2.2] ──eval──> score: 0.41
│
├──diff──> [Agent v3.1] ──eval──> score: 0.52
...
After 50 iterations: best agent score → 0.710 (paper review)
Each generation builds on the best previous agents (stepping stones), and the archive preserves diversity to prevent premature convergence.
What Can Be Modified
The key differentiator from all prior systems is the scope of what can change:
| Component | DGM (Original) | DGM-H (Hyperagents) |
|---|---|---|
| Task-solving code | ✅ Editable | ✅ Editable |
| Prompt templates | ✅ Editable | ✅ Editable |
| Self-modification instructions | ❌ Fixed | ✅ Editable |
| Selection strategy | ❌ Fixed | ✅ Editable |
| Memory/tracking infrastructure | ❌ None | ✅ Self-created |
| Performance analysis code | ❌ None | ✅ Self-created |
| Evaluation criteria | ❌ Fixed | ✅ Editable |
5 LLM Integration
Multi-Model Architecture
DGM-H uses a deliberate separation between modification and evaluation models:
| Role | Model | Purpose |
|---|---|---|
| Self-modification engine | Claude 4.5 Sonnet (Anthropic) | Generates code diffs for agent variants |
| Coding evaluation | o3-mini (OpenAI) | Evaluates coding task performance |
| Paper review evaluation | GPT-4o (OpenAI) | Evaluates paper review accuracy |
| Robotics evaluation | Domain-specific simulation | No LLM needed (physics simulation) |
| Math grading evaluation | GPT-4o (OpenAI) | Evaluates math solution grading |
Separating the modification engine from the evaluation engine prevents any single model from "grading its own homework."
This architectural choice is critical for avoiding self-deceptive optimization, where the same model that generates improvements also evaluates them, potentially creating reward hacking loops.
How the Meta Agent Uses LLMs
The meta agent's core operation is generating code diffs. The process:
┌──────────────────────────────────────────────────────┐
│ META AGENT WORKFLOW │
│ │
│ 1. Read current agent source code │
│ 2. Read performance history / analysis │
│ 3. Read self-modification instructions │
│ (which are themselves part of the editable code) │
│ 4. Construct prompt for LLM: │
│ - Current code │
│ - Performance data │
│ - Self-modification strategy │
│ - Request for specific diff │
│ 5. Send to Claude 4.5 Sonnet │
│ 6. Receive code diff │
│ 7. Apply diff to create new agent variant │
│ 8. Evaluate new variant │
│ 9. Update archive if variant improves │
└──────────────────────────────────────────────────────┘
LLM Independence Principle
The LLM is used as a tool by the meta agent, not as a fixed component. The meta agent's instructions for how to use the LLM are themselves part of the editable program. This means the system can evolve:
- How it prompts the LLM
- What context it provides to the LLM
- What it asks the LLM to modify
- How it parses and applies the LLM's output
In practice, the system has been observed to autonomously improve its prompting strategies over successive generations.
No LLM Fine-Tuning
DGM-H does not fine-tune any LLM. All models are used through inference APIs. The "learning" happens entirely in the program space—the agent's source code evolves, not the LLM's weights.
This is a deliberate design choice that: 1. Avoids the cost and complexity of model fine-tuning 2. Preserves the LLM's general capabilities 3. Enables using the latest and best models without retraining 4. Makes the improvement process interpretable (source code diffs)
6 Key Results
Primary Metric: imp@50
The paper introduces imp@50 (improvement at 50 iterations) as the primary evaluation metric:
imp@50 = (best score at iteration 50) - (initial score at iteration 0)
This captures cumulative improvement rather than absolute final score, making it sensitive to systems that consistently improve versus those that make one lucky jump.
Domain-by-Domain Results
AI Paper Review
| System | Initial Score | Score at 50 Iterations | imp@50 |
|---|---|---|---|
| Static baseline | 0.630 | 0.630 | 0.000 |
| DGM (original) | 0.0 | ~0.4* | ~0.4* |
| DGM-H | 0.0 | 0.710 | 0.710 |
DGM-H improved from zero to 0.710, surpassing the established static baseline of 0.630. The system: - Detected a 94% accept-rate imbalance in its own predictions and self-corrected - Evolved from superficial evaluation to multi-stage review pipelines with explicit checklists - Built its own performance tracking infrastructure
Polyglot Coding
| System | Initial Score | Score at 50 Iterations | imp@50 |
|---|---|---|---|
| DGM (original) | 0.084 | ~0.267 | ~0.183 |
| DGM-H | 0.084 | 0.267 | 0.183 |
DGM-H matches the original DGM on coding despite not being domain-specialized. This is notable because DGM was specifically designed for coding (where task and meta skills align), while DGM-H is domain-general.
Robotics Reward Design
| System | Initial Score | Score at 50 Iterations | imp@50 |
|---|---|---|---|
| No self-improvement | 0.060 | 0.060 | 0.000 |
| DGM-H | 0.060 | 0.372 | 0.312 |
The system learned to write increasingly effective reward functions for simulated quadruped locomotion, a task requiring understanding of physics and reinforcement learning dynamics.
Cross-Domain Transfer: The Headline Result
The most striking finding is cross-domain transfer of meta-level improvements:
| Configuration | Olympiad Math Score |
|---|---|
| Initial agent (no improvement) | 0.0 |
| DGM transfer agents | 0.0 |
| Standard DGM-H (no transfer) | 0.0 |
| DGM-H with transfer from paper review + robotics | 0.630 |
Transfer agents trained on paper review and robotics achieved 0.630 in Olympiad math evaluation, while all other configurations scored 0.0.
This demonstrates that the meta-level strategies discovered in unrelated domains—structured evaluation checklists, multi-stage analysis pipelines, persistent memory—are domain-general improvement strategies that transfer to new problems.
Ablation Results
The paper shows that all components are necessary:
| Configuration | Performance |
|---|---|
| No self-improvement | Baseline |
| Self-improvement without open-ended exploration | Limited gains |
| Open-ended exploration without metacognition (original DGM) | Domain-limited gains |
| Full DGM-H (metacognition + open-ended exploration) | Best across all domains |
Emergent Capabilities
Beyond raw performance numbers, DGM-H demonstrated emergent behaviors:
-
Self-built performance tracking: The system independently created cross-generation performance trackers without being instructed to do so.
-
Persistent memory synthesis: Agents evolved to maintain memory that "synthesizes insights rather than just logging numbers."
-
Compute-aware planning: The system adjusted its modification strategies based on remaining compute budget—making broad exploratory changes early and targeted refinements later.
-
Meta-reasoning in logs: In one documented instance, the system reasoned that generation 55 was "best accuracy but is too harsh" and decided to merge that version's rigorous reasoning with the stronger balance of a later generation.
7 Reproducibility
Code Availability
| Component | Status | Location |
|---|---|---|
| Core framework | Open source | github.com/facebookresearch/HyperAgents |
| Experiment logs | Provided as multi-part ZIP | outputs_os_parts.zip in repository |
| Docker container | Build instructions provided | Dockerfile in repository |
| Initial agents | Setup script provided | setup_initial.sh |
Repository Structure
HyperAgents/
├── agent/ # Foundation model interaction code
├── analysis/ # Plotting and analysis scripts
├── domains/ # Domain-specific code
│ ├── coding/ # Polyglot coding tasks
│ ├── paper_review/ # AI paper review
│ ├── robotics/ # Reward design
│ └── math/ # Olympiad math grading
├── utils/ # Common utilities
├── meta_agent.py # Main meta agent implementation
├── task_agent.py # Main task agent implementation
├── run_meta_agent.py # Script to run meta agent and get diffs
├── generate_loop.py # Entry point for running the algorithm
├── requirements.txt # Python dependencies
├── requirements_dev.txt # Development dependencies
├── Dockerfile # Container build
└── setup_initial.sh # Initial agent setup
Environment Setup
# API keys required
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=...
# System dependencies
sudo dnf install -y python3.12-devel graphviz graphviz-devel \
cmake ninja-build bzip2-devel zlib-devel ncurses-devel libffi-devel
# Python environment
python3.12 -m venv venv_nat
source venv_nat/bin/activate
pip install -r requirements.txt
pip install -r requirements_dev.txt
# Docker (sandboxed execution)
docker build --network=host -t hyperagents .
# Initialize agents
bash ./setup_initial.sh
# Run
python generate_loop.py --domains <domain>
Reproducibility Assessment
| Criterion | Rating | Notes |
|---|---|---|
| Code available | ✅ Strong | Full source on GitHub |
| Data available | ✅ Strong | Experiment logs provided |
| Dependencies specified | ✅ Strong | requirements.txt + Dockerfile |
| API keys required | ⚠️ Moderate | Requires OpenAI, Anthropic, Gemini API access |
| Compute cost | ⚠️ Moderate | Significant API costs for full reproduction |
| Determinism | ⚠️ Low | LLM non-determinism + stochastic exploration |
| Model versioning | ⚠️ Low | Results tied to specific model versions that may change |
Key Reproducibility Challenges
- API Cost: Full reproduction across all four domains with 50+ iterations requires significant API spend across three providers.
- Model Version Drift: Claude 4.5 Sonnet, o3-mini, and GPT-4o may be updated or deprecated, changing results.
- Non-Determinism: Self-modification is inherently stochastic; exact trajectories will differ.
- Safety Sandboxing: Reproducing the safety setup (sandboxed execution with human oversight) requires careful infrastructure.
8 Compute and API Costs
Per-Iteration Cost Structure
Each iteration of the DGM-H loop involves:
┌─────────────────────────────────────────────────┐
│ COST PER ITERATION │
│ │
│ 1. Meta Agent Prompt Construction │
│ - Read current agent source (~500-2000 LOC) │
│ - Read performance history │
│ - Read self-modification instructions │
│ Cost: ~5K-20K input tokens │
│ │
│ 2. LLM Call (Claude 4.5 Sonnet) │
│ - Generate code diff │
│ Cost: ~2K-10K output tokens │
│ │
│ 3. Apply Diff + Execute New Agent │
│ - Sandboxed execution (Docker) │
│ Cost: Compute time (seconds to minutes) │
│ │
│ 4. Evaluation │
│ - Domain-specific (may involve LLM calls) │
│ Cost: Varies by domain │
└─────────────────────────────────────────────────┘
Estimated Costs by Domain
| Domain | Iterations (typ.) | Modification Model | Evaluation Model | Est. Cost per Run |
|---|---|---|---|---|
| Polyglot Coding | 50 | Claude 4.5 Sonnet | o3-mini | $50–150 |
| Paper Review | 50 | Claude 4.5 Sonnet | GPT-4o | $30–100 |
| Robotics Reward | 50 | Claude 4.5 Sonnet | Simulation (free) | $20–60 |
| Olympiad Math | 50 | Claude 4.5 Sonnet | GPT-4o | $30–100 |
Total for full reproduction (all domains, multiple seeds): Estimated $500–2,000 in API costs.
Cost Comparison with Baselines
| System | Cost Model | Improvement Mechanism |
|---|---|---|
| Fine-tuning | Very High (GPU hours) | Weight updates |
| DGM (original) | Moderate (API calls) | Code modification (coding only) |
| DGM-H | Moderate (API calls) | Code modification (any domain) |
| RL from scratch | Very High (GPU hours) | Gradient descent |
Compute Infrastructure
The system runs on standard hardware with Docker for sandboxing: - CPU: Standard development machine - GPU: Not required (all compute is API-based) - Memory: Minimal (programs are small; no large model loading) - Network: Required for API calls - Storage: Moderate (experiment logs, agent archives)
The key insight is that DGM-H is compute-light on local hardware because the heavy computation (LLM inference) is offloaded to API providers. This makes it accessible but creates API cost as the primary budget constraint.
9 Architecture Solution
System Architecture
┌────────────────────────────────────────────────────────────────────┐
│ DGM-HYPERAGENT SYSTEM ARCHITECTURE │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ GENERATE LOOP (generate_loop.py) │ │
│ │ │ │
│ │ for iteration in range(max_iterations): │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌──────────────────────┐ │ │
│ │ │ ARCHIVE │ │ SELECTION │ │ │
│ │ │ (stepping │────>│ (choose parent │ │ │
│ │ │ stones) │ │ agent to modify) │ │ │
│ │ └─────────────────┘ └──────────┬───────────┘ │ │
│ │ ↑ │ │ │
│ │ │ ▼ │ │
│ │ ┌─────┴───────────┐ ┌──────────────────────┐ │ │
│ │ │ EVALUATION │ │ META AGENT │ │ │
│ │ │ (domain- │<────│ (meta_agent.py) │ │ │
│ │ │ specific) │ │ │ │ │
│ │ └─────────────────┘ │ ┌─────────────────┐ │ │ │
│ │ │ │ Self-Modification│ │ │ │
│ │ │ │ Instructions │ │ │ │
│ │ ┌─────────────────┐ │ │ (EDITABLE) │ │ │ │
│ │ │ TASK AGENT │ │ └─────────────────┘ │ │ │
│ │ │ (task_agent.py) │<────│ │ │ │
│ │ │ (domain code) │ │ Uses: Claude 4.5 │ │ │
│ │ └─────────────────┘ │ Sonnet for diffs │ │ │
│ │ └──────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ SANDBOX (Docker) │ │
│ │ - Limited resources │ │
│ │ - Restricted internet access │ │
│ │ - Human oversight │ │
│ └──────────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
The Self-Referential Loop
The architecture's defining feature is the self-referential loop. Unlike standard agent systems with fixed optimization procedures, DGM-H's optimization procedure is part of the optimized artifact:
STANDARD OPTIMIZATION HYPERAGENT OPTIMIZATION
────────────────────── ──────────────────────────
Fixed Optimizer ──> Agent ┌──────────────────────┐
↑ │ │ Agent + Optimizer │
│ │ │ (single program) │
└──evaluate───────┘ │ │
│ Optimizer modifies │
│ BOTH the agent │
│ AND itself │
└──────────┬───────────┘
│
└──evaluate──┘
Key Architectural Decisions
-
Single Editable Program: Task agent and meta agent are unified in one Python program, not separate services or models. This enables self-reference without external coordination.
-
LLM as Tool, Not Component: The LLM (Claude 4.5 Sonnet) is called as an API tool by the meta agent, not embedded in the architecture. The meta agent's instructions for how to use the LLM are editable, so the system can evolve its LLM interaction patterns.
-
Archive as Memory: The growing archive of agent variants serves as both a population (for diversity) and a memory (stepping stones for future improvements).
-
Separate Evaluation: Evaluation uses different models than modification, preventing self-deceptive optimization.
-
Docker Sandboxing: All agent execution happens in sandboxed containers with resource limits and restricted network access. This is essential given that the system executes self-generated code.
Comparison to DGM (Original)
┌─────────────────────────────────────────────────────────────┐
│ │
│ DARWIN GÖDEL MACHINE (DGM) │
│ ───────────────────────── │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Coding Agent │ === │ Meta Agent │ SAME agent │
│ │ (task) │ │ (modifier) │ (works because │
│ └─────────────┘ └─────────────┘ coding = coding) │
│ │
│ Fixed: modification instructions │
│ Fixed: selection strategy │
│ Fixed: evaluation │
│ │
│ Domain: Coding ONLY │
│ │
├─────────────────────────────────────────────────────────────┤
│ │
│ DGM-HYPERAGENTS (DGM-H) │
│ ──────────────────────── │
│ │
│ ┌───────────────────────────────┐ │
│ │ HYPERAGENT PROGRAM │ │
│ │ │ │
│ │ ┌─────────────┐ │ │
│ │ │ Task Agent │ Editable │ │
│ │ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────┐ │ │
│ │ │ Meta Agent │ Editable │ │
│ │ │ (including │ │ │
│ │ │ its own │ │ │
│ │ │ instructions│ │ │
│ │ └─────────────┘ │ │
│ └───────────────────────────────┘ │
│ │
│ Editable: modification instructions ✅ │
│ Editable: selection strategy ✅ │
│ Editable: memory/tracking ✅ │
│ │
│ Domain: ANY computable task │
└─────────────────────────────────────────────────────────────┘
10 Component Breakdown
Component 1: Generate Loop (generate_loop.py)
Purpose: The outer loop that orchestrates the self-improvement process.
| Parameter | Specification |
|---|---|
| Entry point | python generate_loop.py --domains <domain> |
| Loop structure | Iterative: select parent → modify → evaluate → archive |
| Max iterations | Configurable (50 used in experiments) |
| Output | Agent variants saved to outputs/ directory |
| Parallelism | Sequential iterations (each depends on archive state) |
The generate loop implements the open-ended exploration framework inherited from DGM: maintain an archive of diverse, high-quality agent variants ("stepping stones") and iteratively produce new variants that build on existing ones.
Component 2: Meta Agent (meta_agent.py)
Purpose: The self-modifying component that generates code diffs for agent variants.
The meta agent reads the current agent's source code and generates a diff using Claude 4.5 Sonnet. Critically, the meta agent's own instructions—how it decides what to modify, what context to provide to the LLM, and how to structure its prompts—are part of the editable program.
Key operations: - Code reading: Parse current agent source - Context assembly: Gather performance history, previous modifications, self-modification instructions - Diff generation: Prompt Claude 4.5 Sonnet for a code modification - Diff application: Apply the generated diff to create a new agent variant
The helper script run_meta_agent.py wraps the meta agent execution and diff extraction.
Component 3: Task Agent (task_agent.py)
Purpose: The domain-specific component that solves the target task.
The task agent's structure depends on the domain:
| Domain | Task Agent Structure |
|---|---|
| Coding | Code generation + execution |
| Paper Review | Structured analysis pipeline + decision logic |
| Robotics | Reward function definition |
| Math Grading | Proof analysis + score assignment |
The task agent is the primary target of modification, but unlike traditional systems, it is not the only modifiable component.
Component 4: Archive (Stepping Stones)
Purpose: Maintains the population of agent variants for open-ended exploration.
The archive stores: - Complete agent source code for each variant - Evaluation scores (training and validation) - Lineage information (which parent agent it was derived from) - Modification descriptions
The archive enables: - Diversity: Prevents convergence on a single lineage - Backtracking: Can return to earlier promising variants - Stepping stones: Current agents build on past discoveries
Component 5: Domain Evaluators (domains/)
Purpose: Domain-specific evaluation of agent performance.
| Domain | Evaluator | Model Used |
|---|---|---|
| Coding | Code execution + test cases | o3-mini |
| Paper Review | Comparison to human decisions | GPT-4o |
| Robotics | Physics simulation | None (simulator) |
| Math Grading | Comparison to ground truth grades | GPT-4o |
Evaluators are external to the hyperagent—they cannot be modified by the meta agent. This prevents the system from learning to game its own evaluation.
Component 6: Safety Sandbox
Purpose: Contain the execution of self-generated code.
┌─────────────────────────────────────────┐
│ DOCKER CONTAINER │
│ │
│ ┌──────────────────────────────────┐ │
│ │ Agent Execution Environment │ │
│ │ │ │
│ │ - Resource limits (CPU, memory) │ │
│ │ - No/restricted internet │ │
│ │ - Time limits per execution │ │
│ │ - File system isolation │ │
│ └──────────────────────────────────┘ │
│ │
│ Human oversight at checkpoints │
└─────────────────────────────────────────┘
WARNING from README: "This repository involves executing untrusted, model-generated code. We strongly advise users to be aware of the associated safety risks."
11 Core Mechanisms (Detailed)
Mechanism 1: Metacognitive Self-Modification
This is the paper's central mechanism and its most novel contribution. Metacognitive self-modification means the system can modify the process by which it generates modifications.
Concrete example from the paper:
In the paper review domain, the meta agent initially used simple instructions like "improve the review process." Over successive iterations, the meta agent evolved its own instructions to include:
- Structured analysis checklists — multi-point evaluation criteria
- Decision rules — explicit thresholds for accept/reject
- Multi-stage pipelines — separate stages for different aspects of review
- Performance tracking — code to monitor and analyze its own accuracy
- Adaptive strategy — logic to merge strengths of different generations
None of these meta-level improvements were programmed by humans. They emerged from the self-modification process.
Mechanism 2: Open-Ended Exploration (from DGM)
DGM-H inherits the open-ended exploration framework from DGM:
OPEN-ENDED EXPLORATION CYCLE
─────────────────────────────
1. SELECT parent agent from archive
(diverse, high-quality variants)
│
▼
2. MODIFY parent to create child
(meta agent generates diff)
│
▼
3. EVALUATE child on task
(domain-specific evaluation)
│
▼
4. ARCHIVE child if it adds value
(quality + diversity criteria)
│
▼
5. REPEAT (growing archive of stepping stones)
Key property: The archive preserves diversity, not just the single best agent. This prevents premature convergence and enables exploration of qualitatively different strategies.
The open-endedness comes from the lack of a fixed objective—the system can discover novel strategies that weren't anticipated in the initial design, as long as they improve evaluation scores.
Mechanism 3: Cross-Domain Transfer
The most practically significant mechanism. Meta-level improvements transfer because they encode general improvement strategies, not domain-specific knowledge:
| Meta-Level Strategy | Discovered In | Transfers To |
|---|---|---|
| Structured evaluation checklists | Paper review | Math grading, robotics |
| Multi-stage analysis pipelines | Paper review | Math grading |
| Persistent memory synthesis | Emerged in multiple domains | All domains |
| Performance tracking infrastructure | Emerged in multiple domains | All domains |
| Compute-aware planning | Observed across runs | All domains |
| Bias detection and correction | Paper review (94% accept rate) | Any classification task |
Transfer mechanism:
Domain A (Paper Review) Domain B (Math Grading)
──────────────────────── ─────────────────────
[Agent with evolved meta strategies] [Initial agent]
│ ↑
└──transfer meta agent code──────────┘
│
[Agent with transferred
meta strategies, applied
to new domain]
│
Score: 0.0 → 0.630
(vs. 0.0 for all baselines)
Mechanism 4: Emergent Self-Built Infrastructure
The system autonomously develops tools and infrastructure it was never instructed to build:
Performance Tracker: The system built a cross-generation performance tracking system that monitors accuracy trends, identifies regression, and informs modification decisions.
Persistent Memory: Rather than simple logging, the system evolved memory that "synthesizes insights rather than just logging numbers." From the paper's logs:
Generation 55: "best accuracy but is too harsh" — decided to merge this version's rigorous reasoning with the stronger balance of a later generation.
Compute-Aware Planning: The system adjusts its exploration strategy based on remaining compute budget: - Early iterations (high budget): Broad, exploratory changes (try fundamentally different approaches) - Late iterations (low budget): Targeted refinements (polish the best approach)
This behavior was not programmed—it emerged from the self-modification process.
Mechanism 5: Diff-Based Code Evolution
Agent modification happens through code diffs, not full program regeneration:
--- agent_v3.py
+++ agent_v4.py
@@ -45,8 +45,15 @@
def evaluate_paper(paper_text: str) -> dict:
- score = simple_score(paper_text)
- return {"decision": "accept" if score > 0.5 else "reject"}
+ # Multi-stage evaluation pipeline (evolved)
+ novelty = assess_novelty(paper_text)
+ methodology = assess_methodology(paper_text)
+ clarity = assess_clarity(paper_text)
+
+ # Decision rule with calibrated thresholds (evolved)
+ weighted_score = 0.4 * novelty + 0.4 * methodology + 0.2 * clarity
+ confidence = compute_confidence(novelty, methodology, clarity)
+
+ return {
+ "decision": "accept" if weighted_score > 0.55 else "reject",
+ "confidence": confidence,
+ "reasoning": format_reasoning(novelty, methodology, clarity)
+ }
Diff-based modification has advantages over full regeneration: - Incremental: Small changes are less likely to break working code - Traceable: Each diff documents exactly what changed - Composable: Multiple diffs can be applied sequentially - Reversible: Failed diffs can be rolled back
Mechanism 6: Selection and Archival Strategy
The system uses a selection strategy to choose which parent agent to modify. Initially, selection is based on quality scores, but the selection strategy itself is editable—the meta agent can evolve how it chooses parents.
From the paper:
"Self-discovered selection strategies beat random selection but have not matched hand-designed mechanisms, suggesting the technology remains in early stages."
This is an honest assessment of current limitations: the meta-cognitive self-modification works but hasn't yet surpassed carefully designed baselines in all aspects.
12 Programming Language
Primary Language: Python 3.12
The entire system—generate loop, meta agent, task agent, and all evolved agents—is Python.
Why Python: - LLMs are strongest at Python code generation - Python's dynamic nature makes self-modification natural (source code as data) - Rich ecosystem for all four evaluation domains - Docker containerization for sandboxed execution
Dependencies
From the repository:
# Core dependencies (requirements.txt)
# Foundation model APIs
openai
anthropic
google-generativeai
# Utilities
graphviz # Visualization of agent lineages
numpy # Numerical operations
# ... additional domain-specific dependencies
Code Style
The evolved agent code ranges from simple scripts to multi-file programs with: - Function definitions for task-specific logic - Class definitions for structured state management - String templates for LLM prompts - Control flow for multi-stage pipelines
The code quality of evolved agents varies—some generations produce clean, well-structured code while others are more ad hoc. The system does not enforce style constraints beyond functional correctness.
Self-Modification as Code Transformation
The self-modification mechanism treats Python source code as a mutable artifact:
# Simplified conceptual model of meta agent operation
def meta_agent_modify(current_agent_source: str,
performance_history: dict,
modification_instructions: str) -> str:
"""Generate a modified version of the agent."""
prompt = construct_prompt(
current_code=current_agent_source,
history=performance_history,
instructions=modification_instructions # EDITABLE
)
diff = call_llm(prompt) # Claude 4.5 Sonnet
new_source = apply_diff(current_agent_source, diff)
return new_source
The modification_instructions parameter is itself part of the editable program—so the system can evolve how it constructs the prompt, what context it includes, and what it asks the LLM to change.
13 Memory Management
Explicit Memory: Agent Archive
The archive is the system's primary long-term memory. Each entry stores:
| Field | Size (approx.) | Purpose |
|---|---|---|
| Agent source code | 1–50 KB | Complete program text |
| Evaluation scores | <1 KB | Training and validation metrics |
| Lineage | <1 KB | Parent agent reference |
| Generation number | <1 KB | Iteration when created |
| Diff from parent | 1–10 KB | What changed |
Archive size grows linearly with iterations. For 50 iterations with ~3 variants per iteration, the archive contains ~150 agents at ~10 KB average = ~1.5 MB total.
Implicit Memory: Evolved Agent State
A distinctive feature of DGM-H is that agents can evolve their own internal memory structures. The paper documents several emergent memory patterns:
- Performance logs: Agents that maintain records of their own accuracy across test cases
- Insight synthesis: Rather than raw logging, agents that distill patterns from performance data
- Strategy annotations: Code comments that document why certain approaches were adopted
- Cross-generation state: Information about what previous generations tried and what worked
This evolved memory is inside the agent's source code—it persists as part of the program text, not as a separate data structure.
Working Memory: LLM Context
During each modification step, the meta agent assembles a context window for the LLM call:
Context Window Contents:
1. Current agent source code (~5-20K tokens)
2. Performance history (~1-5K tokens)
3. Self-modification instructions (~1-3K tokens)
4. Previous modification attempts (~2-5K tokens)
Total: ~10-35K tokens per LLM call
This working memory is ephemeral—it exists only during the LLM call and is not persisted.
Cross-Run Memory (Transfer Agents)
For cross-domain transfer, the meta agent's evolved code (including any self-built infrastructure) is preserved and applied to new domains. This constitutes a form of long-term meta-memory that persists across independent runs:
Run 1 (Paper Review):
Meta agent evolves: structured checklists, bias detection
→ Save meta agent code
Run 2 (Robotics):
Meta agent evolves: reward function analysis, physics reasoning
→ Save meta agent code
Run 3 (Math Grading):
Transfer meta agent from Runs 1+2
→ Immediately applies structured analysis from paper review
→ Score: 0.0 → 0.630 (vs. 0.0 for baselines)
Memory Hierarchy Summary
┌─────────────────────────────────────────┐
│ MEMORY HIERARCHY │
│ │
│ Long-term: Agent Archive │
│ (all variants, scores, │
│ lineages) │
│ │
│ Medium-term: Evolved Agent Memory │
│ (self-built tracking, │
│ insight synthesis) │
│ │
│ Short-term: LLM Context Window │
│ (current code + history │
│ + instructions) │
│ │
│ Cross-run: Transfer Agents │
│ (meta strategies that │
│ persist across domains) │
│ │
└─────────────────────────────────────────┘
14 Continued Learning
Within-Run Learning
Each run of DGM-H demonstrates clear learning curves:
| Domain | Initial Score | Iteration 10 | Iteration 25 | Iteration 50 |
|---|---|---|---|---|
| Paper Review | 0.0 | ~0.3 | ~0.5 | 0.710 |
| Coding | 0.084 | ~0.15 | ~0.22 | 0.267 |
| Robotics | 0.060 | ~0.15 | ~0.25 | 0.372 |
Learning is not monotonic—the archive preserves diversity, so individual agent scores may fluctuate while the archive's best score generally increases.
Across-Run Learning (Cross-Domain Transfer)
The paper's strongest evidence for continued learning is cross-domain transfer:
Transfer Protocol: 1. Train DGM-H on Domain A (e.g., paper review) for 50 iterations 2. Extract the meta agent's evolved code 3. Initialize DGM-H on Domain B (e.g., math grading) with the transferred meta agent 4. Run DGM-H on Domain B
Results: Transfer agents achieve non-zero performance on domains where all other approaches score 0.0 (Olympiad math). This demonstrates that meta-level strategies generalize.
Accumulation Across Runs
The paper states that meta-level improvements "accumulate across runs":
Run 1: Meta agent learns structured evaluation
Run 2: Meta agent (seeded from Run 1) learns performance tracking
Run 3: Meta agent (seeded from Run 2) learns compute-aware planning
...
Run N: Meta agent has accumulated N runs worth of meta-strategies
Each run adds to the meta agent's repertoire of improvement strategies. This accumulation is the basis for the paper's claim of potentially self-accelerating progress.
What the System Learns to Do Better
Over successive iterations and runs, DGM-H demonstrably improves at:
- Prompting the LLM — more effective instructions for code modifications
- Selecting what to modify — targeting the most impactful parts of the code
- Evaluating its own progress — building performance tracking tools
- Managing compute budget — adjusting exploration breadth to remaining iterations
- Synthesizing insights — combining strengths of multiple generations
- Detecting biases — identifying and correcting systematic errors (e.g., 94% accept rate)
Limitations of Current Learning
- No formal learning guarantees: The system can stagnate on difficult tasks.
- Selection strategy gap: Self-discovered selection strategies haven't yet matched hand-designed ones.
- Sample efficiency: 50 iterations is required for significant improvement—each iteration involves LLM API calls.
- Evaluation ceiling: Performance is bounded by the quality of the (fixed) evaluation mechanism.
Relationship to Open-Ended Learning
DGM-H is positioned within the open-ended learning paradigm:
| Property | Traditional RL | Standard DGM | DGM-H |
|---|---|---|---|
| Fixed objective | Yes | Partially | No |
| Fixed improvement mechanism | Yes | Yes | No |
| Domain-specific | Yes | Yes (coding) | No |
| Open-ended discovery | No | Yes (coding) | Yes (any domain) |
| Meta-level learning | No | Implicit | Explicit |
| Cross-domain transfer | No | No | Yes |
15 Applications
Direct Applications
1. Automated AI Research Agent Development
The most immediate application is using DGM-H to evolve better AI research agents. Given a research task (e.g., literature review, experiment design, data analysis), DGM-H can evolve agents that improve at these tasks over time.
Connection to AIRS-Bench: Coauthors Foerster and Shavrina also developed AIRS-Bench (February 2026), a benchmark for AI research science agents. DGM-H could be applied to evolve agents that score well on AIRS-Bench, creating a self-improving cycle where the benchmark and the agents co-evolve.
2. Automated Scientific Discovery
DGM-H's domain-agnostic self-improvement makes it applicable to scientific discovery workflows:
| Scientific Task | How DGM-H Could Help |
|---|---|
| Hypothesis generation | Evolve agents that generate higher-quality hypotheses |
| Experimental design | Evolve agents that design more informative experiments |
| Data analysis | Evolve agents that extract more insight from data |
| Paper writing | Evolve agents that produce better scientific writing |
| Peer review | Demonstrated in paper (paper review domain) |
3. Reward Function Engineering for RL
The robotics reward design domain demonstrates that DGM-H can evolve reward functions for reinforcement learning. This has direct applications in:
- Robotic locomotion (demonstrated)
- Manipulation tasks
- Autonomous driving reward design
- Game AI reward shaping
The connection to Eureka (Ma et al., 2024) is explicit—DGM-H extends the LLM-based reward design paradigm with self-improving meta-level strategies.
4. Automated Evaluation and Grading
The paper review and math grading domains demonstrate DGM-H's ability to evolve evaluation agents:
- Conference paper review (demonstrated)
- Student work grading
- Code review
- Proposal evaluation
- Application screening
5. Agent Infrastructure Development
The emergent self-built infrastructure capability suggests using DGM-H to evolve agent tools and frameworks:
- Performance monitoring dashboards
- Error analysis pipelines
- A/B testing frameworks for agent variants
- Automated debugging tools
Broader Implications
The Self-Improvement Landscape (March 2026)
The paper arrives in a competitive landscape:
| System | Organization | Self-Improvement Type |
|---|---|---|
| DGM-H | Meta FAIR + Labs | Metacognitive, cross-domain |
| DGM (original) | Sakana AI | Coding-only |
| M2.7 | MiniMax | In-training self-evolution |
| Codex 5.3 | OpenAI | Self-assisted development |
| Karpathy Loop | Independent | Autonomous experiment loops |
| AlphaEvolve | Google DeepMind | Evolutionary code optimization |
DGM-H's unique position: cross-domain transfer of improvement strategies. While all other systems improve within their training domain, DGM-H demonstrates that meta-level improvement strategies can transfer to entirely new domains.
The "Improving How to Improve" Hierarchy
DGM-H introduces a hierarchy of improvement levels:
Level 0: Static agent (no improvement)
→ Fixed performance
Level 1: Self-improving agent (standard DGM, RL)
→ Task performance improves
→ Improvement rate is fixed
Level 2: Meta-self-improving agent (DGM-H)
→ Task performance improves
→ Improvement RATE improves
→ Potential for acceleration
Level 3: (Theoretical) Recursively meta-improving
→ Improvement of improvement of improvement...
→ Unbounded acceleration potential
→ Not yet demonstrated
DGM-H achieves Level 2 in practice. Level 3 remains theoretical but is the logical endpoint of the framework.
Safety Implications
The paper is commendably frank about safety concerns:
"We discuss what safety entails in this setting and the broader implications of self-improving systems."
Current safety measures: - Docker sandboxing with resource limits - Restricted internet access during agent execution - Human oversight at checkpoints - Separation of modification and evaluation models
Unresolved safety questions: 1. Capability acceleration: If meta-level improvement accumulates across runs, how fast could capabilities evolve? 2. Sandbox escape: Could sufficiently capable self-modifying code find ways to bypass containment? 3. Deceptive alignment: Could an agent evolve to appear aligned during evaluation while pursuing different objectives? 4. Scalability of oversight: Can human oversight scale as fast as the system's self-improvement rate?
Connection to Evolutionary Computation
DGM-H has deep structural parallels with evolutionary computation:
| Concept | Evolution | DGM-H |
|---|---|---|
| Individual | Organism | Agent (Python program) |
| Genotype | DNA | Source code |
| Phenotype | Physical traits | Agent behavior |
| Fitness | Reproductive success | Evaluation score |
| Mutation | Random DNA changes | LLM-generated code diffs |
| Selection | Natural selection | Archive-based selection |
| Population | Species | Archive of agent variants |
| Adaptation | Phenotypic change | Improved task performance |
| Evolvability | Ability to evolve | Meta-level self-modification |
The analogy to evolvability is particularly apt: biological evolution has evolved mechanisms that make future evolution more effective (e.g., sexual reproduction, modular body plans, regulatory gene networks). DGM-H's metacognitive self-modification is the computational analogue—evolving mechanisms that make future improvement more effective.
Limitations and Open Questions
-
Sample efficiency: 50 iterations is a significant investment in API calls. Can the system learn faster?
-
Evaluation dependence: Performance is bounded by the quality of fixed evaluation. What happens when evaluation is imperfect or gameable?
-
Selection strategy gap: Self-discovered selection hasn't matched hand-designed selection. This is a ceiling on the meta-improvement's current quality.
-
Scale of programs: Current agents are relatively small programs. How does the approach scale to complex, multi-file systems?
-
Two-player dynamics: The paper doesn't extensively test adversarial domains where the opponent is also improving.
-
Theoretical foundations: The paper is empirical. There is no formal characterization of when or why metacognitive self-modification should work, or what its limits are.
-
Safety at scale: All experiments were small-scale with human oversight. The safety properties have not been tested at deployment scale.
Related Conceptual Connections
Connection to the Original Gödel Machine
The name "Gödel Machine" (Schmidhuber, 2007) refers to a theoretical self-referential universal problem solver that can rewrite any part of its own code—including the code that does the rewriting—provided it can prove that the modification is beneficial. The Darwin Gödel Machine relaxes the proof requirement (using empirical evaluation instead) and adds evolutionary diversity (the "Darwin" aspect).
DGM-H goes further by making the rewriting mechanism itself editable, closing the final gap in the self-referential loop. In Schmidhuber's original vision, the Gödel Machine could modify its rewriting code—but only if it could prove the modification was beneficial. DGM-H drops the proof requirement entirely and relies on empirical evaluation + evolutionary selection.
Connection to Quality-Diversity Algorithms
The archive of stepping stones connects to the quality-diversity (QD) literature, particularly MAP-Elites (Mouret & Clune, 2015). QD algorithms maintain an archive of diverse, high-performing solutions indexed by behavioral descriptors. DGM-H's archive serves a similar function—maintaining diversity to enable future stepping-stone discoveries.
Connection to AutoML and Neural Architecture Search
DGM-H can be viewed as a generalization of AutoML/NAS to arbitrary program optimization. While AutoML searches over model architectures and hyperparameters, DGM-H searches over entire agent programs—including the meta-level procedures that guide the search.
| Dimension | AutoML/NAS | DGM-H |
|---|---|---|
| Search space | Architectures/hyperparameters | Complete programs |
| Search method | Fixed (Bayesian opt., RL, evolution) | Evolvable (self-modifying) |
| Objective | Fixed metric | Fixed (but meta-strategies evolve) |
| Transfer | Limited | Cross-domain meta transfer |
Connection to Program Synthesis Literature
DGM-H's use of LLMs for code generation places it at the intersection of program synthesis and self-improving systems. The diff-based modification approach is related to:
- AlphaCode (Li et al., 2022): Large-scale sampling and filtering of programs
- AlphaEvolve (Novikov et al., 2025): Evolutionary algorithm applied to codebases with LLM as mutation
- Reflexion (Shinn et al., 2023): Verbal reinforcement learning for code refinement
DGM-H differentiates by making the refinement procedure itself evolvable—a level of self-reference absent in all prior work.
This analysis is based on the paper as published on arXiv (2603.19461v1, March 19, 2026), the Meta AI research publication (March 24, 2026), the open-source repository at github.com/facebookresearch/HyperAgents, and detailed coverage by WinBuzzer (March 31, 2026). The paper was authored by researchers across University of British Columbia, Vector Institute, University of Edinburgh, NYU, FAIR at Meta, and Meta Superintelligence Labs.