← Back to Index

Hyperagents

Self-referential agents that unify task-solving and self-improvement into a single editable program, enabling metacognitive self-modification—improving not just task behavior but the mechanism that generates future improvements. Organization: Meta FAIR / Meta Superintelligence Labs, University of British Columbia, Vector Institute, University of Edinburgh, NYU Published: March 19, 2026 Type: paper Report Type: PhD-Level Technical Analysis Report Date: April 2026

Full Title and Attribution
Authors and Team
Core Contribution
Supported Solutions
LLM Integration
Key Results
Reproducibility
Compute and API Costs
Architecture Solution
Component Breakdown
Core Mechanisms (Detailed)
Programming Language
Memory Management
Continued Learning
Applications

1 Full Title and Attribution

Full Title: Hyperagents

arXiv ID: 2603.19461

DOI: 10.48550/arXiv.2603.19461

License: CC BY 4.0

Venue: arXiv preprint (cs.AI)

Submission Date: March 19, 2026

Meta Publication Date: March 24, 2026

Code Repository: github.com/facebookresearch/HyperAgents

"DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve."

This paper addresses the fundamental limitation of existing self-improving AI systems: they rely on fixed, handcrafted meta-level mechanisms that constrain the rate and nature of improvement. By making the improvement procedure itself editable and evolvable, Hyperagents breaks through this ceiling and demonstrates genuine metacognitive self-modification across multiple domains.

The work extends Sakana AI's Darwin Gödel Machine (DGM) from a coding-only self-improvement system to a domain-general framework, establishing that meta-level improvement strategies can transfer across unrelated domains and accumulate across independent runs.

2 Authors and Team

Author	Affiliation	Role / Expertise
Jenny Zhang	University of British Columbia / Vector Institute	Lead author; open-ended learning, Darwin Gödel Machine (original DGM)
Bingchen Zhao	University of Edinburgh	Self-supervised learning, visual representation
Wannan Yang	NYU	Agent systems, reinforcement learning
Jakob Foerster	University of Oxford / Meta FAIR	Multi-agent RL, game theory, AIRS-Bench
Jeff Clune	University of British Columbia / Vector Institute	Open-ended evolution, quality-diversity, AI-generating algorithms
Minqi Jiang	FAIR at Meta	Open-ended learning, environment design
Sam Devlin	Meta Superintelligence Labs	Agent development, game AI
Tatiana Shavrina	Meta Superintelligence Labs	NLP, benchmark design, AIRS-Bench

Institutional Span: Six institutions across three countries—a cross-cutting collaboration between academic open-ended evolution research (Clune, Zhang) and industry agent systems (Meta FAIR, Meta Superintelligence Labs).

Key Intellectual Lineage:

Jenny Zhang is the lead author of the original Darwin Gödel Machine (DGM) paper (Zhang et al., 2025b), making her the natural lead for its generalization.
Jeff Clune is one of the most influential researchers in open-ended evolution and AI-generating algorithms. His research vision—that AI systems should be able to discover increasingly complex and diverse solutions without human intervention—is the philosophical foundation of Hyperagents.
Jakob Foerster brings multi-agent systems expertise and co-authored AIRS-Bench, a benchmark for AI research science agents, providing evaluation infrastructure context.
Tatiana Shavrina co-authored AIRS-Bench and contributes NLP/benchmark expertise from Meta Superintelligence Labs.

This team represents a convergence of the open-ended evolution school (Clune/Zhang) with large-scale industrial AI systems (Meta), bridging the gap between theoretical frameworks and practical agent deployments.

3 Core Contribution

The Problem: Fixed Meta-Level Mechanisms

All prior self-improving AI systems share a structural limitation: the mechanism that drives improvement is itself fixed and handcrafted.

TRADITIONAL SELF-IMPROVING SYSTEMS
──────────────────────────────────────────────────────────────

  ┌──────────────────────────────────────┐                    
  │         FIXED META-LEVEL             │  ← Cannot improve  
  │  (handcrafted improvement procedure) │     itself         
  │                                      │                    
  │  "Generate random variation"         │                    
  │  "Evaluate on benchmark"             │                    
  │  "Keep if better"                    │                    
  └──────────────┬───────────────────────┘                    
                 │ applies to                                 
                 ▼                                            
  ┌──────────────────────────────────────┐                    
  │         MUTABLE TASK-LEVEL           │  ← Can improve     
  │  (the agent solving the actual task) │                    
  └──────────────────────────────────────┘                    

  Problem: The improvement rate is bounded by the quality     
  of the fixed meta-level mechanism.                          
──────────────────────────────────────────────────────────────

This creates the infinite regress problem: if a meta-agent improves a task agent, who improves the meta-agent? Adding more meta-levels just shifts the question upward without solving it.

The Darwin Gödel Machine (DGM): A Partial Solution

The original DGM (Zhang et al., 2025b, by Sakana AI) solved this for a single domain—coding:

Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability.

In DGM, the coding agent is both the task agent (to be evaluated) and the meta agent (to generate modifications). This creates a virtuous cycle only when the task domain aligns with the self-modification skill. For coding, this alignment is natural. For paper review, robotics, or mathematics, it is not.

The Hyperagent Solution: Collapse the Hierarchy

Hyperagents solve the infinite regress by collapsing the task agent and meta agent into a single editable program:

HYPERAGENT ARCHITECTURE
──────────────────────────────────────────────────────────────

  ┌──────────────────────────────────────┐                    
  │           SINGLE EDITABLE PROGRAM    │                    
  │                                      │                    
  │  ┌────────────────────────────────┐  │                    
  │  │  TASK AGENT                    │  │ ← Editable         
  │  │  (solves the target task)      │  │                    
  │  └────────────────────────────────┘  │                    
  │                                      │                    
  │  ┌────────────────────────────────┐  │                    
  │  │  META AGENT                    │  │ ← ALSO Editable    
  │  │  (modifies itself + task agent)│  │                    
  │  │  (improvement procedure)       │  │                    
  │  └────────────────────────────────┘  │                    
  │                                      │                    
  └──────────────────────────────────────┘                    

  Key: The meta agent can modify ITSELF.                      
  This enables metacognitive self-modification.               
  No infinite regress—one level, fully self-referential.      
──────────────────────────────────────────────────────────────

The crucial innovation: The meta-level modification procedure is itself editable. The system can improve not only what it does (task-solving) but how it decides what to try next (self-modification strategy). This enables:

Metacognitive self-modification — improving the improvement process
Domain-agnostic improvement — no requirement that task and meta skills align
Cross-domain transfer — meta-level strategies learned in one domain apply to others
Accumulation across runs — meta improvements persist and compound

Formal Definition

A hyperagent is defined as a self-referential agent (H = (T, M)) where: - (T) is the task agent component (domain-specific, solves the target task) - (M) is the meta agent component (modifies both (T) and itself) - Both (T) and (M) are parts of a single editable program - (M) has read/write access to its own source code

The self-referential property means (M) can execute: - (M(T) \rightarrow T') — modify the task agent - (M(M) \rightarrow M') — modify itself (metacognitive modification) - (M(T, M) \rightarrow (T', M')) — modify both simultaneously

4 Supported Solutions

Task Domains

DGM-Hyperagents are evaluated across four diverse domains:

Domain	Task Description	Evaluation Metric	Difficulty
Polyglot Coding	Solve coding challenges in multiple languages	Pass rate	Well-aligned with DGM's original domain
AI Paper Review	Predict conference accept/reject decisions	Accuracy (balanced)	Requires structured analytical reasoning
Robotics Reward Design	Write reward functions for simulated quadruped locomotion	Normalized performance	Requires understanding physics and RL
Olympiad Math Grading	Grade mathematical proofs at competition level	Grading accuracy	Requires deep mathematical reasoning

Solution Representation

Each hyperagent is a complete Python program that encodes:

Task-solving logic — domain-specific code for performing the task
Self-modification logic — code that generates variants of itself
Memory/tracking infrastructure — any persistent state the agent has built for itself

The program is self-contained and can be executed independently. The meta agent component interacts with an LLM to generate code modifications (diffs) that are applied to produce new agent variants.

Agent Evolution Paths

The system generates and evaluates a growing archive of agent variants:

Generation 0: [Initial Agent]
    │
    ├──diff──> [Agent v1.1] ──eval──> score: 0.15
    ├──diff──> [Agent v1.2] ──eval──> score: 0.22
    └──diff──> [Agent v1.3] ──eval──> score: 0.08
                    │
                    ├──diff──> [Agent v2.1] ──eval──> score: 0.35
                    └──diff──> [Agent v2.2] ──eval──> score: 0.41
                                    │
                                    ├──diff──> [Agent v3.1] ──eval──> score: 0.52
                                    ...

After 50 iterations: best agent score → 0.710 (paper review)

Each generation builds on the best previous agents (stepping stones), and the archive preserves diversity to prevent premature convergence.

What Can Be Modified

The key differentiator from all prior systems is the scope of what can change:

Component	DGM (Original)	DGM-H (Hyperagents)
Task-solving code	✅ Editable	✅ Editable
Prompt templates	✅ Editable	✅ Editable
Self-modification instructions	❌ Fixed	✅ Editable
Selection strategy	❌ Fixed	✅ Editable
Memory/tracking infrastructure	❌ None	✅ Self-created
Performance analysis code	❌ None	✅ Self-created
Evaluation criteria	❌ Fixed	✅ Editable

5 LLM Integration

Multi-Model Architecture

DGM-H uses a deliberate separation between modification and evaluation models:

Role	Model	Purpose
Self-modification engine	Claude 4.5 Sonnet (Anthropic)	Generates code diffs for agent variants
Coding evaluation	o3-mini (OpenAI)	Evaluates coding task performance
Paper review evaluation	GPT-4o (OpenAI)	Evaluates paper review accuracy
Robotics evaluation	Domain-specific simulation	No LLM needed (physics simulation)
Math grading evaluation	GPT-4o (OpenAI)	Evaluates math solution grading

Separating the modification engine from the evaluation engine prevents any single model from "grading its own homework."

This architectural choice is critical for avoiding self-deceptive optimization, where the same model that generates improvements also evaluates them, potentially creating reward hacking loops.

How the Meta Agent Uses LLMs

The meta agent's core operation is generating code diffs. The process:

┌──────────────────────────────────────────────────────┐
│                META AGENT WORKFLOW                    │
│                                                      │
│  1. Read current agent source code                   │
│  2. Read performance history / analysis              │
│  3. Read self-modification instructions              │
│     (which are themselves part of the editable code) │
│  4. Construct prompt for LLM:                        │
│     - Current code                                   │
│     - Performance data                               │
│     - Self-modification strategy                     │
│     - Request for specific diff                      │
│  5. Send to Claude 4.5 Sonnet                        │
│  6. Receive code diff                                │
│  7. Apply diff to create new agent variant           │
│  8. Evaluate new variant                             │
│  9. Update archive if variant improves               │
└──────────────────────────────────────────────────────┘

LLM Independence Principle

The LLM is used as a tool by the meta agent, not as a fixed component. The meta agent's instructions for how to use the LLM are themselves part of the editable program. This means the system can evolve:

How it prompts the LLM
What context it provides to the LLM
What it asks the LLM to modify
How it parses and applies the LLM's output

In practice, the system has been observed to autonomously improve its prompting strategies over successive generations.

No LLM Fine-Tuning

DGM-H does not fine-tune any LLM. All models are used through inference APIs. The "learning" happens entirely in the program space—the agent's source code evolves, not the LLM's weights.

This is a deliberate design choice that: 1. Avoids the cost and complexity of model fine-tuning 2. Preserves the LLM's general capabilities 3. Enables using the latest and best models without retraining 4. Makes the improvement process interpretable (source code diffs)

6 Key Results

Primary Metric: imp@50

The paper introduces imp@50 (improvement at 50 iterations) as the primary evaluation metric:

imp@50 = (best score at iteration 50) - (initial score at iteration 0)

This captures cumulative improvement rather than absolute final score, making it sensitive to systems that consistently improve versus those that make one lucky jump.

Domain-by-Domain Results

AI Paper Review

System	Initial Score	Score at 50 Iterations	imp@50
Static baseline	0.630	0.630	0.000
DGM (original)	0.0	~0.4*	~0.4*
DGM-H	0.0	0.710	0.710

DGM-H improved from zero to 0.710, surpassing the established static baseline of 0.630. The system: - Detected a 94% accept-rate imbalance in its own predictions and self-corrected - Evolved from superficial evaluation to multi-stage review pipelines with explicit checklists - Built its own performance tracking infrastructure

Polyglot Coding

System	Initial Score	Score at 50 Iterations	imp@50
DGM (original)	0.084	~0.267	~0.183
DGM-H	0.084	0.267	0.183

DGM-H matches the original DGM on coding despite not being domain-specialized. This is notable because DGM was specifically designed for coding (where task and meta skills align), while DGM-H is domain-general.

Robotics Reward Design

System	Initial Score	Score at 50 Iterations	imp@50
No self-improvement	0.060	0.060	0.000
DGM-H	0.060	0.372	0.312

The system learned to write increasingly effective reward functions for simulated quadruped locomotion, a task requiring understanding of physics and reinforcement learning dynamics.

Cross-Domain Transfer: The Headline Result

The most striking finding is cross-domain transfer of meta-level improvements:

Configuration	Olympiad Math Score
Initial agent (no improvement)	0.0
DGM transfer agents	0.0
Standard DGM-H (no transfer)	0.0
DGM-H with transfer from paper review + robotics	0.630

Transfer agents trained on paper review and robotics achieved 0.630 in Olympiad math evaluation, while all other configurations scored 0.0.

This demonstrates that the meta-level strategies discovered in unrelated domains—structured evaluation checklists, multi-stage analysis pipelines, persistent memory—are domain-general improvement strategies that transfer to new problems.

Ablation Results

The paper shows that all components are necessary:

Configuration	Performance
No self-improvement	Baseline
Self-improvement without open-ended exploration	Limited gains
Open-ended exploration without metacognition (original DGM)	Domain-limited gains
Full DGM-H (metacognition + open-ended exploration)	Best across all domains

Emergent Capabilities

Beyond raw performance numbers, DGM-H demonstrated emergent behaviors:

Self-built performance tracking: The system independently created cross-generation performance trackers without being instructed to do so.
Persistent memory synthesis: Agents evolved to maintain memory that "synthesizes insights rather than just logging numbers."
Compute-aware planning: The system adjusted its modification strategies based on remaining compute budget—making broad exploratory changes early and targeted refinements later.
Meta-reasoning in logs: In one documented instance, the system reasoned that generation 55 was "best accuracy but is too harsh" and decided to merge that version's rigorous reasoning with the stronger balance of a later generation.

7 Reproducibility

Code Availability

Component	Status	Location
Core framework	Open source	github.com/facebookresearch/HyperAgents
Experiment logs	Provided as multi-part ZIP	`outputs_os_parts.zip` in repository
Docker container	Build instructions provided	`Dockerfile` in repository
Initial agents	Setup script provided	`setup_initial.sh`

Repository Structure

HyperAgents/
├── agent/              # Foundation model interaction code
├── analysis/           # Plotting and analysis scripts
├── domains/            # Domain-specific code
│   ├── coding/         # Polyglot coding tasks
│   ├── paper_review/   # AI paper review
│   ├── robotics/       # Reward design
│   └── math/           # Olympiad math grading
├── utils/              # Common utilities
├── meta_agent.py       # Main meta agent implementation
├── task_agent.py       # Main task agent implementation
├── run_meta_agent.py   # Script to run meta agent and get diffs
├── generate_loop.py    # Entry point for running the algorithm
├── requirements.txt    # Python dependencies
├── requirements_dev.txt # Development dependencies
├── Dockerfile          # Container build
└── setup_initial.sh    # Initial agent setup

Environment Setup

# API keys required
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=...

# System dependencies
sudo dnf install -y python3.12-devel graphviz graphviz-devel \
    cmake ninja-build bzip2-devel zlib-devel ncurses-devel libffi-devel

# Python environment
python3.12 -m venv venv_nat
source venv_nat/bin/activate
pip install -r requirements.txt
pip install -r requirements_dev.txt

# Docker (sandboxed execution)
docker build --network=host -t hyperagents .

# Initialize agents
bash ./setup_initial.sh

# Run
python generate_loop.py --domains <domain>

Reproducibility Assessment

Criterion	Rating	Notes
Code available	✅ Strong	Full source on GitHub
Data available	✅ Strong	Experiment logs provided
Dependencies specified	✅ Strong	requirements.txt + Dockerfile
API keys required	⚠️ Moderate	Requires OpenAI, Anthropic, Gemini API access
Compute cost	⚠️ Moderate	Significant API costs for full reproduction
Determinism	⚠️ Low	LLM non-determinism + stochastic exploration
Model versioning	⚠️ Low	Results tied to specific model versions that may change

Key Reproducibility Challenges

API Cost: Full reproduction across all four domains with 50+ iterations requires significant API spend across three providers.
Model Version Drift: Claude 4.5 Sonnet, o3-mini, and GPT-4o may be updated or deprecated, changing results.
Non-Determinism: Self-modification is inherently stochastic; exact trajectories will differ.
Safety Sandboxing: Reproducing the safety setup (sandboxed execution with human oversight) requires careful infrastructure.

8 Compute and API Costs

Per-Iteration Cost Structure

Each iteration of the DGM-H loop involves:

┌─────────────────────────────────────────────────┐
│              COST PER ITERATION                  │
│                                                  │
│  1. Meta Agent Prompt Construction               │
│     - Read current agent source (~500-2000 LOC)  │
│     - Read performance history                   │
│     - Read self-modification instructions        │
│     Cost: ~5K-20K input tokens                   │
│                                                  │
│  2. LLM Call (Claude 4.5 Sonnet)                 │
│     - Generate code diff                         │
│     Cost: ~2K-10K output tokens                  │
│                                                  │
│  3. Apply Diff + Execute New Agent               │
│     - Sandboxed execution (Docker)               │
│     Cost: Compute time (seconds to minutes)      │
│                                                  │
│  4. Evaluation                                   │
│     - Domain-specific (may involve LLM calls)    │
│     Cost: Varies by domain                       │
└─────────────────────────────────────────────────┘

Estimated Costs by Domain

Domain	Iterations (typ.)	Modification Model	Evaluation Model	Est. Cost per Run
Polyglot Coding	50	Claude 4.5 Sonnet	o3-mini	$50–150
Paper Review	50	Claude 4.5 Sonnet	GPT-4o	$30–100
Robotics Reward	50	Claude 4.5 Sonnet	Simulation (free)	$20–60
Olympiad Math	50	Claude 4.5 Sonnet	GPT-4o	$30–100

Total for full reproduction (all domains, multiple seeds): Estimated $500–2,000 in API costs.

Cost Comparison with Baselines

System	Cost Model	Improvement Mechanism
Fine-tuning	Very High (GPU hours)	Weight updates
DGM (original)	Moderate (API calls)	Code modification (coding only)
DGM-H	Moderate (API calls)	Code modification (any domain)
RL from scratch	Very High (GPU hours)	Gradient descent

Compute Infrastructure

The system runs on standard hardware with Docker for sandboxing: - CPU: Standard development machine - GPU: Not required (all compute is API-based) - Memory: Minimal (programs are small; no large model loading) - Network: Required for API calls - Storage: Moderate (experiment logs, agent archives)

The key insight is that DGM-H is compute-light on local hardware because the heavy computation (LLM inference) is offloaded to API providers. This makes it accessible but creates API cost as the primary budget constraint.

9 Architecture Solution

System Architecture

┌────────────────────────────────────────────────────────────────────┐
│                    DGM-HYPERAGENT SYSTEM ARCHITECTURE              │
│                                                                    │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                   GENERATE LOOP (generate_loop.py)           │  │
│  │                                                              │  │
│  │  for iteration in range(max_iterations):                     │  │
│  │                                                              │  │
│  │    ┌─────────────────┐     ┌──────────────────────┐          │  │
│  │    │  ARCHIVE         │     │  SELECTION            │          │  │
│  │    │  (stepping       │────>│  (choose parent       │          │  │
│  │    │   stones)        │     │   agent to modify)    │          │  │
│  │    └─────────────────┘     └──────────┬───────────┘          │  │
│  │          ↑                            │                      │  │
│  │          │                            ▼                      │  │
│  │    ┌─────┴───────────┐     ┌──────────────────────┐          │  │
│  │    │  EVALUATION      │     │  META AGENT           │          │  │
│  │    │  (domain-        │<────│  (meta_agent.py)      │          │  │
│  │    │   specific)      │     │                       │          │  │
│  │    └─────────────────┘     │  ┌─────────────────┐  │          │  │
│  │                            │  │ Self-Modification│  │          │  │
│  │                            │  │ Instructions     │  │          │  │
│  │    ┌─────────────────┐     │  │ (EDITABLE)       │  │          │  │
│  │    │  TASK AGENT      │     │  └─────────────────┘  │          │  │
│  │    │  (task_agent.py) │<────│                       │          │  │
│  │    │  (domain code)   │     │  Uses: Claude 4.5     │          │  │
│  │    └─────────────────┘     │  Sonnet for diffs     │          │  │
│  │                            └──────────────────────┘          │  │
│  └──────────────────────────────────────────────────────────────┘  │
│                                                                    │
│  ┌──────────────────────────────────────────────────────────────┐  │
│  │                   SANDBOX (Docker)                           │  │
│  │  - Limited resources                                        │  │
│  │  - Restricted internet access                               │  │
│  │  - Human oversight                                          │  │
│  └──────────────────────────────────────────────────────────────┘  │
└────────────────────────────────────────────────────────────────────┘

The Self-Referential Loop

The architecture's defining feature is the self-referential loop. Unlike standard agent systems with fixed optimization procedures, DGM-H's optimization procedure is part of the optimized artifact:

STANDARD OPTIMIZATION                HYPERAGENT OPTIMIZATION
──────────────────────                ──────────────────────────

Fixed Optimizer ──> Agent             ┌──────────────────────┐
    ↑                 │               │  Agent + Optimizer   │
    │                 │               │  (single program)    │
    └──evaluate───────┘               │                      │
                                      │  Optimizer modifies  │
                                      │  BOTH the agent      │
                                      │  AND itself           │
                                      └──────────┬───────────┘
                                                 │
                                                 └──evaluate──┘

Key Architectural Decisions

Single Editable Program: Task agent and meta agent are unified in one Python program, not separate services or models. This enables self-reference without external coordination.
LLM as Tool, Not Component: The LLM (Claude 4.5 Sonnet) is called as an API tool by the meta agent, not embedded in the architecture. The meta agent's instructions for how to use the LLM are editable, so the system can evolve its LLM interaction patterns.
Archive as Memory: The growing archive of agent variants serves as both a population (for diversity) and a memory (stepping stones for future improvements).
Separate Evaluation: Evaluation uses different models than modification, preventing self-deceptive optimization.
Docker Sandboxing: All agent execution happens in sandboxed containers with resource limits and restricted network access. This is essential given that the system executes self-generated code.

Comparison to DGM (Original)

┌─────────────────────────────────────────────────────────────┐
│                                                             │
│  DARWIN GÖDEL MACHINE (DGM)                                 │
│  ─────────────────────────                                  │
│                                                             │
│  ┌─────────────┐     ┌─────────────┐                       │
│  │ Coding Agent │ === │  Meta Agent  │   SAME agent         │
│  │ (task)       │     │ (modifier)   │   (works because     │
│  └─────────────┘     └─────────────┘    coding = coding)   │
│                                                             │
│  Fixed: modification instructions                           │
│  Fixed: selection strategy                                  │
│  Fixed: evaluation                                          │
│                                                             │
│  Domain: Coding ONLY                                        │
│                                                             │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  DGM-HYPERAGENTS (DGM-H)                                    │
│  ────────────────────────                                   │
│                                                             │
│  ┌───────────────────────────────┐                          │
│  │      HYPERAGENT PROGRAM       │                          │
│  │                               │                          │
│  │  ┌─────────────┐             │                          │
│  │  │ Task Agent   │  Editable  │                          │
│  │  └─────────────┘             │                          │
│  │                               │                          │
│  │  ┌─────────────┐             │                          │
│  │  │ Meta Agent   │  Editable  │                          │
│  │  │ (including   │            │                          │
│  │  │  its own     │            │                          │
│  │  │  instructions│            │                          │
│  │  └─────────────┘             │                          │
│  └───────────────────────────────┘                          │
│                                                             │
│  Editable: modification instructions ✅                     │
│  Editable: selection strategy ✅                             │
│  Editable: memory/tracking ✅                                │
│                                                             │
│  Domain: ANY computable task                                │
└─────────────────────────────────────────────────────────────┘

10 Component Breakdown

Component 1: Generate Loop (`generate_loop.py`)

Purpose: The outer loop that orchestrates the self-improvement process.

Parameter	Specification
Entry point	`python generate_loop.py --domains <domain>`
Loop structure	Iterative: select parent → modify → evaluate → archive
Max iterations	Configurable (50 used in experiments)
Output	Agent variants saved to `outputs/` directory
Parallelism	Sequential iterations (each depends on archive state)

The generate loop implements the open-ended exploration framework inherited from DGM: maintain an archive of diverse, high-quality agent variants ("stepping stones") and iteratively produce new variants that build on existing ones.

Component 2: Meta Agent (`meta_agent.py`)

Purpose: The self-modifying component that generates code diffs for agent variants.

The meta agent reads the current agent's source code and generates a diff using Claude 4.5 Sonnet. Critically, the meta agent's own instructions—how it decides what to modify, what context to provide to the LLM, and how to structure its prompts—are part of the editable program.

Key operations: - Code reading: Parse current agent source - Context assembly: Gather performance history, previous modifications, self-modification instructions - Diff generation: Prompt Claude 4.5 Sonnet for a code modification - Diff application: Apply the generated diff to create a new agent variant

The helper script run_meta_agent.py wraps the meta agent execution and diff extraction.

Component 3: Task Agent (`task_agent.py`)

Purpose: The domain-specific component that solves the target task.

The task agent's structure depends on the domain:

Domain	Task Agent Structure
Coding	Code generation + execution
Paper Review	Structured analysis pipeline + decision logic
Robotics	Reward function definition
Math Grading	Proof analysis + score assignment

The task agent is the primary target of modification, but unlike traditional systems, it is not the only modifiable component.

Component 4: Archive (Stepping Stones)

Purpose: Maintains the population of agent variants for open-ended exploration.

The archive stores: - Complete agent source code for each variant - Evaluation scores (training and validation) - Lineage information (which parent agent it was derived from) - Modification descriptions

The archive enables: - Diversity: Prevents convergence on a single lineage - Backtracking: Can return to earlier promising variants - Stepping stones: Current agents build on past discoveries

Component 5: Domain Evaluators (`domains/`)

Purpose: Domain-specific evaluation of agent performance.

Domain	Evaluator	Model Used
Coding	Code execution + test cases	o3-mini
Paper Review	Comparison to human decisions	GPT-4o
Robotics	Physics simulation	None (simulator)
Math Grading	Comparison to ground truth grades	GPT-4o

Evaluators are external to the hyperagent—they cannot be modified by the meta agent. This prevents the system from learning to game its own evaluation.

Component 6: Safety Sandbox

Purpose: Contain the execution of self-generated code.

┌─────────────────────────────────────────┐
│            DOCKER CONTAINER              │
│                                          │
│  ┌──────────────────────────────────┐   │
│  │  Agent Execution Environment     │   │
│  │                                  │   │
│  │  - Resource limits (CPU, memory) │   │
│  │  - No/restricted internet        │   │
│  │  - Time limits per execution     │   │
│  │  - File system isolation         │   │
│  └──────────────────────────────────┘   │
│                                          │
│  Human oversight at checkpoints          │
└─────────────────────────────────────────┘

WARNING from README: "This repository involves executing untrusted, model-generated code. We strongly advise users to be aware of the associated safety risks."

11 Core Mechanisms (Detailed)

Mechanism 1: Metacognitive Self-Modification

This is the paper's central mechanism and its most novel contribution. Metacognitive self-modification means the system can modify the process by which it generates modifications.

Concrete example from the paper:

In the paper review domain, the meta agent initially used simple instructions like "improve the review process." Over successive iterations, the meta agent evolved its own instructions to include:

Structured analysis checklists — multi-point evaluation criteria
Decision rules — explicit thresholds for accept/reject
Multi-stage pipelines — separate stages for different aspects of review
Performance tracking — code to monitor and analyze its own accuracy
Adaptive strategy — logic to merge strengths of different generations

None of these meta-level improvements were programmed by humans. They emerged from the self-modification process.

Mechanism 2: Open-Ended Exploration (from DGM)

DGM-H inherits the open-ended exploration framework from DGM:

OPEN-ENDED EXPLORATION CYCLE
─────────────────────────────

1. SELECT parent agent from archive
   (diverse, high-quality variants)
        │
        ▼
2. MODIFY parent to create child
   (meta agent generates diff)
        │
        ▼
3. EVALUATE child on task
   (domain-specific evaluation)
        │
        ▼
4. ARCHIVE child if it adds value
   (quality + diversity criteria)
        │
        ▼
5. REPEAT (growing archive of stepping stones)

Key property: The archive preserves diversity, not just the single best agent. This prevents premature convergence and enables exploration of qualitatively different strategies.

The open-endedness comes from the lack of a fixed objective—the system can discover novel strategies that weren't anticipated in the initial design, as long as they improve evaluation scores.

Mechanism 3: Cross-Domain Transfer

The most practically significant mechanism. Meta-level improvements transfer because they encode general improvement strategies, not domain-specific knowledge:

Meta-Level Strategy	Discovered In	Transfers To
Structured evaluation checklists	Paper review	Math grading, robotics
Multi-stage analysis pipelines	Paper review	Math grading
Persistent memory synthesis	Emerged in multiple domains	All domains
Performance tracking infrastructure	Emerged in multiple domains	All domains
Compute-aware planning	Observed across runs	All domains
Bias detection and correction	Paper review (94% accept rate)	Any classification task

Transfer mechanism:

Domain A (Paper Review)               Domain B (Math Grading)
────────────────────────               ─────────────────────

[Agent with evolved meta strategies]    [Initial agent]
         │                                    ↑
         └──transfer meta agent code──────────┘
                                              │
                                    [Agent with transferred
                                     meta strategies, applied
                                     to new domain]
                                              │
                                    Score: 0.0 → 0.630
                                    (vs. 0.0 for all baselines)

Mechanism 4: Emergent Self-Built Infrastructure

The system autonomously develops tools and infrastructure it was never instructed to build:

Performance Tracker: The system built a cross-generation performance tracking system that monitors accuracy trends, identifies regression, and informs modification decisions.

Persistent Memory: Rather than simple logging, the system evolved memory that "synthesizes insights rather than just logging numbers." From the paper's logs:

Generation 55: "best accuracy but is too harsh" — decided to merge this version's rigorous reasoning with the stronger balance of a later generation.

Compute-Aware Planning: The system adjusts its exploration strategy based on remaining compute budget: - Early iterations (high budget): Broad, exploratory changes (try fundamentally different approaches) - Late iterations (low budget): Targeted refinements (polish the best approach)

This behavior was not programmed—it emerged from the self-modification process.

Mechanism 5: Diff-Based Code Evolution

Agent modification happens through code diffs, not full program regeneration:

--- agent_v3.py
+++ agent_v4.py
@@ -45,8 +45,15 @@
 def evaluate_paper(paper_text: str) -> dict:
-    score = simple_score(paper_text)
-    return {"decision": "accept" if score > 0.5 else "reject"}
+    # Multi-stage evaluation pipeline (evolved)
+    novelty = assess_novelty(paper_text)
+    methodology = assess_methodology(paper_text)
+    clarity = assess_clarity(paper_text)
+    
+    # Decision rule with calibrated thresholds (evolved)
+    weighted_score = 0.4 * novelty + 0.4 * methodology + 0.2 * clarity
+    confidence = compute_confidence(novelty, methodology, clarity)
+    
+    return {
+        "decision": "accept" if weighted_score > 0.55 else "reject",
+        "confidence": confidence,
+        "reasoning": format_reasoning(novelty, methodology, clarity)
+    }

Diff-based modification has advantages over full regeneration: - Incremental: Small changes are less likely to break working code - Traceable: Each diff documents exactly what changed - Composable: Multiple diffs can be applied sequentially - Reversible: Failed diffs can be rolled back

Mechanism 6: Selection and Archival Strategy

The system uses a selection strategy to choose which parent agent to modify. Initially, selection is based on quality scores, but the selection strategy itself is editable—the meta agent can evolve how it chooses parents.

From the paper:

"Self-discovered selection strategies beat random selection but have not matched hand-designed mechanisms, suggesting the technology remains in early stages."

This is an honest assessment of current limitations: the meta-cognitive self-modification works but hasn't yet surpassed carefully designed baselines in all aspects.

12 Programming Language

Primary Language: Python 3.12

The entire system—generate loop, meta agent, task agent, and all evolved agents—is Python.

Why Python: - LLMs are strongest at Python code generation - Python's dynamic nature makes self-modification natural (source code as data) - Rich ecosystem for all four evaluation domains - Docker containerization for sandboxed execution

Dependencies

From the repository:

# Core dependencies (requirements.txt)
# Foundation model APIs
openai
anthropic
google-generativeai

# Utilities
graphviz          # Visualization of agent lineages
numpy             # Numerical operations
# ... additional domain-specific dependencies

Code Style

The evolved agent code ranges from simple scripts to multi-file programs with: - Function definitions for task-specific logic - Class definitions for structured state management - String templates for LLM prompts - Control flow for multi-stage pipelines

The code quality of evolved agents varies—some generations produce clean, well-structured code while others are more ad hoc. The system does not enforce style constraints beyond functional correctness.

Self-Modification as Code Transformation

The self-modification mechanism treats Python source code as a mutable artifact:

# Simplified conceptual model of meta agent operation

def meta_agent_modify(current_agent_source: str, 
                      performance_history: dict,
                      modification_instructions: str) -> str:
    """Generate a modified version of the agent."""

    prompt = construct_prompt(
        current_code=current_agent_source,
        history=performance_history,
        instructions=modification_instructions  # EDITABLE
    )

    diff = call_llm(prompt)  # Claude 4.5 Sonnet
    new_source = apply_diff(current_agent_source, diff)

    return new_source

The modification_instructions parameter is itself part of the editable program—so the system can evolve how it constructs the prompt, what context it includes, and what it asks the LLM to change.

13 Memory Management

Explicit Memory: Agent Archive

The archive is the system's primary long-term memory. Each entry stores:

Field	Size (approx.)	Purpose
Agent source code	1–50 KB	Complete program text
Evaluation scores	<1 KB	Training and validation metrics
Lineage	<1 KB	Parent agent reference
Generation number	<1 KB	Iteration when created
Diff from parent	1–10 KB	What changed

Archive size grows linearly with iterations. For 50 iterations with ~3 variants per iteration, the archive contains ~150 agents at ~10 KB average = ~1.5 MB total.

Implicit Memory: Evolved Agent State

A distinctive feature of DGM-H is that agents can evolve their own internal memory structures. The paper documents several emergent memory patterns:

Performance logs: Agents that maintain records of their own accuracy across test cases
Insight synthesis: Rather than raw logging, agents that distill patterns from performance data
Strategy annotations: Code comments that document why certain approaches were adopted
Cross-generation state: Information about what previous generations tried and what worked

This evolved memory is inside the agent's source code—it persists as part of the program text, not as a separate data structure.

Working Memory: LLM Context

During each modification step, the meta agent assembles a context window for the LLM call:

Context Window Contents:
  1. Current agent source code          (~5-20K tokens)
  2. Performance history                (~1-5K tokens)
  3. Self-modification instructions     (~1-3K tokens)
  4. Previous modification attempts     (~2-5K tokens)

  Total: ~10-35K tokens per LLM call

This working memory is ephemeral—it exists only during the LLM call and is not persisted.

Cross-Run Memory (Transfer Agents)

For cross-domain transfer, the meta agent's evolved code (including any self-built infrastructure) is preserved and applied to new domains. This constitutes a form of long-term meta-memory that persists across independent runs:

Run 1 (Paper Review):
  Meta agent evolves: structured checklists, bias detection
  → Save meta agent code

Run 2 (Robotics):
  Meta agent evolves: reward function analysis, physics reasoning
  → Save meta agent code

Run 3 (Math Grading):
  Transfer meta agent from Runs 1+2
  → Immediately applies structured analysis from paper review
  → Score: 0.0 → 0.630 (vs. 0.0 for baselines)

Memory Hierarchy Summary

┌─────────────────────────────────────────┐
│           MEMORY HIERARCHY               │
│                                          │
│  Long-term:  Agent Archive               │
│              (all variants, scores,      │
│               lineages)                  │
│                                          │
│  Medium-term: Evolved Agent Memory       │
│              (self-built tracking,       │
│               insight synthesis)         │
│                                          │
│  Short-term: LLM Context Window          │
│              (current code + history     │
│               + instructions)            │
│                                          │
│  Cross-run:  Transfer Agents             │
│              (meta strategies that       │
│               persist across domains)    │
│                                          │
└─────────────────────────────────────────┘

14 Continued Learning

Within-Run Learning

Each run of DGM-H demonstrates clear learning curves:

Domain	Initial Score	Iteration 10	Iteration 25	Iteration 50
Paper Review	0.0	~0.3	~0.5	0.710
Coding	0.084	~0.15	~0.22	0.267
Robotics	0.060	~0.15	~0.25	0.372

Learning is not monotonic—the archive preserves diversity, so individual agent scores may fluctuate while the archive's best score generally increases.

Across-Run Learning (Cross-Domain Transfer)

The paper's strongest evidence for continued learning is cross-domain transfer:

Transfer Protocol: 1. Train DGM-H on Domain A (e.g., paper review) for 50 iterations 2. Extract the meta agent's evolved code 3. Initialize DGM-H on Domain B (e.g., math grading) with the transferred meta agent 4. Run DGM-H on Domain B

Results: Transfer agents achieve non-zero performance on domains where all other approaches score 0.0 (Olympiad math). This demonstrates that meta-level strategies generalize.

Accumulation Across Runs

The paper states that meta-level improvements "accumulate across runs":

Run 1: Meta agent learns structured evaluation
Run 2: Meta agent (seeded from Run 1) learns performance tracking
Run 3: Meta agent (seeded from Run 2) learns compute-aware planning
...
Run N: Meta agent has accumulated N runs worth of meta-strategies

Each run adds to the meta agent's repertoire of improvement strategies. This accumulation is the basis for the paper's claim of potentially self-accelerating progress.

What the System Learns to Do Better

Over successive iterations and runs, DGM-H demonstrably improves at:

Prompting the LLM — more effective instructions for code modifications
Selecting what to modify — targeting the most impactful parts of the code
Evaluating its own progress — building performance tracking tools
Managing compute budget — adjusting exploration breadth to remaining iterations
Synthesizing insights — combining strengths of multiple generations
Detecting biases — identifying and correcting systematic errors (e.g., 94% accept rate)

Limitations of Current Learning

No formal learning guarantees: The system can stagnate on difficult tasks.
Selection strategy gap: Self-discovered selection strategies haven't yet matched hand-designed ones.
Sample efficiency: 50 iterations is required for significant improvement—each iteration involves LLM API calls.
Evaluation ceiling: Performance is bounded by the quality of the (fixed) evaluation mechanism.

Relationship to Open-Ended Learning

DGM-H is positioned within the open-ended learning paradigm:

Property	Traditional RL	Standard DGM	DGM-H
Fixed objective	Yes	Partially	No
Fixed improvement mechanism	Yes	Yes	No
Domain-specific	Yes	Yes (coding)	No
Open-ended discovery	No	Yes (coding)	Yes (any domain)
Meta-level learning	No	Implicit	Explicit
Cross-domain transfer	No	No	Yes

15 Applications

Direct Applications

1. Automated AI Research Agent Development

The most immediate application is using DGM-H to evolve better AI research agents. Given a research task (e.g., literature review, experiment design, data analysis), DGM-H can evolve agents that improve at these tasks over time.

Connection to AIRS-Bench: Coauthors Foerster and Shavrina also developed AIRS-Bench (February 2026), a benchmark for AI research science agents. DGM-H could be applied to evolve agents that score well on AIRS-Bench, creating a self-improving cycle where the benchmark and the agents co-evolve.

2. Automated Scientific Discovery

DGM-H's domain-agnostic self-improvement makes it applicable to scientific discovery workflows:

Scientific Task	How DGM-H Could Help
Hypothesis generation	Evolve agents that generate higher-quality hypotheses
Experimental design	Evolve agents that design more informative experiments
Data analysis	Evolve agents that extract more insight from data
Paper writing	Evolve agents that produce better scientific writing
Peer review	Demonstrated in paper (paper review domain)

3. Reward Function Engineering for RL

The robotics reward design domain demonstrates that DGM-H can evolve reward functions for reinforcement learning. This has direct applications in:

Robotic locomotion (demonstrated)
Manipulation tasks
Autonomous driving reward design
Game AI reward shaping

The connection to Eureka (Ma et al., 2024) is explicit—DGM-H extends the LLM-based reward design paradigm with self-improving meta-level strategies.

4. Automated Evaluation and Grading

The paper review and math grading domains demonstrate DGM-H's ability to evolve evaluation agents:

Conference paper review (demonstrated)
Student work grading
Code review
Proposal evaluation
Application screening

5. Agent Infrastructure Development

The emergent self-built infrastructure capability suggests using DGM-H to evolve agent tools and frameworks:

Performance monitoring dashboards
Error analysis pipelines
A/B testing frameworks for agent variants
Automated debugging tools

Broader Implications

The Self-Improvement Landscape (March 2026)

The paper arrives in a competitive landscape:

System	Organization	Self-Improvement Type
DGM-H	Meta FAIR + Labs	Metacognitive, cross-domain
DGM (original)	Sakana AI	Coding-only
M2.7	MiniMax	In-training self-evolution
Codex 5.3	OpenAI	Self-assisted development
Karpathy Loop	Independent	Autonomous experiment loops
AlphaEvolve	Google DeepMind	Evolutionary code optimization

DGM-H's unique position: cross-domain transfer of improvement strategies. While all other systems improve within their training domain, DGM-H demonstrates that meta-level improvement strategies can transfer to entirely new domains.

The "Improving How to Improve" Hierarchy

DGM-H introduces a hierarchy of improvement levels:

Level 0: Static agent (no improvement)
         → Fixed performance

Level 1: Self-improving agent (standard DGM, RL)
         → Task performance improves
         → Improvement rate is fixed

Level 2: Meta-self-improving agent (DGM-H)
         → Task performance improves
         → Improvement RATE improves
         → Potential for acceleration

Level 3: (Theoretical) Recursively meta-improving
         → Improvement of improvement of improvement...
         → Unbounded acceleration potential
         → Not yet demonstrated

DGM-H achieves Level 2 in practice. Level 3 remains theoretical but is the logical endpoint of the framework.

Safety Implications

The paper is commendably frank about safety concerns:

"We discuss what safety entails in this setting and the broader implications of self-improving systems."

Current safety measures: - Docker sandboxing with resource limits - Restricted internet access during agent execution - Human oversight at checkpoints - Separation of modification and evaluation models

Unresolved safety questions: 1. Capability acceleration: If meta-level improvement accumulates across runs, how fast could capabilities evolve? 2. Sandbox escape: Could sufficiently capable self-modifying code find ways to bypass containment? 3. Deceptive alignment: Could an agent evolve to appear aligned during evaluation while pursuing different objectives? 4. Scalability of oversight: Can human oversight scale as fast as the system's self-improvement rate?

Connection to Evolutionary Computation

DGM-H has deep structural parallels with evolutionary computation:

Concept	Evolution	DGM-H
Individual	Organism	Agent (Python program)
Genotype	DNA	Source code
Phenotype	Physical traits	Agent behavior
Fitness	Reproductive success	Evaluation score
Mutation	Random DNA changes	LLM-generated code diffs
Selection	Natural selection	Archive-based selection
Population	Species	Archive of agent variants
Adaptation	Phenotypic change	Improved task performance
Evolvability	Ability to evolve	Meta-level self-modification

The analogy to evolvability is particularly apt: biological evolution has evolved mechanisms that make future evolution more effective (e.g., sexual reproduction, modular body plans, regulatory gene networks). DGM-H's metacognitive self-modification is the computational analogue—evolving mechanisms that make future improvement more effective.

Limitations and Open Questions

Sample efficiency: 50 iterations is a significant investment in API calls. Can the system learn faster?
Evaluation dependence: Performance is bounded by the quality of fixed evaluation. What happens when evaluation is imperfect or gameable?
Selection strategy gap: Self-discovered selection hasn't matched hand-designed selection. This is a ceiling on the meta-improvement's current quality.
Scale of programs: Current agents are relatively small programs. How does the approach scale to complex, multi-file systems?
Two-player dynamics: The paper doesn't extensively test adversarial domains where the opponent is also improving.
Theoretical foundations: The paper is empirical. There is no formal characterization of when or why metacognitive self-modification should work, or what its limits are.
Safety at scale: All experiments were small-scale with human oversight. The safety properties have not been tested at deployment scale.

Connection to the Original Gödel Machine

The name "Gödel Machine" (Schmidhuber, 2007) refers to a theoretical self-referential universal problem solver that can rewrite any part of its own code—including the code that does the rewriting—provided it can prove that the modification is beneficial. The Darwin Gödel Machine relaxes the proof requirement (using empirical evaluation instead) and adds evolutionary diversity (the "Darwin" aspect).

DGM-H goes further by making the rewriting mechanism itself editable, closing the final gap in the self-referential loop. In Schmidhuber's original vision, the Gödel Machine could modify its rewriting code—but only if it could prove the modification was beneficial. DGM-H drops the proof requirement entirely and relies on empirical evaluation + evolutionary selection.

Connection to Quality-Diversity Algorithms

The archive of stepping stones connects to the quality-diversity (QD) literature, particularly MAP-Elites (Mouret & Clune, 2015). QD algorithms maintain an archive of diverse, high-performing solutions indexed by behavioral descriptors. DGM-H's archive serves a similar function—maintaining diversity to enable future stepping-stone discoveries.

Connection to AutoML and Neural Architecture Search

DGM-H can be viewed as a generalization of AutoML/NAS to arbitrary program optimization. While AutoML searches over model architectures and hyperparameters, DGM-H searches over entire agent programs—including the meta-level procedures that guide the search.

Dimension	AutoML/NAS	DGM-H
Search space	Architectures/hyperparameters	Complete programs
Search method	Fixed (Bayesian opt., RL, evolution)	Evolvable (self-modifying)
Objective	Fixed metric	Fixed (but meta-strategies evolve)
Transfer	Limited	Cross-domain meta transfer

Connection to Program Synthesis Literature

DGM-H's use of LLMs for code generation places it at the intersection of program synthesis and self-improving systems. The diff-based modification approach is related to:

AlphaCode (Li et al., 2022): Large-scale sampling and filtering of programs
AlphaEvolve (Novikov et al., 2025): Evolutionary algorithm applied to codebases with LLM as mutation
Reflexion (Shinn et al., 2023): Verbal reinforcement learning for code refinement

DGM-H differentiates by making the refinement procedure itself evolvable—a level of self-reference absent in all prior work.

This analysis is based on the paper as published on arXiv (2603.19461v1, March 19, 2026), the Meta AI research publication (March 24, 2026), the open-source repository at github.com/facebookresearch/HyperAgents, and detailed coverage by WinBuzzer (March 31, 2026). The paper was authored by researchers across University of British Columbia, Vector Institute, University of Edinburgh, NYU, FAIR at Meta, and Meta Superintelligence Labs.