About This Survey

This is a comprehensive, continuously evolving survey of LLM-powered self-evolving systems — the rapidly growing family of AI systems that use large language models to autonomously discover, optimize, and improve algorithms, code, and research artifacts. The survey covers 62 systems spanning evolutionary code optimization, autonomous research pipelines, self-improving agents, and harness frameworks from 2024 to 2026.

A Self-Evolving Survey

This site is itself an evolving artifact, continuously updated by OmniEvolve — an agentic research platform created by Remek Kinas (launch coming soon). The content improves on three levels:

  1. New papers are discovered and added — as new systems, papers, and frameworks appear in the field, they are added to the database and processed by OmniEvolve to generate new survey chapters.
  2. Existing content self-improves — OmniEvolve continuously re-reviews existing chapters, identifies weaknesses, and autonomously rewrites them to raise quality scores. Reviewer feedback accumulates into recurring patterns that prevent repeated mistakes.
  3. The prompts themselves evolve — using a GEPA-inspired evolutionary strategy, the writing prompt template is treated as an optimizable artifact. A population of prompt variants is evaluated against the reviewer, and the best-performing variant is selected for future chapters. This closes the meta-optimization loop: the system that writes chapters is itself getting better over time.

The full evolution history — every score change, every improvement round, the cumulative total score over time — is tracked on the Page History page.

Generation Pipeline

Each chapter goes through an autonomous write–review–fix loop:

  1. Source material loaded from papers, repositories, and Obsidian notes
  2. Claude generates chapter HTML with equations, code examples, and SVG diagrams
  3. GPT-5.4 reviews the chapter across 8 weighted dimensions (PhD-level rubric)
  4. If the weighted score < 8.5, Claude revises based on reviewer feedback
  5. Up to 4 revision rounds per chapter, with cumulative feedback across all chapters
  6. Accepted chapters are built into this static site and deployed automatically

The pipeline runs autonomously — no human in the loop for content generation, review, or deployment. The entire system is open source.

GEPA Prompt Evolution

The generation pipeline itself is subject to evolutionary optimization. Inspired by GEPA (Chapter 7) and GEPA Skills (Chapter 12), we treat the writer prompt template as an evolvable text artifact and use the reviewer’s 8-dimension scoring as a fitness function with structured Actionable Side Information (ASI).

The evolutionary loop works as follows:

  1. Seed. The current prompt template (or the best previously evolved variant) becomes the seed artifact.
  2. Initialize population. An LLM proposes 4 prompt variants using different mutation strategies: refine (targeted fixes), combine (merge strengths from multiple variants), simplify (remove contradictory instructions), and specialize (strengthen lowest-scoring dimensions).
  3. Evaluate. Each variant generates a chapter, which is then reviewed by GPT-5.4. The 8-dimension scores, strengths, weaknesses, and priority feedback form the ASI.
  4. Reflect & mutate. A reflection model receives the entire ranked population with ASI and proposes improved variants for the next generation — the same mechanism GEPA uses for skill optimization.
  5. Select. The best-performing variant (elite) survives unchanged; remaining slots are filled by new mutations.
  6. Repeat for up to 3 generations (5 variants each), with early stopping: if the elite variant reaches the target score (≥ 8.5), evolution halts immediately to avoid wasting budget. The best prompt is saved and used for all subsequent chapter generation.

During continuous improvement runs (--all --loop --gepa), the prompt is periodically re-evolved as cumulative reviewer feedback grows — giving the reflection model richer ASI to work with. This means the prompt that writes chapters is itself getting better over time, not just the chapters.

Seed Prompt write_chapter.md Population [P1, P2, P3, P4, P5] 5 prompt variants Evaluate Claude writes chapter GPT-5.4 reviews (8 dims) fitness + ASI feedback Reflect + Mutate LLM analyzes ASI proposes improvements next generation Best Prompt write_chapter_latest.md All future chapters use evolved prompt Mutation Strategies refine · combine simplify · specialize driven by ASI feedback Cumulative Patterns cross-chapter ASI recurring reviewer insights

Scoring Dimensions

DimensionWeightDescription
Technical Accuracy2.0Correctness of algorithms, equations, and claims
Depth & Completeness2.0Coverage of topic, no major gaps
Mathematical Rigor1.5Proper notation, derivations, proofs where needed
Code Examples1.5Working, idiomatic code illustrating key concepts
Diagrams & Illustrations1.5Clear visual explanations of architectures and flows
Clarity & Accessibility1.5Readable by informed practitioners, good structure
Research Contribution2.0Novel synthesis, comparative analysis, insights
Writing Quality1.0Prose quality, grammar, consistent voice

Author

Remek Kinas

Creator of OmniEvolve — a research platform for evolutionary algorithm discovery, heuristic optimization, and benchmark-driven experimentation with LLMs.

Citation & Attribution

If you are inspired by this survey, use any part of the knowledge, analysis, or methodology presented here, build upon the ideas, reference the systems we cover, or incorporate insights from this work into your own research, product, blog post, presentation, or project — please share and cite this work. A citation or link helps others discover this resource and supports the continued development of OmniEvolve.

Even if you don’t quote directly — if this survey helped you understand a system, shaped your thinking, or saved you research time — I would greatly appreciate a mention or backlink to https://evo.si5.pl.

BibTeX:

@misc{kinas2026evosurvey,
  author       = {Kinas, Remek},
  title        = {Evolutionary {AI} Research: A Comprehensive Survey of {LLM}-Powered
                  Self-Evolving Systems (2024--2026)},
  year         = {2026},
  howpublished = {\url{https://evo.si5.pl}},
  note         = {Continuously updated by OmniEvolve. Accessed: 2026}
}

Plain text:

Kinas, R. (2026). Evolutionary AI Research: A Comprehensive Survey of LLM-Powered Self-Evolving Systems. https://evo.si5.pl

This survey and all generated content are licensed under CC BY-ND 4.0. You are free to share the material for any purpose, provided you give appropriate credit with a link to https://evo.si5.pl. No derivatives — please link to the original rather than republishing modified versions.

License

Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0)