SkillOpt: Executive Strategy for Self-Evolving Agent Skills
SkillOpt, detailed in a new arXiv preprint (2605.23904), proposes a novel approach to agent skill optimization. Unlike current methods that rely on hand-crafted or one-shot generated skills, SkillOpt treats a skill document as external state that can be iteratively improved via a separate optimizer model. The optimizer applies bounded add/delete/replace edits based on scored rollouts, accepting edits only when they strictly improve a held-out validation score. Key components include a textual learning-rate budget, a rejected-edit buffer, and epoch-wise slow/meta updates to ensure stability, all without adding inference-time model calls. The system was evaluated across six benchmarks (including diverse reasoning and coding tasks), seven target models (e.g., GPT-4, Claude, Codex), and three execution harnesses (direct chat, Codex, Claude Code). SkillOpt achieved best or tied performance on all 52 evaluated (model, benchmark, harness) cells, outperforming human-written skills, one-shot LLM-generated skills, and prior self-revision methods like Trace2Sk. The paper is available on arXiv and was published on May 25, 2026.
Enables reliable, automated skill improvement without extra inference cost, making agent systems more robust.