SKILLC: Learning Autonomous Skill Internalization in LLM Agents via Contrastive Credit Assignment
SkillC, a new framework from arXiv cs.AI, addresses the challenge of skill internalization in LLM agents for long-horizon reinforcement learning. Unlike prior methods that only use skill-helpfulness contrast for curriculum control, SkillC directly converts this contrast into a learning signal via Contrastive Skill Credit Assignment (CSCA). It samples paired skill-injected and skill-free rollouts for tasks from active skill types within the same policy update, and injects their task-level contrast into optimization using a dual-stream advantage estimator. This estimator preserves global ranking while applying a one-sided correction toward skill-free success. A smoothed validation-level signal drives an adaptive curriculum over attribution strength, rollout allocation, and monotonic active-set pruning. Experiments on ALFWorld and WebShop demonstrate improved autonomous performance compared to baselines. The paper was published on May 28, 2026, under arXiv ID 2605.27899.
Enables LLM agents to autonomously perform tasks without external skill prompts, improving efficiency in long-horizon RL.