Distilling Game Code World Model Generation into Lightweight Large Language Models
A new paper on arXiv (2605.24375v1) explores distilling Game Code World Model (GameCWM) generation capabilities from large frontier LLMs into smaller models. GameCWMs translate game rules into Python code implementing legal actions, state transitions, observations, and rewards, compatible with solvers like Monte Carlo Tree Search. Current approaches rely on large models and iterative refinement, limiting scalability. The authors introduce a curated dataset of 30 games covering perfect and imperfect information games, a verification framework for structural and semantic properties, and a post-training pipeline combining supervised fine-tuning. This work aims to make automated environment construction more accessible by enabling smaller models to generate game code efficiently.
Smaller models can now generate game code worlds, reducing cost and enabling broader use in AI agent training.