arXiv cs.AIWednesday · May 27, 2026FREE

Advancing Creative Physical Intelligence in Large Multimodal Models

multimodal-modelsbenchmarkcreative-reasoningai-research

A new paper on arXiv (2605.26396) presents MM-CreativityBench, a benchmark designed to evaluate large multimodal models (LMMs) on affordance-grounded creative tool use. Unlike existing benchmarks that focus on pattern recognition or well-posed questions, MM-CreativityBench requires models to identify non-obvious yet physically feasible ways to repurpose scene elements. Each instance includes a scenario image with structured views of candidate entities and their parts, enabling fine-grained, interactive evaluation. Experiments show that current LMMs often fail, not due to lack of generative capability, but because they do not sustain grounded reasoning throughout the problem-solving process. This work underscores a gap between LMMs' perception and reasoning abilities and their capacity for creative physical intelligence in open-ended environments.

// why it matters

Highlights a critical gap in LMMs' ability to sustain grounded reasoning for creative problem-solving.

Sources

Primary · arXiv cs.AI
▸ Read original at arxiv.org

Like this? Get the next digest.

Advancing Creative Physical Intelligence in Large Multimodal Models — aigest.dev