Partner-Aware Hierarchical Skill Discovery for Robust Human-AI Collaboration
Researchers introduce Partner-Aware Skill Discovery (PASD), a Deep Hierarchical Reinforcement Learning (DHRL) framework that learns skills conditioned on partner behavior. PASD uses a contrastive intrinsic reward to capture patterns from partner interactions, aligning skill representations across similar partners while maintaining discriminability across diverse strategies. This approach mitigates shortcut learning—where skills exploit spurious information instead of adapting to partners' dynamic behaviors—and promotes behavioral consistency. The method is extensively evaluated in the Overcooked-AI benchmark with a diverse population of partners characterized by varying skill levels. Results demonstrate that PASD enables robust and adaptive coordination with novel partners, outperforming conventional DHRL methods that focus solely on agent-centric rewards.
Enables AI agents to adapt to diverse human partners, improving real-world human-AI teaming.