arXiv cs.AITuesday · May 26, 2026FREE

DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning

web-agentscontinual-learningai-researchskill-modeling

The arXiv paper "DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning," published on May 26, 2026, introduces a novel framework to address challenges in web agent development. Web agents require both high-level reasoning for task decomposition and low-level interactions for manipulating page elements. The paper highlights that these knowledge types differ fundamentally: reasoning knowledge, such as "booking a flight requires first searching for routes," is abstract and transferable, while interaction knowledge, like "clicking the Search button at a specific coordinate on Site A," is highly dependent on page-specific contexts. Existing methods often store these experiences uniformly, creating a dilemma where abstract representations lose executability on concrete pages, and concrete representations fail to generalize across domains. This entanglement limits capability accumulation, causing agents to struggle on new websites by either failing to recognize reusable task logic due to surface-level differences or attempting infeasible actions from outdated page structures. DRIVE proposes a dual-level skill modeling framework that disentangles historical experience. It separates natural language reasoning skills, which capture transferable task logic, from programmatic interaction skills, which ground abstract actions to executable steps. This separation aims to allow web agents to learn more effectively and generalize capabilities across diverse web environments under continual learning paradigms.

// why it matters

Developers can build more robust and adaptable web agents that learn continuously and generalize across diverse websites, reducing manual adaptation efforts.

Sources

Primary · arXiv cs.AI

▸ Read original at arxiv.org

DRIVE: Modeling Skills at the Reasoning and Interaction Levels for Web Agents under Continual Learning

Sources

Like this? Get the next digest.