Beyond Binary Edits Robust Multimodal Knowledge Editing with Adversarial Subspace Alignment
A new paper on arXiv (2605.23780v1) addresses the limited generality of intrinsic multimodal knowledge editing in multimodal large language models (MLLMs). Current methods often fail to propagate edits across semantically equivalent visual and linguistic variations due to lack of semantic supervision and rigid editing scopes. The authors formalize robustness via knowledge units grouping semantically equivalent inputs and introduce Latent Adversarial Robustification (LAR) to generate adversarial yet semantically coherent variants in the joint latent space. They also propose Rank-Constrained Subspace Learning (RCSL), which enforces low-rank alignment of adversarial representations at the edit layer using a singular value-based objective. Extensive analysis shows improved generalization while maintaining reliability and locality. The work is published on arXiv and has not yet been peer-reviewed.
Enables more reliable and generalizable knowledge updates in multimodal AI systems.