ECCV 2026

Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification

Jiawen Wen1 Penglei Sun1,* Wenjie Zhang1 Suixuan Qiu2 Weisheng Xu1 Xiaofei Yang3 Xiaowen Chu1,*
1Hong Kong University of Science and Technology (Guangzhou) 2Beijing Normal University 3Guangzhou University *Corresponding authors
Rule-VLN paradigm overview
The Rule-VLN paradigm: regulatory visual signals turn geometrically reachable streets into semantically constrained navigation problems.

Abstract

As embodied AI transitions to real-world deployment, Vision-and-Language Navigation (VLN) must evolve from mere reachability to social compliance. Current agents often fall into a goal-driven trap, prioritizing physical geometry over semantic rules and overlooking subtle regulatory constraints. Rule-VLN is the first large-scale urban benchmark for rule-compliant navigation, spanning a 29k-node environment and injecting 177 diverse regulatory categories into 8k constrained nodes across four curriculum levels.

We further propose the Semantic Navigation Rectification Module (SNRM), a universal, zero-shot module that equips pre-trained agents with safety awareness. SNRM combines coarse-to-fine visual perception with an epistemic mental map for dynamic detour planning, significantly reducing constraint violations while restoring task completion performance.

Benchmark

29k Urban graph nodes
8k Constrained nodes
177 Regulatory categories
4 Curriculum levels

Rule-VLN builds on Touchdown by introducing dynamic semantic constraints into graph traversal. Paths that are geometrically reachable may become invalid when traffic signs or regulatory signals prohibit the intended action, forcing agents to reason about whether they may proceed rather than only whether they can proceed.

Rule-VLN benchmark construction pipeline
Benchmark construction pipeline for semantic rule injection and curriculum-level constrained navigation.

Method

SNRM is a plug-and-play rectification module for pre-trained VLN agents. It routes observations through dual-stage coarse-to-fine perception, grounds candidate rule labels, and maintains an epistemic mental map to prune illegal actions and select compliant detours.

Semantic Navigation Rectification Module overview
SNRM bridges visual rule perception and geometric navigation by turning semantic prohibitions into hard graph constraints.

Results

Constraint Violation Rate -19.26%

SNRM substantially reduces illegal rule-crossing behavior under constrained navigation.

Task Completion +5.97%

Rule-aware detour planning restores navigation ability without retraining the backbone.

Task completion and constraint violation results across curriculum difficulty levels
Quantitative comparison across curriculum difficulty levels. SNRM improves task completion while reducing constraint violations.
CLIP score comparison for generated rule insertions
MPSI improves visual-semantic alignment over baseline insertion, measured by CLIP score.
Qualitative trajectory comparison for rule-compliant navigation
Qualitative comparison: SNRM identifies semantic prohibitions and selects compliant detours.

Citation

@article{wen2026rule,
  title={Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification},
  author={Wen, Jiawen and Sun, Penglei and Zhang, Wenjie and Qiu, Suixuan and Xu, Weisheng and Yang, Xiaofei and Chu, Xiaowen},
  journal={arXiv preprint arXiv:2604.16993},
  year={2026}
}