MVP-Nav: Multi-layer Value Map Planner Navigator


Wenyuan Xie, Shaokai Wu, Yijin Zhou, Yanbiao Ji, Guodong Zhang, Bayram Bayramli, Qiuchang Li, Xunchu Zhou, Yue Ding, Hongtao Lu

Paper ID 63

Session Navigation 1

Poster session details TBA

Abstract: Zero-Shot Object Goal Navigation (ZSON) is an important task for robots. While Multimodal Large Language Models (MLLMs) have empowered robots with significant semantic reasoning capabilities, current RGB-only navigation methods still struggle to align high-level discrete logic with low-level continuous physical execution, often due to a lack of explicit geometric constraints and spatial memory. In this paper, we propose MVP-Nav (Multi-layer ValueMap Planner Navigator), a hierarchical framework designed for robust RGB-only ZSON. Our approach utilizes a 3D foundation model to recover the physical scale and spatial occupancy of semantic instances from monocular observations, representing them as Oriented Bounding Boxes (OBB) within a dynamic Spatial Semantic List. At the core of our system is the Multi-layer Value Maps (MVM) mechanism, which serves as a navigation hub: the MLLM acts as a high-level planner to assign semantic weights and determine navigation modes, while the low-level controller performs precise geometric path planning within a cost space fused with physical constraints. Experimental results demonstrate that MVP-Nav achieves state-of-the-art (SOTA) success rates and exploration efficiency among depth-free methods, even surpassing several depth-based benchmarks.