Adapting Execution-Time Objectives for Multi-Robot Policies via Collaborative Flow Policy Guidance

Williard Joshua Jose, Yuhao Li, Zihao Deng, Hao Zhang

Paper ID 33

Session Multi-robot Systems

Posters presented in the poster session following their oral. Locations not assigned.

Abstract: Multi-robot teams are increasingly gaining attention due to their ability to scale up in terms of task workloads and complexities. However, existing approaches struggle with three key limitations: the inability of unimodal policies to capture multi-modal joint strategies, the rigidity of fixed policies against dynamic execution-time requirements, and the difficulty of resolving conflicting objectives during deployment. To address these challenges, we present Collaborative Flow Policy Guidance (CFPG), a novel framework that enables the modification of existing collaborative multi-robot policies by composing objectives during execution time. First, we introduce Multi-Agent Flow Policy Optimization (MAFPO) to learn robust, multi-modal collaborative policies in a fully on-policy manner without offline datasets. Second, we enable training-free adaptation to new execution-time objectives by leveraging flow matching guidance to steer actions toward user-specified goals. Third, during guidance we employ a hierarchical gradient projection mechanism to resolve conflicts among the nominal objective and execution objectives. We theoretically analyze CFPG and demonstrate that it achieves superior performance and robustness across multiple simulation environments and real robots and show that CFPG surpasses state-of-the-art methods.