Zhengran Ji¹, Lingyu Zhang¹, Paul Sajda², Boyuan Chen¹
¹ Duke University, ² Columbia University
Learning collaborative behaviors is essential for multi-agent systems. Traditionally, multi-agent reinforcement learning solves this implicitly through a joint reward and centralized observations, assuming collaborative behavior will emerge. Other studies propose to learn from demonstrations of a group of collaborative experts. Instead, we propose an efficient and explicit way of learning collaborative behaviors in multi-agent systems by leveraging expertise from only a single human. Our insight is that humans can naturally take on various roles in a team. We show that agents can effectively learn to collaborate by allowing a human operator to dynamically switch between controlling agents for a short period and incorporating a human-like theory-of-mind model of teammates. Our experiments showed that our method improves the success rate of a challenging collaborative hide-and-seek task by up to 58% with only 40 minutes of human guidance. We further demonstrate our findings transfer to the real world by conducting multi-robot experiments.
-
Clone the repository:
git clone https://github.com/generalroboticslab/HUMAC.git
-
To run the simulation part of the paper, install CREW. There are more detailed instruction in the Simulation folder.
-
To run the real-world experiment part of the paper, navigate to Real-World folder for detailed instructiom.
This repository has this following structure
├── Simulation
│ └── crew-algorithms
│ └── environment
│ └── training
├── Real-World
│ └── environment
│ └── training
├── images
├── .gitignore
├── README.md
└── LICENSE
This work is supported by ARL STRONG program under awards W911NF2320182 and W911NF2220113. We also thank Jiaxun Liu for helping with the hardware setup.
If you think this paper is helpful, please consider cite our work
@misc{ji2024enablingmultirobotcollaborationsinglehuman,
title={Enabling Multi-Robot Collaboration from Single-Human Guidance},
author={Zhengran Ji and Lingyu Zhang and Paul Sajda and Boyuan Chen},
year={2024},
eprint={2409.19831},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2409.19831},
}