Adaptive Team Cooperative Co-Evolution for a Multi-Rover Distribution Problem
Abstract
This paper deals with policy learning for a team of heterogeneous robotic agents when the whole team shares a single reward. We address the problem of providing an accurate estimation of the contribution of each agent in tasks where coordination between agents requires joint policy updates of two (or more) agents. This is typically the case when two agents must simultaneously modify their behaviors to perform a joint action that leads to a performance gain for the whole team. We propose a cooperative co-evolutionary algorithm extended with a multi-armed bandit algorithm that dynamically adjusts the number of agents that should update their policies simultaneously, aiming both for performance and learning speed. We use a realistic robotic multi-rover task where agents must physically distribute themselves on points of interest of different natures to complete the task. Results show that the algorithm is able to select the best group size for policy updates that reflects the task's coordination requirements. Surprisingly, we also reveal that coupling between agents' actions in a realistic setup can also emerge from interactions at the phenotypical level, hinting at subtle interactions during learning between the control parameter space and the behavioral space. CCS CONCEPTS • Computing methodologies → Evolutionary robotics; • Theory of computation → Multi-agent reinforcement learning.
Keywords
multi-robots multi-agent systems evolutionary robotics multiagent reinforcement learning multi-armed bandits marginal contribution fitness evaluation cooperative coevolution EA
multi-robots
multi-agent systems
evolutionary robotics
multiagent reinforcement learning
multi-armed bandits
marginal contribution
fitness evaluation
cooperative coevolution EA
Origin | Files produced by the author(s) |
---|