机器学习实验室博士生系列论坛(第三十期)—— Multi-agent Reinforcement Learning in Markov Games: Theory and Algorithms

Abstract: 
Multi-agent reinforcement learning (MARL) addresses the sequential decision-making problem in multi-agent systems, where each agent aims to maximize its own long-term return by interacting with a shared environment and other agents. Although modern MARL systems have achieved great empirical success in challenging artificial intelligence tasks like GO, real-time strategic games, autonomous driving, etc., theoretical understandings of MARL are still very limited. A standard framework to model MARL problems is Markov games, which can be viewed as a multi-agent generalization of Markov decision processes (MDPs). 

In this talk, we will give a brief review on two lines of works on MARL in two-agent zero-sum Markov games: the algorithms based on Nash V-learning for online MARL and the model-based pessimistic-type algorithms for offline MARL. We would put an emphasis on the theory side, especially how these works break “the curse of multi-agents” and build near-optimal sample complexity bounds.