机器学习与数据科学博士生系列论坛(第三十七期)—— Constrained Markov Decision Process and Some Extensions

Abstract:

Reinforcement learning (RL) has achieved great success in areas such as Game-playing, robotics, recommender systems, etc. However, due to safety concerns or physical limitations, in some real-world RL problems, we must consider additional constraints that may influence the optimal policy and the learning process. A standard framework to handle such cases is the constrained Markov Decision Process (CMDP). Within the CMDP framework, the agent has to maximize the expected cumulative reward while obeying a finite number of constraints, which are usually in the form of expected cumulative cost criteria.

In this talk, we will briefly review the formulation of CMDP model along with some RL algorithms for solving CMDP problems. Additionally, we will also present an extension of the CMDP model called semi-infinitely constrained Markov decision process, where we are allowed to consider RL under infinitely many constraints.