Reinforcement Learning 22 - Policy Gradient Methods

Reinforcement Learning 22 - Policy Gradient Methods

#reinforcementlearning #policygradient #ai In this lecture we take a look at policy gradient methods, where the objective is to learn the policy directly based on the gradient of different performance measures. 0:00 Intro 5:15 Performance Measures 14:25 Policy Gradient Theorem 52:05 Softmax Action Preferences