書(shū)名： Python Reinforcement Learning
作者名： Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
本章字?jǐn)?shù)： 88字
更新時(shí)間： 2021-06-24 15:17:22

Value function

A value function denotes how good it is for an agent to be in a particular state. It is dependent on the policy and is often denoted by v(s). It is equal to the total expected reward received by the agent starting from the initial state. There can be several value functions; the optimal value function is the one that has the highest value for all the states compared to other value functions. Similarly, an optimal policy is the one that has the optimal value function.

官术网_书友最值得收藏!

Python Reinforcement Learning

Value function