官术网_书友最值得收藏!

The policy function

We have learned about the policy function in Chapter 1Introduction to Reinforcement Learning, which maps the states to actions. It is denoted by π. 

The policy function can be represented as  , indicating mapping from states to actions. So, basically, a policy function says what action to perform in each state. Our ultimate goal lies in finding the optimal policy which specifies the correct action to perform in each state, which maximizes the reward.

主站蜘蛛池模板: 纳雍县| 讷河市| 铜川市| 肃北| 平安县| 阳曲县| 太白县| 榆树市| 赣榆县| 金川县| 夏邑县| 天峨县| 喀喇| 迁安市| 巨鹿县| 栾城县| 内乡县| 蛟河市| 保德县| 竹山县| 龙口市| 岳阳市| 广东省| 连城县| 古浪县| 砀山县| 安徽省| 都江堰市| 东乡县| 宕昌县| 蛟河市| 寿宁县| 松阳县| 化德县| 定州市| 镇宁| 南皮县| 朝阳县| 涿鹿县| 邯郸市| 志丹县|