官术网_书友最值得收藏!

Summary

In this chapter, we learned what the Markov chain and Markov process are and how RL problems are represented using MDP. We have also looked at the Bellman equation, and we solved the Bellman equation to derive an optimal policy using DP. In the Chapter 4, Gaming with Monte Carlo Methods, we will look at the Monte Carlo tree search and how to build intelligent games using it.

主站蜘蛛池模板: 桓台县| 海盐县| 宁都县| 绥宁县| 万安县| 武穴市| 盘锦市| 兴山县| 衡阳县| 班戈县| 彭水| 财经| 曲阳县| 英吉沙县| 鞍山市| 山东| 荣昌县| 浦城县| 舞钢市| 东宁县| 贵阳市| 嵊泗县| 洪泽县| 梁河县| 吴堡县| 阿瓦提县| 廊坊市| 克什克腾旗| 遵义市| 托里县| 江陵县| 四川省| 安顺市| 镇江市| 阿巴嘎旗| 益阳市| 泰州市| 广德县| 阿坝县| 文登市| 安庆市|