官术网_书友最值得收藏!

Gaming with Monte Carlo Methods

Monte Carlo is one of the most popular and most commonly used algorithms in various fields ranging from physics and mechanics to computer science. The Monte Carlo algorithm is used in reinforcement learning (RL) when the model of the environment is not known. In the previous chapter, we looked at using dynamic programming (DP) to find an optimal policy where we know the model dynamics, which is transition and reward probabilities. But how can we determine the optimal policy when we don't know the model dynamics? In that case, we use the Monte Carlo algorithm; it is extremely powerful for finding optimal policies when we don't have knowledge of the environment.

In this chapter, you will learn about the following:

  • Monte Carlo methods
  • Monte Carlo prediction
  • Playing Blackjack with Monte Carlo
  • Model Carlo control
  • Monte Carlo exploration starts 
  • On-policy Monte Carlo control
  • Off-policy Monte Carlo control
主站蜘蛛池模板: 全南县| 太保市| 东阳市| 商河县| 峨边| 兴山县| 尚志市| 贵港市| 广平县| 平塘县| 本溪| 营口市| 海南省| 方城县| 灵山县| 石首市| 大庆市| 莒南县| 南投市| 定南县| 昔阳县| 江阴市| 祥云县| 平顶山市| 务川| 江源县| 革吉县| 建湖县| 龙州县| 隆子县| 昌黎县| 漳浦县| 龙陵县| 周宁县| 修文县| 双江| 垦利县| 仁化县| 黑龙江省| 土默特右旗| 建瓯市|