官术网_书友最值得收藏!

  • Python Reinforcement Learning
  • Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
  • 104字
  • 2021-06-24 15:17:22

Policy function

A policy defines the agent's behavior in an environment. The way in which the agent decides which action to perform depends on the policy. Say you want to reach your office from home; there will be different routes to reach your office, and some routes are shortcuts, while some routes are long. These routes are called policies because they represent the way in which we choose to perform an action to reach our goal. A policy is often denoted by the symbol ??. A policy can be in the form of a lookup table or a complex search process.

主站蜘蛛池模板: 武邑县| 镇原县| 辽阳县| 西林县| 曲阜市| 平顺县| 三亚市| 廊坊市| 临海市| 佛冈县| 察哈| 武宣县| 五常市| 田林县| 会泽县| 敖汉旗| 江源县| 丘北县| 湘西| 砀山县| 无为县| 望江县| 邵东县| 毕节市| 寻甸| 札达县| 德昌县| 马山县| 茌平县| 望江县| 会泽县| 东兴市| 霍邱县| 西藏| 云安县| 池州市| 博湖县| 恩平市| 博罗县| 通河县| 丹寨县|