官术网_书友最值得收藏!

  • Python Reinforcement Learning
  • Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
  • 183字
  • 2021-06-24 15:17:22

Agent environment interface

Agents are the software agents that perform actions, At, at a time, t, to move from one state, St, to another state St+1. Based on actions, agents receive a numerical reward, R, from the environment. Ultimately, RL is all about finding the optimal actions that will increase the numerical reward:

Let us understand the concept of RL with a maze game:

The objective of a maze is to reach the destination without getting stuck on the obstacles. Here's the workflow:

  • The agent is the one who travels through the maze, which is our software program/ RL algorithm
  • The environment is the maze
  • The state is the position in a maze that the agent currently resides in 
  • An agent performs an action by moving from one state to another
  • An agent receives a positive reward when its action doesn't get stuck on any obstacle and receives a negative reward when its action gets stuck on obstacles so it cannot reach the destination
  • The goal is to clear the maze and reach the destination
主站蜘蛛池模板: 大城县| 临朐县| 衡阳市| 东方市| 于都县| 牙克石市| 柏乡县| 南投市| 富平县| 大丰市| 荣昌县| 花莲县| 高邮市| 安庆市| 江孜县| 济阳县| 平安县| 江永县| 武宁县| 陇川县| 中卫市| 尚志市| 遵义市| 麦盖提县| 东城区| 临夏县| 巴楚县| 称多县| 宁安市| 赤峰市| 西藏| 高邮市| 淮安市| 长乐市| 登封市| 阜阳市| 玛曲县| 吴川市| 民勤县| 泾阳县| 罗城|