官术网_书友最值得收藏!

States, actions, and rewards

What does it mean to be in a state, to take an action, or to receive a reward? These are the most important concepts for us to understand intuitively, so let's dig deeper into them. The following diagram depicts the agent-environment interaction in an MDP:

The agent interacts with the environment through actions, and it receives rewards and state information from the environment. In other words, the states and rewards are feedback from the environment, and the actions are inputs to the environment from the agent. 

Going back to our simple driving simulator example, our agent might be moving or stopped at a red light, turning left or right, or heading straight. There might be other cars in the intersection, or there might not be. Our distance from the destination will be X units.

主站蜘蛛池模板: 泽州县| 望奎县| 富蕴县| 大荔县| 龙岩市| 澜沧| 岑溪市| 静海县| 兴安县| 中西区| 屏东县| 鲁山县| 武强县| 华蓥市| 博客| 长兴县| 云阳县| 棋牌| 明溪县| 夹江县| 霍城县| 阿勒泰市| 哈尔滨市| 晋州市| 海林市| 信阳市| 珠海市| 龙门县| 钟祥市| 万宁市| 亚东县| 綦江县| 哈尔滨市| 城步| 赫章县| 无为县| 榆社县| 平谷区| 富源县| 长阳| 苍溪县|