- Hands-On Q-Learning with Python
- Nazia Habib
- 139字
- 2021-06-24 15:13:09
States, actions, and rewards
What does it mean to be in a state, to take an action, or to receive a reward? These are the most important concepts for us to understand intuitively, so let's dig deeper into them. The following diagram depicts the agent-environment interaction in an MDP:

The agent interacts with the environment through actions, and it receives rewards and state information from the environment. In other words, the states and rewards are feedback from the environment, and the actions are inputs to the environment from the agent.
Going back to our simple driving simulator example, our agent might be moving or stopped at a red light, turning left or right, or heading straight. There might be other cars in the intersection, or there might not be. Our distance from the destination will be X units.
- Ansible Configuration Management
- 大學(xué)計算機基礎(chǔ):基礎(chǔ)理論篇
- 圖形圖像處理(Photoshop)
- 大數(shù)據(jù)改變世界
- ROS機器人編程與SLAM算法解析指南
- 小型電動機實用設(shè)計手冊
- PHP開發(fā)手冊
- Photoshop CS3特效處理融會貫通
- 塊數(shù)據(jù)5.0:數(shù)據(jù)社會學(xué)的理論與方法
- Cloudera Administration Handbook
- Storm應(yīng)用實踐:實時事務(wù)處理之策略
- 走近大數(shù)據(jù)
- Mastering GitLab 12
- Spark大數(shù)據(jù)商業(yè)實戰(zhàn)三部曲:內(nèi)核解密|商業(yè)案例|性能調(diào)優(yōu)
- Mastering SQL Server 2014 Data Mining