- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 81字
- 2021-08-27 18:51:57
The policy model for optimality
Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as .
is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.
推薦閱讀
- 大數據導論:思維、技術與應用
- 大數據技術與應用基礎
- Unreal Engine:Game Development from A to Z
- 面向STEM的mBlock智能機器人創新課程
- Java實用組件集
- Visual FoxPro 6.0數據庫與程序設計
- SharePoint 2010開發最佳實踐
- Hadoop Real-World Solutions Cookbook(Second Edition)
- 大型數據庫管理系統技術、應用與實例分析:SQL Server 2005
- 數據挖掘方法及天體光譜挖掘技術
- Arduino &樂高創意機器人制作教程
- CompTIA Network+ Certification Guide
- 大數據時代
- Apache Superset Quick Start Guide
- Implementing Splunk 7(Third Edition)