- Hands-On Reinforcement Learning with Python
- Sudharsan Ravichandiran
- 104字
- 2021-06-18 19:12:01
Policy function
A policy defines the agent's behavior in an environment. The way in which the agent decides which action to perform depends on the policy. Say you want to reach your office from home; there will be different routes to reach your office, and some routes are shortcuts, while some routes are long. These routes are called policies because they represent the way in which we choose to perform an action to reach our goal. A policy is often denoted by the symbol ??. A policy can be in the form of a lookup table or a complex search process.
推薦閱讀
- Mastering AWS Lambda
- Rust編程:入門、實(shí)戰(zhàn)與進(jìn)階
- Leap Motion Development Essentials
- Cross-platform Desktop Application Development:Electron,Node,NW.js,and React
- 程序員數(shù)學(xué):用Python學(xué)透線性代數(shù)和微積分
- 云原生Spring實(shí)戰(zhàn)
- HTML5+CSS3網(wǎng)站設(shè)計(jì)教程
- 組態(tài)軟件技術(shù)與應(yīng)用
- Learning OpenCV 3 Computer Vision with Python(Second Edition)
- 編程菜鳥學(xué)Python數(shù)據(jù)分析
- 0 bug:C/C++商用工程之道
- Java EE Web應(yīng)用開發(fā)基礎(chǔ)
- 深入淺出Python數(shù)據(jù)分析
- 交互設(shè)計(jì)師成長(zhǎng)手冊(cè):從零開始學(xué)交互
- Java程序設(shè)計(jì)教程