官术网_书友最值得收藏!

<rp id="korwl"><ins id="korwl"><delect id="korwl"></delect></ins></rp>

<rp id="korwl"></rp>

<button id="korwl"></button><sup id="korwl"><table id="korwl"><em id="korwl"></em></table></sup>

<button id="korwl"></button><sup id="korwl"><var id="korwl"></var></sup>

<rp id="korwl"><em id="korwl"><optgroup id="korwl"></optgroup></em></rp>

書名： Reinforcement Learning with TensorFlow
作者名： Sayon Dutta
本章字數： 81字
更新時間： 2021-08-27 18:51:57

The policy model for optimality

Policy is defined as the model that guides the agent with action selection in different states. Policy is denoted as . is basically the probability of a certain action given a particular state:

Thus, a policy map will provide the set of probabilities of different actions given a particular state. The policy along with the value function create a solution that helps in agent navigation as per the policy and the calculated value of the state.

主站蜘蛛池模板：邻水| 澄江县| 泸水县| 开鲁县| 交口县| 呈贡县| 龙口市| 岐山县| 东乡县| 大港区| 淅川县| 博客| 义马市| 阳山县| 江城| 永福县| 江口县| 博白县| 黔江区| 河北省| 石楼县| 大厂| 辽中县| 庆云县| 旺苍县| 邵东县| 内黄县| 兴业县| 芜湖市| 呼玛县| 巴彦淖尔市| 西贡区| 左贡县| 大渡口区| 霍邱县| 常熟市| 天峻县| 璧山县| 宣武区| 九龙城区| 平江县|

<menuitem id="gn8ab"><center id="gn8ab"><delect id="gn8ab"></delect></center></menuitem>

<form id="gn8ab"></form>

<form id="gn8ab"><tbody id="gn8ab"></tbody></form>

<fieldset id="gn8ab"><var id="gn8ab"></var></fieldset>

<menuitem id="gn8ab"></menuitem>

<dfn id="gn8ab"></dfn>

<menuitem id="gn8ab"></menuitem>

<form id="gn8ab"></form>

<form id="gn8ab"><nobr id="gn8ab"></nobr></form>