官术网_书友最值得收藏!

Alpha – deterministic versus stochastic environments

Your agent's learning rate alpha ranges from zero to one. Setting the learning rate to zero will cause your agent to learn nothing. All of its exploration of its environment and the rewards it receives will not affect its behavior at all, and it will continue to behave completely randomly.

Setting the learning rate to one will cause your agent to learn policies that are fully specific to a deterministic environment. One important distinction to understand is between deterministic and stochastic environments and policies.

Briefly, in a deterministic environment, the output is totally determined by the initial conditions and there is no randomness involved. We always take the same action from the same state in a deterministic environment.

In a stochastic environment, there is randomness involved and the decisions that we make are given as probability distributions. In other words, we don't always take the same action from the same state. 

主站蜘蛛池模板: 云南省| 丰原市| 深泽县| 泗水县| 东阿县| 开封县| 东平县| 固安县| 克什克腾旗| 怀集县| 兴业县| 新巴尔虎左旗| 靖西县| 汕尾市| 凌海市| 丰镇市| 华容县| 和林格尔县| 德安县| 鱼台县| 元谋县| 峡江县| 且末县| 德江县| 通海县| 静海县| 黑河市| 诸暨市| 朝阳市| 六安市| 应城市| 伊吾县| 当涂县| 比如县| 绥化市| 卢龙县| 宁乡县| 汕尾市| 绵竹市| 密山市| 台中市|