官术网_书友最值得收藏!

Decaying gamma

Decaying gamma will have the agent prioritize short-term rewards as it learns what those rewards are, and puts less emphasis on long-term rewards. 

Remember that a gamma value of 0 will cause an agent to totally disregard future values and focus only on current rewards, and that a gamma value of 1 will cause it to prioritize future values in the same way as current ones. Decaying gamma will, therefore, increase its focus onto current rewards and away from future rewards. 

Intuitively, this benefits us, because the closer we get to our goal, the more we want to take advantage of these short-term rewards instead of holding out for future rewards that won't be available after we complete the task. We can reach our goal faster and more efficiently by changing the use of the resources that we have available to us as the availability of those resources changes. 

主站蜘蛛池模板: 井研县| 陆河县| 浦北县| 年辖:市辖区| 新乡县| 绥宁县| 永胜县| 中西区| 南京市| 湖北省| 屯门区| 博湖县| 康定县| 桃源县| 吉水县| 南通市| 图片| 简阳市| 武义县| 桐城市| 牡丹江市| 波密县| 宣城市| 渭源县| 延寿县| 昂仁县| 永兴县| 巴青县| 盈江县| 阳山县| 昭平县| 南溪县| 涟源市| 化隆| 芮城县| 云南省| 郁南县| 鄯善县| 合阳县| 方山县| 江孜县|