官术网_书友最值得收藏!

Discount factor

We have seen that an agent goal is to maximize the return. For an episodic task, we can define our return as Rt= rt+1 + rt+2 + ..... +rT, where T is the final state of the episode, and we try to maximize the return Rt.

Since we don't have any final state for a continuous task, we can define our return for continuous tasks as Rt= rt+1 + rt+2+....,which sums up to infinity. But how can we maximize the return if it never stops?

That's why we introduce the notion of a discount factor. We can redefine our return with a discount factor , as follows:

  ---(1)
          ---(2) 

The discount factor decides how much importance we give to the future rewards and immediate rewards. The value of the discount factor lies within 0 to 1. A discount factor of 0 means that immediate rewards are more important, while a discount factor of 1 would mean that future rewards are more important than immediate rewards.

A discount factor of 0 will never learn considering only the immediate rewards; similarly, a discount factor of 1 will learn forever looking for the future reward, which may lead to infinity. So the optimal value of the discount factor lies between 0.2 to 0.8. 

We give importance to immediate rewards and future rewards depending on the use case. In some cases, future rewards are more desirable than immediate rewards and vice versa. In a chess game, the goal is to defeat the opponent's king. If we give importance to the immediate reward, which is acquired by actions like our pawn defeating any opponent player and so on, the agent will learn to perform this sub-goal instead of learning to reach the actual goal. So, in this case, we give importance to future rewards, whereas in some cases, we prefer immediate rewards over future rewards. (Say, would you prefer chocolates if I gave you them today or 13 months later?)

主站蜘蛛池模板: 贡嘎县| 嵊泗县| 静宁县| 宁明县| 托克逊县| 宿州市| 威海市| 镇江市| 吕梁市| 富平县| 和顺县| 延庆县| 原阳县| 全椒县| 和政县| 临安市| 浦东新区| 蒙城县| 布尔津县| 吴旗县| 阿拉善右旗| 景洪市| 石阡县| 托里县| 兰溪市| 都江堰市| 龙南县| 乌拉特后旗| 珠海市| 富源县| 岫岩| 麻阳| 嘉荫县| 同仁县| 旺苍县| 大兴区| 青龙| 泾源县| 蒙自县| 赤城县| 保亭|