- Hands-On Q-Learning with Python
- Nazia Habib
- 154字
- 2021-06-24 15:13:17
Decaying gamma
Decaying gamma will have the agent prioritize short-term rewards as it learns what those rewards are, and puts less emphasis on long-term rewards.
Remember that a gamma value of 0 will cause an agent to totally disregard future values and focus only on current rewards, and that a gamma value of 1 will cause it to prioritize future values in the same way as current ones. Decaying gamma will, therefore, increase its focus onto current rewards and away from future rewards.
Intuitively, this benefits us, because the closer we get to our goal, the more we want to take advantage of these short-term rewards instead of holding out for future rewards that won't be available after we complete the task. We can reach our goal faster and more efficiently by changing the use of the resources that we have available to us as the availability of those resources changes.
- Project 2007項目管理實用詳解
- Spark編程基礎(Scala版)
- 計算機網絡技術基礎
- CompTIA Network+ Certification Guide
- Windows 7寶典
- Excel 2007常見技法與行業應用實例精講
- Word 2007,Excel 2007辦公應用融會貫通
- 學練一本通:51單片機應用技術
- WOW!Photoshop CS6完全自學寶典
- Natural Language Processing and Computational Linguistics
- 淘寶網店頁面設計、布局、配色、裝修一本通
- Hadoop大數據開發基礎
- 人工智能基礎教程:Python篇(青少版)
- 亮劍.NET:圖解ASP.NET網站開發實戰
- 從零開始學HTML+CSS