- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 146字
- 2021-06-24 12:34:41
Developing the hill-climbing algorithm
As we can see in the random search policy, each episode is independent. In fact, all episodes in random search can be run in parallel, and the weight that achieves the best performance will eventually be selected. We've also verified this with the plot of reward versus episode, where there is no upward trend. In this recipe, we will develop a different algorithm, a hill-climbing algorithm, to transfer the knowledge acquired in one episode to the next episode.
In the hill-climbing algorithm, we also start with a randomly chosen weight. But here, for every episode, we add some noise to the weight. If the total reward improves, we update the weight with the new one; otherwise, we keep the old weight. In this approach, the weight is gradually improved as we progress through the episodes, instead of jumping around in each episode.
- 零起步輕松學單片機技術(第2版)
- Mastering Proxmox(Third Edition)
- 精通MATLAB神經網絡
- 教父母學會上網
- 西門子S7-200 SMART PLC從入門到精通
- 西門子S7-200 SMART PLC實例指導學與用
- 21天學通Visual Basic
- 傳感器與物聯網技術
- 精通數據科學算法
- 工業機器人維護與保養
- Salesforce for Beginners
- TensorFlow Reinforcement Learning Quick Start Guide
- R Data Analysis Projects
- 水晶石影視動畫精粹:After Effects & Nuke 影視后期合成
- 西門子S7-1200/1500 PLC從入門到精通