官术网_书友最值得收藏!

Chapter 6. Deep Q-Networks

In the previous chapter, we became familiar with the Bellman equation and the practical method of its application called Value iteration. This approach allowed us to significantly improve our speed and convergence in the FrozenLake environment, which is promising, but can we go further?

In this chapter, we'll try to apply the same theory to problems of much greater complexity: arcade games from the Atari 2600 platform, which are the de-facto benchmark of the RL research community. To deal with this new and more challenging goal, we'll talk about problems with the Value iteration method and introduce its variation, called Q-learning. In particular, we'll look at the application of Q-learning to so-called "grid world" environments, which is called tabular Q-learning, and then we'll discuss Q-learning in conjunction with neural networks. This combination has the name DQN. At the end of the chapter, we'll reimplement a DQN algorithm from the famous paper, Playing Atari with Deep Reinforcement Learning by V. Mnih and others, published in 2013, which started a new era in RL development.

主站蜘蛛池模板: 嘉定区| 丰都县| 淮阳县| 芜湖县| 湖口县| 天气| 鄂州市| 巴里| 汝州市| 广昌县| 洪泽县| 方城县| 罗江县| 宜丰县| 浮山县| 辽阳市| 呼图壁县| 长顺县| 九江市| 乡城县| 民勤县| 临沧市| 宕昌县| 满洲里市| 远安县| 鹤壁市| 卓资县| 海宁市| 彰武县| 湘潭县| 辽阳县| 张掖市| 康平县| 金昌市| 句容市| 正定县| 宜州市| 磐石市| 阿拉尔市| 天柱县| 昭觉县|