官术网_书友最值得收藏!

Deep Q-learning 

In Q-learning, we generally work with a finite set of states and actions; this means that, tables suffice to hold the Q-values and rewards. However, in practical applications, the number of states and applicable actions are mostly infinite, and better Q-function approximators are needed to represent and learn the Q-functions. This is where deep neural networks come to the rescue, since they are universal function approximators. We can represent the Q-function with a neural network that takes the states and actions as input and provides the corresponding Q-values as output. Alternatively, we can train a neural network using only the states, and have the output as Q-values corresponding to all of the actions. Both of these scenarios are illustrated in the following diagram. Since the Q-values are rewards, we are dealing with regression in these networks:

Figure 1.17: Deep Q-learning function approximator network

In this book, we will use reinforcement learning to train a race car to drive by itself through deep Q-learning.

主站蜘蛛池模板: 郧西县| 和硕县| 赤壁市| 余江县| 墨竹工卡县| 清徐县| 日土县| 涿鹿县| 剑川县| 岑溪市| 文登市| 开封县| 景德镇市| 永济市| 墨竹工卡县| 宁海县| 东乡| 扬州市| 突泉县| 穆棱市| 谷城县| 稷山县| 高邮市| 西城区| 福海县| 绵阳市| 崇阳县| 苍溪县| 青阳县| 永德县| 英吉沙县| 青岛市| 永清县| 民丰县| 老河口市| 呼伦贝尔市| 九龙城区| 华宁县| 库车县| 连云港市| 乌恰县|