官术网_书友最值得收藏!

Decaying epsilon

We've discussed epsilon decay in the context of exploration versus exploitation. The more we get to know our environment, the less random exploration we want to do and the more actions we want to take that we know will give us high rewards. Our goal should always be to take advantage of what we already know. 

We do this by reducing the agent's epsilon value by a particular amount as the game progresses. Remember that epsilon is the likelihood (in percentage) that the agent will take a random action, instead of taking the current highest Q-valued action for the current state.

When we reduce epsilon, the likelihood of a random action becomes smaller, and we take more opportunities to benefit from the high-valued actions that we have already discovered. 

For similar reasons, it can be to our benefit to decay alpha and gamma along with epsilon.

主站蜘蛛池模板: 彩票| 大姚县| 莱州市| 元阳县| 常德市| 兴城市| 大方县| 南溪县| 丹阳市| 历史| 罗定市| 呼和浩特市| 邻水| 乐山市| 三门县| 潢川县| 和政县| 祁门县| 长泰县| 丹江口市| 彰化县| 额尔古纳市| 阳泉市| 五常市| 萍乡市| 广汉市| 县级市| 鹤岗市| 河东区| 巩义市| 枝江市| 同江市| 松阳县| 和硕县| 合川市| 北海市| 香格里拉县| 措美县| 嘉义市| 灵石县| 通渭县|