官术网_书友最值得收藏!

Summary

In this chapter, we learned about OpenAI Gym, including the installation of different important functions to load, render, and understand the environment state-action spaces. We learned about the Epsilon-Greedy approach as a solution to the exploration-exploitation dilemma, and tried to implement a basic Q-learning and Q-network algorithm to train a reinforcement-learning agent to navigate an environment from OpenAI Gym.

In the next chapter, we will cover the most fundamental concepts in Reinforcement Learning, which include Markov Decision Processes (MDPs), Bellman Equation, and Markov Chain Monte Carlo.

主站蜘蛛池模板: 教育| 肇庆市| 华亭县| 水城县| 治县。| 民县| 仁化县| 西峡县| 上犹县| 原平市| 旅游| 宁晋县| 会理县| 潜江市| 祁连县| 清河县| 武功县| 福建省| 中卫市| 临洮县| 华安县| 长兴县| 玉山县| 宜兰县| 西安市| 盐山县| 武穴市| 孙吴县| 花莲县| 定日县| 五家渠市| 张掖市| 梁河县| 海丰县| 连平县| 江门市| 桐庐县| 定远县| 凉山| 治多县| 达州市|