書名： Hands-On Q-Learning with Python
作者名： Nazia Habib
本章字?jǐn)?shù)： 154字
更新時(shí)間： 2021-06-24 15:13:13

Getting Started with the Q-Learning Algorithm

Q-learning is an algorithm that is designed to solve a control problem called a Markov decision process (MDP). We will go over what MDPs are in detail, how they work, and how Q-learning is designed to solve them. We will explore some classic reinforcement learning (RL) problems and learn how to develop solutions using Q-learning.

We will cover the following topics in this chapter:

Understanding what an MDP is and how Q-learning is designed to solve an MDP
Learning how to define the states an agent can be in, and the actions it can take from those states in the context of the OpenAI Gym Taxi-v2 environment that we will be using for our first project
Becoming familiar with alpha (learning), gamma (discount), and epsilon (exploration) rates
Diving into a classic RL problem, the multi-armed bandit problem (MABP), and putting it into a Q-learning context

官术网_书友最值得收藏!

Hands-On Q-Learning with Python

Getting Started with the Q-Learning Algorithm