- Hands-On Q-Learning with Python
- Nazia Habib
- 154字
- 2021-06-24 15:13:13
Getting Started with the Q-Learning Algorithm
Q-learning is an algorithm that is designed to solve a control problem called a Markov decision process (MDP). We will go over what MDPs are in detail, how they work, and how Q-learning is designed to solve them. We will explore some classic reinforcement learning (RL) problems and learn how to develop solutions using Q-learning.
We will cover the following topics in this chapter:
- Understanding what an MDP is and how Q-learning is designed to solve an MDP
- Learning how to define the states an agent can be in, and the actions it can take from those states in the context of the OpenAI Gym Taxi-v2 environment that we will be using for our first project
- Becoming familiar with alpha (learning), gamma (discount), and epsilon (exploration) rates
- Diving into a classic RL problem, the multi-armed bandit problem (MABP), and putting it into a Q-learning context
推薦閱讀
- 現(xiàn)代測(cè)控電子技術(shù)
- 輕松學(xué)Java Web開發(fā)
- 走入IBM小型機(jī)世界
- AWS:Security Best Practices on AWS
- 程序設(shè)計(jì)語(yǔ)言與編譯
- UTM(統(tǒng)一威脅管理)技術(shù)概論
- CSS全程指南
- 模型制作
- OpenStack Cloud Computing Cookbook(Second Edition)
- Docker High Performance(Second Edition)
- Visual FoxPro數(shù)據(jù)庫(kù)基礎(chǔ)及應(yīng)用
- 精通數(shù)據(jù)科學(xué):從線性回歸到深度學(xué)習(xí)
- 工業(yè)自動(dòng)化技術(shù)實(shí)訓(xùn)指導(dǎo)
- R Data Analysis Projects
- 學(xué)練一本通:51單片機(jī)應(yīng)用技術(shù)