書名： Hands-On Q-Learning with Python
作者名： Nazia Habib
本章字數(shù)： 61字
更新時間： 2021-06-24 15:13:14

Demystifying MDPs

The technical purpose of Q-learning is to discover solutions for a type of optimization problem called an MDP.

When we talk about states and the actions that we can take from states, we are discussing concepts developed in the context of MDPs (and the Markov chains and other state space models that they are derived from).

官术网_书友最值得收藏!

Hands-On Q-Learning with Python

Demystifying MDPs