- Python Reinforcement Learning
- Sudharsan Ravichandiran Sean Saito Rajalingappaa Shanmugamani Yang Wenzhuo
- 123字
- 2021-06-24 15:17:30
The Markov Decision Process and Dynamic Programming
The Markov Decision Process (MDP) provides a mathematical framework for solving the reinforcement learning (RL) problem. Almost all RL problems can be modeled as MDP. MDP is widely used for solving various optimization problems. In this chapter, we will understand what MDP is and how can we use it to solve RL problems. We will also learn about dynamic programming, which is a technique for solving complex problems in an efficient way.
In this chapter, you will learn about the following topics:
- The Markov chain and Markov process
- The Markov Decision Process
- Rewards and returns
- The Bellman equation
- Solving a Bellman equation using dynamic programming
- Solving a frozen lake problem using value and policy iteration
推薦閱讀
- 從零開始學Hadoop大數據分析(視頻教學版)
- Visual Studio 2015 Cookbook(Second Edition)
- 分布式數據庫系統:大數據時代新型數據庫技術(第3版)
- MySQL從入門到精通(第3版)
- 商業分析思維與實踐:用數據分析解決商業問題
- 區塊鏈通俗讀本
- 區塊鏈:看得見的信任
- WS-BPEL 2.0 Beginner's Guide
- Lego Mindstorms EV3 Essentials
- ZeroMQ
- Oracle數據庫管理、開發與實踐
- 數據分析師養成寶典
- R Object-oriented Programming
- 成功之路:ORACLE 11g學習筆記
- 深入理解Flink:實時大數據處理實踐