- PyTorch 1.x Reinforcement Learning Cookbook
- Yuxi (Hayden) Liu
- 160字
- 2021-06-24 12:34:43
Markov Decision Processes and Dynamic Programming
In this chapter, we will continue our practical reinforcement learning journey with PyTorch by looking at Markov decision processes (MDPs) and dynamic programming. This chapter will start with the creation of a Markov chain and an MDP, which is the core of most reinforcement learning algorithms. You will also become more familiar with Bellman equations by practicing policy evaluation. We will then move on and apply two approaches to solving an MDP: value iteration and policy iteration. We will use the FrozenLake environment as an example. At the end of the chapter, we will demonstrate how to solve the interesting coin-flipping gamble problem with dynamic programming step by step.
The following recipes will be covered in this chapter:
- Creating a Markov chain
- Creating an MDP
- Performing policy evaluation
- Simulating the FrozenLake environment
- Solving an MDP with a value iteration algorithm
- Solving an MDP with a policy iteration algorithm
- Solving the coin-flipping gamble problem
推薦閱讀
- Google Cloud Platform Cookbook
- Design for the Future
- Hands-On Artificial Intelligence on Amazon Web Services
- 群體智能與數(shù)據(jù)挖掘
- CorelDRAW X4中文版平面設(shè)計50例
- 21天學通Visual Basic
- 21天學通Java Web開發(fā)
- 自動控制理論(非自動化專業(yè))
- 具比例時滯遞歸神經(jīng)網(wǎng)絡(luò)的穩(wěn)定性及其仿真與應(yīng)用
- Kubernetes for Developers
- 人工智能:語言智能處理
- 機床電氣控制與PLC
- 手把手教你學Photoshop CS3
- 設(shè)計模式
- 網(wǎng)絡(luò)安全概論