舉報

會員
TensorFlow Reinforcement Learning Quick Start Guide
Advancesinreinforcementlearningalgorithmshavemadeitpossibletousethemforoptimalcontrolinseveraldifferentindustrialapplications.Withthisbook,youwillapplyReinforcementLearningtoarangeofproblems,fromcomputergamestoautonomousdriving.ThebookstartsbyintroducingyoutoessentialReinforcementLearningconceptssuchasagents,environments,rewards,andadvantagefunctions.Youwillalsomasterthedistinctionsbetweenon-policyandoff-policyalgorithms,aswellasmodel-freeandmodel-basedalgorithms.YouwillalsolearnaboutseveralReinforcementLearningalgorithms,suchasSARSA,DeepQ-Networks(DQN),DeepDeterministicPolicyGradients(DDPG),AsynchronousAdvantageActor-Critic(A3C),TrustRegionPolicyOptimization(TRPO),andProximalPolicyOptimization(PPO).ThebookwillalsoshowyouhowtocodethesealgorithmsinTensorFlowandPythonandapplythemtosolvecomputergamesfromOpenAIGym.Finally,youwillalsolearnhowtotrainacartodriveautonomouslyintheTorcsracingcarsimulator.Bytheendofthebook,youwillbeabletodesign,build,train,andevaluatefeed-forwardneuralnetworksandconvolutionalneuralnetworks.Youwillalsohavemasteredcodingstate-of-the-artalgorithmsandalsotrainingagentsforvariouscontrolproblems.
目錄(168章)
倒序
- coverpage
- Title Page
- Copyright and Credits
- TensorFlow Reinforcement Learning Quick Start Guide
- Dedication
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the author
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Up and Running with Reinforcement Learning
- Why RL?
- Formulating the RL problem
- The relationship between an agent and its environment
- Defining the states of the agent
- Defining the actions of the agent
- Understanding policy value and advantage functions
- Identifying episodes
- Identifying reward functions and the concept of discounted rewards
- Rewards
- Learning the Markov decision process
- Defining the Bellman equation
- On-policy versus off-policy learning
- On-policy method
- Off-policy method
- Model-free and model-based training
- Algorithms covered in this book
- Summary
- Questions
- Further reading
- Temporal Difference SARSA and Q-Learning
- Technical requirements
- Understanding TD learning
- Relation between the value functions and state
- Understanding SARSA and Q-Learning
- Learning SARSA
- Understanding Q-learning
- Cliff walking and grid world problems
- Cliff walking with Q-learning
- Grid world with SARSA
- Summary
- Further reading
- Deep Q-Network
- Technical requirements
- Learning the theory behind a DQN
- Understanding target networks
- Learning about replay buffer
- Getting introduced to the Atari environment
- Summary of Atari games
- Pong
- Breakout
- Space Invaders
- LunarLander
- The Arcade Learning Environment
- Coding a DQN in TensorFlow
- Using the model.py file
- Using the funcs.py file
- Using the dqn.py file
- Evaluating the performance of the DQN on Atari Breakout
- Summary
- Questions
- Further reading
- Double DQN Dueling Architectures and Rainbow
- Technical requirements
- Understanding Double DQN
- Updating the Bellman equation
- Coding DDQN and training to play Atari Breakout
- Evaluating the performance of DDQN on Atari Breakout
- Understanding dueling network architectures
- Coding dueling network architecture and training it to play Atari Breakout
- Combining V and A to obtain Q
- Evaluating the performance of dueling architectures on Atari Breakout
- Understanding Rainbow networks
- DQN improvements
- Prioritized experience replay
- Multi-step learning
- Distributional RL
- Noisy nets
- Running a Rainbow network on Dopamine
- Rainbow using Dopamine
- Summary
- Questions
- Further reading
- Deep Deterministic Policy Gradient
- Technical requirements
- Actor-Critic algorithms and policy gradients
- Policy gradient
- Deep Deterministic Policy Gradient
- Coding ddpg.py
- Coding AandC.py
- Coding TrainOrTest.py
- Coding replay_buffer.py
- Training and testing the DDPG on Pendulum-v0
- Summary
- Questions
- Further reading
- Asynchronous Methods - A3C and A2C
- Technical requirements
- The A3C algorithm
- Loss functions
- CartPole and LunarLander
- CartPole
- LunarLander
- The A3C algorithm applied to CartPole
- Coding cartpole.py
- Coding a3c.py
- The AC class
- The Worker() class
- Coding utils.py
- Training on CartPole
- The A3C algorithm applied to LunarLander
- Coding lunar.py
- Training on LunarLander
- The A2C algorithm
- Summary
- Questions
- Further reading
- Trust Region Policy Optimization and Proximal Policy Optimization
- Technical requirements
- Learning TRPO
- TRPO equations
- Learning PPO
- PPO loss functions
- Using PPO to solve the MountainCar problem
- Coding the class_ppo.py file
- Coding train_test.py file
- Evaluating the performance
- Full throttle
- Random throttle
- Summary
- Questions
- Further reading
- Deep RL Applied to Autonomous Driving
- Technical requirements
- Car driving simulators
- Learning to use TORCS
- State space
- Support files
- Training a DDPG agent to learn to drive
- Coding ddpg.py
- Coding AandC.py
- Coding TrainOrTest.py
- Training a PPO agent
- Summary
- Questions
- Further reading
- Assessment
- Chapter 1
- Chapter 3
- Chapter 4
- Chapter 5
- Chapter 6
- Chapter 7
- Chapter 8
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-24 15:29:32
推薦閱讀
- 集成架構中型系統
- OpenStack for Architects
- WOW!Illustrator CS6完全自學寶典
- Julia 1.0 Programming
- 中國戰略性新興產業研究與發展·工業機器人
- 學會VBA,菜鳥也高飛!
- INSTANT Puppet 3 Starter
- ZigBee無線通信技術應用開發
- 機器學習案例分析(基于Python語言)
- PHP求職寶典
- Microsoft System Center Data Protection Manager Cookbook
- Hands-On Generative Adversarial Networks with Keras
- SQL Server 2019 Administrator's Guide
- Flash CS3動畫制作融會貫通
- 中小型網站建設與管理
- 巧學活用Photoshop
- Internet of Things for Architects
- 新手學Illustrator CS6平面廣告設計
- 單片機小系統的設計與制作
- Bash Quick Start Guide
- Office戰斗力
- SQL Server 2017 Administrator's Guide
- ASP動態網頁編程
- Mastering BeagleBone Robotics
- Learning Docker(Second Edition)
- Illustrator CS5插畫藝術
- Mastering Kibana 6.x
- Mastering Azure Machine Learning
- Mastering UDK Game Development
- 計算機檢修技能零基礎成長