舉報

會員
Deep Reinforcement Learning Hands-On
最新章節(jié):
Index
Recentdevelopmentsinreinforcementlearning(RL),combinedwithdeeplearning(DL),haveseenunprecedentedprogressmadetowardstrainingagentstosolvecomplexproblemsinahuman-likeway.Google’suseofalgorithmstoplayanddefeatthewell-knownAtariarcadegameshaspropelledthefieldtoprominence,andresearchersaregeneratingnewideasatarapidpace.DeepReinforcementLearningHands-OnisacomprehensiveguidetotheverylatestDLtoolsandtheirlimitations.YouwillevaluatemethodsincludingCross-entropyandpolicygradients,beforeapplyingthemtoreal-worldenvironments.TakeonboththeAtarisetofvirtualgamesandfamilyfavoritessuchasConnect4.ThebookprovidesanintroductiontothebasicsofRL,givingyoutheknow-howtocodeintelligentlearningagentstotakeonaformidablearrayofpracticaltasks.DiscoverhowtoimplementQ-learningon‘gridworld’environments,teachyouragenttobuyandtradestocks,andfindouthownaturallanguagemodelsaredrivingtheboominchatbots.
目錄(159章)
倒序
- 封面
- 版權信息
- Why subscribe?
- PacktPub.com
- Contributors
- About the author
- About the reviewers
- Packt is Searching for Authors Like You
- Preface
- What this book covers
- To get the most out of this book
- Get in touch
- Chapter 1. What is Reinforcement Learning?
- Learning – supervised unsupervised and reinforcement
- RL formalisms and relations
- Markov decision processes
- Summary
- Chapter 2. OpenAI Gym
- The anatomy of the agent
- Hardware and software requirements
- OpenAI Gym API
- The random CartPole agent
- The extra Gym functionality – wrappers and monitors
- Summary
- Chapter 3. Deep Learning with PyTorch
- Tensors
- Gradients
- NN building blocks
- Custom layers
- Final glue – loss functions and optimizers
- Monitoring with TensorBoard
- Example – GAN on Atari images
- Summary
- Chapter 4. The Cross-Entropy Method
- Taxonomy of RL methods
- Practical cross-entropy
- Cross-entropy on CartPole
- Cross-entropy on FrozenLake
- Theoretical background of the cross-entropy method
- Summary
- Chapter 5. Tabular Learning and the Bellman Equation
- Value state and optimality
- The Bellman equation of optimality
- Value of action
- The value iteration method
- Value iteration in practice
- Q-learning for FrozenLake
- Summary
- Chapter 6. Deep Q-Networks
- Real-life value iteration
- Tabular Q-learning
- Deep Q-learning
- DQN on Pong
- Summary
- Chapter 7. DQN Extensions
- The PyTorch Agent Net library
- Basic DQN
- N-step DQN
- Double DQN
- Noisy networks
- Prioritized replay buffer
- Dueling DQN
- Categorical DQN
- Combining everything
- Summary
- References
- Chapter 8. Stocks Trading Using RL
- Trading
- Data
- Problem statements and key decisions
- The trading environment
- Models
- Training code
- Results
- Things to try
- Summary
- Chapter 9. Policy Gradients – An Alternative
- Values and policy
- The REINFORCE method
- REINFORCE issues
- PG on CartPole
- PG on Pong
- Summary
- Chapter 10. The Actor-Critic Method
- Variance reduction
- CartPole variance
- Actor-critic
- A2C on Pong
- A2C on Pong results
- Tuning hyperparameters
- Summary
- Chapter 11. Asynchronous Advantage Actor-Critic
- Correlation and sample efficiency
- Adding an extra A to A2C
- Multiprocessing in Python
- A3C – data parallelism
- A3C – gradients parallelism
- Summary
- Chapter 12. Chatbots Training with RL
- Chatbots overview
- Deep NLP basics
- Training of seq2seq
- The chatbot example
- Summary
- Chapter 13. Web Navigation
- Web navigation
- OpenAI Universe
- Simple clicking approach
- Human demonstrations
- Adding text description
- Things to try
- Summary
- Chapter 14. Continuous Action Space
- Why a continuous space?
- Action space
- Environments
- The Actor-Critic (A2C) method
- Deterministic policy gradients
- Distributional policy gradients
- Things to try
- Summary
- Chapter 15. Trust Regions – TRPO PPO and ACKTR
- Introduction
- Roboschool
- A2C baseline
- Proximal Policy Optimization
- Trust Region Policy Optimization
- A2C using ACKTR
- Summary
- Chapter 16. Black-Box Optimization in RL
- Black-box methods
- Evolution strategies
- ES on CartPole
- ES on HalfCheetah
- Genetic algorithms
- GA on CartPole
- GA tweaks
- GA on Cheetah
- Summary
- References
- Chapter 17. Beyond Model-Free – Imagination
- Model-based versus model-free
- Model imperfections
- Imagination-augmented agent
- I2A on Atari Breakout
- Experiment results
- Summary
- References
- Chapter 18. AlphaGo Zero
- Board games
- The AlphaGo Zero method
- Connect4 bot
- Connect4 results
- Summary
- References
- Book summary
- Other Books You May Enjoy
- Leave a review - let other readers know what you think
- Index 更新時間:2021-06-25 20:47:21
推薦閱讀
- 虛擬儀器設計測控應用典型實例
- 21天學通PHP
- 蕩胸生層云:C語言開發(fā)修行實錄
- 計算機應用復習與練習
- Dreamweaver CS3網頁設計與網站建設詳解
- 工業(yè)機器人工程應用虛擬仿真教程:MotoSim EG-VRC
- 構建高性能Web站點
- Learn CloudFormation
- 網絡布線與小型局域網搭建
- Learning Apache Apex
- Raspberry Pi Projects for Kids
- 運動控制系統
- Keras Reinforcement Learning Projects
- 工程地質地學信息遙感自動提取技術
- 網絡設備規(guī)劃、配置與管理大全(Cisco版)
- Hadoop大數據開發(fā)基礎
- 系統與服務監(jiān)控技術實踐
- Data Visualization with D3.js Cookbook
- Learning QGIS(Third Edition)
- Stream Analytics with Microsoft Azure
- Java Web開發(fā)入行真功夫
- 傳感器與檢測技術
- 巧學活用打印機維護
- 機器學習案例實戰(zhàn)
- Puppet for Containerization
- Oracle WebLogic Server 12c Advanced Administration Cookbook
- 中國戰(zhàn)略性新興產業(yè)研究與發(fā)展·智慧工業(yè)
- Moodle Gradebook(Second Edition)
- ARM嵌入式系統技術開發(fā)與應用實踐
- 大數據可視分析方法與應用