舉報

會員
Python Reinforcement Learning Projects
Reinforcementlearningisoneofthemostexcitingandrapidlygrowingfieldsinmachinelearning.Thisisduetothemanynovelalgorithmsdevelopedandincredibleresultspublishedinrecentyears.Inthisbook,youwilllearnaboutthecoreconceptsofRLincludingQ-learning,policygradients,MonteCarloprocesses,andseveraldeepreinforcementlearningalgorithms.Asyoumakeyourwaythroughthebook,you'llworkonprojectswithdatasetsofvariousmodalitiesincludingimage,text,andvideo.Youwillgainexperienceinseveraldomains,includinggaming,imageprocessing,andphysicalsimulations.You'llexploretechnologiessuchasTensorFlowandOpenAIGymtoimplementdeeplearningreinforcementlearningalgorithmsthatalsopredictstockprices,generatenaturallanguage,andevenbuildotherneuralnetworks.Bytheendofthisbook,youwillhavehands-onexperiencewitheightreinforcementlearningprojects,eachaddressingdifferenttopicsand/oralgorithms.Wehopethesepracticalexerciseswillprovideyouwithbetterintuitionandinsightaboutthefieldofreinforcementlearningandhowtoapplyitsalgorithmstovariousproblemsinreallife.
目錄(175章)
倒序
- 封面
- Title Page
- Copyright and Credits
- Python Reinforcement Learning Projects
- Packt Upsell
- Why subscribe?
- Packt.com
- Contributors
- About the authors
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Conventions used
- Get in touch
- Reviews
- Up and Running with Reinforcement Learning
- Introduction to this book
- Expectations
- Hardware and software requirements
- Installing packages
- What is reinforcement learning?
- The agent
- Policy
- Value function
- Model
- Markov decision process (MDP)
- Deep learning
- Neural networks
- Backpropagation
- Convolutional neural networks
- Advantages of neural networks
- Implementing a convolutional neural network in TensorFlow
- TensorFlow
- The Fashion-MNIST dataset
- Building the network
- Methods for building the network
- build method
- fit method
- Summary
- References
- Balancing CartPole
- OpenAI Gym
- Gym
- Installation
- Running an environment
- Atari
- Algorithmic tasks
- MuJoCo
- Robotics
- Markov models
- CartPole
- Summary
- Playing Atari Games
- Introduction to Atari games
- Building an Atari emulator
- Getting started
- Implementation of the Atari emulator
- Atari simulator using gym
- Data preparation
- Deep Q-learning
- Basic elements of reinforcement learning
- Demonstrating basic Q-learning algorithm
- Implementation of DQN
- Experiments
- Summary
- Simulating Control Tasks
- Introduction to control tasks
- Getting started
- The classic control tasks
- Deterministic policy gradient
- The theory behind policy gradient
- DPG algorithm
- Implementation of DDPG
- Experiments
- Trust region policy optimization
- Theory behind TRPO
- TRPO algorithm
- Experiments on MuJoCo tasks
- Summary
- Building Virtual Worlds in Minecraft
- Introduction to the Minecraft environment
- Data preparation
- Asynchronous advantage actor-critic algorithm
- Implementation of A3C
- Experiments
- Summary
- Learning to Play Go
- A brief introduction to Go
- Go and other board games
- Go and AI research
- Monte Carlo tree search
- Selection
- Expansion
- Simulation
- Update
- AlphaGo
- Supervised learning policy networks
- Reinforcement learning policy networks
- Value network
- Combining neural networks and MCTS
- AlphaGo Zero
- Training AlphaGo Zero
- Comparison with AlphaGo
- Implementing AlphaGo Zero
- Policy and value networks
- preprocessing.py
- features.py
- network.py
- Monte Carlo tree search
- mcts.py
- Combining PolicyValueNetwork and MCTS
- alphagozero_agent.py
- Putting everything together
- controller.py
- train.py
- Summary
- References
- Creating a Chatbot
- The background problem
- Dataset
- Step-by-step guide
- Data parser
- Data reader
- Helper methods
- Chatbot model
- Training the data
- Testing and results
- Summary
- Generating a Deep Learning Image Classifier
- Neural Architecture Search
- Generating and training child networks
- Training the Controller
- Training algorithm
- Implementing NAS
- child_network.py
- cifar10_processor.py
- controller.py
- Method for generating the Controller
- Generating a child network using the Controller
- train_controller method
- Testing ChildCNN
- config.py
- train.py
- Additional exercises
- Advantages of NAS
- Summary
- Predicting Future Stock Prices
- Background problem
- Data used
- Step-by-step guide
- Actor script
- Critic script
- Agent script
- Helper script
- Training the data
- Final result
- Summary
- Looking Ahead
- The shortcomings of reinforcement learning
- Resource efficiency
- Reproducibility
- Explainability/accountability
- Susceptibility to attacks
- Upcoming developments in reinforcement learning
- Addressing the limitations
- Transfer learning
- Multi-agent reinforcement learning
- Summary
- References
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-07-23 19:05:36
推薦閱讀
- Mastering Matplotlib 2.x
- Linux Mint System Administrator’s Beginner's Guide
- Hadoop 2.x Administration Cookbook
- 輕松學(xué)Java
- 項目管理成功利器Project 2007全程解析
- Mastering ServiceNow Scripting
- Microsoft System Center Confi guration Manager
- 經(jīng)典Java EE企業(yè)應(yīng)用實戰(zhàn)
- 大數(shù)據(jù)導(dǎo)論
- 網(wǎng)絡(luò)信息安全項目教程
- Eclipse全程指南
- 新世紀(jì)Photoshop CS6中文版應(yīng)用教程
- 巧學(xué)活用Linux
- EDA技術(shù)及其創(chuàng)新實踐(Verilog HDL版)
- 特征工程入門與實踐
- 工程地質(zhì)地學(xué)信息遙感自動提取技術(shù)
- Internet of Things with Raspberry Pi 3
- 亮劍.NET:圖解ASP.NET網(wǎng)站開發(fā)實戰(zhàn)
- 博弈論與無線傳感器網(wǎng)絡(luò)安全
- 從零開始學(xué)HTML+CSS
- Office 2007典型應(yīng)用四合一
- Practical Data Science Cookbook
- Mastering Docker Enterprise
- 為什么
- 樹莓派創(chuàng)客:手把手教你玩轉(zhuǎn)人工智能
- Machine Learning Quick Reference
- 高級PLC硬件和編程:基于Allen-Bradley和Siemens平臺的軟、硬件基礎(chǔ)和高級技術(shù)
- Web應(yīng)用項目開發(fā)
- 對抗機器學(xué)習(xí):機器學(xué)習(xí)系統(tǒng)中的攻擊和防御
- 基于Hadoop與Spark的大數(shù)據(jù)開發(fā)實戰(zhàn)