- Reinforcement Learning with TensorFlow
- Sayon Dutta
- 512字
- 2021-08-27 18:51:57
Reinforcement learning
Reinforcement learning is a branch of artificial intelligence that deals with an agent that perceives the information of the environment in the form of state spaces and action spaces, and acts on the environment thereby resulting in a new state and receiving a reward as feedback for that action. This received reward is assigned to the new state. Just like when we had to minimize the cost function in order to train our neural network, here the reinforcement learning agent has to maximize the overall reward to find the the optimal policy to solve a particular task.
How this is different from supervised and unsupervised learning?
In supervised learning, the training dataset has input features, X, and their corresponding output labels, Y. A model is trained on this training dataset, to which test cases having input features, X', are given as the input and the model predicts Y'.
In unsupervised learning, input features, X, of the training set are given for the training purpose. There are no associated Y values. The goal is to create a model that learns to segregate the data into different clusters by understanding the underlying pattern and thereby, classifying them to find some utility. This model is then further used for the input features X' to predict their similarity to one of the clusters.
Reinforcement learning is different from both supervised and unsupervised. Reinforcement learning can guide an agent on how to act in the real world. The interface is broader than the training vectors, like in supervised or unsupervised learning. Here is the entire environment, which can be real or a simulated world. Agents are trained in a different way, where the objective is to reach a goal state, unlike the case of supervised learning where the objective is to maximize the likelihood or minimize cost.
Reinforcement learning agents automatically receive the feedback, that is, rewards from the environment, unlike in supervised learning where labeling requires time-consuming human effort. One of the bigger advantage of reinforcement learning is that phrasing any task's objective in the form of a goal helps in solving a wide variety of problems. For example, the goal of a video game agent would be to win the game by achieving the highest score. This also helps in discovering new approaches to achieving the goal. For example, when AlphaGo became the world champion in Go, it found new, unique ways of winning.
A reinforcement learning agent is like a human. Humans evolved very slowly; an agent reinforces, but it can do that very fast. As far as sensing the environment is concerned, neither humans nor and artificial intelligence agents can sense the entire world at once. The perceived environment creates a state in which agents perform actions and land in a new state, that is, a newly-perceived environment different from the earlier one. This creates a state space that can be finite as well as infinite.
The largest sector interested in this technology is defense. Can reinforcement learning agents replace soldiers that not only walk, but fight, and make important decisions?
- 高性能混合信號ARM:ADuC7xxx原理與應(yīng)用開發(fā)
- TestStand工業(yè)自動化測試管理(典藏版)
- 并行數(shù)據(jù)挖掘及性能優(yōu)化:關(guān)聯(lián)規(guī)則與數(shù)據(jù)相關(guān)性分析
- 數(shù)據(jù)挖掘?qū)嵱冒咐治?/a>
- SharePoint 2010開發(fā)最佳實踐
- 精通特征工程
- Visual C++編程全能詞典
- 數(shù)據(jù)庫系統(tǒng)原理及應(yīng)用教程(第5版)
- 網(wǎng)絡(luò)布線與小型局域網(wǎng)搭建
- 筆記本電腦維修90個精選實例
- Working with Linux:Quick Hacks for the Command Line
- 智能鼠原理與制作(進(jìn)階篇)
- 重估:人工智能與賦能社會
- INSTANT Adobe Story Starter
- C#求職寶典