官术网_书友最值得收藏!

Reinforcement learning

Reinforcement learning aims to create algorithms that can learn and adapt to environmental changes. This programming technique is based on the concept of receiving external stimuli depending on the algorithm choices. A correct choice will involve a premium while an incorrect choice will lead to a penalty. The goal of system is to achieve the best possible result, of course.

In supervised learning, there is a teacher that tells the system which is the correct output (learning with a teacher). This is not always possible. Often we have only qualitative information (sometimes binary, right/wrong, or success/failure).

The information available is called reinforcement signals. But the system does not give any information on how to update the agent's behavior (that is, weights). You cannot define a cost function or a gradient. The goal of the system is to create the smart agents that have a machinery able to learn from their experience.

This flowchart shows reinforcement learning: 

Figure 1.8: How to reinforcement learning interact with the environment
主站蜘蛛池模板: 涡阳县| 介休市| 南汇区| 罗江县| 永宁县| 临武县| 六安市| 元阳县| 保康县| 盐山县| 金寨县| 都昌县| 绵阳市| 尼勒克县| 衡阳市| 石柱| 青阳县| 广德县| 大足县| 泗洪县| 五大连池市| 仁寿县| 巴林左旗| 巩留县| 高邑县| 固阳县| 天柱县| 山阳县| 苍南县| 怀宁县| 靖江市| 衢州市| 杭锦后旗| 大理市| 封开县| 唐河县| 北宁市| 英超| 义马市| 巴里| 乐山市|