官术网_书友最值得收藏!

Reinforcement learning algorithms

As we have seen in the previous sections, reinforcement learning is a programming technique that aims to develop algorithms that can learn and adapt to changes in the environment. This programming technique is based on the assumption of the agent being able to receive stimuli from the outside and to change its actions according to these stimuli. So, a correct choice will result in a reward while an incorrect choice will lead to a penalization of the system.

The goal of the system is to achieve the highest possible reward and consequently the best possible result. This result can be obtained through two approaches:

  • The first approach involves evaluating the choices of the algorithm and then rewarding or punishing the algorithm based on the result. These techniques can also adapt to substantial changes in the environment. An example is the image recognition programs that improve their performance with use. In this case we can say that learning takes place continuously.
  • In the second approach, a first phase is applied in which the algorithm is previously trained, and when the system is considered reliable, it is crystallized and no longer modifiable. This derives from the observation that constantly evaluating the actions of the algorithm can be a process that cannot be automated or that is very expensive.

These are only implementation choices, so it may happen that an algorithm includes the newly analyzed approaches.

So far, we have introduced the basic concepts of reinforcement learning. Now, we can analyze the various ways in which these concepts have been transformed into algorithms. In this section, we will list them, providing an overview, and we will deepen them in the practical cases that we will address in the following chapters.

主站蜘蛛池模板: 曲周县| 罗甸县| 正蓝旗| 湘阴县| 滦平县| 沾益县| 琼中| 保康县| 铁岭县| 永年县| 宜宾市| 兖州市| 滁州市| 黄平县| 大同县| 马尔康县| 东乡族自治县| 合肥市| 延庆县| 类乌齐县| 理塘县| 盐城市| 凉城县| 天台县| 林州市| 珲春市| 聊城市| 吐鲁番市| 虹口区| 安溪县| 望江县| 平潭县| 郎溪县| 临邑县| 珲春市| 泰兴市| 靖西县| 廉江市| 二连浩特市| 南溪县| 东至县|