官术网_书友最值得收藏!

Reinforcement learning algorithms

As we have seen in the previous sections, reinforcement learning is a programming technique that aims to develop algorithms that can learn and adapt to changes in the environment. This programming technique is based on the assumption of the agent being able to receive stimuli from the outside and to change its actions according to these stimuli. So, a correct choice will result in a reward while an incorrect choice will lead to a penalization of the system.

The goal of the system is to achieve the highest possible reward and consequently the best possible result. This result can be obtained through two approaches:

  • The first approach involves evaluating the choices of the algorithm and then rewarding or punishing the algorithm based on the result. These techniques can also adapt to substantial changes in the environment. An example is the image recognition programs that improve their performance with use. In this case we can say that learning takes place continuously.
  • In the second approach, a first phase is applied in which the algorithm is previously trained, and when the system is considered reliable, it is crystallized and no longer modifiable. This derives from the observation that constantly evaluating the actions of the algorithm can be a process that cannot be automated or that is very expensive.

These are only implementation choices, so it may happen that an algorithm includes the newly analyzed approaches.

So far, we have introduced the basic concepts of reinforcement learning. Now, we can analyze the various ways in which these concepts have been transformed into algorithms. In this section, we will list them, providing an overview, and we will deepen them in the practical cases that we will address in the following chapters.

主站蜘蛛池模板: 武鸣县| 侯马市| 平果县| 吴桥县| 石台县| 灵璧县| 平定县| 寿阳县| 尚义县| 岐山县| 抚州市| 霍山县| 肇庆市| 卓资县| 嘉峪关市| 永福县| 富蕴县| 格尔木市| 定安县| 深州市| 冀州市| 高平市| 安平县| 英吉沙县| 灵台县| 巴中市| 彝良县| 西乌| 大兴区| 伊吾县| 信宜市| 方城县| 皮山县| 益阳市| 无极县| 吉林省| 黎城县| 合肥市| 石阡县| 高碑店市| 化德县|