官术网_书友最值得收藏!

Reinforcement learning algorithms

As we have seen in the previous sections, reinforcement learning is a programming technique that aims to develop algorithms that can learn and adapt to changes in the environment. This programming technique is based on the assumption of the agent being able to receive stimuli from the outside and to change its actions according to these stimuli. So, a correct choice will result in a reward while an incorrect choice will lead to a penalization of the system.

The goal of the system is to achieve the highest possible reward and consequently the best possible result. This result can be obtained through two approaches:

  • The first approach involves evaluating the choices of the algorithm and then rewarding or punishing the algorithm based on the result. These techniques can also adapt to substantial changes in the environment. An example is the image recognition programs that improve their performance with use. In this case we can say that learning takes place continuously.
  • In the second approach, a first phase is applied in which the algorithm is previously trained, and when the system is considered reliable, it is crystallized and no longer modifiable. This derives from the observation that constantly evaluating the actions of the algorithm can be a process that cannot be automated or that is very expensive.

These are only implementation choices, so it may happen that an algorithm includes the newly analyzed approaches.

So far, we have introduced the basic concepts of reinforcement learning. Now, we can analyze the various ways in which these concepts have been transformed into algorithms. In this section, we will list them, providing an overview, and we will deepen them in the practical cases that we will address in the following chapters.

主站蜘蛛池模板: 清苑县| 河东区| 鸡泽县| 西充县| 永嘉县| 孟州市| 松溪县| 蒙城县| 罗田县| 鄂州市| 白河县| 陇西县| 仙桃市| 聂拉木县| 霍州市| 石河子市| 额敏县| 会宁县| 蓬溪县| 溧水县| 栾川县| 托克托县| 资中县| 当雄县| 余干县| 武陟县| 芜湖市| 台州市| 高淳县| 潼南县| 广水市| 沁阳市| 通河县| 塔河县| 海林市| 上高县| 和政县| 乐清市| 呼伦贝尔市| 靖宇县| 全南县|