官术网_书友最值得收藏!

RL, supervised learning, and unsupervised learning

What is the difference between RL, supervised learning, and unsupervised learning? Well, all of them involve developing rules about an unknown environment using labeled or unlabeled data. The following is a simple diagram charting the different terms:

Take a look at the following definitions of each term: 

  • Supervised learning feeds labeled training data into an algorithm, trains the algorithm on that data, generates predictions for unlabeled testing data, and then compares the predictions of the model to the actual labels. The goal of supervised learning is to generate class labels for unseen data or to predict unseen numerical values using regression.
  • Unsupervised learning looks for similarities between different observations of unlabeled data. An unsupervised learning algorithm looks for observations that fit together along axes of similarity. The goal of unsupervised learning is to group together similar observations based on relevant criteria. 
  • RL seeks to optimize a variable under a set of constraints. An RL algorithm, called an agent, is seeking an optimal path to a goal. Therefore, the goal of RL is to find a set of actions, mapped to a set of states, that leads us to the best possible outcome in a situation that we have limited information about. 

The primary difference between these three learning methods is in the type of question being asked: 

  • Supervised learning works well for classification and regression problems (for example, whether a customer will buy a product or how much they might spend)
  • Unsupervised learning works well for problems dealing with association (for example, what products customers might buy together) and anomaly detection
  • RL works best when there is a specific value to be optimized and a function that can be discovered within a problem to optimize it (for example, how can we maximize the number of times a user will click on links or download apps based on the advertisements that we show them)

Note that this list of uses for each method is not exhaustive; we are only presenting well-known examples of the type of problem each method tends to work well for.

There are many other examples of questions that we might ask and other machine learning algorithms that we might use to solve them, but understanding the broad similarities and differences between these three major types will be useful for us going forward. 

主站蜘蛛池模板: 剑河县| 平武县| 鄂托克前旗| 临汾市| 靖江市| 怀柔区| 舟曲县| 左贡县| 黄平县| 芜湖县| 乌拉特后旗| 长子县| 南木林县| 囊谦县| 长顺县| 日照市| 冷水江市| 鄢陵县| 鄂托克旗| 屯门区| 余姚市| 崇仁县| 广元市| 寿光市| 常宁市| 通城县| 邓州市| 石泉县| 武强县| 新巴尔虎右旗| 博爱县| 镶黄旗| 周至县| 新野县| 克拉玛依市| 陇川县| 普兰店市| 涞源县| 沙洋县| 健康| 嘉祥县|