官术网_书友最值得收藏!

Supervised learning problems

Supervised learning problems aim to infer the best mapping between an input and output dataset based on provided labeled pairs of input/output. The labeled dataset acts as feedback for the algorithm, allowing it to gauge the optimality of its solution. For example, given a list of mean yearly crude oil prices from 2010-2018, you may wish to predict the mean yearly crude oil price of 2019. The error that the algorithm makes on the 2010-2018 years will allow the engineer to estimate its error on the target prediction year of 2019.

A labeled pair consists of an input vector consisting of independent variables and an output vector consisting of dependent variables. For example, a labeled dataset for facial recognition might contain input vectors with facial image data alongside output vectors encoding the photographed persons name. A labeled set (or dataset) is a collection of labeled pairs. 

Given a labeled collection of handwritten digits, you may wish to predict the label of a previously unseen handwritten digit. Similarly, given a dataset of emails that are labeled as being either spam or not spam, a company that wants to create a spam filter would want to predict whether a previously unseen message was spam. All these problems are supervised learning problems.

Supervised ML problems can be further divided into prediction and classification:

  • Classification attempts to label an unknown input sample with a known output value. For example, you could train an algorithm to recognize breeds of cats. The algorithm would classify an unknown cat by labeling it with a known breed.
  • By contrast, prediction algorithms attempt to label an unknown input sample with either a known or unknown output value. This is also known as estimation or regression. A canonical prediction problem is time series forecasting, where the output value of the series is predicted for a time value that was not previously seen.
A classification algorithm will try to associate an input sample with an item from a given list of output categories: for example, deciding whether a photo represents a cat, a dog, or neither is a classification problem. A prediction algorithm will map an input sample to a member of an output domain, which could be continuous: for example, attempting to guess a persons height from their weight and gender would be a prediction problem.

We will cover supervised algorithms in more detail in Chapter 3, Supervised Learning.

主站蜘蛛池模板: 仁怀市| 英超| 河池市| 博爱县| 湘阴县| 泸溪县| 鸡泽县| 郎溪县| 宜丰县| 儋州市| 株洲县| 普陀区| 台中市| 松原市| 得荣县| 墨竹工卡县| 江门市| 玛纳斯县| 乐山市| 宜川县| 安徽省| 靖远县| 新兴县| 潍坊市| 丹东市| 禹州市| 墨玉县| 西和县| 合阳县| 彩票| 类乌齐县| 买车| 福建省| 天峨县| 邹城市| 晋江市| 保康县| 赤城县| 株洲县| 商南县| 离岛区|