官术网_书友最值得收藏!

K-Nearest Neighbors

KNN is a simple model for regression and classification tasks. It is so simple that its name describes most of its learning algorithm. The titular neighbors are representations of training instances in a metric space. A metric space is a feature space in which the distances between all members of a set are defined. In the previous chapter's pizza problem, our training instances were represented in a metric space because the distances between all the pizza diameters was defined. These neighbors are used to estimate the value of the response variable for a test instance. The hyperparameter k specifies how many neighbors can be used in the estimation. A hyperparameter is a parameter that controls how the algorithm learns; hyperparameters are not estimated from the training data and are sometimes set manually. Finally, the k neighbors that are selected are those that are nearest to the test instance, as measured by some distance function.

For classification tasks, a set of tuples of feature vectors and class labels comprise the training set. KNN is a capable of binary, multi-class, and multi-label classification; we will define these tasks later, and we will focus on binary classification in this chapter. The simplest KNN classifiers use the mode of the KNN labels to classify test instances, but other strategies can be used. The k is often set to an odd number to prevent ties. In regression tasks, the feature vectors are each associated with a response variable that takes a real-valued scalar instead of a label. The prediction is the mean or weighted mean of the KNN response variables.

主站蜘蛛池模板: 德保县| 临泉县| 泰州市| 宁武县| 康平县| 昌图县| 永吉县| 慈溪市| 封开县| 韩城市| 来宾市| 三原县| 米脂县| 卓尼县| 合阳县| 宜宾县| 福贡县| 仁寿县| 江都市| 黑水县| 松潘县| 任丘市| 西乡县| 民勤县| 丹阳市| 桃源县| 健康| 太仆寺旗| 项城市| 馆陶县| 宣化县| 孙吴县| 乌拉特中旗| 香河县| 武功县| 大埔区| 东方市| 灵宝市| 红原县| 景宁| 虎林市|