官术网_书友最值得收藏!

Label encoding

Humans are able to deal with various types of values. Machine learning algorithms (with some exceptions) need numerical values. If we offer a string such as Ivan, unless we're using specialized software, the program won't know what to do. In this example, we're dealing with a categorical feature—names, probably. We can consider each unique value to be a label. (In this particular example, we also need to decide what to do with the case—is Ivan the same as ivan?). We can then replace each label with an integer—label encoding.

The following example shows how label encoding works:

This approach can be problematic, because the learner may conclude that there's an order. For example, Asia and North America in the preceding case differ by 4 after encoding, which is a bit counter-intuitive.

主站蜘蛛池模板: 同仁县| 涿鹿县| 日照市| 霞浦县| 湘阴县| 毕节市| 宜良县| 斗六市| 元朗区| 阳东县| 怀化市| 长海县| 芦山县| 汾阳市| 巴里| 石河子市| 镇巴县| 吴堡县| 苍溪县| 祁连县| 聂拉木县| 香港| 麻栗坡县| 乌兰察布市| 武汉市| 东阿县| 武隆县| 孝昌县| 黑龙江省| 章丘市| 兰坪| 庄浪县| 潮安县| 新泰市| 牙克石市| 本溪市| 东源县| 商洛市| 海口市| 新宾| 长治市|