官术网_书友最值得收藏!

Binary and multiclass classification

The first classifier we saw, the threshold classifier, was a simple binary classifier (the result is either one class or the other as a point is either above the threshold or it is not). The second classifier we used, the nearest neighbor classifier, was a naturally multiclass classifier (the output can be one of several classes).

It is often simpler to define a simple binary method than one that works on multiclass problems. However, we can reduce the multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset in a haphazard way; we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions:

  • Is it an Iris Setosa (yes or no)?
  • If no, check whether it is an Iris Virginica (yes or no).

Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction.

The simplest is to use a series of "one classifier versus the rest of the classifiers". For each possible label ?, we build a classifier of the type "is this ? or something else?". When applying the rule, exactly one of the classifiers would say "yes" and we would have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers.

Alternatively, we can build a classification tree. Split the possible labels in two and build a classifier that asks "should this example go to the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way.

There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. However, which one you use normally does not make much of a difference to the final result.

Most classifiers are binary systems while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem.

主站蜘蛛池模板: 嵩明县| 泽库县| 许昌县| 东城区| 彭州市| 无极县| 合水县| 凤山县| 土默特右旗| 共和县| 梅河口市| 万山特区| 石城县| 肇源县| 清新县| 奉新县| 辽宁省| 海林市| 眉山市| 连平县| 永和县| 浠水县| 湖南省| 宝兴县| 平潭县| 花莲县| 曲沃县| 方山县| 山东| 建始县| 木里| 万源市| 平远县| 晋州市| 桐乡市| 昭平县| 秦皇岛市| 西藏| 磐安县| 泰来县| 五常市|