官术网_书友最值得收藏!

Binary and multiclass classification

The first classifier we saw, the threshold classifier, was a simple binary classifier (the result is either one class or the other as a point is either above the threshold or it is not). The second classifier we used, the nearest neighbor classifier, was a naturally multiclass classifier (the output can be one of several classes).

It is often simpler to define a simple binary method than one that works on multiclass problems. However, we can reduce the multiclass problem to a series of binary decisions. This is what we did earlier in the Iris dataset in a haphazard way; we observed that it was easy to separate one of the initial classes and focused on the other two, reducing the problem to two binary decisions:

  • Is it an Iris Setosa (yes or no)?
  • If no, check whether it is an Iris Virginica (yes or no).

Of course, we want to leave this sort of reasoning to the computer. As usual, there are several solutions to this multiclass reduction.

The simplest is to use a series of "one classifier versus the rest of the classifiers". For each possible label ?, we build a classifier of the type "is this ? or something else?". When applying the rule, exactly one of the classifiers would say "yes" and we would have our solution. Unfortunately, this does not always happen, so we have to decide how to deal with either multiple positive answers or no positive answers.

Alternatively, we can build a classification tree. Split the possible labels in two and build a classifier that asks "should this example go to the left or the right bin?" We can perform this splitting recursively until we obtain a single label. The preceding diagram depicts the tree of reasoning for the Iris dataset. Each diamond is a single binary classifier. It is easy to imagine we could make this tree larger and encompass more decisions. This means that any classifier that can be used for binary classification can also be adapted to handle any number of classes in a simple way.

There are many other possible ways of turning a binary method into a multiclass one. There is no single method that is clearly better in all cases. However, which one you use normally does not make much of a difference to the final result.

Most classifiers are binary systems while many real-life problems are naturally multiclass. Several simple protocols reduce a multiclass problem to a series of binary decisions and allow us to apply the binary models to our multiclass problem.

主站蜘蛛池模板: 于田县| 禄丰县| 沭阳县| 鄂尔多斯市| 黎川县| 棋牌| 旺苍县| 神池县| 利川市| 永德县| 体育| 山东省| 海南省| 无极县| 喀喇沁旗| 饶平县| 榆林市| 高雄县| 榆社县| 阿鲁科尔沁旗| 北流市| 阳新县| 绥江县| 祁连县| 新竹县| 武鸣县| 逊克县| 五寨县| 德令哈市| 民勤县| 枣阳市| 报价| 疏勒县| 岗巴县| 明光市| 赤峰市| 延川县| 丹江口市| 沁源县| 临颍县| 广元市|