官术网_书友最值得收藏!

Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

The four variables:

  • True positive (hit): This indicates a yes instance correctly predicted as yes
  • True negative (correct rejection): This indicates a no instance correctly predicted as no
  • False positive (false alarm): This indicates a no instance predicted as yes
  • False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

主站蜘蛛池模板: 镇安县| 漳平市| 甘泉县| 丁青县| 中山市| 广宗县| 洱源县| 鱼台县| 陵水| 钟祥市| 驻马店市| 西城区| 崇左市| 神木县| 井冈山市| 呼伦贝尔市| 巨野县| 河东区| 威海市| 安岳县| 那曲县| 周口市| 连南| 卓资县| 临沭县| 五常市| 桂平市| 巴林右旗| 油尖旺区| 盐山县| 武功县| 长宁区| 扎兰屯市| 育儿| 滦平县| 波密县| 苏尼特右旗| 阿尔山市| 河西区| 盐津县| 安义县|