書名： Machine Learning in Java
作者名： AshishSingh Bhatia Bostjan Kaluza
本章字?jǐn)?shù)： 209字
更新時(shí)間： 2021-06-10 19:30:00

Evaluating classification

Is our classifier doing well? Is this better than the other one? In classification, we count how many times we classify something right and wrong. Suppose there are two possible classification labels of yes and no, then there are four possible outcomes, as shown in the following table:

The four variables:

True positive (hit): This indicates a yes instance correctly predicted as yes
True negative (correct rejection): This indicates a no instance correctly predicted as no
False positive (false alarm): This indicates a no instance predicted as yes
False negative (miss): This indicates a yes instance predicted as no

The basic two performance measures of a classifier are, firstly, classification error:

And, secondly, classification accuracy is another performance measure, as shown here:

The main problem with these two measures is that they cannot handle unbalanced classes. Classifying whether a credit card transaction is an abuse or not is an example of a problem with unbalanced classes: there are 99.99% normal transactions and just a tiny percentage of abuses. The classifier that says that every transaction is a normal one is 99.99% accurate, but we are mainly interested in those few classifications that occur very rarely.

官术网_书友最值得收藏!

Machine Learning in Java

Evaluating classification