官术网_书友最值得收藏!

Evaluating the model

As you saw when running the trainer component of the sample project, there are various elements of model evaluation. For each model type, there are different metrics to look at when analyzing the performance of a model.

In binary classification models like the one found in the example project, the following properties are exposed in CalibratedBiniaryClassificationMetrics that we set after calling the Evaluate method. However, first, we need to define the four prediction types in a binary classification:

  • True negative: Properly classified as negative
  • True positive: Properly classified as positive
  • False negative: Improperly classified as negative
  • False positive: Improperly classified as positive

The first metric to understand is Accuracy. As the name implies, accuracy is one of the most commonly used metrics when evaluating a model. This metric is calculated simply as the ratio of correctly classified predictions to total classifications.

The next metric to understand is Precision. Precision is defined as the proportion of true results over all the positive results in a model. For example, a precision of 1 means there were no false positives, an ideal scenario. A false positive is classifying something as positive when it should be classified as negative, as mentioned previously. A common example of a false positive is misclassifying a file as malicious when it is actually benign.

The next metric to understand is Recall. Recall is the fraction of all correct results returned by the model. For example, a recall of 1 means there were no false negatives, another ideal scenario. A false negative is classifying something as negative when it should have been classified as positive.

The next metric to understand is the F-score, which utilizes both precision and recall, producing a weighted average based on the false positives and false negatives. F-scores give another perspective on the performance of the model compared to simply looking at accuracy. The range of values is between 0 and 1, with an ideal value of 1.

Area Under the Curve, also referred to as AUC, is, as the name implies, the area under the curve plotted with true positives on the y-axis and false positives on the x-axis. For classifiers such as the model that we trained earlier in this chapter, as you saw, this returned values of between 0 and 1.

Lastly, Average Log Loss and Training Log Loss are both used to further explain the performance of the model. The average log loss is effectively expressing the penalty for wrong results in a single number by taking the difference between the true classification and the one the model predicts. Training log loss represents the uncertainty of the model using probability versus the known values. As you train your model, you will look to have a low number (lower numbers are better).

As regards the other model types, we will deep dive into how to evaluate them in their respective chapters, where we will cover regression and clustering metrics.

主站蜘蛛池模板: 曲沃县| 永嘉县| 喜德县| 邵阳县| 宝清县| 鄯善县| 瓦房店市| 江口县| 柳林县| 青岛市| 阿瓦提县| 泰顺县| 霍山县| 汶川县| 偃师市| 武穴市| 乐昌市| 许昌市| 巴塘县| 武隆县| 祁连县| 裕民县| 米易县| 荔波县| 京山县| 庆元县| 宝应县| 石楼县| 志丹县| 澄迈县| 大洼县| 于田县| 天全县| 虹口区| 池州市| 察雅县| 封开县| 拜泉县| 仁布县| 大港区| 贵溪市|