官术网_书友最值得收藏!

Evaluating the model

Since it's a binary classification problem, we need the BinaryClassificationEvaluator() estimator to evaluate the model's performance on the test set:

val evaluator = new BinaryClassificationEvaluator()
.setLabelCol("label")

Now that the training is completed and we have a trained decision tree model, we can evaluate the trained model on the test set:

val predictionDF = dtModel.transform(testDF)

Finally, we compute the classification accuracy:

val accuracy = evaluator.evaluate(predictionDF)
println("Accuracy = " + accuracy)

You should experience about 96% classification accuracy:

Accuracy =  0.9675436785432

Finally, we stop the SparkSession by invoking the stop() method:

spark.stop()

We have managed to achieve about 96% accuracy with minimum effort. However, there are other performance metrics such as precision, recall, and F1 measure. We will discuss them in upcoming chapters. Also, if you're a newbie to ML and haven't understood all the steps in this example, don't worry. We'll recap all of these steps in other chapters with various other examples.

主站蜘蛛池模板: 武平县| 兴山县| 神木县| 锦州市| 吉安市| 文水县| 秀山| 新宾| 宁都县| 友谊县| 汾西县| 卢湾区| 衡东县| 红河县| 香格里拉县| 门头沟区| 读书| 丰镇市| 甘孜县| 郁南县| 贵定县| 双柏县| 清镇市| 安泽县| 固镇县| 营山县| 汝州市| 永定县| 秦皇岛市| 离岛区| 永丰县| 平果县| 醴陵市| 将乐县| 平遥县| 昌图县| 美姑县| 古蔺县| 四会市| 兴和县| 商河县|