官术网_书友最值得收藏!

How it works...

The first step involves simply loading the necessary libraries that will allow us to manipulate data quickly and easily. In steps 2 and 3, we generate a training and testing set consisting of normal observations. These have the same distributions. In step 4, on the other hand, we generate the remainder of our testing set by creating outliers. This anomalous dataset has a different distribution from the training data and the rest of the testing data. Plotting our data, we see that some outlier points look indistinguishable from normal points (step 5). This guarantees that our classifier will have a significant percentage of misclassifications, due to the nature of the data, and we must keep this in mind when evaluating its performance. In step 6, we fit an instance of Isolation Forest with default parameters to the training data.

Note that the algorithm is fed no information about the anomalous data. We use our trained instance of Isolation Forest to predict whether the testing data is normal or anomalous, and similarly to predict whether the anomalous data is normal or anomalous. To examine how the algorithm performs, we append the predicted labels to X_outliers (step 7) and then plot the predictions of the Isolation Forest instance on the outliers (step 8). We see that it was able to capture most of the anomalies. Those that were incorrectly labeled were indistinguishable from normal observations. Next, in step 9, we append the predicted label to X_test in preparation for analysis and then plot the predictions of the Isolation Forest instance on the normal testing data (step 10). We see that it correctly labeled the majority of normal observations. At the same time, there was a significant number of incorrectly classified normal observations (shown in red).

Depending on how many false alarms we are willing to tolerate, we may need to fine-tune our classifier to reduce the number of false positives.

主站蜘蛛池模板: 蒙山县| 郯城县| 高雄市| 柯坪县| 德令哈市| 新乡市| 图们市| 承德市| 丽水市| 宜兰县| 赤峰市| 克拉玛依市| 汝城县| 安阳市| 冀州市| 集安市| 西昌市| 普陀区| 嘉定区| 黄陵县| 西华县| 祁门县| 成安县| 平果县| 饶河县| 丽江市| 彰化县| 新密市| 金秀| 双流县| 上高县| 叙永县| 修文县| 敦化市| 郑州市| 镇远县| 景谷| 德州市| 孟州市| 皋兰县| 高雄市|