官术网_书友最值得收藏!

Basic tuning

So you've built a model, now what? Can you call it a day? Chances are, you'll have some optimization to do on your model. A key part of the machine learning process is the optimization of our algorithms and methods. In this section, we'll be covering the basic concepts of optimization, and will be continuing our learning of tuning methods throughout the following chapters. 

Sometimes, when our models do not perform well with new data it can be related to them overfitting or underfitting. Let's cover some methods that we can use to prevent this from happening. First off, let's look at the random forest classifier that we trained earlier. In your notebook, call the predict method on it and pass the x_test data in to receive some predictions: 

predicted = rf_classifier.predict(x_test)

From this, we can create evaluate the performance of our classifier through something known as a confusion matrix, which maps out misclassifications for us. Pandas makes this easy for us with the crosstab command:

pd.crosstab(y_test, predicted, rownames=['Actual'], colnames=['Predicted'])

You should see the output as follows: 

As you can see, our model performed fairly well on this dataset (it is a simple one after all!). What happens, however, if our model didn't perform well? Let's take a look at what could happen. 

主站蜘蛛池模板: 武强县| 武夷山市| 白山市| 定南县| 辛集市| 武汉市| 禄劝| 米泉市| 平山县| 始兴县| 民丰县| 南平市| 庄浪县| 连州市| 金秀| 娄烦县| 西乌| 临汾市| 灵宝市| 台前县| 夏邑县| 平湖市| 云和县| 丹寨县| 德兴市| 酒泉市| 广汉市| 东莞市| 奉节县| 开封市| 承德市| 兰考县| 广丰县| 延吉市| 玉林市| 铜川市| 思茅市| 小金县| 新蔡县| 诸暨市| 英吉沙县|