- Learning Data Mining with Python(Second Edition)
- Robert Layton
- 257字
- 2021-07-02 23:40:06
Moving towards a standard workflow
Estimators scikit-learn have two and predict(). We train the algorithm using the
predict() method on our testing set. We evaluate it using the predict() method on our testing set.
- First, we need to create these training and testing sets. As before, import and run the train_test_split function:
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=14)
Then, we import the nearest neighbor class and create an instance for it. We leave the parameters as defaults for now and will test other values later in this chapter. By default, the algorithm will choose the five nearest neighbors to predict the class of a testing sample:
from sklearn.neighbors import KNeighborsClassifier estimator = KNeighborsClassifier()
- After creating our estimator, we must then fit it on our training dataset. For the nearest neighbor class, this training step simply records our dataset, allowing us to find the nearest neighbor for a new data point, by comparing that point to the training dataset:
estimator.fit(X_train, y_train)
- We then train the algorithm with our test set and evaluate with our testing set:
y_predicted = estimator.predict(X_test)
accuracy = np.mean(y_test == y_predicted) * 100
print("The accuracy is {0:.1f}%".format(accuracy))
This model scores 86.4 percent accuracy, which is impressive for a default algorithm and just a few lines of code! Most scikit-learn default parameters are chosen deliberately to work well with a range of datasets. However, you should always aim to choose parameters based on knowledge of the application experiment. We will use strategies for doing this parameter search in later chapters.
- Spring技術(shù)內(nèi)幕:深入解析Spring架構(gòu)與設(shè)計(jì)
- Visual Basic程序設(shè)計(jì)(第3版):學(xué)習(xí)指導(dǎo)與練習(xí)
- 深入淺出DPDK
- 人人都是網(wǎng)站分析師:從分析師的視角理解網(wǎng)站和解讀數(shù)據(jù)
- R的極客理想:工具篇
- 前端HTML+CSS修煉之道(視頻同步+直播)
- Android項(xiàng)目實(shí)戰(zhàn):手機(jī)安全衛(wèi)士開發(fā)案例解析
- Python機(jī)器學(xué)習(xí):預(yù)測分析核心算法
- TMS320LF240x芯片原理、設(shè)計(jì)及應(yīng)用
- 響應(yīng)式Web設(shè)計(jì):HTML5和CSS3實(shí)戰(zhàn)(第2版)
- 零基礎(chǔ)學(xué)C語言第2版
- Maker基地嘉年華:玩轉(zhuǎn)樂動(dòng)魔盒學(xué)Scratch
- 創(chuàng)意UI Photoshop玩轉(zhuǎn)移動(dòng)UI設(shè)計(jì)
- 深入淺出 HTTPS:從原理到實(shí)戰(zhàn)
- 嵌入式C編程實(shí)戰(zhàn)