- Machine Learning for OpenCV
- Michael Beyeler
- 113字
- 2021-07-02 19:47:25
Splitting the data into training and test sets
We learned in the previous chapter that it is essential to keep training and test data separate. We can easily split the data using one of scikit-learn's many helper functions:
In [11]: X_train, X_test, y_train, y_test = model_selection.train_test_split(
... data, target, test_size=0.1, random_state=42
... )
Here we want to split the data into 90 percent training data and 10 percent test data, which we specify with test_size=0.1. By inspecting the return arguments, we note that we ended up with exactly 90 training data points and 10 test data points:
In [12]: X_train.shape, y_train.shape
Out[12]: ((90, 4), (90,))
In [13]: X_test.shape, y_test.shape
Out[13]: ((10, 4), (10,))
推薦閱讀
- Spring Boot 2實戰之旅
- Python從小白到大牛
- Cassandra Data Modeling and Analysis
- SQL Server 2012數據庫管理與開發項目教程
- 概率成形編碼調制技術理論及應用
- Visual FoxPro程序設計習題集及實驗指導(第四版)
- FPGA Verilog開發實戰指南:基于Intel Cyclone IV(進階篇)
- HTML5與CSS3基礎教程(第8版)
- 現代C++編程實戰:132個核心技巧示例(原書第2版)
- MATLAB GUI純代碼編寫從入門到實戰
- Learning JavaScript Data Structures and Algorithms(Second Edition)
- 軟件測試項目實戰之功能測試篇
- Server Side development with Node.js and Koa.js Quick Start Guide
- BackTrack 5 Cookbook
- C#編程魔法書