- Machine Learning in Java
- AshishSingh Bhatia Bostjan Kaluza
- 176字
- 2021-06-10 19:30:05
Train and test sets
To estimate the generalization error, we split our data into two parts: training data and testing data. A general rule of thumb is to split them by the training: testing ratio, that is, 70:30. We first train the predictor on the training data, then predict the values for the test data, and finally, compute the error, that is, the difference between the predicted and the true values. This gives us an estimate of the true generalization error.
The estimation is based on the two following assumptions: first, we assume that the test set is an unbiased sample from our dataset; and second, we assume that the actual new data will reassemble the distribution as our training and testing examples. The first assumption can be mitigated by cross-validation and stratification. Also, if it is scarce, one can't afford to leave out a considerable amount of data for a separate test set, as learning algorithms do not perform well if they don't receive enough data. In such cases, cross-validation is used instead.
- Dreamweaver CS3 Ajax網頁設計入門與實例詳解
- AutoCAD繪圖實用速查通典
- Div+CSS 3.0網頁布局案例精粹
- 嵌入式系統應用
- JavaScript實例自學手冊
- 腦動力:PHP函數速查效率手冊
- Getting Started with Containerization
- 嵌入式Linux上的C語言編程實踐
- 大數據時代
- Splunk Operational Intelligence Cookbook
- Visual Studio 2010 (C#) Windows數據庫項目開發
- Learning Apache Apex
- EJB JPA數據庫持久層開發實踐詳解
- Learning iOS 8 for Enterprise
- 網頁設計與制作