- Statistics for Data Science
- James D. Miller
- 224字
- 2021-07-02 14:58:55
Cross-validation
Cross-validation is a method for assessing a data science process performance. Mainly used with predictive modeling to estimate how accurately a model might perform in practice, one might see cross-validation used to check how a model will potentially generalize, in other words, how the model can apply what it infers from samples to an entire population (or recordset).
With cross-validation, you identify a (known) dataset as your validation dataset on which training is run along with a dataset of unknown data (or first seen data) against which the model will be tested (this is known as your testing dataset). The objective is to ensure that problems such as overfitting (allowing non-inclusive information to influence results) are controlled and also provide an insight into how the model will generalize a real problem or on a real data file.
The cross-validation process will consist of separating data into samples of similar subsets, performing the analysis on one subset (called the training set) and validating the analysis on the other subset (called the validation set or testing set). To reduce variability, multiple iterations (also called folds or rounds) of cross-validation are performed using different partitions, and the validation results are averaged over the rounds. Typically, a data scientist will use a models stability to determine the actual number of rounds of cross-validation that should be performed.
- 亮劍.NET:.NET深入體驗與實戰精要
- AWS:Security Best Practices on AWS
- Effective DevOps with AWS
- Maya 2012從入門到精通
- 計算機系統結構
- Visual Basic.NET程序設計
- 工業機器人安裝與調試
- TensorFlow Reinforcement Learning Quick Start Guide
- Machine Learning Algorithms(Second Edition)
- 青少年VEX IQ機器人實訓課程(初級)
- Hands-On SAS for Data Analysis
- RealFlow流體制作經典實例解析
- WPF專業編程指南
- 教育創新與創新人才:信息技術人才培養改革之路(四)
- Hands-On Microservices with C#