- Statistics for Machine Learning
- Pratap Dangeti
- 320字
- 2021-07-02 19:05:58
Compensating factors in machine learning models
Compensating factors in machine learning models to equate statistical diagnostics is explained with the example of a beam being supported by two supports. If one of the supports doesn't exist, the beam will eventually fall down by moving out of balance. A similar analogy is applied for comparing statistical modeling and machine learning methodologies here.
The two-point validation is performed on the statistical modeling methodology on training data using overall model accuracy and individual parameters significance test. Due to the fact that either linear or logistic regression has less variance by shape of the model itself, hence there would be very little chance of it working worse on unseen data. Hence, during deployment, these models do not incur too many deviated results.
However, in the machine learning space, models have a high degree of flexibility which can change from simple to highly complex. On top, statistical diagnostics on individual variables are not performed in machine learning. Hence, it is important to ensure the robustness to avoid overfitting of the models, which will ensure its usability during the implementation phase to ensure correct usage on unseen data.
As mentioned previously, in machine learning, data will be split into three parts (train data - 50 percent, validation data - 25 percent, testing data - 25 percent) rather than two parts in statistical methodology. Machine learning models should be developed on training data, and its hyperparameters should be tuned based on validation data to ensure the two-point validation equivalence; this way, the robustness of models is ensured without diagnostics performed at an individual variable level:

Before diving deep into comparisons between both streams, we will start understanding the fundamentals of each model individually. Let us start with linear regression! This model might sound trivial; however, knowing the linear regression working principles will create a foundation for more advanced statistical and machine learning models. Below are the assumptions of linear regression.
- DevOps:軟件架構師行動指南
- 解構產品經理:互聯網產品策劃入門寶典
- Python高效開發實戰:Django、Tornado、Flask、Twisted(第3版)
- 用Flutter極速構建原生應用
- C語言程序設計案例精粹
- FLL+WRO樂高機器人競賽教程:機械、巡線與PID
- Jenkins Continuous Integration Cookbook(Second Edition)
- Natural Language Processing with Java and LingPipe Cookbook
- 搞定J2EE:Struts+Spring+Hibernate整合詳解與典型案例
- 一本書講透Java線程:原理與實踐
- 時空數據建模及其應用
- OpenStack Networking Essentials
- 零代碼實戰:企業級應用搭建與案例詳解
- Python物理建模初學者指南(第2版)
- Using Yocto Project with BeagleBone Black