- Statistics for Machine Learning
- Pratap Dangeti
- 452字
- 2021-07-02 19:05:58
Example of simple linear regression from first principles
The entire chapter has been presented with the popular wine quality dataset which is openly available from the UCI machine learning repository at https://archive.ics.uci.edu/ml/datasets/Wine+Quality.
Simple linear regression is a straightforward approach for predicting the dependent/response variable Y given the independent/predictor variable X. It assumes a linear relationship between X and Y:

β0 and β1 are two unknown constants which are intercept and slope parameters respectively. Once we determine the constants, we can utilize them for the prediction of the dependent variable:






Residuals are the differences between the ith observed response value and the ith response value that is predicted from the model. Residual sum of squares is shown. The least squares approach chooses estimates by minimizing errors.
In order to prove statistically that linear regression is significant, we have to perform hypothesis testing. Let's assume we start with the null hypothesis that there is no significant relationship between X and Y:



Since, if β1 = 0, then the model shows no association between both variables (Y = β0 + ε), these are the null hypothesis assumptions; in order to prove this assumption right or wrong, we need to determine β1 is sufficiently far from 0 (statistically significant in distance from 0 to be precise), that we can be confident that β1 is nonzero and have a significant relationship between both variables. Now, the question is, how far is far enough from zero? It depends on the distribution of β1, which is its mean and standard error (similar to standard deviation). In some cases, if the standard error is small, even relatively small values may provide strong evidence that β1 ≠ 0, hence there is a relationship between X and Y. In contrast, if SE(β1) is large, then β1 must be large in absolute value in order for us to reject the null hypothesis. We usually perform the following test to check how many standard deviations β1 is away from the value 0:



With this t value, we calculate the probability of observing any value equal to |t| or larger, assuming β1 = 0; this probability is also known as the p-value. If p-value < 0.05, it signifies that β1 is significantly far from 0, hence we can reject the null hypothesis and agree that there exists a strong relationship, whereas if p-value > 0.05, we accept the null hypothesis and conclude that there is no significant relationship between both variables.
Once we have the coefficient values, we will try to predict the dependent value and check for the R-squared value; if the value is >= 0.7, it means the model is good enough to deploy on unseen data, whereas if it is not such a good value (<0.6), we can conclude that this model is not good enough to deploy.
- Learning ROS for Robotics Programming(Second Edition)
- Software Testing using Visual Studio 2012
- Three.js開發(fā)指南:基于WebGL和HTML5在網(wǎng)頁上渲染3D圖形和動(dòng)畫(原書第3版)
- C語言程序設(shè)計(jì)實(shí)訓(xùn)教程
- Instant 960 Grid System
- Java虛擬機(jī)字節(jié)碼:從入門到實(shí)戰(zhàn)
- Python Network Programming Cookbook(Second Edition)
- C語言程序設(shè)計(jì)
- 微服務(wù)從小白到專家:Spring Cloud和Kubernetes實(shí)戰(zhàn)
- 智能搜索和推薦系統(tǒng):原理、算法與應(yīng)用
- Mastering ArcGIS Enterprise Administration
- C++程序設(shè)計(jì)教程(第2版)
- Mastering Adobe Captivate 7
- Hands-On Robotics Programming with C++
- Mastering Concurrency in Python