官术网_书友最值得收藏!

Evaluating the fitness of the model with a cost function

Regression lines produced by several sets of parameter values are plotted in the following figure. How can we assess which parameters produced the best-fitting regression line?

A cost function, also called a loss function, is used to define and measure the error of a model. The differences between the prices predicted by the model and the observed prices of the pizzas in the training set are called residuals, or training errors. Later, we will evaluate the model on a separate set of test data. The differences between the predicted and observed values in the test data are called prediction errors, or test errors. The residuals for our model are indicated by vertical lines between the points for the training instances and the regression hyperplane in the following plot:

We can produce the best pizza-price predictor by minimizing the sum of the residuals. That is, our model fits if the values it predicts for the response variable are close to the observed values for all of the training examples. This measure of the model's fitness is called the residual sum of squares (RSS) cost function. Formally, this function assesses the fitness of a model by summing the squared residuals for all of our training examples. The RSS is calculated with the formula in the following equation, where yi is the observed value and f(xi) is the predicted value:

Let's compute the RSS for our model by adding the following two lines to the previous script:

print('Residual sum of squares: %.2f' % np.mean((model.predict(X)
- y) ** 2))
Residual sum of squares: 1.75

Now that we have a cost function, we can find the values of the model's parameters that minimize it.

主站蜘蛛池模板: 鹤壁市| 蒙自县| 高平市| 河北区| 申扎县| 沙雅县| 始兴县| 清水河县| 府谷县| 西充县| 文安县| 新乡市| 龙州县| 芒康县| 蒲江县| 尼玛县| 金门县| 耒阳市| 哈巴河县| 鸡东县| 宁夏| 易门县| 铜陵市| 仲巴县| 辉南县| 武义县| 汝州市| 瓦房店市| 小金县| 威海市| 玉山县| 宁晋县| 曲周县| 德化县| 沈阳市| 大港区| 沈阳市| 全南县| 霸州市| 中西区| 汉源县|