官术网_书友最值得收藏!

Evaluating the fitness of the model with a cost function

Regression lines produced by several sets of parameter values are plotted in the following figure. How can we assess which parameters produced the best-fitting regression line?

A cost function, also called a loss function, is used to define and measure the error of a model. The differences between the prices predicted by the model and the observed prices of the pizzas in the training set are called residuals, or training errors. Later, we will evaluate the model on a separate set of test data. The differences between the predicted and observed values in the test data are called prediction errors, or test errors. The residuals for our model are indicated by vertical lines between the points for the training instances and the regression hyperplane in the following plot:

We can produce the best pizza-price predictor by minimizing the sum of the residuals. That is, our model fits if the values it predicts for the response variable are close to the observed values for all of the training examples. This measure of the model's fitness is called the residual sum of squares (RSS) cost function. Formally, this function assesses the fitness of a model by summing the squared residuals for all of our training examples. The RSS is calculated with the formula in the following equation, where yi is the observed value and f(xi) is the predicted value:

Let's compute the RSS for our model by adding the following two lines to the previous script:

print('Residual sum of squares: %.2f' % np.mean((model.predict(X)
- y) ** 2))
Residual sum of squares: 1.75

Now that we have a cost function, we can find the values of the model's parameters that minimize it.

主站蜘蛛池模板: 壤塘县| 金沙县| 德江县| 昭平县| 永清县| 连州市| 黑山县| 邵武市| 独山县| 山丹县| 泽州县| 中方县| 淅川县| 瑞昌市| 南召县| 临沂市| 鄂伦春自治旗| 西宁市| 东乡县| 松原市| 襄城县| 金昌市| 邓州市| 岳阳市| 阜阳市| 芜湖市| 平江县| 鹤山市| 汨罗市| 唐海县| 大英县| 三原县| 镇原县| 旌德县| 常州市| 汝南县| 绵竹市| 长沙市| 兴隆县| 江门市| 湟源县|