官术网_书友最值得收藏!

Evaluating the fitness of the model with a cost function

Regression lines produced by several sets of parameter values are plotted in the following figure. How can we assess which parameters produced the best-fitting regression line?

A cost function, also called a loss function, is used to define and measure the error of a model. The differences between the prices predicted by the model and the observed prices of the pizzas in the training set are called residuals, or training errors. Later, we will evaluate the model on a separate set of test data. The differences between the predicted and observed values in the test data are called prediction errors, or test errors. The residuals for our model are indicated by vertical lines between the points for the training instances and the regression hyperplane in the following plot:

We can produce the best pizza-price predictor by minimizing the sum of the residuals. That is, our model fits if the values it predicts for the response variable are close to the observed values for all of the training examples. This measure of the model's fitness is called the residual sum of squares (RSS) cost function. Formally, this function assesses the fitness of a model by summing the squared residuals for all of our training examples. The RSS is calculated with the formula in the following equation, where yi is the observed value and f(xi) is the predicted value:

Let's compute the RSS for our model by adding the following two lines to the previous script:

print('Residual sum of squares: %.2f' % np.mean((model.predict(X)
- y) ** 2))
Residual sum of squares: 1.75

Now that we have a cost function, we can find the values of the model's parameters that minimize it.

主站蜘蛛池模板: 贺州市| 子长县| 闵行区| 桦川县| 张家川| 比如县| 临安市| 新化县| 梁河县| 抚顺县| 宁德市| 天长市| 邯郸市| 内丘县| 边坝县| 泰来县| 长春市| 临夏县| 樟树市| 南投市| 盐亭县| 额济纳旗| 台中市| 巫山县| 义乌市| 丰原市| 额尔古纳市| 肥乡县| 广平县| 增城市| 鄂托克旗| 马公市| 隆子县| 依兰县| 闻喜县| 夏津县| 景泰县| 正宁县| 夏河县| 丹阳市| 鄄城县|