官术网_书友最值得收藏!

Evaluating the fitness of the model with a cost function

Regression lines produced by several sets of parameter values are plotted in the following figure. How can we assess which parameters produced the best-fitting regression line?

A cost function, also called a loss function, is used to define and measure the error of a model. The differences between the prices predicted by the model and the observed prices of the pizzas in the training set are called residuals, or training errors. Later, we will evaluate the model on a separate set of test data. The differences between the predicted and observed values in the test data are called prediction errors, or test errors. The residuals for our model are indicated by vertical lines between the points for the training instances and the regression hyperplane in the following plot:

We can produce the best pizza-price predictor by minimizing the sum of the residuals. That is, our model fits if the values it predicts for the response variable are close to the observed values for all of the training examples. This measure of the model's fitness is called the residual sum of squares (RSS) cost function. Formally, this function assesses the fitness of a model by summing the squared residuals for all of our training examples. The RSS is calculated with the formula in the following equation, where yi is the observed value and f(xi) is the predicted value:

Let's compute the RSS for our model by adding the following two lines to the previous script:

print('Residual sum of squares: %.2f' % np.mean((model.predict(X)
- y) ** 2))
Residual sum of squares: 1.75

Now that we have a cost function, we can find the values of the model's parameters that minimize it.

主站蜘蛛池模板: 赤水市| 长沙县| 大关县| 大姚县| 横山县| 普兰店市| 瑞丽市| 漳平市| 长白| 文昌市| 榆社县| 北川| 临漳县| 阜新市| 察雅县| 仁寿县| 鄂托克前旗| 正阳县| 凤冈县| 繁峙县| 长沙县| 博兴县| 中卫市| 桂东县| 富蕴县| 红原县| 霍城县| 东莞市| 桐庐县| 舟曲县| 大姚县| 南皮县| 沈丘县| 昆明市| 珠海市| 陈巴尔虎旗| 苗栗市| 车险| 西昌市| 射洪县| 武隆县|