- Machine Learning Algorithms
- Giuseppe Bonaccorso
- 369字
- 2021-07-02 18:53:27
Underfitting and overfitting
The purpose of a machine learning model is to approximate an unknown function that associates input elements to output ones (for a classifier, we call them classes). However, a training set is normally a representation of a global distribution, but it cannot contain all possible elements; otherwise the problem could be solved with a one-to-one association. In the same way, we don't know the analytic expression of a possible underlying function, therefore, when training, it's necessary to think about fitting the model but keeping it free to generalize when an unknown input is presented. Unfortunately, this ideal condition is not always easy to find and it's important to consider two different dangers:
- Underfitting: It means that the model isn't able to capture the dynamics shown by the same training set (probably because its capacity is too limited).
- Overfitting: the model has an excessive capacity and it's not more able to generalize considering the original dynamics provided by the training set. It can associate almost perfectly all the known samples to the corresponding output values, but when an unknown input is presented, the corresponding prediction error can be very high.
In the following picture, there are examples of interpolation with low-capacity (underfitting), normal-capacity (normal fitting), and excessive capacity (overfitting):

It's very important to avoid both underfitting and overfitting. Underfitting is easier to detect considering the prediction error, while overfitting may prove to be more difficult to discover as it could be initially considered the result of a perfect fitting.
Cross-validation and other techniques that we're going to discuss in the next chapters can easily show how our model works with test samples never seen during the training phase. That way, it would be possible to assess the generalization ability in a broader context (remember that we're not working with all possible values, but always with a subset that should reflect the original distribution).
However, a generic rule of thumb says that a residual error is always necessary to guarantee a good generalization ability, while a model that shows a validation accuracy of 99.999... percent on training samples is almost surely overfitted and will likely be unable to predict correctly when never-seen input samples are provided.
- ASP.NET Core:Cloud-ready,Enterprise Web Application Development
- Java EE框架整合開發(fā)入門到實(shí)戰(zhàn):Spring+Spring MVC+MyBatis(微課版)
- 動(dòng)手玩轉(zhuǎn)Scratch3.0編程:人工智能科創(chuàng)教育指南
- Python從菜鳥到高手(第2版)
- 組態(tài)軟件技術(shù)與應(yīng)用
- Android程序設(shè)計(jì)基礎(chǔ)
- Babylon.js Essentials
- AutoCAD 2009實(shí)訓(xùn)指導(dǎo)
- ArcGIS for Desktop Cookbook
- Instant Debian:Build a Web Server
- 單片機(jī)原理及應(yīng)用技術(shù)
- 跟戴銘學(xué)iOS編程:理順核心知識(shí)點(diǎn)
- 分布式架構(gòu)原理與實(shí)踐
- Java 9 with JShell
- 嵌入式C編程實(shí)戰(zhàn)