Knowing how supervised learning works is pretty if we can't put it into practice. Thankfully, OpenCV provides a pretty straightforward interface for all its statistical learning models, which includes all supervised learning models.
In OpenCV, every machine learning model derives from the cv::ml::StatModelbase class. This is fancy talk for saying that if we want to be a machine learning model in OpenCV, we have to provide all the functionality that StatModel tells us to. This includes a method to train the model (called train) and a method to measure the performance of the model (called calcError).
In object-oriented programming (OOP), functions are often called objects or classes. An object can itself consist of a number of functions, called methods, as well as variables, called members or attributes. You can learn more about OOP in Python at https://docs.python.org/3/tutorial/classes.html.
Thanks to this organization of the software, setting up a machine learning model in OpenCV always follows the same logic:
Initialization: We call the model by name to create an empty instance of the model.
Set parameters: If the model needs some parameters, we can set them via setter methods, which can be different for every model. For example, in order for a k-NN algorithm to work, we need to specify its open parameter, k (as we will find out later).
Train the model: Every model must provide a method called train, used to fit the model to some data.
Predict new labels: Every model must provide a method called predict, used to predict the labels of new data.
Score the model: Every model must provide a method called calcError, used to measure performance. This calculation might be different for every model.
Because OpenCV is a vast and community-driven project, not every algorithm follows these rules to the extent that we as users might expect. For example, the k-NN algorithm does most of its work in a findNearest method, although predict still works. We will make sure to point out these discrepancies as we work through different examples.
As we will make the occasional use of scikit-learn to implement some machine learning algorithms that OpenCV does not provide, it is worth pointing out that learning algorithms in scikit-learn follow an almost identical logic. The most notable difference is that scikit-learn sets all the required model parameters in the initialization step. In addition, it calls the training function fit instead of train, and the scoring function score instead of calcError.