- Machine Learning with Scala Quick Start Guide
- Md. Rezaul Karim
- 321字
- 2021-06-24 14:32:00
Supervised learning
Supervised learning is the simplest and most well-known automatic learning task. It is based on a number of predefined examples, in which the category to which each of the inputs should belong is already known, as shown in the following diagram:

The preceding diagram shows a typical workflow of supervised learning. An actor (for example, a data scientist or data engineer) performs Extraction Transformation Load (ETL) and the necessary feature engineering (including feature extraction, selection, and so on) to get the appropriate data with features and labels so that they can be fed in to the model. Then he would split the data into training, development, and test sets. The training set is used to train an ML model, the validation set is used to validate the training against the overfitting problem and regularization, and then the actor would evaluate the model's performance on the test set (that is, unseen data).
However, if the performance is not satisfactory, he can perform additional tuning to get the best model based on hyperparameter optimization. Finally, he would deploy the best model in a production-ready environment. The following diagram summarizes these steps in a nutshell:

In the overall life cycle, there might be many actors involved (for example, a data engineer, data scientist, or an ML engineer) to perform each step independently or collaboratively. The supervised learning context includes classification and regression tasks; classification is used to predict which class a data point is a part of (discrete value). It is also used for predicting the label of the class attribute. On the other hand, regression is used for predicting continuous values and making a numeric prediction of the class attribute.
In the context of supervised learning, the learning process required for the input dataset is split randomly into three sets, for example, 60% for the training set, 10% for the validation set, and the remaining 30% for the testing set.
- Internet接入·網(wǎng)絡(luò)安全
- 數(shù)據(jù)展現(xiàn)的藝術(shù)
- Hands-On Neural Networks with Keras
- 數(shù)據(jù)運營之路:掘金數(shù)據(jù)化時代
- PIC單片機C語言非常入門與視頻演練
- Spark大數(shù)據(jù)技術(shù)與應(yīng)用
- 數(shù)據(jù)庫系統(tǒng)原理及應(yīng)用教程(第5版)
- Learn QGIS
- 與人共融機器人的關(guān)節(jié)力矩測量技術(shù)
- 中老年人學(xué)電腦與上網(wǎng)
- Practical Network Automation
- Cisco UCS Cookbook
- 機器人手工制作
- 仿龜機器人的設(shè)計與制作
- Python Data Mining Quick Start Guide