官术网_书友最值得收藏!

Supervised learning algorithms

Supervised algorithms rely on human knowledge to complete their tasks. Let's say we have a dataset related to loan repayment that contains several demographic indicators, as well as whether a loan was paid back or not:

The Paid column, which tells us if a loan was paid back or not, is called the target - it's what we would like to predict. The data that contains information about the applicants background is known as the features of the datasets. In supervised learning, algorithms learn to predict the target based on the features, or in other words, what indicators give a high probability that an applicant will pay back a loan or not? Mathematically, this process looks as follows:

Here, we are saying that our label  is a function of the input features , plus some amount of error  that it caused naturally by the dataset. We know that a certain set of features will likely produce a certain outcome. In supervised learning, we set up an algorithm to learn what function will produce the correct mapping of a set of features to an outcome. 

To illustrate how supervised learning works, we are going to utilize a famous example toy dataset in the machine learning field, the Iris Dataset. It shows four features: Sepal Length, Sepal Width, Petal Length, and Petal Width. In this dataset, our target variable (sometimes called a label) is Name. The dataset is available in the GitHub repository that corresponds with this chapter:

import pandas as pd
data = pd.read_csv("iris.csv")
data.head()

The preceding code generates the following output:

Now that we have our data ready to go, let's jump into some supervised learning!

主站蜘蛛池模板: 邛崃市| 宣恩县| 威海市| 灵台县| 双牌县| 满洲里市| 邯郸市| 开封市| 石泉县| 河池市| 克什克腾旗| 札达县| 新沂市| 兴宁市| 定日县| 甘南县| 井研县| 芷江| 平顺县| 尚义县| 富阳市| 浙江省| 镇原县| 山阴县| 青阳县| 湄潭县| 小金县| 丹寨县| 栾川县| 长乐市| 嵊泗县| 哈尔滨市| 徐汇区| 桑日县| 海安县| 青川县| 泰顺县| 宜兴市| 红桥区| 莎车县| 罗江县|