官术网_书友最值得收藏!

Example of multilinear regression - step-by-step methodology of model building

In this section, we actually show the approach followed by industry experts while modeling using linear regression with sample wine data. The statmodels.api package has been used for multiple linear regression demonstration purposes instead of scikit-learn, due to the fact that the former provides diagnostics on variables, whereas the latter only provides final accuracy, and so on:

>>> import numpy as np 
>>> import pandas as pd 
>>> import statsmodels.api as sm 
>>> import matplotlib.pyplot as plt 
>>> import seaborn as sns 
>>> from sklearn.model_selection import train_test_split     
>>> from sklearn.metrics import r2_score 
 
>>> wine_quality = pd.read_csv("winequality-red.csv",sep=';')   
# Step for converting white space in columns to _ value for better handling  
>>> wine_quality.rename(columns=lambda x: x.replace(" ", "_"), inplace=True) 
>>> eda_colnms = [ 'volatile_acidity',  'chlorides', 'sulphates', 'alcohol','quality'] 
# Plots - pair plots 
>>> sns.set(style='whitegrid',context = 'notebook') 

Pair plots for sample five variables are shown as follows; however, we encourage you to try various combinations to check various relationships visually between the various other variables:

>>> sns.pairplot(wine_quality[eda_colnms],size = 2.5,x_vars= eda_colnms, y_vars= eda_colnms) 
>>> plt.show() 

In addition to visual plots, correlation coefficients are calculated to show the level of correlation in numeric terminology; these charts are used to drop variables in the initial stage, if there are many of them to start with:

>>> # Correlation coefficients 
>>> corr_mat = np.corrcoef(wine_quality[eda_colnms].values.T) 
>>> sns.set(font_scale=1) 
>>> full_mat = sns.heatmap(corr_mat, cbar=True, annot=True, square=True, fmt='.2f',annot_kws={'size': 15}, yticklabels=eda_colnms, xticklabels=eda_colnms) 
>>> plt.show() 
主站蜘蛛池模板: 土默特右旗| 固安县| 和田县| 叶城县| 中西区| 清苑县| 邹平县| 和硕县| 宜阳县| 富锦市| 二连浩特市| 南溪县| 乾安县| 大庆市| 措美县| 金溪县| 张北县| 长兴县| 南华县| 扎赉特旗| 晋江市| 伊金霍洛旗| 沭阳县| 定结县| 衡阳市| 黄石市| 泉州市| 四子王旗| 白河县| 保定市| 河南省| 都兰县| 西乡县| 太原市| 广昌县| 渝中区| 乳源| 奉新县| 旺苍县| 星座| 永修县|