官术网_书友最值得收藏!

Example of multilinear regression - step-by-step methodology of model building

In this section, we actually show the approach followed by industry experts while modeling using linear regression with sample wine data. The statmodels.api package has been used for multiple linear regression demonstration purposes instead of scikit-learn, due to the fact that the former provides diagnostics on variables, whereas the latter only provides final accuracy, and so on:

>>> import numpy as np 
>>> import pandas as pd 
>>> import statsmodels.api as sm 
>>> import matplotlib.pyplot as plt 
>>> import seaborn as sns 
>>> from sklearn.model_selection import train_test_split     
>>> from sklearn.metrics import r2_score 
 
>>> wine_quality = pd.read_csv("winequality-red.csv",sep=';')   
# Step for converting white space in columns to _ value for better handling  
>>> wine_quality.rename(columns=lambda x: x.replace(" ", "_"), inplace=True) 
>>> eda_colnms = [ 'volatile_acidity',  'chlorides', 'sulphates', 'alcohol','quality'] 
# Plots - pair plots 
>>> sns.set(style='whitegrid',context = 'notebook') 

Pair plots for sample five variables are shown as follows; however, we encourage you to try various combinations to check various relationships visually between the various other variables:

>>> sns.pairplot(wine_quality[eda_colnms],size = 2.5,x_vars= eda_colnms, y_vars= eda_colnms) 
>>> plt.show() 

In addition to visual plots, correlation coefficients are calculated to show the level of correlation in numeric terminology; these charts are used to drop variables in the initial stage, if there are many of them to start with:

>>> # Correlation coefficients 
>>> corr_mat = np.corrcoef(wine_quality[eda_colnms].values.T) 
>>> sns.set(font_scale=1) 
>>> full_mat = sns.heatmap(corr_mat, cbar=True, annot=True, square=True, fmt='.2f',annot_kws={'size': 15}, yticklabels=eda_colnms, xticklabels=eda_colnms) 
>>> plt.show() 
主站蜘蛛池模板: 瑞金市| 西安市| 广西| 华安县| 法库县| 金华市| 祁连县| 台州市| 赫章县| 凌云县| 岳西县| 昌宁县| 综艺| 繁昌县| 杭锦旗| 木里| 宁陕县| 临夏县| 三亚市| 上杭县| 永宁县| 团风县| 渭南市| 碌曲县| 库伦旗| 万载县| 三亚市| 古蔺县| 灵山县| 沁源县| 福安市| 保康县| 双桥区| 金阳县| 宝坻区| 平泉县| 密山市| 沁阳市| 临澧县| 黄石市| 灯塔市|