官术网_书友最值得收藏!

Example of multilinear regression - step-by-step methodology of model building

In this section, we actually show the approach followed by industry experts while modeling using linear regression with sample wine data. The statmodels.api package has been used for multiple linear regression demonstration purposes instead of scikit-learn, due to the fact that the former provides diagnostics on variables, whereas the latter only provides final accuracy, and so on:

>>> import numpy as np 
>>> import pandas as pd 
>>> import statsmodels.api as sm 
>>> import matplotlib.pyplot as plt 
>>> import seaborn as sns 
>>> from sklearn.model_selection import train_test_split     
>>> from sklearn.metrics import r2_score 
 
>>> wine_quality = pd.read_csv("winequality-red.csv",sep=';')   
# Step for converting white space in columns to _ value for better handling  
>>> wine_quality.rename(columns=lambda x: x.replace(" ", "_"), inplace=True) 
>>> eda_colnms = [ 'volatile_acidity',  'chlorides', 'sulphates', 'alcohol','quality'] 
# Plots - pair plots 
>>> sns.set(style='whitegrid',context = 'notebook') 

Pair plots for sample five variables are shown as follows; however, we encourage you to try various combinations to check various relationships visually between the various other variables:

>>> sns.pairplot(wine_quality[eda_colnms],size = 2.5,x_vars= eda_colnms, y_vars= eda_colnms) 
>>> plt.show() 

In addition to visual plots, correlation coefficients are calculated to show the level of correlation in numeric terminology; these charts are used to drop variables in the initial stage, if there are many of them to start with:

>>> # Correlation coefficients 
>>> corr_mat = np.corrcoef(wine_quality[eda_colnms].values.T) 
>>> sns.set(font_scale=1) 
>>> full_mat = sns.heatmap(corr_mat, cbar=True, annot=True, square=True, fmt='.2f',annot_kws={'size': 15}, yticklabels=eda_colnms, xticklabels=eda_colnms) 
>>> plt.show() 
主站蜘蛛池模板: 兴海县| 务川| 湖南省| 西乌珠穆沁旗| 顺义区| 蕉岭县| 额济纳旗| 科尔| 乐陵市| 高阳县| 确山县| 通海县| 平阳县| 封丘县| 客服| 封丘县| 巢湖市| 余姚市| 启东市| 佛冈县| 乌鲁木齐市| 集贤县| 沁源县| 安康市| 西华县| 昌乐县| 阜南县| 额尔古纳市| 柘荣县| 枝江市| 平塘县| 永仁县| 隆回县| 祁门县| 高邮市| 伽师县| 高州市| 巩义市| 馆陶县| 察隅县| 和平县|