官术网_书友最值得收藏!

Data analysis and visualization

In order to understand the underlying form of the data, the relationship between the features and response, and more insights, we can use different types of visualization. To understand the relationship between the advertising data features and response, we are going to use a scatterplot.

In order to make different types of visualizations of your data, you can use Matplotlib (https://matplotlib.org/), which is a Python 2D library for making visualizations. To get Matplotlib, you can follow their installation instructions at: https://matplotlib.org/users/installing.html.

Let's import the visualization library Matplotlib:

import matplotlib.pyplot as plt

# The next line will allow us to make inline plots that could appear directly in the notebook
# without poping up in a different window
%matplotlib inline

Now, let's use a scatterplot to visualize the relationship between the advertising data features and response variable:

fig, axs = plt.subplots(1, 3, sharey=True)

# Adding the scatterplots to the grid
advertising_data.plot(kind='scatter', x='TV', y='sales', ax=axs[0], figsize=(16, 8))
advertising_data.plot(kind='scatter', x='radio', y='sales', ax=axs[1])
advertising_data.plot(kind='scatter', x='newspaper', y='sales', ax=axs[2])

Output:

Figure 1: Scatter plot for understanding the relationship between the advertising data features and the response variable

Now, we need to see how the ads will help increase the sales. So, we need to ask ourselves a couple of questions about that. Worthwhile questions to ask will be something like the relationship between the ads and sales, which kind of ads contribute more to the sales, and the approximate effect of each type of ad on the sales. We will try to answer such questions using a simple linear model.

主站蜘蛛池模板: 陆川县| 乌苏市| 乌苏市| 巴楚县| 拉萨市| 宁国市| 乌兰浩特市| 庄河市| 张北县| 陆河县| 彭阳县| 万州区| 凤庆县| 通化市| 江门市| 玉溪市| 尤溪县| 鹿泉市| 固阳县| 米易县| 永川市| 瓦房店市| 馆陶县| 广宁县| 常德市| 兴业县| 玛纳斯县| 乐至县| 政和县| 汪清县| 吉林省| 镇康县| 双流县| 望都县| 涞源县| 宜昌市| 垫江县| 北票市| 沙雅县| 邹城市| 湛江市|