官术网_书友最值得收藏!

Data analysis and visualization

In order to understand the underlying form of the data, the relationship between the features and response, and more insights, we can use different types of visualization. To understand the relationship between the advertising data features and response, we are going to use a scatterplot.

In order to make different types of visualizations of your data, you can use Matplotlib (https://matplotlib.org/), which is a Python 2D library for making visualizations. To get Matplotlib, you can follow their installation instructions at: https://matplotlib.org/users/installing.html.

Let's import the visualization library Matplotlib:

import matplotlib.pyplot as plt

# The next line will allow us to make inline plots that could appear directly in the notebook
# without poping up in a different window
%matplotlib inline

Now, let's use a scatterplot to visualize the relationship between the advertising data features and response variable:

fig, axs = plt.subplots(1, 3, sharey=True)

# Adding the scatterplots to the grid
advertising_data.plot(kind='scatter', x='TV', y='sales', ax=axs[0], figsize=(16, 8))
advertising_data.plot(kind='scatter', x='radio', y='sales', ax=axs[1])
advertising_data.plot(kind='scatter', x='newspaper', y='sales', ax=axs[2])

Output:

Figure 1: Scatter plot for understanding the relationship between the advertising data features and the response variable

Now, we need to see how the ads will help increase the sales. So, we need to ask ourselves a couple of questions about that. Worthwhile questions to ask will be something like the relationship between the ads and sales, which kind of ads contribute more to the sales, and the approximate effect of each type of ad on the sales. We will try to answer such questions using a simple linear model.

主站蜘蛛池模板: 阿克| 美姑县| 酉阳| 花莲县| 佛教| 廉江市| 南城县| 嵩明县| 衡南县| 简阳市| 元氏县| 大田县| 微山县| 阳新县| 大足县| 莒南县| 徐州市| 章丘市| 商水县| 湄潭县| 金乡县| 沐川县| 新乡县| 茶陵县| 商丘市| 霍林郭勒市| 忻州市| 习水县| 隆昌县| 海林市| 鹤山市| 石门县| 黄平县| 延庆县| 江门市| 香格里拉县| 观塘区| 资阳市| 阿克苏市| 东辽县| 宝清县|