書名： Machine Learning for Developers
作者名： Rodolfo Bonnin
本章字數： 255字
更新時間： 2021-07-02 15:46:46

Mean

This is one of the most intuitive and most frequently used concepts in statistics. Given a set of numbers, the mean of that set is the sum of all the elements divided by the number of elements in the set.

The formula that represents the mean is as follows:

Although this is a very simple concept, we will write a Python code sample in which we will create a sample set, represent it as a line plot, and mark the mean of the whole set as a line, which should be at the weighted center of the samples. It will serve as an introduction to Python syntax, and also as a way of experimenting with Jupyter notebooks:

    import matplotlib.pyplot as plt #Import the plot library 
 
    def mean(sampleset):  #Definition header for the mean function 
        total=0 
        for element in sampleset: 
            total=total+element 
        return total/len(sampleset) 
 
    myset=[2.,10.,3.,6.,4.,6.,10.]  #We create the data set 
    mymean=mean(myset) #Call the mean funcion 
    plt.plot(myset)  #Plot the dataset 
    plt.plot([mymean] * 7)  #Plot a line of 7 points located on the mean

This program will output a time series of the dataset elements, and will then draw a line at the mean height.

As the following graph shows, the mean is a succinct (one value) way of describing the tendency of a sample set:

In this first example, we worked with a very homogeneous sample set, so the mean is very informative regarding its values. But let's try the same sample with a very dispersed sample set (you are encouraged to play with the values too):

官术网_书友最值得收藏!

Machine Learning for Developers

Mean