官术网_书友最值得收藏!

Mode

Finally, we'll talk about mode. This doesn't really come up too often in practice, but you can't talk about mean and median without talking about mode. All mode means, is the most common value in a dataset.

Let's go back to my example of the number of kids in each house.

0, 2, 3, 2, 1, 0, 0, 2, 0

How many of each value are there:

0: 4, 1: 1, 2: 3, 3: 1

The MODE is 0

If I just look at what number occurs most frequently, it turns out to be 0, and the mode therefore of this data is 0. The most common number of children in a given house in this neighborhood is no kids, and that's all that means.

Now this is actually a pretty good example of continuous versus discrete data, because this only really works with discrete data. If I have a continuous range of data then I can't really talk about the most common value that occurs, unless I quantize that somehow into discrete values. So we've already run into one example here where the data type matters.

Mode is usually only relevant to discrete numerical data, and not to continuous data.

A lot of real-world data tends to be continuous, so maybe that's why I don't hear too much about mode, but we see it here for completeness.

There you have it: mean, median, and mode in a nutshell. Kind of the most basic statistics stuff you can possibly do, but I hope you gained a little refresher there in the importance of choosing between median and mean. They can tell very different stories, and yet people tend to equate them in their heads, so make sure you're being a responsible data scientist and representing data in a way that conveys the meaning you're trying to represent. If you're trying to display a typical value, often the median is a better choice than the mean because of outliers, so remember that. Let's move on.

主站蜘蛛池模板: 灵石县| 荥阳市| 灵川县| 农安县| 濮阳县| 仲巴县| 长子县| 山西省| 泰顺县| 吉林市| 泰兴市| 伊金霍洛旗| 连云港市| 临潭县| 伊川县| 宜阳县| 石狮市| 无极县| 中西区| 扎赉特旗| 普兰店市| 江口县| 盘山县| 陈巴尔虎旗| 平定县| 报价| 天台县| 龙泉市| 三台县| 平武县| 新泰市| 安远县| 阿合奇县| 彭山县| 张家港市| 大港区| 常德市| 柳林县| 深州市| 金坛市| 元氏县|