官术网_书友最值得收藏!

Measures of dispersion

While measures of central tendency try to give an idea about where data is centered, measures of dispersion are meant to give a general idea about how data is distributed around the center. Standard deviation and variance are the most popular measures of dispersion. The square root of the variance equals the standard deviation. It's very easy to get both values with R:

sd(big_sample, na.rm = T)
# outputs [1] 5.01836
var(big_sample, na.rm = T)
# outputs [1] 25.18394
Keep in mind that these computations we've done so far are estimations from the (real) parameters, not parameters itself.

The sd() function estimates the standard deviation while var() estimates the variation. In most cases, we find ourselves with a DataFrame full of variables we want to analyze. One way out of this is to use a function that will quickly summarize the whole dataset. These functions usually work equally well both with vectors and DataFrame objects. The next section introduces a couple of them.

主站蜘蛛池模板: 成都市| 吉林市| 宣城市| 宁化县| 启东市| 申扎县| 七台河市| 广元市| 三都| 东辽县| 临澧县| 库尔勒市| 红安县| 富顺县| 贞丰县| 兴化市| 嘉祥县| 祁连县| 江阴市| 辽阳县| 兰考县| 曲阜市| 正宁县| 商都县| 湘潭市| 万宁市| 尚义县| 大安市| 清水河县| 芦山县| 江津市| 图木舒克市| 乌拉特前旗| 三江| 雷州市| 阿荣旗| 衡山县| 柯坪县| 曲周县| 武冈市| 青河县|