官术网_书友最值得收藏!

Cluster analysis

Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.

The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:

  • k-means
  • k-medoids
  • Hierarchical clustering
  • GMM
  • HMM

When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.

A typical cluster analysis result is shown in the following figure:

Figure 1.19: A cluster analysis example

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.

主站蜘蛛池模板: 从化市| 太谷县| 望谟县| 攀枝花市| 沙湾县| 绵阳市| 常山县| 本溪| 东方市| 青冈县| 遂溪县| 焉耆| 宣恩县| 平潭县| 盐边县| 紫云| 永年县| 手机| 贺州市| 平乡县| 辽宁省| 东平县| 日喀则市| 常州市| 卢湾区| 长春市| 莲花县| 莱州市| 新津县| 大悟县| 安阳县| 湘潭县| 宣城市| 桂林市| 临武县| 永丰县| 宁波市| 喀喇沁旗| 尖扎县| 揭阳市| 阿拉善左旗|