官术网_书友最值得收藏!

Cluster analysis

Cluster analysis is a multivariate analysis technique through which it is possible to group the statistical units so as to minimize the logic distance of each group and the logic distance between the groups. The logic distance is quantified by means of measures of similarity/dissimilarity between the defined statistical units.

The Statistics and Machine Learning Toolbox provides several algorithms to carry out cluster analysis. Available algorithms include:

  • k-means
  • k-medoids
  • Hierarchical clustering
  • GMM
  • HMM

When the number of clusters is unknown, we can use cluster evaluation techniques to determine the number of clusters present in the data based on a specified metric.

A typical cluster analysis result is shown in the following figure:

Figure 1.19: A cluster analysis example

In addition, the Statistics and Machine Learning Toolbox allows viewing clusters by creating a dendrogram plot to display a hierarchical binary cluster tree. Then, we optimize the leaf order to maximize the sum of the similarities between adjacent leaves. Finally, for grouped data with multiple measurements for each group, we create a dendrogram plot based on the group means computed using a multivariate analysis of variance.

主站蜘蛛池模板: 岑溪市| 申扎县| 高陵县| 商丘市| 鹤山市| 温州市| 翁牛特旗| 青冈县| 河曲县| 沧源| 区。| 利津县| 任丘市| 桃园市| 米易县| 石门县| 军事| 凭祥市| 泰来县| 环江| 贵溪市| 金川县| 深泽县| 明水县| 通河县| 荔浦县| 南郑县| 中西区| 上虞市| 竹溪县| 策勒县| 武乡县| 从化市| 万山特区| 浮梁县| 金山区| 景德镇市| 柞水县| 建湖县| 扶沟县| 吉水县|