官术网_书友最值得收藏!

K-means

K-means is a clustering algorithm that groups the elements of a dataset into k distinct clusters (hence the k in the name). Here is how it works:

  1. Choose k random points, called centroids, from the feature space, which will represent the center of each of the k clusters.
  2. Assign each sample of the dataset (that is, each point in the feature space) to the cluster with the closest centroid.
  3. For each cluster, we recomputed new centroids by taking the mean values of all the points in the cluster.
  4. With the new centroids, we repeat steps 2 and 3 until the stopping criteria is met.

The preceding method is sensitive to the initial choice of random centroids and it may be a good idea to repeat it with different initial choices. It's also possible for some centroids to not be close to any of the points in the dataset, reducing the number of clusters down from k. Finally, it's worth mentioning that if we used k-means with k=3 on the Iris dataset, we may get different distributions of the samples compared to the distribution of the decision tree that we'd introduced. Once more, this highlights how important it is to carefully choose and use the correct machine learning method for each problem.

Now let's discuss a practical example that uses k-means clustering. Let's say a pizza-delivery place wants to open four new franchises in a city, and they need to choose the locations for the sites. We can solve this problem with k-means:

  1. Find the locations where pizza is ordered from most often and these will be our data points.
  2. Choose four random points where the site locations will be located.
  1. By using k-means clustering, we can identify the four best locations that minimize the distance to each delivery place:
In the left image, we can see the distribution of points where pizza is delivered most often. The round pints in the right image indicate where the new franchises should be located and their corresponding delivery areas
主站蜘蛛池模板: 竹山县| 文山县| 唐山市| 德惠市| 聂拉木县| 垦利县| 沧州市| 射阳县| 凤山县| 柏乡县| 陆丰市| 邻水| 长子县| 麻城市| 长葛市| 达孜县| 阿拉尔市| 侯马市| 安达市| 佳木斯市| 绥棱县| 突泉县| 隆尧县| 镇远县| 呈贡县| 泗水县| 章丘市| 慈利县| 白城市| 泽州县| 昭苏县| 建瓯市| 乐亭县| 桂东县| 乐亭县| 肥东县| 大英县| 华蓥市| 博乐市| 南开区| 郓城县|