官术网_书友最值得收藏!

Unsupervised learning

As the name suggests, unlike supervised learning, unsupervised learning works on data that is not labeled or that doesn't have a category associated with each training example.

Unsupervised learning is used to understand data segmentation based on a few features of the data. For example, a supermarket might want to understand how many different types of customers they have. For that, they can use the following two features:

  • The number of visits per month (number of times the customer shows up)
  • The average bill amount

The initial data that the supermarket had might look like the following in a spreadsheet:

So the data plotted in these 2 dimensions, after being clustered, might look like this following image:

Here you see that there are 4 types of people with two extreme cases that have been annotated in the preceding image. Those who are very thorough and disciplinarian and know what they want, go to the store very few times and buy what they want, and generally their bills are very high. The vast majority falls under the basket where people make many trips (kind of like darting into a super market for a packet of chips, maybe) but their bills are really low. This type of information is crucial for the super market because they can optimize their operations based on these data.

This type of segmenting task has a special name in machine learning. It is called "clustering". There are several clustering algorithms and K Means Clustering is quite popular. The only flip side of k Means Clustering is that the number of possible clusters has to be told in the beginning.

主站蜘蛛池模板: 金阳县| 克东县| 兴化市| 盱眙县| 阿巴嘎旗| 兰州市| 湘乡市| 滦南县| 罗江县| 虎林市| 和政县| 苗栗市| 南木林县| 八宿县| 永兴县| 佛教| 三门峡市| 康保县| 平江县| 井研县| 房产| 鄱阳县| 南部县| 巴青县| 剑川县| 德保县| 铅山县| 黎城县| 东光县| 安新县| 图片| 罗平县| 包头市| 措美县| 南平市| 南溪县| 迁西县| 老河口市| 平原县| 长白| 兴国县|