官术网_书友最值得收藏!

Unsupervised machine learning

Unsupervised learning is a type of machine learning algorithm used for grouping related data objects and finding hidden patterns by inferencing from unlabeled datasets—that is, training sets consisting of input data without labels.

Let's see a real-life example. Suppose you have a large collection of non-pirated and totally legal MP3 files in a crowded and massive folder on your hard drive. Now, what if you could build a predictive model that helps you automatically group together similar songs and organize them into your favorite categories, such as country, rap, and rock?

This is an act of assigning an item to a group so that an MP3 is added to the respective playlist in an unsupervised way. For classification, we assume that you are given a training dataset of correctly labeled data. Unfortunately, we do not always have that luxury when we collect data in the real world.

For example, suppose we would like to divide a huge collection of music into interesting playlists. How can we possibly group together songs if we do not have direct access to their metadata? One possible approach is a mixture of various ML techniques, but clustering is often at the heart of the solution:

Figure 7: Clustering data samples at a glance

In other words, the main objective of unsupervised learning algorithms is to explore unknown/hidden patterns in input data that is unlabeled. Unsupervised learning, however, also comprehends other techniques to explain the key features of the data in an exploratory way to find the hidden patterns. To overcome this challenge, clustering techniques are used widely to group unlabeled data points based on certain similarity measures in an unsupervised way.

主站蜘蛛池模板: 阿勒泰市| 新龙县| 廉江市| 涡阳县| 广河县| 清水河县| 渝北区| 历史| 巴楚县| 东阿县| 大宁县| 文登市| 滦南县| 东乌珠穆沁旗| 新竹县| 麻江县| 赤水市| 辽阳市| 河源市| 乐昌市| 准格尔旗| 嵩明县| 清河县| 广西| 安丘市| 东乡| 阳曲县| 石渠县| 建阳市| 海原县| 天水市| 荥经县| 广水市| 武义县| 大安市| 南汇区| 伊通| 罗江县| 方城县| 岢岚县| 隆昌县|