官术网_书友最值得收藏!

SVM

Now we are ready to understand SVMs. SVM is an algorithm that enables us to make use of it for both classification and regression. Given a set of examples, it builds a model to assign a group of observations into one category and others into a second category. It is a non-probabilistic linear classifier. Training data being linearly separable is the key here. All the observations or training data are a representation of vectors that are mapped into a space and SVM tries to classify them by using a margin that has to be as wide as possible:

Let's say there are two classes A and B as in the preceding screenshot.

And from the preceding section, we have learned the following:

g(x) = w. x + b

Where:

  • w: Weight vector that decides the orientation of the hyperplane
  • b: Bias term that decides the position of the hyperplane in n-dimensional space by biasing it

The preceding equation is also called a linear discriminant function. If there is a vector x1 that lies on the positive side of the hyperplane, the equation becomes the following:

g(x1)= w.x1 +b >0 

The equation will become the following:

g(x1)<0

If x1 lies on the positive side of the hyperplane.

What if g(x1)=0? Can you guess where x1 would be? Well, yes, it would be on the hyperplane, since our goal is to find out the class of the vector.

So, if g(x1)>0 => x1 belongs to Class Ag(x1)<0 => x1 belongs to Class B.

Here, it's evident that we can find out the classification by using the previous equation. But can you see the issue in it? Let's say the boundary line is like the following plot:

Even in the preceding scenario, we are able to classify those feature vectors here. But is it desirable? What can be seen here is that the boundary line or the classifier is close to the Class B. It implies that it brings in a large bias in the favor of Class A but penalizes Class B. As a result of that, due to any disturbances in the vectors close to the boundary, they might cross over and become part of Class A, which might not be correct. Hence, our goal is to find an optimal classifier that has got the widest margin, like what is shown in the following plot: 

Through SVM, we are attempting to create a boundary or hyperplane such that the distance from each of the feature vectors to the boundary is maximized so that any slight noise or disturbance won't cause the change in classification. So, in this scenario, if we try to bring in certain yi which happens to be the class belonging to xi, we get the following: 

yi= ± 1

yi (w.xi + b) will always be greater than 0. yi(w.xi + b) >0 because when x∈ class A, w.xi +b>0 then yi>0, so the whole term will be positive. Also, if x∈ class B, w.xi + b<0 then yi<0, and it will make the term positive.

So, now if we have to redesign it, we say the following:

w.xi + b> γ where γ is the measure of the distance of hyperplane from xi.

And if there is a hyperplane w.x + b = 0, then the distance of point x from the preceding hyperplane is as follows:

 w.x + b/ ||w||

Hence, as mentioned previously:

w.x + b/ ||w|| ≥ γ

w.x + b ≥ γ.||w||

On performing proper scaling, we can say the following:

w.x + b ≥ 1 (since γ.||w|| = 1)

It implies that if there is a classification to be arrived at based on the previous result, it follows this:

w.x + b ≥ 1 if x ∈ class A and

w.x + b ≤ -1 if x ∈ class B

And now, again, if we bring in a class belonging to yi here, the equation becomes the following:

yi (w.xi + b) ≥ 1

But, if yi (w.xi + b) = 1, xi is a support vector. Next, we will learn what a support vector is.

主站蜘蛛池模板: 荣成市| 吉木乃县| 长阳| 桑日县| 涿鹿县| 正蓝旗| 北票市| 舟山市| 铜山县| 克东县| 白玉县| 昭平县| 诸城市| 龙泉市| 南澳县| 义马市| 黑山县| 旬阳县| 醴陵市| 孟连| 巴中市| 赤峰市| 雅安市| 盐源县| 迁西县| 高邮市| 商水县| 福贡县| 化德县| 瓦房店市| 青岛市| 兴仁县| 体育| 繁昌县| 济阳县| 石棉县| 陆丰市| 江北区| 姚安县| 晴隆县| 儋州市|