官术网_书友最值得收藏!

Training the classifier

Creating a logistic regression classifier involves pretty much the same steps as setting up k-NN:

In [14]: lr = cv2.ml.LogisticRegression_create()

We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code:

In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
... lr.setMiniBatchSize(1)

We also want to specify the number of iterations the algorithm should run before it terminates:

In [16]: lr.setIterations(100)

We can then call the train method of the object (in the exact same way as we did earlier), which will return True upon success:

In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
Out[17]: True

As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f0, f1, f2, f3). Since we have four features, we should also get four weights, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3, and ?=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3 + w4. We can retrieve these weights as follows:

In [18]: lr.get_learnt_thetas()
Out[18]: array([[-0.04109113, -0.01968078, -0.16216497, 0.28704911, 0.11945518]], dtype=float32)

This means that the input to the logistic function is x = -0.0411 f0 - 0.0197 f1 - 0.162 f2 + 0.287 f3 + 0.119. Then, when we feed in a new data point (f0, f1, f2, f3) that belongs to class 1, the output ?=σ(x) should be close to 1. But how well does that actually work?

主站蜘蛛池模板: 齐齐哈尔市| 突泉县| 葵青区| 五莲县| 奉节县| 剑河县| 闽清县| 忻州市| 灵寿县| 轮台县| 海丰县| 来安县| 天津市| 惠水县| 江孜县| 通州市| 安西县| 弋阳县| 师宗县| 天台县| 广德县| 军事| 双城市| 汝阳县| 旬邑县| 星子县| 宝鸡市| 讷河市| 通河县| 固阳县| 马鞍山市| 克东县| 吉木萨尔县| 集安市| 仙桃市| 茂名市| 德令哈市| 磐石市| 丹东市| 通河县| 新巴尔虎左旗|