官术网_书友最值得收藏!

Training the classifier

Creating a logistic regression classifier involves pretty much the same steps as setting up k-NN:

In [14]: lr = cv2.ml.LogisticRegression_create()

We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code:

In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH)
... lr.setMiniBatchSize(1)

We also want to specify the number of iterations the algorithm should run before it terminates:

In [16]: lr.setIterations(100)

We can then call the train method of the object (in the exact same way as we did earlier), which will return True upon success:

In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train)
Out[17]: True

As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f0, f1, f2, f3). Since we have four features, we should also get four weights, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3, and ?=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3 + w4. We can retrieve these weights as follows:

In [18]: lr.get_learnt_thetas()
Out[18]: array([[-0.04109113, -0.01968078, -0.16216497, 0.28704911, 0.11945518]], dtype=float32)

This means that the input to the logistic function is x = -0.0411 f0 - 0.0197 f1 - 0.162 f2 + 0.287 f3 + 0.119. Then, when we feed in a new data point (f0, f1, f2, f3) that belongs to class 1, the output ?=σ(x) should be close to 1. But how well does that actually work?

主站蜘蛛池模板: 临朐县| 宜昌市| 小金县| 昭通市| 高安市| 五常市| 普格县| 泗洪县| 马尔康县| 甘南县| 丘北县| 昌都县| 彰化县| 行唐县| 宜黄县| 忻城县| 崇信县| 天津市| 盐城市| 若羌县| 东城区| 辽中县| 嘉定区| 岳普湖县| 汉阴县| 奉节县| 永州市| 宿迁市| 宜章县| 嘉善县| 贺州市| 沈丘县| 土默特左旗| 新密市| 汨罗市| 安吉县| 曲水县| 孟津县| 台南县| 莱州市| 温宿县|