官术网_书友最值得收藏!

Training MLP classifier 

In Spark, an MLP is a classifier that consists of multiple layers. Each layer is fully connected to the next layer in the network. Nodes in the input layer represent the input data, whereas other nodes map inputs to outputs by a linear combination of the inputs with the node’s weights and biases and by applying an activation function.

Interested readers can take a look at https://spark.apache.org/docs/latest/ml-classification-regression.html#multilayer-perceptron-classifier.

So let's create the layers for the MLP classifier. For this example, let's make a shallow network considering the fact that our dataset is not that highly dimensional.

Let's assume that only 18 neurons in the first hidden layer and 8 neurons in the second hidden layer would be sufficient. Note that the input layer has 10 inputs, so we set 10 neurons and 2 neurons in the output layers since our MLP will predict only 2 classes. One thing is very important—the number of inputs has to be equal to the size of the feature vectors and the number of outputs has to be equal to the total number of labels:

int[] layers = new int[] {10, 8, 16, 2};

Then we instantiate the model with the trainer and set its parameters:

MultilayerPerceptronClassifier mlp = new MultilayerPerceptronClassifier()
.setLayers(layers)
.setBlockSize(128)
.setSeed(1234L)
.setTol(1E-8)
.setMaxIter(1000);

So, as you can understand, the preceding MultilayerPerceptronClassifier() is the classifier trainer based on the MLP. Each layer has a sigmoid activation function except the output layer, which has the softmax activation. Note that Spark-based MLP implementation supports only minibatch GD and LBFGS optimizers.

In short, we cannot use other activation functions such as ReLU or tanh in the hidden layers. Apart from this, other advanced optimizers are also not supported, nor are batch normalization and so on. This is a serious constraint of this implementation. In the next chapter, we will try to overcome this with DL4J.

We have also set the convergence tolerance of iterations as a very small value so that it will lead to higher accuracy with the cost of more iterations. We set the block size for stacking input data in matrices to speed up the computation.

If the size of the training set is large, then the data is stacked within partitions. If the block size is more than the remaining data in a partition, then it is adjusted to the size of this data. The recommended size is between 10 and 1,000, but the default block size is 128.

Finally, we plan to iterate the training 1,000 times. So let's start training the model using the training set:

MultilayerPerceptronClassificationModel model = mlp.fit(trainingData);
主站蜘蛛池模板: 东兰县| 绥棱县| 凌源市| 上犹县| 靖西县| 德令哈市| 普陀区| 镇平县| 中山市| 湄潭县| 景宁| 长白| 隆昌县| 乐业县| 化州市| 广州市| 济南市| 义马市| 新和县| 洪雅县| 嘉兴市| 太仆寺旗| 镇原县| 文登市| 瓦房店市| 达日县| 衡阳市| 高尔夫| 茂名市| 彭阳县| 乳山市| 望谟县| 吉水县| 丽水市| 会宁县| 孟津县| 桂平市| 杨浦区| 句容市| 新津县| 万山特区|