官术网_书友最值得收藏!

Gradient descent

An SGD implementation of gradient descent uses a simple distributed sampling of the data examples. Loss is a part of the optimization problem, and therefore, is a true sub-gradient.

This requires access to the full dataset, which is not optimal.

The parameter miniBatchFraction specifies the fraction of the full data to use. The average of the gradients over this subset

is a stochastic gradient. S is a sampled subset of size |S|= miniBatchFraction.

In the following code, we show how to use stochastic gardient descent on a mini batch to calculate the weights and the loss. The output of this program is a vector of weights and loss.

object SparkSGD { 
def main(args: Array[String]): Unit = {
val m = 4
val n = 200000
val sc = new SparkContext("local[2]", "")
val points = sc.parallelize(0 until m,
2).mapPartitionsWithIndex { (idx, iter) =>
val random = new Random(idx)
iter.map(i => (1.0,
Vectors.dense(Array.fill(n)(random.nextDouble()))))
}.cache()
val (weights, loss) = GradientDescent.runMiniBatchSGD(
points,
new LogisticGradient,
new SquaredL2Updater,
0.1,
2,
1.0,
1.0,
Vectors.dense(new Array[Double](n)))
println("w:" + weights(0))
println("loss:" + loss(0))
sc.stop()
}
主站蜘蛛池模板: 满洲里市| 藁城市| 岗巴县| 朝阳区| 图片| 唐海县| 信宜市| 济源市| 新密市| 兴城市| 冕宁县| 莒南县| 秦安县| 武夷山市| 黔江区| 浙江省| 平武县| 深州市| 怀化市| 白水县| 安图县| 洛浦县| 拜泉县| 萍乡市| 平邑县| 临沭县| 西峡县| 西林县| 安仁县| 云浮市| 新郑市| 图们市| 青岛市| 阿拉善盟| 苍南县| 苏尼特右旗| 克东县| 凤城市| 榆树市| 新竹市| 淅川县|