官术网_书友最值得收藏!

A single machine

A single machine is the simplest use case for Spark. It is also a great way to sanity check your build. In spark/bin, there is a shell script called run-example, which can be used to launch a Spark job. The run-example script takes the name of a Spark class and some arguments. Earlier, we used the run-example script from the /bin directory to calculate the value of Pi. There is a collection of the sample Spark jobs in examples/src/main/scala/org/apache/spark/examples/.

All of the sample programs take the parameter, master (the cluster manager), which can be the URL of a distributed cluster or local[N], where N is the number of threads.

Going back to our run-example script, it invokes the more general bin/spark-submit script. For now, let's stick with the run-example script.

To run GroupByTest locally, try running the following command:

bin/run-example GroupByTest

This line will produce an output like this given here:

14/11/15 06:28:40 INFO SparkContext: Job finished: count at GroupByTest.scala:51, took 0.494519333 s
2000

Note

All the examples in this book can be run on a Spark installation on a local machine. So you can read through the rest of the chapter for additional information after you have gotten some hands-on exposure to Spark running on your local machine.

主站蜘蛛池模板: 葵青区| 霍林郭勒市| 饶阳县| 仙居县| 镇远县| 综艺| 来凤县| 齐河县| 五华县| 香港 | 宁远县| 慈溪市| 犍为县| 三穗县| 开封县| 益阳市| 简阳市| 江都市| 博客| 浪卡子县| 长治县| 南川市| 鹤山市| 嘉禾县| 星子县| 昌黎县| 庆城县| 建宁县| 石泉县| 乌拉特后旗| 邯郸县| 大冶市| 都兰县| 海安县| 仁怀市| 开远市| 平谷区| 中阳县| 绥化市| 桃江县| 布尔津县|