官术网_书友最值得收藏!

Introduction to Apache Spark

Apache Spark is an open source framework for processing large datasets stored in heterogeneous data stores in an efficient and fast way. Sophisticated analytical algorithms can be easily executed on these large datasets. Spark can execute a distributed program 100 times faster than MapReduce. As Spark is one of the fast-growing projects in the open source community, it provides a large number of libraries to its users.

We shall cover the following topics in this chapter:

  • A brief introduction to Spark
  • Spark architecture and the different languages that can be used for coding Spark applications
  • Spark components and how these components can be used together to solve a variety of use cases
  • A comparison between Spark and Hadoop
主站蜘蛛池模板: 三门峡市| 定结县| 疏附县| 清河县| 大新县| 固安县| 河北省| 永城市| 丁青县| 太仓市| 浮梁县| 高州市| 石林| 宁海县| 会同县| 察隅县| 克拉玛依市| 衡阳市| 龙川县| 益阳市| 神木县| 桓仁| 永年县| 富川| 丰原市| 阿拉善盟| 营口市| 泽库县| 南溪县| 东至县| 忻州市| 鲁甸县| 当阳市| 和林格尔县| 新巴尔虎左旗| 稷山县| 辽源市| 儋州市| 灌云县| 德昌县| 缙云县|