官术网_书友最值得收藏!

Chapter 3. ETL with Spark

So we have gone through the architecture of Spark, and have had some detailed level discussions around RDDs. By the end of Chapter 2Transformations and Actions with Spark RDDs, we had focused on PairRDDs and some of the transformations.

This chapter focuses on doing ETL with Apache Spark. We'll cover the following topics, which hopefully will help you with taking the next step on Apache Spark:

  • Understanding the ETL process
  • Commonly supported file formats
  • Commonly supported filesystems
  • Working with NoSQL databases

Let's get started!

主站蜘蛛池模板: 津市市| 太康县| 嘉峪关市| 昆山市| 河南省| 阿拉善左旗| 克什克腾旗| 南和县| 八宿县| 内乡县| 彰化市| 浮梁县| 会宁县| 寻乌县| 阜康市| 崇左市| 威宁| 舒城县| 永仁县| 安新县| 茌平县| 卓尼县| 城市| 阜平县| 金山区| 威海市| 兴安县| 略阳县| 富平县| 同德县| 铜梁县| 湛江市| 华容县| 淄博市| 玉环县| 阳山县| 凭祥市| 无为县| 吴堡县| 广汉市| 临泽县|