官术网_书友最值得收藏!

Chapter 3. ETL with Spark

So we have gone through the architecture of Spark, and have had some detailed level discussions around RDDs. By the end of Chapter 2Transformations and Actions with Spark RDDs, we had focused on PairRDDs and some of the transformations.

This chapter focuses on doing ETL with Apache Spark. We'll cover the following topics, which hopefully will help you with taking the next step on Apache Spark:

  • Understanding the ETL process
  • Commonly supported file formats
  • Commonly supported filesystems
  • Working with NoSQL databases

Let's get started!

主站蜘蛛池模板: 通辽市| 鹤山市| 梁平县| 文登市| 黑河市| 革吉县| 宜君县| 荥阳市| 阆中市| 巴林右旗| 涪陵区| 布尔津县| 长寿区| 肥东县| 桂林市| 孟村| 那坡县| 宁强县| 方山县| 武冈市| 兰州市| 濮阳市| 大洼县| 舟曲县| 盱眙县| 淳安县| 黄石市| 时尚| 庆安县| 郎溪县| 阳山县| 荔波县| 泽库县| 屏边| 南宫市| 平罗县| 中牟县| 台州市| 万源市| 吴忠市| 进贤县|