官术网_书友最值得收藏!

Getting Your Big Data into the Spark Environment Using RDDs

Primarily, this chapter will provide a brief overview of how to get your big data into the Spark environment using resilient distributed datasets (RDDs). We will be using a wide array of tools to interact with and modify this data so that useful insights can be extracted. We will first load the data on Spark RDDs and then carry out parallelization with Spark RDDs.

In this chapter, we will cover the following topics:

  • Loading data onto Spark RDDs
  • Parallelization with Spark RDDs
  • Basics of RDD operation
主站蜘蛛池模板: 渝北区| 宝丰县| 晋州市| 柳河县| 井陉县| 百色市| 商都县| 大埔县| 靖江市| 富源县| 绥棱县| 宝鸡市| 安庆市| 文昌市| 岳普湖县| 都昌县| 福州市| 黑水县| 多伦县| 乐平市| 红原县| 高尔夫| 安新县| 泗洪县| 浙江省| 英山县| 仁布县| 左权县| 太原市| 寿阳县| 大厂| 赫章县| 洛扎县| 长武县| 井冈山市| 家居| 昌都县| 乐山市| 宜黄县| 稻城县| 偃师市|