官术网_书友最值得收藏!

Operations on RDD

An RDD supports only two types of operations. One is called transformation and the other is called action. The following are the explanations of both of these:

  • Transformation: If an operation on an RDD gives you another RDD, then it is a transformation. Consider you have an RDD of strings and want to filter out all values that start with H as follows:

So, a filter operation on an RDD will return another RDD with all the values that passes through the filter condition. So, a filter is an example of a transformation

  • Action: If an operation on an RDD gives you a result other than an RDD, it is called an action: for example, the sum of all values in an RDD, or the count of all the values or retrieving all values of RDD in form of a list, and so on. The following is the logical representation of an action sum of an RDD:

So, the rule is if after an operation on an RDD, you get an RDD then it is a transformation; otherwise, it is an action. We will discuss all the available transformations and actions that can be performed on an RDD, with the coding examples, in Chapter 4, Understanding the Spark Programming Model and Chapter 7, Spark Programming Model - Advanced.

主站蜘蛛池模板: 瑞安市| 通化市| 环江| 娱乐| 德令哈市| 耒阳市| 呼伦贝尔市| 拜城县| 灵寿县| 时尚| 甘洛县| 正定县| 钦州市| 民县| 同江市| 得荣县| 台州市| 工布江达县| 苍梧县| 莫力| 黔南| 华容县| 台山市| 竹山县| 新营市| 个旧市| 凤台县| 广安市| 房山区| 比如县| 金平| 海林市| 蒲城县| 志丹县| 洞口县| 赤城县| 昌乐县| 崇义县| 洛扎县| 古田县| 阿拉善盟|