官术网_书友最值得收藏!

RDDs versus DataFrames versus Datasets

To make it clear, we are discouraging you from using RDDs unless there is a strong reason to do so for the following reasons:

  • RDDs, on an abstraction level, are equivalent to assembler or machine code when it comes to system programming
  • RDDs express how to do something and not what is to be achieved, leaving no room for optimizers
  • RDDs have proprietary syntax; SQL is more widely known

Whenever possible, use Datasets because their static typing makes them faster. As long as you are using statically typed languages such as Java or Scala, you are fine. Otherwise, you have to stick with DataFrames.

主站蜘蛛池模板: 义马市| 肇东市| 达尔| 喀喇沁旗| 辽阳市| 德格县| 鄂伦春自治旗| 华坪县| 柘荣县| 吴桥县| 和龙市| 马尔康县| 克拉玛依市| 乃东县| 安图县| 大兴区| 桐城市| 西充县| 嘉峪关市| 大悟县| 永城市| 望谟县| 大兴区| 于田县| 县级市| 湛江市| 陕西省| 麻栗坡县| 娄烦县| 大姚县| 荆州市| 益阳市| 尚志市| 兰考县| 达拉特旗| 会东县| 海阳市| 晋州市| 霍林郭勒市| 宜兰市| 尖扎县|