官术网_书友最值得收藏!

Performance improvements in Spark ML over Spark MLlib

Spark 2.0 uses Tungsten Engine, which is built using ideas of modern compilers and MPP databases. It emits optimized bytecode at runtime, which collapses the query into a single function. Hence, there is no need for virtual function calls. It also uses CPU registers to store intermediate data. This technique has been called whole stage code generation.

Reference : https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.htmlSource: https://databricks.com/blog/2016/05/11/apache-spark-2-0-technical-preview-easier-faster-and-smarter.html

The upcoming table and graph show single function improvements between Spark 1.6 and Spark 2.0:

Chart comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0
Table comparing Performance improvements in Single line functions between Spark 1.6 and Spark 2.0.
主站蜘蛛池模板: 家居| 绥阳县| 同仁县| 普兰店市| 永嘉县| 灵宝市| 彰武县| 博湖县| 肇庆市| 吉林市| 亚东县| 方正县| 靖远县| 汝南县| 景泰县| 顺义区| 江北区| 赣榆县| 桐梓县| 和政县| 莎车县| 应城市| 合川市| 平安县| 芒康县| 湟源县| 庄浪县| 阜平县| 东海县| 连南| 新营市| 穆棱市| 江永县| 龙州县| 丹棱县| 蒙山县| 崇文区| 伊宁市| 宁陵县| 依兰县| 大足县|