官术网_书友最值得收藏!

Design of Sparkling Water

Sparkling Water is designed to be executed as a regular Spark application. Consequently, it is launched inside a Spark executor created after submitting the application. At this point, H2O starts services, including a distributed key-value (K/V) store and memory manager, and orchestrates them into a cloud. The topology of the created cloud follows the topology of the underlying Spark cluster.

As stated previously, Sparkling Water enables transformation between different types of RDDs/DataFrames and H2O's frame, and vice versa. When converting from a hex frame to an RDD, a wrapper is created around the hex frame to provide an RDD-like API. In this case, data is not duplicated but served directly from the underlying hex frame. Converting from an RDD/DataFrame to a H2O frame requires data duplication because it transforms data from Spark into H2O-specific storage. However, data stored in an H2O frame is heavily compressed and does not need to be preserved as an RDD anymore:

Data sharing between sparkling water and Spark
主站蜘蛛池模板: 察隅县| 璧山县| 临夏市| 友谊县| 大城县| 平定县| 松潘县| 永仁县| 钟祥市| 浮梁县| 石屏县| 钦州市| 广昌县| 怀安县| 永胜县| 桦川县| 阆中市| 翼城县| 湖南省| 托克托县| 湖南省| 买车| 雅安市| 西乡县| 正镶白旗| 龙门县| 丰台区| 汝州市| 鄯善县| 石狮市| 吐鲁番市| 海晏县| 吉安县| 平度市| 会昌县| 泽库县| 惠东县| 法库县| 诸城市| 磐石市| 虹口区|