官术网_书友最值得收藏!

MapReduce and Spark

MapReduce is a technique for performing aggregate processing on large amounts of data in parallel; it's a particularly common technique in data analytics applications. Cassandra does not offer built-in MapReduce capabilities, but it can be integrated with Hadoop in order to perform MapReduce operations across Cassandra data sets, or Spark for real-time data analysis. The DataStax enterprise product provides integration with both of these tools out of the box.
Spark is a fast, distributed, and expressive computational engine used for large-scale data processing similar to MapReduce. It is much more efficient than MapReduce and runs with resource managers such as Mesos and Yarn. It can read data from various sources such as Hadoop or Cassandra or even streams such as Kafka. DataStax provides a Spark-Cassandra connector to load data from Cassandra into Spark and run batch computations on the data.

主站蜘蛛池模板: 甘孜县| 营山县| 昌吉市| 长岭县| 枝江市| 邢台市| 保亭| 襄垣县| 克东县| 蕲春县| 阿拉善右旗| 天水市| 怀柔区| 包头市| 嘉定区| 营口市| 阿鲁科尔沁旗| 南阳市| 阜南县| 惠来县| 通许县| 鄱阳县| 澄迈县| 宜黄县| 汉中市| 定襄县| 册亨县| 昌图县| 丹棱县| 绍兴市| 洛宁县| 汉中市| 临安市| 双江| 饶平县| 吴堡县| 包头市| 大港区| 苏尼特左旗| 康马县| 通道|