官术网_书友最值得收藏!

  • Cassandra High Availability
  • Robbie Strickland
  • 462字
  • 2021-08-06 19:50:31

Consistent hashing

To solve the problem of locating a key in a distributed hash table, we use a technique called consistent hashing. Introduced as a term in 1997, consistent hashing was originally used as a means of routing requests among large numbers of web servers. It's easy to see how the Web can benefit from a hash mechanism that allows any node in the network to efficiently determine the location of an object, in spite of the constant shifting of nodes in and out of the network. This is the fundamental objective of consistent hashing.

The mechanics of consistent hashing

With consistent hashing, the buckets are arranged in a ring with a predefined range; the exact range depends on the partitioner being used. Keys are then hashed to produce a value that lies somewhere along the ring. Nodes are assigned a range, which is computed as follows:

Note

The following examples assume that the default Murmur3Partitioner is used. For more information on this partitioner, take a look at the documentation at http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePartitionerM3P_c.html.

For a five-node cluster, a ring with evenly distributed token ranges would look like the following diagram, presuming the default Murmur3Partitioner is used:

The mechanics of consistent hashing

In the preceding diagram, the primary replica for each key is assigned to a node based on its hashed value. Each node is responsible for the region of the ring between itself (inclusive) and its predecessor (exclusive).

This diagram represents data ranges (the letters) and the nodes (the numbers) that own these ranges. It might also be helpful to visualize this in a table form, which might be more familiar to those who have used the nodetool ring command to view Cassandra's topology.

When Cassandra receives a key for either a read or a write, the hash function is applied to the key to determine where it lies in the range. Since all nodes in the cluster are aware of the other nodes' ranges, any node can handle a request for any other node's range. The node receiving the request is called the coordinator, and any node can act in this role. If a key does not belong to the coordinator's range, it forwards the request to replicas in the correct range.

Following the previous example, we can now examine how our names might map to a hash, using the Murmur3 hash algorithm. Once the values are computed, they can be matched to the range of one of the nodes in the cluster, as follows:

The placement of these keys might be easier to understand by visualizing their position in the ring.

The mechanics of consistent hashing

The hash value of the name keys determines their placement in the cluster

Now that you understand the basics of consistent hashing, let's turn our focus to the mechanism by which Cassandra assigns data ranges.

主站蜘蛛池模板: 盐池县| 镇康县| 普兰店市| 兴义市| 和硕县| 象州县| 花垣县| 卓尼县| 乌鲁木齐县| 龙南县| 调兵山市| 平湖市| 徐水县| 称多县| 桐庐县| 隆昌县| 达尔| 介休市| 梅河口市| 阿合奇县| 宿松县| 灵台县| 宜城市| 平和县| 耒阳市| 洛阳市| 鞍山市| 大宁县| 沙田区| 台东县| 垣曲县| 泸定县| 呼和浩特市| 高阳县| 依兰县| 洛南县| 库尔勒市| 嘉善县| 夏邑县| 两当县| 木里|