官术网_书友最值得收藏!

HDFS I/O

An HDFS read operation from a client involves the following:

  1. The client requests NameNode to determine where the actual data blocks are stored for a given file.
  2. NameNode obliges by providing the block IDs and locations of the hosts (DataNode) where the data can be found.
  3. The client contacts DataNode with the respective block IDs to fetch the data from DataNode while preserving the order of the block files.

An HDFS write operation from a client involves the following:

  1. The client contacts NameNode to update the namespace with the filename and verify the necessary permissions.
  2. If the file exists, then NameNode throws an error; otherwise, it returns the client FSDataOutputStream which points to the data queue.
  3. The data queue negotiates with the NameNode to allocate new blocks on suitable DataNodes.
  4. The data is then copied to that DataNode, and, as per the replication strategy, the data is further copied from that DataNode to the rest of the DataNodes.
  5. It's important to note that the data is never moved through the NameNode as it would caused a performance bottleneck.
主站蜘蛛池模板: 鹤庆县| 娄烦县| 黑河市| 温州市| 罗江县| 香格里拉县| 会理县| 墨脱县| 和平县| 蓬溪县| 张家界市| 辰溪县| 南昌县| 马山县| 兴业县| 江阴市| 平度市| 枣庄市| 莎车县| 西乌| 焦作市| 勃利县| 澜沧| 宜宾市| 柳河县| 文登市| 霸州市| 巴彦县| 平南县| 逊克县| 常德市| 尚义县| 高清| 淳安县| 泗水县| 泉州市| 安龙县| 荥经县| 修文县| 富宁县| 汝城县|