官术网_书友最值得收藏!

Setting the HDFS block size for all the files in a cluster

In this recipe, we are going to take a look at how to set a block size at the cluster level.

Getting ready

To perform this recipe, you should already have a running Hadoop cluster.

How to do it...

The HDFS block size is configurable for all files in the cluster or for a single file as well. To change the block size at the cluster level itself, we need to modify the hdfs-site.xml file.

By default, the HDFS block size is 128MB. In case we want to modify this, we need to update this property, as shown in the following code. This property changes the default block size to 64MB:

<property>
<name>dfs.block.size</name>
    <value>67108864</value>
    <description>HDFS Block size</description>
</property>

If you have a multi-node Hadoop cluster, you should update this file in the nodes, that is, NameNode and DataNode. Make sure you save these changes and restart the HDFS daemons:

/usr/local/hadoop/sbin/stop-dfs.sh
/usr/local/hadoop/sbin/start-dfs.sh

This will set the block size for files that will now get added to the HDFS cluster. Make sure that this does not change the block size of the files that are already present in HDFS. There is no way to change the block sizes of existing files.

How it works...

By default, the HDFS block size is 128MB for Hadoop 2.X. Sometimes, we may want to change this default block size for optimization purposes. When this configuration is successfully updated, all the new files will be saved into blocks of this size. Ensure that these changes do not affect the files that are already present in HDFS; their block size will be defined at the time being copied.

主站蜘蛛池模板: 城口县| 泽普县| 吴堡县| 法库县| 绥中县| 曲阳县| 长岭县| 垣曲县| 鹤峰县| 蕉岭县| 邵阳县| 延庆县| 梅州市| 定襄县| 武鸣县| 内丘县| 达尔| 荥阳市| 新巴尔虎右旗| 昌图县| 平果县| 教育| 岐山县| 十堰市| 皋兰县| 健康| 平陆县| 沅江市| 祥云县| 曲沃县| 石渠县| 安新县| 乌兰县| 宁河县| 彩票| 新巴尔虎右旗| 平阳县| 青冈县| 新巴尔虎右旗| 磐安县| 科尔|