In previous post I created a HDFS/YARN cluster with only 4 nodes. Now I’ve requested more resource, it’s time to extend the capacity.
Here I plan to add another 4 hosts to the cluster, both HDFS and YARN. Please node that if the host profile(mostly RAM) is not the same, you should change yarn.nodemanager.resource.memory-mb
in etc/hadoop/yarn-site.xml
properly.
1. environment setup
First of all, let’s setup the environment for each host.
a. enable passwordless access for new added hosts;
b. copy hadoop-2.7.4.tar.gz, jdk-7u71-linux-x64.tar.gz and jdk-8u144-linux-x64.tar.gz to all hosts, and unzip them;
tar -zxf hadoop-2.7.4.tar.gz ; tar -zxf jdk-7u71-linux-x64.tar.gz ; tar -zxf jdk-8u144-linux-x64.tar.gz;
c. create local directories
mkdir -p /mnt/dfs/namenode /mnt/dfs/data /mnt/yarn/nm-local-dir /mnt/yarn/nm-log /mnt/dfs/journal; chown -R stack:stack /mnt/dfs/ /mnt/yarn
d. add profile /etc/profile.d/hadoop.sh
HADOOP_HOME=/home/stack/hadoop-2.7.4 export HADOOP_HOME HADOOP_PREFIX=/home/stack/hadoop-2.7.4 export HADOOP_PREFIX export HADOOP_CONF_DIR=/home/stack/hadoop-2.7.4/etc/hadoop export YARN_CONF_DIR=/home/stack/hadoop-2.7.4/etc/hadoop
2. DFS/YARN configuration
Since all my hosts have the same size, I only need to add new host list to etc/hadoop/slaves
and copy all configurations to new hosts.
3. enable DataNode/NodeManager services
$HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode $HADOOP_HOME/sbin/yarn-daemon.sh --config $HADOOP_HOME/etc/hadoop start nodemanager
Now you should see the new nodes in both HDFS and YARN cluster.