top of page

APACHE HADOOP INSTALLATION FOR UBUNTU



1)According to the hostname and IP, all nodes are defined as /etc/hosts.


bigdata@hidats:~$ sudo nano /etc/hosts



2)SSH keys are defined to each other of the server


bigdata@hidats:~$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa

Generating public/private rsa key pair.

Created directory '/home/bigdata/.ssh'.

Your identification has been saved in /home/bigdata/.ssh/id_rsa

Your public key has been saved in /home/bigdata/.ssh/id_rsa.pub

The key fingerprint is:

SHA256:/j07RH1ABmLdEOj/BO7yTnJcFTjGi4+ZWuNvxtXWmLE bigdata@hidats

The key's randomart image is:

+---[RSA 3072]----+

| oo=*+. |

| ....*o .|

| . o.o..|

| ..o.o..|

| S +=..*o|

| . B+oE.+|

| . =o*o.. |

| o.*++. |

| .+O= |

+----[SHA256]-----+


bigdata@hidats:~$ ls -lrt .ssh/

total 8

-rw-r--r-- 1 bigdata bigdata 568 Nis 6 16:28 id_rsa.pub

-rw------- 1 bigdata bigdata 2602 Nis 6 16:28 id_rsa


bigdata@hidats:~$ cat .ssh/id_rsa.pub >> ~/.ssh/authorized_keys

bigdata@hidats:~$ cat .ssh/authorized_keys

ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC9xALRv5qpfcs1tyVRpFMt


3)Open JDK is installed on all nodes.

bigdata@hidats:~$ sudo apt-get -y install openjdk-8-jdk-headless


4)The Hadoop file is downloaded.


bigdata@hidats:~$ tar -xvzf hadoop-3.3.4.tar.gz


bigdata@hidats:~$ mv hadoop-3.3.4 hadoop


5)Evn. Regulated.

bigdata@hidats:~$ nano ~/.bashrc


export HADOOP_HOME=/home/bigdata/hadoop

export PATH=$PATH:$HADOOP_HOME/bin

export PATH=$PATH:$HADOOP_HOME/sbin

export HADOOP_MAPRED_HOME=${HADOOP_HOME}

export HADOOP_COMMON_HOME=${HADOOP_HOME}

export HADOOP_HDFS_HOME=${HADOOP_HOME}

export YARN_HOME=${HADOOP_HOME}

export HDFS_NAMENODE_USER="bigdata"

export HDFS_DATANODE_USER="bigdata"

export HDFS_SECONDARYNAMENODE_USER="bigdata"

export YARN_RESOURCEMANAGER_USER="bigdata"

export YARN_NODEMANAGER_USER="bigdata"

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64


6)It is exported java home to Haddop's env.

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/hadoop-env.sh


export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64


7)Configuring Hadoop core-site.xml

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/core-site.xml


<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://hidats:9000</value>

</property>

</configuration>


8)A data folder directory is created.


bigdata@hidats:~$ sudo mkdir -p /data

bigdata@hidats:~$ sudo chown -R bigdata:bigdata /data/

bigdata@hidats:~$ sudo chmod -R 700 /data


9) Configuring Master File

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/masters


İP


10)Configuring Hadoop hdfs-site.xml

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/hdfs-site.xml


<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:///data</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:///data</value>

</property>

</configuration>


11)Configuring Hadoop yarn-site.xml (Memory optimization should be optimized according to computer hardware.)

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/yarn-site.xml


<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.hostname</name>

<value>hidats</value>

</property>

<property>

<name>yarn.nodemanager.resource.memory-mb</name>

<value>1024</value>

</property>

<property>

<name>yarn.scheduler.maximum-allocation-mb</name>

<value>1024</value>

</property>

<property>

<name>yarn.scheduler.minimum-allocation-mb</name>

<value>5</value>

</property>

<property>

<name>yarn.nodemanager.vmem-check-enabled</name>

<value>false</value>

</property>

</configuration>


12)Configuring Hadoop mapred-site.xml

bigdata@hidats:~$ vi ~/hadoop/etc/hadoop/mapred-site.xml


<configuration>

<property>

<name>mapreduce.jobtracker.address</name>

<value>hidats:54311</value>

</property>

<property>

<value>yarn</value>

</property>

<property>

<name>yarn.app.mapreduce.am.resource.mb</name>

<value>1024</value>

</property>

<property>

<name>mapreduce.map.memory.mb</name>

<value>1024</value>

</property>

<property>

<name>mapreduce.reduce.memory.mb</name>

<value>1024</value>

</property>

<property>

<name>yarn.app.mapreduce.am.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>

</property>

<property>

<name>mapreduce.map.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>

</property>

<property>

<name>mapreduce.reduce.env</name>

<value>HADOOP_MAPRED_HOME=$HADOOP_MAPRED_HOME</value>

</property>

</configuration>


13)HDFS formatted.

bigdata@hidats:~$ sudo apt-get install hdf4-tools

bigdata@hidats:~$ sudo apt-get install hfsutils-tcltk


bigdata@hidats:~$ cd /home/avelytic/hadoop/bin

bigdata@hidats:~$ /hadoop/bin$ ./hdfs namenode -format


14)Cluster starts.

bigdata@hidats:~$ cd /home/avelytic/hadoop/sbin/

bigdata@hidats:~$ /hadoop/sbin$ ./start-all.sh


Starting resourcemanager

Starting nodemanagers

NameNode

SecondaryNameNode

Jps

DataNode




15)Cluster control

bigdata@hidats:~$ jps


16)Interface control

ip:9870





GG EASY



 
 
 

Comments


bottom of page