i

Hadoop Tutorial

Hadoop Installation & Environment Setup

Hadoop Installation

Apache Hadoop is a Java-based programming framework that supports the processing and storage of enormous datasets on a cluster of inexpensive machines. It was the first major open-source project in the big data playing field and is sponsored by the Apache Software Foundation. In this document, I will install Hadoop 3.1.2 on ubuntu 18.04 step by step.

Assumption:

1. You already have an ubuntu 18.04 machine. If not, please prepare it first.

2. You have some basic knowledge of the Linux Operating system.

Environment Setup

Before we install Hadoop, we have to understand a quick, simple concept: the Hadoop framework requires java environment for running all its services on separate Java Virtual Machine. In this section, we will Download JDK 8 then install and setup the Java environment.

Step 1: First, run the below command to check java is available or not. It will return empty with some recommendations for java installation. 

java –version

 

 Step2: Go to the link:
 https://www.oracle.com/technetwork/java/javase/downloads/index.html and download JDK 8 Java SE 8u231 for Linux.

 

Fig: Download option for Jdk 8

Next, go to the JDK download link, then accept the license agreement and Download your appropriate version (64/32). I will go for the 64-bit Linux version.

 

Fig: Download option for Linux 64 bit Operating system.

After successfully downloading the file, go to the Download folder and check the file status. There should be a tar.gz of JDK available.

Fig: Download location: /home/your_user/Downloads

Step3: In the terminal, navigate to the location of your zip file. Extract contents of the java zip file in a directory of your choice.  We will extract it in the Downloads folder itself.

$ cd /home/user_name/Downloads

$ sudo tar –xvf jdk-8u231-linux-x64.tar.gz

Once unzip is done, go to the unzip folder (same Download folder). Now there will be a jdk folder containing more subfolders. 

Step4: You’ll most likely need to set the JAVA_HOME variable so other applications can find the location of your Java installation.  First open the file “/etc/environment” with a text editor (gedit/ vim/nano). I will use the nano editor in the whole session.

sudo nano /etc/environment

It will open the environment area. Then set the JAVA_HOME as below.

JAVA_HOME="/home/ehsan/Downloads/jdk1.8.0_231".

Now press Ctrl and x to exit. It will show you a save option. Press Y to save the changes and finally Enter to return.    

Finally, reload the environment using the below command. It will help us to save the JAVA_HOME permanently.

source /etc/environment

Now, if we run echo $JAVA_HOME, it should return the java installation location.

echo $JAVA_HOME

output: /home/ehsan/Downloads/jdk1.8.0_231

Step 5: We have already finished the installation of Java and also set the JAVA_HOME to /etc/environment. Now check the below command to check the java version. It is another confirmation that java installation has adequately completed.

java –version