i

Hadoop Tutorial

Execute Apache Pig Script

In this section, we will create a Pig script and execute it.

Step9: first, we are going to create a text file. We can use any text editor to do this.

 

$sudo nano employee

We will add some sample lines of data on it. The sample data file contains four columns FirstName, LastName, ID, and Dept separated by tab key. Our goal is to read the content of this file from the HDFS and display the specific columns of these records. 

Step 10: We have to start (start-all.sh) all the services of Hadoop as we will work on HDFS. First, we will create a pig directory on it and provide necessary permissions on it.

Step 11: In this step, we are going to store the employee text file into a pig directory. We will use the below command to do this operation.

Check the file from the Hadoop browser:

Step 12: Now, we will create a Pig script using an editor (nano). The following command will create an out. pig file inside the home directory of the hduser user.

 Write the following Pig commands in out. pig file.

The first command loads the file employee into variable A with indirect schema (FName, LName, ID, and Dept). The second command loads FName, ID, and Dept data from variable A to variable B. The final line displays the content of variable B on the terminal.

Step 13: We will execute the out.pig file using the below command.

It will display the desired output in the terminal.