i
Characteristics of Big Data
Application of Big Data Processing
Introduction to BIG DATA
Where to get Big Data?
Types of Big Data
Storage layer - HDFS (Hadoop Distributed File System)
MapReduce
YARN
How Hadoop works?
Hadoop Eco System
Hadoop Architecture
Hadoop Installation & Environment Setup
Setting Up A Single Node Hadoop Cluster
Ubuntu User Configuration
SSH Setup With Key Generation
Disable IPv6
Download and Install Hadoop 3.1.2
Working with Configuration Files
Start The Hadoop instances
Hadoop Distributed File System (HDFS)
HDFS Features and Goals
HDFS Architecture
Read Operations in HDFS
Write Operations In HDFS
HDFS Operations
YARN
YARN Features
YARN Architecture
Resource Manager
Node Manager
Application Master
Container
Application Workflow in Hadoop YARN
Hadoop MapReduce
How MapReduce Works?
MapReduce Examples with Python
Running The MapReduce Program & Storing The Data File To HDFS
Create A Python Script
Hadoop Environment Setup
Execute The Script
Apache Hive Definition
Why Apache Hive?
Features Of Apache Hive
Hive Architecture
Hive Metastore
Hive Query Language
SQL vs Hive
Hive Installation
Apache Pig Definition
MapReduce vs. Apache Pig vs. Hive
Apache Pig Architecture
Installation Process Of Apache Pig
Execute Apache Pig Script
Hadoop Eco Components
NoSQL Data Management
Apache Hbase
Apache Cassandra
Mongodb
Introduction To Kafka
The Architecture of Apache Flume
Apache Spark Ecosystem
In this section, I will help you to install Apache Pig step by step. We assume that you have correctly configured Java, Hadoop, and HDFS. Please check your configurations, and if everything is fine, please proceed to the installation of Pig.
Step1: We start with downloading the latest version of apache Pig. Go to the page https://pig.apache.org/releases.html and proceed with Download options.
I will work with the latest version of Pig. Please continue with that.
After successfully downloading the file, go to the download location and check whether it is appropriately downloaded or not. Usually, it will be under /home/hduser/Downloads directory.
Step 2: Move the file to a suitable location. It is always recommended to keep all the Hadoop related applications in the same location. I am going to move this to /usr/local directory.
Step 3: Now, unzip the file in the same location.
Step 4: change the owner of the directory to the hduser and provide proper permission on it.
Step 5: In this step, we will integrate the PIG_HOME to .bashrc file. Add the below lines to the file.
# Set PIG_HOME export PIG_HOME=/usr/local/pig-0.17.0 export PATH=$PATH: /usr/local/pig-0.17.0/bin export PIG_CLASSPATH=$HADOOP_CONF_DIR/conf |
After save and exit this bashrc file, execute the below command to make the changes work in the same terminal.
Step 6: Now, check the Pig version using the below command. If every step is fine, it will return the installed version on the screen.
Step7: We can know the detail of Pig commands using the help option.
Step8: Run Pig command to start the grunt shell. Grunt shell is used to execute Pig Latin scripts.
We are now ready to work on Pig.
Don't miss out!