i
Characteristics of Big Data
Application of Big Data Processing
Introduction to BIG DATA
Where to get Big Data?
Types of Big Data
Storage layer - HDFS (Hadoop Distributed File System)
MapReduce
YARN
How Hadoop works?
Hadoop Eco System
Hadoop Architecture
Hadoop Installation & Environment Setup
Setting Up A Single Node Hadoop Cluster
Ubuntu User Configuration
SSH Setup With Key Generation
Disable IPv6
Download and Install Hadoop 3.1.2
Working with Configuration Files
Start The Hadoop instances
Hadoop Distributed File System (HDFS)
HDFS Features and Goals
HDFS Architecture
Read Operations in HDFS
Write Operations In HDFS
HDFS Operations
YARN
YARN Features
YARN Architecture
Resource Manager
Node Manager
Application Master
Container
Application Workflow in Hadoop YARN
Hadoop MapReduce
How MapReduce Works?
MapReduce Examples with Python
Running The MapReduce Program & Storing The Data File To HDFS
Create A Python Script
Hadoop Environment Setup
Execute The Script
Apache Hive Definition
Why Apache Hive?
Features Of Apache Hive
Hive Architecture
Hive Metastore
Hive Query Language
SQL vs Hive
Hive Installation
Apache Pig Definition
MapReduce vs. Apache Pig vs. Hive
Apache Pig Architecture
Installation Process Of Apache Pig
Execute Apache Pig Script
Hadoop Eco Components
NoSQL Data Management
Apache Hbase
Apache Cassandra
Mongodb
Introduction To Kafka
The Architecture of Apache Flume
Apache Spark Ecosystem
This is a practical session, and you will learn the basic operations of HDFS from it. Let's start by jumping in immediately to perform some operations first, and after which, we will then take a step back to understand what is happening behind the scenes. We will try to cover the below-mentioned operations:
Creating new directories
Listing files and directories
Copying files between the local file system and HDFS
Remove files
1 Starting HDFS
During the first time login to the configured HDFS file system, open name node (HDFS server) and execute the following command to format it.
$ hadoop namenode –format |
After formatting the HDFS, start the Hadoop services. The below mentioned commands would start the NameNode as well as the DataNodes as a cluster.
2 Create new directories:
To create new directories, we can use the hdfs dfs -mkdir . This hdfs dfs commands, mainly use for HDFS. I am going to create a hduser directory under the user directory. Please follow the below command to create an empty directory.
3 Check the directories in browser:
To check the directories, we can use the hdfs dfs -ls . Please check the below command to check the directories:
Now we will review the directories form the web browser. Go to the below address to start the session on the browser.
Localhost:9870
|
Then follow the below steps to check your directories:
Go to the Utilities option and press Browse the file system
In the search area, set your parent directory and press Go.
All the directories under the user directory will be available in the below section.
4 Create the local files:
We will create (using touch command) 2 files test.txt and abc.txt on a local file system, after that, we write on the abc.text file using an editor (nano/vim/ gedit). Finally, save and close the file.
5 Copy Local files to HDFS:
We can use the hdfs dfs -copyFromLocal or hdfs dfs -put to copy files locally to hdfs. Use the below command to transfer and store the data file from the local systems to the HDFS in the terminal.
6 Check the files in browser:
First, we will check the files in our specific directory using the ls command.
We can check the files from the browser. Set the specific directory in the search area and press Go. It will display all the files available in the directory.
To check the content inside the file, please press the file, and you will be able to view the material inside the file.
7 Copy the HDFS file to local system:
For this create a new directory in your local system. To copy files from hdfs to local, we can use hdfs dfs -copyToLocal or hdfs dfs -put . Please follow the below command to copy the file abc.txt from HDFS to local directory.
We can check its availability using the ls command.
8 Delete a file from HDFS:
To remove empty files we can use hdfs dfs -rm . We will delete the empty file test.txt from HDFS using the below command.
We can check this from our browser. Our file test.txt is not available in our HDFS system.
9 Shutting Down the HDFS
Shut down the HDFS files by following the below command
Don't miss out!