i

Hadoop Tutorial

How Hadoop works?

Hadoop skillset requires in-depth knowledge of every layer in the Hadoop stack right from understanding the various components in the Hadoop architecture, designing a Hadoop cluster, performance tuning it, and setting up the top chain responsible for data processing.

Hadoop follows a master-slave architecture design for data storage and distributed data processing using HDFS and MapReduce, respectively. The master node for data storage is Hadoop HDFS is the Name-Node, and the master node for parallel processing of data using Hadoop MapReduce is the Job Tracker. The slave nodes in the Hadoop architecture are the other machines in the Hadoop cluster, which store data and perform complex computations. Every slave node has a Task Tracker daemon and a Data Node that synchronizes the processes with the Job Tracker and Name Node, respectively. In Hadoop architectural implementation, the master or slave systems can be set up in the cloud or the local environment.

                                          Fig: How Hadoop works