i

Hadoop Tutorial

Types of Big Data

We can classify big data into three main categories.

Structured Data:

Structured data refers to the data that resides in a fixed field within a record or file, which includes data contained in relational databases and spreadsheets. It has the advantage of being easily entered, queried, stored, and analyzed. At one time, relational databases, and spreadsheets using structured data was the only way to manage data effectively, because of the high cost and performance limitations of storage, memory, and processing, relational databases, and However, nowadays, we are foreseeing issues when the size of such data grows to a considerable extent; typical sizes are being in the rage of multiple zettabytes.

Example of Structured Data:

The medical history of Heart patients can be an excellent example of structured data.

Table: Heart Patient Data

Un-Structured Data:

Any data with the unknown form of the structure is classified as unstructured data. Despite the huge size, unstructured data poses many challenges in terms of its processing capabilities.

A typical example of unstructured data is the heterogeneous data source containing a combination of simple text files, images, videos, etc.

Semi-structured Data:

Semi-structured data can contain both structured and unstructured forms of data. It is neither like Relational data nor text or image data.

An excellent example of semi-structured data is data represented in an XML file format.