Introduction to Big Data :
The term Big Data Means huge volume, high velocity and a variety of data. The Big Data is increasing tremendously day by day. Traditional Data Management system and existing tools are facing difficulties to process such a big data.
Big Data is the most important technologies in modern world. It is really critical to store and manage it. Big data is a collection of large datasets that cannot be processed using traditional computing techniques.
Introduction to Big Data includes huge volume, high velocity and extensible variety of data. The data in it may be structured data, Semi Structured data or unstructured data. Big data also involves various tools, techniques and frameworks.
Big Data Characteristics –
(1) Volume :
- Huge amount of data is generated during big data applications.
- The amount of data generated as well as the storage volume is very big in size.
(2) Velocity :
- For time critical applications the faster processing is very important.
- The huge amount of data is generate and store, which requires higher processing speed of processing data.
- The amount of digital data will be double in every 18 months and it repeats may be in less time in future.
(3) Variety :
- The type and nature of data having great variety.
- It has Structured and Unstructured variety
(4) Veracity :
- The Data capture is not in certain format.
- Data Capture can vary greatly. So accuracy of analysis depends on the veracity of the source data.
Additional Characteristics of Big Data –
(1) Programmable :
- It is possible with big data to explore all types of programming logic.
- Programming can be use to perform any kind of exploration because of the sale of the data.
(2) Data Driven :
- The Data driven approach is possible for scientists.
- In Data driven, data collection happens in huge amounts.
(3) Multi Attributes :
- It is possible to deal with many gigabytes of data that consists of thousands of attributes.
- As all data operations are now happening on a large scale.
Types of Big Data –
(1) Structured Data :
Structured Data is generally a data that has a definite length and format for big data. Like RDBMS tables have fix number of columns and data can be increase by adding rows.
Example : Structured data includes marks data as numbers, dates or data like words and numbers. Structured data is very simple to dealing with, and easy to store in a database.
(2) Un-structured Data :
Unstructured Data is generally a data collection in any available form without restricting them for any formats. Like audio data, video data, web blog data, etc.
Example : Unstructured data includes video recording of CCTV Surveillance.
(3) Semi-structured Data :
Along with Structured and Unstructured data, there is also a Semi-structured Data. Semi-structured data is information that does not reside in a RDBMS. It may organize in tree pattern which is easier to analyze in some cases. Examples of Semi-structured data might include XML Documents and NoSQL Databases.
(4) Hybrid Data :
There are some systems which will make use of both types of data to achieve competitive advantages. Structured data is offering simplicity whereas unstructured data will give lot of data about topic.