What is Big Data
 
*  Every day, world create 2.5 quintillion bytes of data so much that 90% of the data in the world today has been created in the last two years alone.

* Gartner defines Big Data as high volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.

*According to IBM, 80% of data captured today is unstructured, from sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals, to name a few. All of this unstructured data is also Big Data.


* Huge Competition in the Market:
                 - Retails – Customer analytics
                 - Travel – travel pattern of the customer
                 - Website – Understand users navigation pattern, interest, conversion, etc
                 - Sensors, satellite, geospatial Data
                 - Military and intelligence
 
1.  Volume
      - Today we are living in the world of data. There are multiple factors contributing in data growth
      - Huge volumes of data are generated from various sources:
      - Transaction based data (stored through years)
      - Text, Images, Videos from Social Media
      - Increased amounts of data generated by sensors
      -  Turn 12 terabytes of Tweets created each day into improved product sentiment analysis
      -  Convert 350 billion annual meter readings to better predict power consumption
      -  Turn billions of customer complaints to analyze root cause of customer churn

2.  Velocity 
       -  According to Gartner, velocity "means both how fast data is being produced and how fast the data must be
processed to meet demand." 
       -  Scrutinize 5 million trade events created each day to identify potential fraud 
       -  Analyze customer’s searching/buying pattern and show them advertisement of attractive offers in real time
       -  Take Google’s example, about processing of the data:
       -  As soon as a blog is posted it comes into the search result.
       - Even ads in the mail are highly content driven
 
3.  Variety
       -  Data today comes in all types of formats – from traditional databases to hierarchical data stores created by end users and OLAP systems, to text documents, email, meter-collected data, video, audio, stock ticker data and financial transactions.
 ============*****==============

Comments

Popular posts from this blog

Setup Nginx as a Reverse Proxy for Thingsboard running on different port/server

How to auto re-launch a YARN Application Master on a failure.

Read JSON File in Cassandra