Hadoop vs Apache Spark - PowerPoint PPT Presentation

About This Presentation

Title:

Hadoop vs Apache Spark

Description:

Number of Views:675

Slides: 18

Provided by: valuecodersvc

Category: Medicine, Science & Technology

Tags: apache_spark | hadoop

Transcript and Presenter's Notes

Title: Hadoop vs Apache Spark

1
Hadoop Vs Apache Spark
2
Hadoop Introduction

Hadoop helps in storing large data sets. It also
helps in running processes related to
distributed analytics. Hadoop is a framework that
is open source and can be freely used. Large data
sets can be quickly and easily stored using
Hadoop. Hadoop is an efficient framework it
does not require large amounts of data transfer.
Hadoop makes sure that one job is processed at a
time. Data warehousing is one of the core
functions of Hadoop. The framework ensures that
big data applications continue to run in case of
a failures of individual servers.
Hadoop is a framework that is highly prefered for
batch processing. The Hadoop framework is written
in Java . Developers also use Hive on Top of
Hadoop for adding SQL compatibility.
Hadoop can be used without any programming,
because there are numerous integration services
available out there.

3
Hadoop Advantages
4
Scalability

One of the key advantages of developing with
hadoop is scalability. Since large data sets can
be easily stored and distributed, it is highly
scalable.
A large number of nodes are made possible by
Hadoop, ensuring large amounts of data storage
and distribution. In comparison to traditional
RDMS, Hadoop is highly scalable.

5
Cost Effective

The big data requirements of today are humongous
and these requirements can be fulfilled in a cost
effective manner using Hadoop. The cost of data
processing is much higher when it comes to
traditional database management systems.
The simplified processing of complex data ensures
that Hadoop is a cost effective framework.

6
Flexible Solution

Operating on different types of data and having
access to different types of data is possible
with Hadoop and this makes it a very flexible
solution. This helps in generating value from all
sorts of data that is gathered.
One could use a variety of data sources like
social media and email etc. to gather as much
useful data as possible.

7
Speed

Since there is a distributed system of files in
Hadoop. The processing servers and storage
servers are the same, making the process
extremely fast.
The processing of data is highly efficient using
the Hadoop framework.

8
Reliable

The higher level of tolerance to faults, is found
only in Hadoop. Data replication in different
nodes ensures that a clear backup is available.
This minimizes the chances of data failure.
Hadoop is quite a reliable framework and helps in
avoiding both single and multiple failures.

Looking for Agile teams for your big data
project? Trust ValueCoders for all kinds of
software development and big data projects.

10
Spark Introduction

11
Spark Advantages
12
Faster

Spark places the data into Resilient Distributed
Datasets. This data gets stored in the memory
making it easily accessible.
Since the data is easily accessed from the
memory, the MapReduce jobs can be undertaken very
quickly.

13
Real Time Processing

There is a continuous growth of real time data.
Processing large quantities of a real time data
can be a big challenge.
This can help in processing of logs for live
streaming sites and also help in fraud detection
and electronic trading data.

14
Using Big Data Effectively

Big data needs to be used effectively to reach
the right set of people with the right messaging.
Big data makes use of very specific audiences to
bring out the best conversion rate for a retail
business. Many retail marketers fail to bring out
the right results for the business because of
lack of understanding of how to make the data
usable and how to analyse it.
Technology has to be fully prepared and used for
big data usage and integration.

15
Processing of Graphs

Graph processing helps in capturing the
relationship between data and entities.
The process helps in analysing social as well as
advertising data. Machine learning helps in
carrying out advanced analytics and getting
consumer understanding.

16
Power

Most companies need 2 systems one for storing
and streaming data and the other for analyzing
the data.
Spark helps in simplified application
development, maintenance and deployment.

17
Get in Touch