Introduction to Apache Hadoop - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction to Apache Hadoop

Description:

A short presentation to introduce Apache Hadoop, what is it and what can it do ? What are the other products associated with it ? – PowerPoint PPT presentation

Number of Views:3439

less

Transcript and Presenter's Notes

Title: Introduction to Apache Hadoop


1
Apache Hadoop
  • What is it ?
  • Architecture
  • Related Projects
  • Large users

2
Hadoop What is it ?
  • An open source system developed using Java
  • Supports very large data sets
  • Supports large clusters of servers
  • Designed to run on pre existing low cost
    hardware
  • Allows for fragmentation of work over cluster
  • Allows for fragmentation of storage over cluster
  • Provides resiliance via automatic failure
    handling

3
Hadoop - Architecture
  • Hadoop consists of
  • Hadoop Common
  • Common utilities for Hadoop module support
  • Hadoop MapReduce
  • Parallel processing of Hadoop data
  • Hadoop Yarn
  • Scheduler and resource manager
  • Hadoop Distributed File System (HDFS)?
  • A Master/Slave file system which spreads the
    Hadoop data over a very large cluster of slave
    data nodes controlled by a single name node.

4
Hadoop Related Projects
5
Hadoop Related Projects
  • Pig - for analysing large data sets
  • Hive data warehouse system for Hadoop
  • Mahout machine learning and data mining
  • Avro a data serialization system
  • Zoo Keeper helps build distributed
    applications
  • Chukwa data collection and analysis

6
Hadoop Related Projects
  • Hue Hadoop user interface
  • Oozie work flow scheduler
  • Hama bulk synchronous parallel framework
  • For massive scientific computations
  • Nutch web crawler
  • Hbase Non relational database

7
Hadoop Large Users
  • Yahoo
  • 10,000 core Linux cluster
  • Facebook
  • 100 Petabytes, growing at .5 Petabytes a day
  • Amazon
  • Its possible to run Hadoop on Amazon's EC2 and S3

8
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com