Experiments in Utility Computing: Hadoop and Condor - PowerPoint PPT Presentation

About This Presentation
Title:

Experiments in Utility Computing: Hadoop and Condor

Description:

Several clusters of 100s-1000s of nodes ... (HDFS, Lustre, Ibrix, ...) Programming Models (MPI, DAG, MW, MR...) Applications (Crawl, Index, ... – PowerPoint PPT presentation

Number of Views:172
Avg rating:3.0/5.0
Slides: 18
Provided by: Csw5
Category:

less

Transcript and Presenter's Notes

Title: Experiments in Utility Computing: Hadoop and Condor


1
Experiments in Utility Computing Hadoop and
Condor
  • Sameer Paranjpye
  • Y! Web Search

2
Outline
  • Introduction
  • Application environment, motivation, development
    principles
  • Hadoop and Condor
  • Description, Hadoop-Condor interaction

3
Introduction
4
Web Search Application Environment
  • Data intensive distributed applications
  • Crawling, Document Analysis and Indexing, Web
    Graphs, Log Processing,
  • Highly parallel workloads
  • Bandwidth to data is a significant design driver
  • Very large production deployments
  • Several clusters of 100s-1000s of nodes
  • Lots of data (billions of records, input/output
    of 10s of TB in a single run)

5
Why Condor and Hadoop?
  • To date, our Utility Computing efforts have been
    conducted using a command-and-control model
  • Closed, cathedral style development
  • Custom built, proprietary solutions
  • Hadoop and Condor
  • Experimental effort to leverage open source for
    infrastructure components
  • Current deployment Cluster for supporting
    research computations
  • Multiple users, running ad-hoc, experimental
    programs

6
Vision - Layered Platform, Open APIs
Applications (Crawl, Index, )
Programming Models (MPI, DAG, MW, MR)
Batch Scheduling (Condor, SGE, SLURM, )
Distributed Store (HDFS, Lustre, Ibrix, )
7
Development philosophy
  • Adopt, Collaborate, Extend
  • Open source commodity software
  • Open APIs for interoperability
  • Identify and use existing robust platform
    components
  • Engage community and participate in developing
    nascent and emerging solutions

8
Hadoop and Condor
9
Hadoop
  • Open source project developing
  • Distributed store
  • Implementation of Map/Reduce programming model
  • Led by Doug Cutting
  • Implemented in Java
  • Alpha (0.1) release available for download
  • Apache distribution
  • Genesis
  • Lucene and Nutch (Open source search)
  • Hadoop (factors out distributed compute/storage
    infrastructure)
  • http//lucene.apache.org/hadoop

10
Hadoop DFS
  • Distributed storage system
  • Files are divided into uniform sized blocks and
    distributed across cluster nodes
  • Block replication for failover
  • Checksums for corruption detection and recovery
  • DFS exposes details of block placement so that
    computes can be migrated to data
  • Notable differences from mainstream DFS work
  • Single storage compute cluster vs. Separate
    clusters
  • Simple I/O centric API vs. Attempts at POSIX
    compliance

11
Hadoop DFS Architecture
  • Master Slave architecture
  • DFS Master Namenode
  • Manages all filesystem metadata
  • Controls read/write access to files
  • Manages block replication
  • DFS Slaves Datanodes
  • Serve read/write requests from clients
  • Perform replication tasks upon instruction by
    namenode

12
Hadoop DFS Architecture
Metadata (Name, replicas, ) /home/sameerp/foo,
3, /home/sameerp/docs, 4,
Namenode
Metadata ops
Client
Datanodes
I/O
Client
Rack 1
Rack 2
13
Benchmarks
14
Deployment
  • Research cluster of 600 nodes
  • Billion web pages
  • Several months worth of logs
  • 10s of TB of data
  • Multiple-users running ad-hoc research
    computations
  • Crawl experiments, various kinds of log analysis,
  • Commodity Platform Intel/AMD, Linux, locally
    attached SATA drives
  • Testbed for open source approach
  • Still early days, deployment exposed many bugs
  • Future releases to
  • First stabilize at current size
  • Then scale to 1000 nodes

15
Hadoop-Condor interactions
  • DFS makes data locations available to
    applications
  • Applications generate job descriptions
    (class-ads) to schedule jobs close to data
  • Extensions to enable Hadoop programming models to
    run in scheduler universe
  • Master/Worker, MPI universe like meta-scheduling
  • Condor enables sharing among applications
  • Priority, accounting, quota mechanisms to manage
    resource allocation among users and apps

16
Hadoop-Condor interactions
Scheduler universe apps
HDFS
Data locations (d,e)
1
Condor
Classads (Schedule on d,e)
2
3
Resource allocation
4
17
The end
  • THE END
Write a Comment
User Comments (0)
About PowerShow.com