- PowerPoint PPT Presentation

1 / 6
About This Presentation
Title:

Description:

Evaluating MapReduce for Multi-core and Multiprocessor Systems Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 7
Provided by: Dani110
Category:
Tags: mapreduce

less

Transcript and Presenter's Notes

Title:


1
Evaluating MapReduce for Multi-core and
Multiprocessor SystemsColby Ranger, Ramanan
Raghuraman, Arun Penmetsa, Gary Bradski, Christos
KozyrakisComputer Systems LaboratoryStanford
UniversityPresented by JP Cafaro
2
Introduction to MapReduce
  • MapReduce is a programming model created by
    Google to help with the automatic parallelization
    and distribution of code over thousands of
    servers.
  • It allows for the programmer to write simple
    functional code without needing to worry about
    all of the low-level parallelization under the
    hood.
  • It works by taking an input data, and mapping it
    to intermediate ltkey,valuegt pairs. Disjoint
    portions of the input data can be worked on in
    parallel.
  • The intermediate pairs are then reduced to
    produce the final output. This can also be done
    in parallel.

3
Proposal and Features
  • MapReduce is for thousands of distributed systems
    and relies on remote file accesses. The
    researchers wanted to create a shared memory
    system implementation of MapReduce for commercial
    systems (Phoenix)
  • Phoenix can do a number of really cool things
    like dynamically spawn threads taking into
    account the number of cores, hardware threads per
    core, system load, etc.
  • Work Stealing/Load Balancing, Prefetching,
    Granularity, Fault Tolerance
  • It deals with a lot of the low level stuff
    automatically to create a simplistic programming
    model to greatly facilitate programmer efficiency.

4
Benchmark and Results
  • The researchers used a number of parallelizable
    types of programs including word count, matrix
    multiply, reverse index, etc.
  • Speedups were determined based on comparisons to
    sequential versions of the code.
  • In all cases, using the MapReduce implementation
    was better than using the sequential version.
  • In some cases, the overhead introduced by Phoenix
    made it less efficient than a low-level
    implementation in P-Threads.

5
Questions
  • The main question is the tradeoff between
    programming simplicity and performance.
  • The low level P-threads implementation didnt use
    dynamic scheduling because of programming
    complexity even though it would have probably
    made the Phoenix implementation look less
    attractive from a performance standpoint.
  • Are we giving up too much to make programmers
    lives easier?
  • How many types of applications can we use this
    MapReduce implementation on?
  • Are there other types of programming models that
    are similar to MapReduce that we could fit to
    other problems types?

6
Conclusions
  • MapReduce/Phoenix can be really useful for some
    algorithms that map nicely onto this programming
    model as shown by the results.
  • Other types of programs that this model isnt
    naturally suited for experience less speedups.
    The overhead introduced by Phoenix makes
    alternatives such as using a lower level
    P-threads implementation perform better.
  • Overall, this model is extremely simple and
    techniques such as MapReduce which automatically
    parallelize code are important to think about as
    we try and figure out how to write software for
    tons of cores.
Write a Comment
User Comments (0)
About PowerShow.com