TECH - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

TECH

Description:

Parallelism & Concurrency. Hardware. Programming Models ... Loop parallelism: OpenMP. Message Passing: CCR, MPI, Erlang ... SQL: Implicit data parallelism ... – PowerPoint PPT presentation

Number of Views:262
Avg rating:3.0/5.0
Slides: 12
Provided by: joel83
Category:
Tags: tech | parallelism

less

Transcript and Presenter's Notes

Title: TECH


1
Scaling out Google Style
Sydney BarCamp 07 Joel Pobar joelpobar_at_gmail.com
http//callvirt.net/blog/
2
Agenda
  • Parallelism Concurrency
  • Hardware
  • Programming Models
  • MapReduce
  • Experiments to date

3
Definitions
  • Concurrency
  • Dijkstra -- Concurrency occurs when two or more
    execution flows are able to run simultaneously
  • Parallelism
  • Simultaneous execution of the same task!

4
HardwarePerformance The Multi-Core era
  • Processors dont get way faster anymore You
    just get a whole lot more slow ones!!!

5
Hardware
  • 90?65?45 nm lithography advances
  • Twice, twice again as many (faster) transistors
  • Slower wires maxed out thermal envelope?
    slower CPU frequency scaling
  • Same freq more cores more cache RAM? same
    cache/core and compute/core
  • Architecture advances
  • (Hardware) multithreading
  • Optimizing for throughput, power
  • System-on-a-Chip integration interconnects,
    shared caches, I/O and DRAM controllers
  • Intel talks 80 cores!?

6
Programming ModelsClient side
  • Shared memory, threads and locks
  • Most used, most disastrous
  • Synchronisation is costly shared memory
    accesses across multiple CPUs doesnt scale
    (cache misses etc)
  • Tough heisenbugs
  • Loop parallelism OpenMP
  • Message Passing CCR, MPI, Erlang
  • Functional Languages Implicit, no shared state
  • Software Transactional Memory
  • IMO Most likely to solve the problem

7
Programming ModelsServer side
  • Server per-client work-unit parallelism
  • Web server implicit request parallelism
  • SQL Implicit data parallelism
  • Scale out possible, but bottlenecks can occur at
    layer boundaries
  • Typically hard to scale out to lots of machines
  • Clusters (Beowulf, Windows HPC)
  • Grid Computing/Cycle Stealing (Sun Grid, G2,
    Alchemi)
  • Map/Reduce (Hadoop Java, Open Source, Google
    MapReduce Not available

8
Agenda
  • Parallelism Concurrency
  • Hardware
  • Programming Models
  • MapReduce
  • Experiments to date

9
MapReduce
  • Nice functional programming model (similar to
    Googles MapReduce model)
  • Scheduling, latency, file system, resource
    management
  • Things to think about Hyperthreading,
    Programming model, code distribution, security,
    resource management for dummies, automatic
    scheduling and tuning
  • What we did

10
MapReduce.NET experiement
11
Contact
  • Joel Pobar
  • Brisbane based
  • 0410 443 469
  • joelpobar_at_gmail.com
Write a Comment
User Comments (0)
About PowerShow.com