OSCAR SCM Architecture for Multigrain Parallel Processing - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

OSCAR SCM Architecture for Multigrain Parallel Processing

Description:

Professor, Department of Computer Science ... Interrupt Distributor. Configurable number of. hardware interrupt lines. Private Peripheral Bus ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 10
Provided by: hpcCsTsi
Category:

less

Transcript and Presenter's Notes

Title: OSCAR SCM Architecture for Multigrain Parallel Processing


1
PanelSoftware Challenges in Multi-Core Chip Era
???? Hironori Kasahara Professor, Department
of Computer Science Director, Advanced
Chip-Multiprocessor Research Institute
?????Waseda University http//www.kasahara.cs.wa
seda.ac.jp
2
Prof. Gaos Questions (1/3)
  • Q1 From software angle do you expect that the
    chip level multi-core architectures will soon be
    converged to 1-2 style
  • (like single-core microprocessors did in the
    history e.g. VLIW vs. superscalar) ?
  • If not, why not ? If yes, what would be the 1-2
    style in your assumptions from software angle ?
  • Answer
  • Yes, I think multi-core architecture will
    converge to SMP for small non-real time systems
    and OSCAR like software and hardware
    collaborative architecture with local,
    distributed shared and centralized memories with
    DMA controller for real-time embedded systems.

3
MPCoreTM
ARM and NEC Collaboration
Private FIQ lines
Configurable number of hardware interrupt lines
MPCoreTM
Interrupt Distributor
Per-CPU aliased peripherals
Timer
CPU interface
Timer
CPU interface
Timer
CPU interface
Timer
CPU interface
Wdog
Wdog
Wdog
Wdog
IRQ
IRQ
IRQ
IRQ
Configurable between 1 and 4 Symmetric CPU
I D 64bit bus
CoherenceControl bus
Snoop Control Unit (SCU)
Private Peripheral Bus
Optional 2nd AXI R/W64bit bus
Primary AXI R/W64bit bus
Duplicated L1 Tag
L2 (L220)
4
Fujitsu FR-1000Multicore Processor
FR550 VLIW Processor
Integer Operation Unit
FR-V Multi-core Processor
Inst. 0
GR
Inst. 1
Inst. 2
Inst. 3
Inst. 4
FR
Inst. 5
Inst. 6
Inst. 7
Media Operation Unit
Fast I/O Bus
  • Memory Bus 64bit x 2ch / 266MHz
  • System Bus 64bit / 178MHz
  • (?????FR-V?2?)

Crossbar (FR1000)
Bus
5
CELL Processor Overview
  • Power Processor Element (PPE)
  • PowerCore processes OS and Control tasks
  • 2-way Multi-threaded
  • Synergistic Processor Element (SPE)
  • 8 SPE offers high performance
  • Dual issue RISC Architecture
  • 128bit SIMD(16-way)
  • 128 x 128bit General Registers
  • 256KB Local Store
  • DedicatedDMA engines

SPE
PPE
512KB
32KB32KB
6
OSCAR Multi-Core Architecture
7
Prof. Gaos Questions (2/3)
  • Q2 Automatic compilation for parallel machine
    did not succeed in general - as proven in the
    past history. What do you expect this time for
    multi-core revolution ? Will it succeed this
    time ? If yes, why do you think we may succeed
    this time ? If not, what other software
    technology (if any) you predict may have a chance
    succeeding ?
  • Answer
  • Yes, I think we will succeed this time
    because we have continued the compiler research
    for twenty years and finally could develop
    multigrain parallelization, local memory
    management, data transfer control and frequency,
    voltage and power-off control.
  • I believe the long long time efforts and the
    real needs for the compiler will change the
    situation.

8
Prof. Gaos Questions (3/3)
  • Q3 What is your favorite parallel programming
    model (if any) ? Why ?
  • Do you believe that the so-called general
    purpose parallel programming models should be the
    way to go - for the new multi-core era ? Why or
    why not ?
  • Answer
  • My favorite model is OpenMP because vendors
    have supported (just we use Parallel, Section,
    Flush, Critical), especially section directives
    for coarse grain task parallel processing. Also,
    we will add some additional directives for OSCAR
    type architecture with local, distributed
    shared, and on-chip and off-chip centralized
    shared memories, DMA controllers and power
    control functions.

9
API and Parallelizing Compiler in
METI/NEDOAdvanced Multicore for Realtime
Consumer Electronics Project
API to specify data assignment, data transfer,
power reduction control
Sequential Application Program (Subset of C
Language)
Translate into parallel codes for each vender
Executable codes for each vendor chip
Realtime Consumer Electronics Application
Programs Image, Secure Audio Streaming etc.
Waseda OSCAR Compiler
Backend compiler
Proc0ScheduledTasks
Mach. Codes
APIdecoder
Sequential Compiler
T1
Stop
SH multicore
Backend Compiler
Proc1ScheduledTasks
Mach. Codes
APIdecoder
Sequential Compiler
T2
T4
FR-V
Proc2ScheduledTasks
Backend Compiler
Mach.Codes
Sequential Compiler
APIdecoder
T3
T6 Slow
(CELL)
Data Transfer by DTC(DMAC)
Write a Comment
User Comments (0)
About PowerShow.com