A Beowulfclass architecture proposal for realtime embedded vision - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

A Beowulfclass architecture proposal for realtime embedded vision

Description:

PDIVM'03. 1. A Beowulf-class architecture proposal for real-time embedded vision ... Madrid, 28806, SPAIN ** LASMEA, UMR 6602 CNRS, Universit Blaise Pascal. ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 20
Provided by: prevengajo
Category:

less

Transcript and Presenter's Notes

Title: A Beowulfclass architecture proposal for realtime embedded vision


1
A Beowulf-class architecture proposal for
real-time embedded vision
  • P.A. Revenga, J Sérot, J.L.Lázaro, J.P.
    Dérutin
  • Dpto de Electrónica ,
  • Universidad de Alcalá de Henares
  • Madrid, 28806, SPAIN
  • LASMEA, UMR 6602 CNRS,
  • Université Blaise Pascal. Clermont-Ferrand
  • Aubière, France

2
Outline
  • Context
  • High-performance computing
  • Architecture proposal, hardware software
  • SKiPPeR
  • Experimental results
  • Basic benchmarking results
  • Realistic application, Image stabilisation
  • Related Work
  • Conclusion, future work

3
Context
  • Real-time embedded vision
  • Example assisted-driving cars, robots,
    space-rovers,
  • Reactive applications
  • Digital video streams I/O
  • High performance (10-100fps, 0.1-5Gflops)
  • Reduced volume and power consumption

4
High-performance computing
  • Possible Solutions
  • High-end PC.
  • Specialized processors(DSP, FPGA, ASIC)
  • Dedicated Multiprocessor Machines
  • Beowulf-class clusters.

5
Architecture proposal
  • Beowulf-class solution for real-time embedded
    vision.
  • Two independent communication network
  • one for interprocess comunication
  • one for digital video broadcasting

6
Architecture proposalHardware
  • Apple Cube motherboard
  • PPC 74xx CPU
  • Altivec support
  • Fast-ethernet
  • IEEE 1394a (400 Mbs)
  • Single supply 24v
  • No Fan

7
Architecture proposalHardware
  • 4 - nodes
  • Fast-ethernet switch
  • Low power
  • 4Gflops / 30 Watts
  • 20x20x20 cm

8
Architecture proposalSoftware
  • Layered approach
  • Linux OS
  • PPC 74xx with Altivec support (gcc-altivec).
  • IEEE 1394 driver.
  • Low level communication library.
  • MPI (MPICH)
  • High-level parallel programming environment.
  • SKiPPER

9
Software architectureSKiPPeR
  • Based upon algorithmic skeletons
  • Higher-order program constructs encapsulating
    common and recurrent forms of parallelism
  • Typical skeletons SCM (data-parallelism), FARM,
  • Parallel programs built by selecting,
    instantiating and composing skeletons taken from
    a library
  • Low-level implementation (mapping, scheduling,
    ..) transparently handled by the compiler

10
Software architecture SKiPPER
11
Experimental results
  • Basic benchmarking
  • Applying a convolution mask
  • I/O on disk
  • Several image and mask sizes
  • Four computations times measured
  • Tseq sequential ref time (1 G4 node, 450 MHz,
    128M)
  • Tpar SPMD exec time (4 nodes, SCM skeleton)
  • Tvec SIMD exec time (1 G4 node with Altivec)
  • Tpv SPMDSIMD (4 nodes with Altivec)

12
Basic benchmarking results
13
Basic benchmarking (2)
  • Relative speedups
  • P/S Tseq / Tpar
  • PV/V Tpv / Tvec

14
Basic benchmarking (3)
  • Relative speedups
  • V/S Tseq / Tvec
  • PV/S Tseq / Tpv

15
Realistic application
  • Image stabilisation on digital video streams
  • Algorithm two stages
  • Detection of Points Of Interest, with Haar
    wavelets.
  • Tracking of POIs, over neighbouring windows with
    a correlation filter.
  • Compute displacement
  • Apply corrections

16
Image stabilisation
Search of POIs
Track
17
Image stabilisation Experimental results
  • Implemented on four platforms
  • Single G4 _at_ 450Mhz
  • 4 x G4s with Altivec
  • Single P4 _at_ 1.5Ghz
  • Single G4 _at_ 1Ghz with Altivec
  • Measured time to process one frame of the input
    stream
  • Np nbr of POIs

18
Related Work
  • Most of Beowulfs machines built today are
    dedicated to number-crunching,off-line
    computations
  • Very few projects have targeted real-time
    embedded vision
  • Yoshimoto et al.
  • Firewire PC cluster, but using ieee1394 for data
    and video broadcast.
  • NASA, REE project .
  • Four node G4 processors interconnected by
    myrinet.

19
Conclusion, future work
  • Originality of the work application of the
    Beowulf COTS approach to a application domain
    where dedicated solutions (hwsw) were
    traditionally used
  • Good performance/Watt and Performance/dm3 ratios
  • Possible applications implementation of complex
    vision and navigation algorithms on small
    autonomous vehicles, such as the CyCab
Write a Comment
User Comments (0)
About PowerShow.com