Title: Computing Architectures for Virtual Reality
1Computing Architectures for Virtual Reality
- Multiprocessor Servers / Graphic Supercomputers
vs. PC Clusters Architecture
2Introduction
- VR requires
- fast graphics and haptics refresh rates? graphic
pipelines - low latencies? interactivity
3Graphics Rendering Pipeline
- Three functional stages
- Application stage (SW)
- Geometry stage (HW)
- Rasterizer stage (HW)
4Computing Architectures (1)
- Single Host Multiprocessor Server
- Massively parallel architecture
- Multiprocessor?interprocessor communication?shar
ed memory pool - Multipipe graphics?parallel rendering?bus based
fast communication
5Computing Architectures (2)
- Distributed Cluster
- off-the-shelf hardware
- interconnecting network
- scalable
6Distributed VR system architecture problems
- No shared memory pool? low latency network
- Independent graphic accelerator cards? video
signal synchronization - If tiled image rendering? composition
7Single Host Multiprocessor Multipipe Servers
- SGI InfiniteReality
- massively parallel architecture
- bus-based broadcast communication to distribute
primitives - Graphics subsystem
- Geometry engine,
- Raster Manager,
- Display generator
8SGI InfiniteReality (1)
9SGI InfiniteReality (2)
10SGI Performer (1)
- High performance 3D rendering toolkit
- Use of multiple CPUs,
- Use of multiple graphics pipelines
11SGI Performer (2) - Multiprocessing
- Each stage of the graphics pipeline process can
then run as a separate process on a separate CPU - APP
- CULL
- DRAW
12SGI Performer (3) - Multichannels
- Each rendering pipelinecan render
multiplechannels multiple video outputs
13SGI Performer (4) - Multipipes
- Multiple displays
- synchronized with genlock
14SGI Performer (5) - Hyperpipe
- Temporal Decomposition
- To use with DPLEX ring or chain
15SGI Performer (5) - Frame Synchronization
- pfSync synchronizes the graphics pipeline to the
frame rate - DRAW time overruns is specified by the phase
control? scene management LOD, culling,
16PC Cluster Architecture (1)
- Each node must have access to the same entire
data set - real time visualization and interactivity ?
network latency - the seamless, synchronized graphic display (image
reassembly)
17PC Cluster Architecture (2)
- 3 levels of synchronization
- video signal synchronization? genlock
- dynamic data synchronization? network
- frame completion synchronization? swapbuffers
barrier
18PC Cluster Architecture (3)
19PC Cluster Architecture (4)
20PC Cluster Architecture - Networks (1)
- Hardware Solutions
- Giga Ethernet (1 Gigabit/s, half duplex)
- Myrinet (2 2 Gigabit/s full duplex)
- ServerNet II (Compaq), ...
- Software Interfaces
- TCP / IP
- PVM, MPI, ...
21PC Cluster Architecture - Networks (2)
- Myrinet
- massively parallel processors (MPP) communication
technology? specialized communication channels,
cut-through switches, host interfaces? "OS
bypass" for low-latency communication
22PC Cluster Architecture - Synchronization (1)
- The following is required to provide a seamless
image - each channel must render the same data
- pixel rates must be identical
- the displays must start new images at the same
time - swapping of their buffers during the same
blanking period
23PC Cluster Architecture - Synchronization (2)
- Video Signal Synchronization
- Genlockpixel level synchronization is ensured
by all graphic pipelines via an (external) sync
signal? most precise way - Framelocksynchronizes once per frame at the end
of the blanking period
24PC Cluster Architecture - Synchronization (3)
- Dynamic Data Synchronization
- 2 types of changing data
- control information direction of view
- changing / dynamic data set information model
movement - 3 approaches for distribution
- distribute stimuli
- calculate resulting data centrally and distribute
- calculate end graphics data centrally and
distribute
25PC Cluster Architecture - Synchronization (4)
- Frame Completion Synchronization nodes have to
wait until are ready to swap buffers? swap
barrier synchronization - Multiview
26PC Cluster Architecture - Synchronization (5)
- Net Juggler and SoftGenLock
- based on an input event level parallelization?
No highbandwidth network necessary - Synchronization
- Real time Linux
- Fast sync network PAPERS (Parallel Port - 4µs )
27PC Cluster Architecture - Composition (1)
- Display Reassembly in Hardware
- Lightning-2
- connects to graphic accelerators via DVI
- any pixel data generated from any node to be
dynamically mapped to any location on any display
- Sepia 2
- reads back the framebuffer and distributes it
over a fast network (ServerNet II)
28PC Cluster Architecture - Composition (2)
- Lightning-2
- Pixel Mapping? strip header
- Frame transfer protocol? RS232 back connect for
sync - Image composition
- Tiled images
- Colour keying
- Depth compositing
29PC Cluster Architecture - Composition (3)
30Single Multipipe Graphics Accelerator (1)
- Wildcat III 6210 / Wildcat II 5110
Graphics pipelines
31Single Multipipe Graphics Accelerator (2)
32Single Multipipe Graphics Accelerator (3)
33Commercial Cluster Solutions
- SGI Graphics Cluster
- ImageSync
- DataSync
34Commercial Cluster Solutions
- Evans Sutherland SimFUSION
Princeton display wall eight 4-way Pentium-Pro
SMPs with ES graphics accelerators. They drive
8 Proxima 9200 LCD projectors. (1998)
35Commercial Cluster Solutions
- AEC ArsBox
- nodes via Parallelport synchronized
- Redhat 7.1
- SGI Performer
- 100Mbit Ethernet
36Conclusion
- Graphic Supercomputers
- Massively parallel structure
- Expensive
- Established
- PC clusters
- Off-the-shelf hardware
- Distributed ? synchronization