Title: Our Experiences on DirectFB in Embedded Application Development
1Our Experiences on DirectFB inEmbedded
Application Development
- IGEL Co., Ltd / Renesas Solution Corp.
2Todays Topics
- DirectFB porting experiences on embedded
platforms - SH7751 SM501
- SH7770
- Missing pieces in DirectFB
- Functionality missing to write specific
applications
3Porting DirectFB
4DirectFB Architecture
DirectFB Application
- DirectFB works on a frame buffer device(/dev/fb)
and provides the mechanism to use the hardware
acceleration effectively. - DirectFB consists of the followings
- Core API Module
- Generic GFX Driver
- GFX Drivers for Specific Hardware
- To bring out the best performance on a specific
graphics hardware GFX Drivers for the hardware
should be written. - Generic GFX Driver checks whether the hardware
acceleration by a GFX driver is available - If yes, it handovers to the GFX driver
- If not it uses software rendering engine
User Level
DirectFB
DirectFB Core API Module
GFX drivers(?)
Generic GFX Driver
Device Drivers
Frame Buffer Driver(?)
Display Unit
2D Graphics Hardware
Hardware
?Modules that needs to be developed
5Why do we need GFX drivers?
- Embedded CPU and bus are slow compare to
Desktops CPU - 200-400MHz CPU
- 120MHz 32bit Bus
- Therefore, handover the rendering tasks to
specialized hardware is crucial!
6Effects of Hardware Accelerations
CPU Bus RAM 2D Kernel H/W Accel
A SH7751 240MHz SH-Bus 64MB SMI SM501 2.4.19 Off
B SH7751 240MHz SH-Bus 64MB SMI SM501 2.4.20 On
C SH7751 240MHz PCI 64MB Matrox Millennium 2.4.20 On
D Celeron 450MHz PCI 128MB Matrox Millennium 2.4.20 On
A B C D
Fill Rectangles MPixel/sec 14.07 217.66 63.63 53.25
Fill Rectangles (blend) MPixel/sec 1.64 1.66 1.2 3.26
Fill Triangles MPixel/sec 12.25 93.69 62.26 50.51
Fill Triangles (blend) MPixel/sec 1.63 1.63 1.17 3.17
Draw Rectangles KRects/sec 1.81 15.45 10.67 8.57
Draw Rectangles (blend) KRects/sec 0.52 0.56 0.43 0.84
Draw Lines KLines/sec 7.1 67.09 61.33 48.84
Draw Lines (blend) KLines/sec 2.33 2.43 1.94 3.7
Blit MPixel/sec 8.12 102.47 38.68 32.56
Blit with format conversion MPixel/sec 4.04 4.12 3.59 17.79
The hardware acceleration shows remarkable
results.
The performance depends on hardware acceleration
engine rather than CPU.
7How to write GFX Drivers?
- Callback routines needs to be written
- GFX Graphics Driver Functions
- GFX Graphics Device Functions
- Good starting point is gfxdrivers/i810/.ch
8GFX Graphics Driver Functions
- From core/gfxcard.h
- typedef struct
- int (Probe) (GraphicsDevice
device) - void (GetDriverInfo) (GraphicsDevice
device, -
GraphicsDriverInfo driver_info) - DFBResult (InitDriver) (GraphicsDevice
device, -
GraphicsDeviceFuncs funcs, - void
driver_data, - void
device_data) - DFBResult (InitDevice) (GraphicsDevice
device, -
GraphicsDeviceInfo device_info, - void
driver_data, - void
device_data) - void (CloseDevice) (GraphicsDevice
device, - void
driver_data, - void
device_data)
9GFX Graphics Device Functions
- From core/gfxcard.h
- typedef struct _GraphicsDeviceFuncs
- /
- function that is called after variable
screeninfo is changed - (used for buggy fbdev drivers, that
reinitialize something when - calling FBIO_PUT_VSCREENINFO)
- /
- void (AfterSetVar)( void driver_data, void
device_data ) - /
- Called after driver-gtInitDevice() and
during dfb_gfxcard_unlock( true ). - The driver should do the one time
initialization of the engine, - e.g. writing some registers that are
supposed to have a fixed value. -
- This happens after mode switching or
after returning from - OpenGL state (e.g. DRI driver).
- /
- void (EngineReset)( void driver_data, void
device_data )
- /
- after the video memory has been written to by
the CPU (e.g. modification - of a texture) make sure the accelerator won't
use cached texture data - /
- void (FlushTextureCache)( void driver_data,
void device_data ) - /
- Check if the function 'accel' can be
accelerated with the 'state'. - If that's true, the function sets the 'accel'
bit in 'state-gtaccel'. - Otherwise the function just returns, no need
to clear the bit. - /
- void (CheckState)( void driver_data, void
device_data, - CardState state,
DFBAccelerationMask accel ) - /
- Program card for execution of the function
'accel' with the 'state'. - 'state-gtmodified' contains information about
changed entries. - This function has to set at least 'accel' in
'state-gtset'. - The driver should remember 'state-gtmodified'
and clear it.
10GFX Graphics Device Interface(contd.)
- /
- drawing functions
- /
- bool (FillRectangle) ( void driver_data,
void device_data, - DFBRectangle rect
) - bool (DrawRectangle) ( void driver_data,
void device_data, - DFBRectangle rect
) - bool (DrawLine) ( void driver_data,
void device_data, - DFBRegion line )
- bool (FillTriangle) ( void driver_data,
void device_data, - DFBTriangle tri )
- /
- blitting functions
- /
- bool (Blit) ( void driver_data,
void device_data,
11Porting DirectFB on SH7751 SM501
- Our first development
- GFX Driver for SM501 just set registers to issue
rendering commands - Issuing command is done on the fly
- Callback functions immediately set registers to
render - Rendering comes on screen instantly
GFX Driver for SM501
SM501
SM501 Registers
12Porting DirectFB on SH7770
- Our second development
- GFX Driver for SH7770 creates list of rendering
commands, so called Display List - The list is double buffered
- The driver fills the list until theyre full, and
then pass them to the 2D engine - While the driver is filling one list, the 2D
engine reads commands from another list - Once the 2D engine is done with the list, it
sends an interrupt, and get the next list - Rendering doesnt come on screen instantly
- Sync mechanism is required to sync with software
rendering done by generic GFX driver
GFX Driver for SH7770
Display List 1
Display List 2
SH7770 2D Engine
Registers
13Missing Pieces in DirectFB
14Whats Missing?
- Access multiple layered frame buffer from a
process - Recent graphics hardware has multiple frame
buffers. - Using scroll function on hardware
- New feature not covered by DirectFB API
- Synchronous rendering and display with VSYNC
(QoS, delay handling) - Real-time motion graphics (e.g. game), car
navigation, etc. - Synchronize 2D Engine and 3D Engine
- Render with 2D and 3D Engine in a single layer
- Synchronous display even 2D and 3D are on
different layers
15Access Multiple Layered Frame Buffer from a
Process
- Recent graphics hardware has multiple layered
frame buffer. - To coordinate layers efficiently, an application
process wants to issue rendering commands and
switch on / off display of each layer.
Layer 1 (/dev/fb0)
Display Unit
Application process
Layer 2 (/dev/fb1)
Layer 3 (/dev/fb2)
16Using Scroll Function on Hardware
- Use Case car navigation system, web browser
- The Scroll function reduces re-rendering cost.
Display start position
A
C
B
B
A
C
Simple scroll
Wraparound scroll
17Synchronous Rendering and Display with VSYNC (QoS)
- In real-time motion graphics applications, the
screen must be updated in sync with the VSYNC
signal. - Under the standard (fairly) task scheduling in
Linux, rendering might miss display timing
(VSYNC) because of signal interrupts or other
heavy tasks.
y16.6ms(for NTSC)
Other tasks
VSYNC
Ideal
Rendering task keeps needed resources in every
VSYNC time slots.
18Synchronous Rendering and Display with VSYNC
(Delay Handling)
- Real-time motion graphics applications could be
optimized for screen rendering, especially for
delay handling. - Application needs VSYNC signal timing to notice
the delay.
19Delay Handling by Application
- Example 1 Skipping the next frame rendering
- When an application noticed that display has been
delayed, it could skip the next frame and start
rendering the frame after the next.
Skip the frame 3
y16.6ms(for NTSC)
Catch up the delay
VSYNC
Frame 2
Frame 4
Rendering
Frame 1
Display
Frame 1
Frame 1
Frame 2
Frame 4
miss
Ideal
Frame 1
Frame 2
Frame 3
Frame 4
20Delay Handling by Application (contd.)
- Example 2 Updating screen even if the rendering
is not finished - Additionally, application could give priority to
rendering operations in case of incomplete frame
displaying.
Terminate rendering 2
y16.6ms(for NTSC)
VSYNC
Frame 2
Frame 4
Frame 3
Rendering
Frame 1
Uncompleted frame 2
Display
Frame 1
Frame 3
Frame 4
Enforce display 2
Ideal
Frame 1
Frame 2
Frame 3
Frame 4
21Synchronize 2D Engine and 3D Engine
- Many 3D graphics applications combine 3D graphics
and 2D graphics. - These 2D graphics must be synchronized with 3D
graphics. - Some 3D acceleration hardware (nVIDIA, ATI, for
example) are separated from 2D hardware. - Synchronization mechanism between 2D and 3D
graphics (hardware) is needed.
22Synchronization Problem in 2D Engine and 3D Engine
- Situation 1 Rendering 2D graphics and 3D
graphics into a single layer simultaneously - 2D engine and 3D engine tries to draw into a
frame without any synchronization. They should be
serialized. - Issued rendering commands might be performed
asynchronously.
Layer
2D Engine
Application
3D Engine
Display
23Synchronization Problem in 2D Engine and 3D Engine
- Situation 2 Displaying a 2D graphics layer and a
3D graphics layer synchronously - 2D/3D engines can draw into each independent
layer asynchronously if the multiple layered
frame buffer is available. - Synchronous display function is still needed.
2D Layer
2D Engine
Synchronous display
Application
3D Layer
3D Engine
24Conclusion
- We need to consider wherther DirectFB API should
/ could cover all application requirements or
not. - We would like to submit proposals and distribute
implementations against the issues.