XC6200 Family FPGAs - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

XC6200 Family FPGAs

Description:

XC6200 Family FPGAs By: Ahmad Alsolaim Alsolaim – PowerPoint PPT presentation

Number of Views:120
Avg rating:3.0/5.0
Slides: 52
Provided by: Stephe723
Category:

less

Transcript and Presenter's Notes

Title: XC6200 Family FPGAs


1
XC6200 Family FPGAs
By Ahmad Alsolaim
Alsolaim
2
Agenda
  • XC6200 Architecture
  • Design Flows
  • Library Support
  • Applications
  • Reconfigurable Processing

3
Problems Confronting Embedded Control Designers
Today
Reconfiguration from external memory limited to
low frequency
I/O
CPU
High frequency access to registers needed
Microprocessor interface consumes resources
Reconfigurable Coprocessor (FPGA)
Bus access to large number of internal registers
requires careful design
Insufficient memory capacity for
coprocessing algorithms
Partial Reconfiguration is difficult
I/O
4
XC6200 System Features MeetEmbedded Coprocessing
Requirements
1000x improvement in reconfiguration time from
external memory
I/O
Memory
CPU
FastMAPtm assures high speed access to all
internal registers
Microprocessor interface built-in
Reconfigurable Coprocessor XC6200
All registers accessed via built-in
low-skew FastMAPtm busses
High capacity distributed memory permits
allocation of chip resources to logic or memory
Ultrafast Partial Reconfiguration fully supported
I/O
Up to 100,000 gates !
5
XC6200 Architectural Overview
  • Array of fine grain function cells, each with a
    register
  • high gate count for structured logic or regular
    arrays
  • Abundant, hierarchical routing resources
  • Flexible pin configuration
  • programmable as in, out, bidirectional, tristate
  • CMOS or TTL logic levels

6
XC6200 Architecture (cont)
  • High speed CPU interface for configuration and
    register I/O
  • Programmable bus width (8..32-bits)
  • Direct processor read/write access to all user
    registers
  • All user registers and configuration SRAM mapped
    into processor address space

7
XC6200 Architecture
16x16 Tile
4x4 Block
User I/Os
Ÿ
Ÿ
Ÿ
FastMAPtm Interface
Ÿ
Ÿ
User I/Os
User I/Os
Ÿ
Ÿ
Address
Function Cell
Ÿ
Ÿ
Data
Control
Ÿ
Ÿ
Ÿ
User I/Os
ŸNumber of tiles varies between devices in family
Alsolaim
8
Logical Organization Basic Cell.
Alsolaim
9
Logical Organization XC6200 Function Unit
  • Function unit allows
  • any function of 2 variables
  • any flavour of 21 mux
  • buffers, inverters, or constant 0s and 1s
  • any of the above in addition to a D-type register
  • 3 I/Ps, each from any of 8 directions O/P to up
    to 4 directions

10
Logical Organization Function Unit.
Figure 6 XC6200 Function Unit
Alsolaim
11
Logical Organization Function Unit. (cont)
Alsolaim
12
Logical Organization Function Unit. (cont)
Alsolaim
13
Physical Organization Cells, Blocks and Tiles
Alsolaim
14
Physical Organization Cells, Blocks and Tiles
(cont)
Alsolaim
15
Routing Resources Example
Alsolaim
16
Routing Switches
Alsolaim
17
North and South Switches
Alsolaim
18
East and West Switches
Alsolaim
19
Clock Distribution
Alsolaim
20
Clear Distribution
Alsolaim
21
Input/Output Architecture
Alsolaim
22
Connections Between IOBs And Built-In XC6200
Control Logic
Alsolaim
23
Array Data Sources In West IOBs
Alsolaim
24
XC6200 Device Organization
  • Conceptual view
  • Logic symbol

Alsolaim
25
FastMAP CPU Interface
  • The industrys only random access configuration
    interface
  • allows for extremely fast full or partial device
    configuration - you only program the bits you
    need
  • Allows direct CPU (random) access to user
    registers
  • supports coprocessing applications.

26
FastMAP CPU Interface (cont)
  • Easily interfaced to most microprocessors and
    microcontrollers
  • memory mapped architecture makes it just like
    designing with SRAM

27
FastMAP (cont)
Cell Array
Map Register
  • Map Register allows mapping of user registers on
    to 8, 16, or 32 bit data bus
  • Allows unconstrained register placement
  • Obviates need for complex shift and mask
    operations

1
0
bit 7
0
bit 6
0
bit 5
1
1
0
Data Bus
bit 4
0
bit 3
0
bit 2
1
0
bit 1
0
bit 0
1
1
User-defined register
Cells
28
FastMAP (cont)
  • Wildcard Registers allow dont cares on address
    bits
  • same data can be written to several locations
    (SRAM and user registers) in one cycle
  • fast configuration of bit-slice type designs
  • broadcast of data to registers without tying up
    valuable routing resources.

29
Partial Run-time Reconfiguration
  • Extend hardware to a larger (virtual) capacity
    through rapid reconfiguration
  • Derive time-varying structures that are smaller
    and faster than the ASIC counterpart
  • Make more transistors participate in a given
    computation

Alsolaim
30
Partial Run-time Reconfiguration
F2
F3
F4
F5
F6
Time 0
Alsolaim
31
Partial Run-time Reconfiguration
F2
F7
F8
F9
F4
F5
F6
Time lta short time latergt
Alsolaim
32
Reconfiguration Speedvs Traditional Technologies
Design Swapping
XC4013
200us
250ms
XC6216
Block Swapping
Circuit Updates
Rewiring
40ns
ns
us
ms
s
33
XC6200 Family Members
Alsolaim
34
Design Flows
35
Library Support
  • Primitive gates and functions (compatible with
    other Xilinx parts)
  • AND, OR, ADD, MULT, etc
  • More complex macros also to be available
  • memory access
  • DSP functions (FIR, FFT, DCT)
  • JTAG, decoders, etc.

36
Applications
  • Can be used as regular FPGA
  • serial interface allows for booting from PROM
  • Intended to act as hardware accelerator for
    microprocessors
  • FastMAP allows for
  • direct microprocessor access to internal logic
  • fast reconfiguration of all or part of device

37
Applications (cont)
  • Context switching and virtual hardware are
    realistic propositions
  • Typical uses might include DSP, image processing,
    datapaths, etc.

38
Reconfigurable Processing
  • Custom computing concept, building on
  • fast configuration
  • virtual hardware
  • PCI based development system to be made available
  • can be used as a custom computer in its own
    right, or
  • as an aid to system development for customers
    designs

39
XC6000 Software
  • XACT6000 Software From Xilinx. (will be available
    soon in our lab)
  • Trianus/Hades Design Entry Software for the
    XC6200.(available in our lab)
  • Velab Free VHDL Elaborator for the XC6200.
    (available in our lab)
  • XC6200 Inspector. (available in our lab)

Alsolaim
40
A Multiplier for the XC6200
41
A Multiplier for the XC6200
  • Structure
  • Math
  • Building Lookup Tables
  • Area Optimization
  • Mapping into an XC6200
  • Changing Coefficients
  • Performance
  • Summary

42
Distributed Arithmetic(Multiplier)
8 bit data
4
4
LUT
LUT
16 X 12
16 X 12
12
8
12 bit adder
4
12
43
Math Class
Constant
LUT-B Input
LUT-A Input
LUT-A Output
LUT-B Output
Adder Output
44
Architecture of the Multiplier
M70
M74
M30
Pipelined Lookup Tables
LUT-B
LUT-A
A74
A30
A118
B30
B74
B118
Pipelined Adder
Pipeline Register
4-bit half
4-bit full
4-bit half
Carry
Carry
P30
P74
P118
P1512
45
LUTs by Muxing
  • Lookup Table contains all pre-calculated partial
    products.
  • Use a Truth Table to determine Mux inputs.

All possible products for multiplying by 0011 (3)
A3
A2
A1
A0
Px
46
Optimizing the Lookup
  • Two mux levels can be collapsed into a single
    gate.
  • The function can be determined with a truth table.

No optimization
XOR
Optimized
Func1
A1
A0
NAND
Func2
?
OR
?
Func3
?
BUF
Func4
?
47
Multiplier Schematic
  • Schematic resembles the block diagram.
  • Two LUTs sourcing adder.
  • The corresponding view in the layout editor.
  • The LUTs are offset to line up bits for adder.
  • Pipeline registers are cheap.
  • XC6216 has 4096 Flip Flops

LUT-A
LUT-B
ADDER
48
A Closer Look at a Lookup
  • Each 12-bit LUT is built from 12 one bit LUTs.
  • LUTs get stacked vertically.

49
Determining Coefficients
  • Schematic for a single 4-input LUT.
  • Functions can be determined from the Truth Table.

50
Changing Coefficients
  • Functionality of a cell is contained in one byte.
  • 32-bit access can change the function of 4 cells
    per write cycle.
  • 96 cells need writing, or 24 write cycles. (worst
    case)
  • 1.45ms assuming 33MHz

51
Summary
  • 8x8 constant coefficient multiplier
  • Pipelined - 75 MHz performance
  • Small grain architecture - High degree of LUT
    optimization
  • Coefficients easily changed - Fast reconfig
    times.
  • High Performance/Dollar
Write a Comment
User Comments (0)
About PowerShow.com