Title: Miodrag Bolic
1Department of Electrical and Computer
Engineering Stony Brook University
Dissertation Defense
ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF
PARTICLE FILTERS
Miodrag Bolic
Advisor Prof. Petar M. Djuric
2Outline
- PART III Implementation of PFs
- VLSI signal processing architectures
- Methodology
- Non-parallel implementation
- Algorithm characteristics
- Modifications of the PF
- New resampling algorithms
- Architecture
- Implementation results
- Parallel implementation
- Propagation of particles
- Parallel resampling
- Architectures for parallel resampling
- Space exploration
- Gaussian PFs
- Motivation and goals
- Challenges
- Dynamic model
- Monte Carlo sampling
- Importance sampling
- Resampling
- Bearings-only tracking example
- Steps and complexity
- Conclusions and future work
3Introduction Motivations and Goals
Particle Filter
sensor
- Goal
- Increase speed of particle filters
4Introduction - Challenges
- Reducing computational complexity
- Randomness difficult to exploit regular
structures in VLSI - Exploiting temporal and spatial concurrency
- First hardware implementation of particle
filters (50 times improvement in speed in
comparison with DSP) - New resampling algorithms suitable for hardware
implementation - Fast particle filtering algorithms that do not
use memories - First distributed algorithms and architectures
for particle filters
5Outline
- PART III Implementation of PFs
- VLSI signal processing architectures
- Methodology
- Non-parallel implementation
- Algorithm characteristics
- Modifications of the PF
- New resampling algorithms
- Architecture
- Implementation results
- Parallel implementation
- Propagation of particles
- Parallel resampling
- Architectures for parallel resampling
- Space exploration
- Gaussian PFs
- Motivation and goals
- Challenges
- Dynamic model
- Monte Carlo sampling
- Importance sampling
- Resampling
- Bearings-only tracking example
- Steps and complexity
- Conclusions and future work
6Theory of PFs Dynamic model
- Example Bearings-only tracking
- States position and velocity xkxk, Vxk, yk,
VykT - Observations angle zk
- Observation equation zkatan(yk/ xk)vk
- State equation
zkfz(xk,vk)
xkfx(xk-1, uk)
xkFxk-1 Guk
fx state transition function uk process
noise
fz measurement function vk observation
noise
7Theory of PFs Bayesian approach
Objective in Bayesian approach p(x0kz1k) poster
ior distribution
xk?
State space model
Problem
Solution
Estimate posterior
Integrals are not tractable
Difficult to drawsamples
Monte Carlo Sampling
Importance Sampling
8Theory of PFs Monte Carlo Sampling
Densities can be approximated by discrete random
measures Particles and Weights
State space model
Problem
Solution
Estimate posterior
Integrals are not tractable
Difficult to drawsamples
Monte Carlo Sampling
- ? approximates the density p(x)
- Integrals simplify to summations
Importance Sampling
9Theory of PFs - Importance Sampling
Objective Approximate a density p(x) by a
discrete random measure
State space model
Problem
Solution
Estimate posterior
Integrals are not tractable
1. Generation of particles proposal density
Difficult to drawsamples
Monte Carlo Sampling
Importance Sampling
10Theory of PFs - Resampling
- Problems
- Weight Degeneration
- Wastage of Computational resources
time
Solution RESAMPLING Replicate
particles in proportion to their weights
11Theory of PFs Bearings-Only Tracking Example
12Theory of PFs - Bearings-Only Tracking Example
(Cont.)
- Blue True trajectory
- Red Estimates
13Theory of PFs Steps and Complexity
Complexity
Initialize particles
Bearings-only tracking problem Number of
particles M1000
4M random number generations
1
2
M
. . .
M exponential and arctangent functions
Weigth computation
Normalize weights
Propagation of the particles
Resampling
yes
no
Exit
14Outline
- PART III Implementation of PFs
- VLSI signal processing architectures
- Methodology
- Non-parallel implementation
- Algorithm characteristics
- Modifications of the PF
- New resampling algorithms
- Architecture
- Implementation results
- Parallel implementation
- Propagation of particles
- Parallel resampling
- Architectures for parallel resampling
- Space exploration
- Gaussian PFs
- Motivation and goals
- Challenges
- Dynamic model
- Monte Carlo sampling
- Importance sampling
- Resampling
- Bearings-only tracking example
- Steps and complexity
- Conclusions and future work
15Implementation of PFs VLSI Signal Processing
Architectures
- Programmable digital signal processors
- Application-domain specific processors
- Application specific processors
- Application specific processors
- Speed is the main goal
- Functionality of the system does not change
- Temporal and spatial concurrency
- One-to-one mapping between operations and
hardware blocks - FPGA implementation
16Implementation of PFs Methodology
17Implementation of PFs Algorithm Characteristics
Start
New observation
Particle generation
1
2
M
. . .
1
2
M
. . .
Weight computation
Resampling
Propagation of particles
Exit
18Implementation of PFs Modifications of the PF
Modifications
Architecture
Algorithm
Fine-grain pipelining
Avoiding normalization
Loop transformations
Spatial concurrency
Parameter Current Limits
Sample period 2MTclk MTclk
Memories (2N1)M (N1)M
Finite precision arithmetic
Dedicated hardware
Addressing schemes
19Implementation of PFs New Resampling Algorithms
Parameter Algorithm 1 Algorithm 2
Sample period 2MTclk MTclk
Memories Particle memory (N1)M Index memory 2M Particle memory (N1)M Index memory 4M
Performances Same Worse (deterministic algorithm)
20Implementation of PFs Architecture
21Implementation of PFs Implementation results
- Hardware platform is Xilinx Virtex-II Pro
- Clock period is 10ns
- PFs is applied to the bearings-only tracking
problem - 1000 particles is used
- Logic blocks 4
- Memories 3
- Percentage of utilization of the PF blocks
Particle generation Weight Computation Resampling
Logic blocks 16 75 9
Block RAMs 67 11 22
22Implementation of PFs Parallelism
Start
- Universal architecture with a central unit
New observation
Particle generation
Processing Element 1
Processing Element 2
2
. . .
Central Unit
2
. . .
Weight computation
Processing Element 3
Processing Element 4
Resampling
Propagation of particles
- Processing elements (PE)
- Particle generation
- Weight computation
- Central Unit
- Algorithm for particle
propagation - Resampling
Exit
23Implementation of PFs Propagation of Particles
time
Particles after resampling
- Disadvantages of the particle propagation step
- Random communication pattern
- Decision about connections is not known
before the run time - Requires dynamic type of a network
- Speed-up is significantly affected
t
Processing Element 1
Processing Element 2
Central Unit
Processing Element 4
Processing Element 3
24Implementation of PFs Parallel Resampling
N13
N0
1
2
3
4
N0
N3
- Solution
- The way in which Monte Carlo sampling is
performed is modified
- Advantages
- Propagation is only local
- Propagation is controlled in advance by a
designer - Performances are the same as in the sequential
applications
- Result
- Speed-up is almost equal to the number of PEs
(up to 8 PEs)
25Implementation of PFs Architectures for Parallel
Resampling
- Controlled particle propagation after resampling
PE1
PE3
PE2
PE4
Architecture that allows adaptive connection
among the processing elements
26Implementation of PFs Space exploration
- Hardware platform is Xilinx Virtex-II Pro
- Clock period is 10ns
- PFs are applied to the bearings-only tracking
problem
27Implementation of PFs Gaussian PFs
No
- Propagates only first two moments
- Approximates densities by Gaussians
- No need for resampling
Yes
Drawing conditioning particles
1
2
M
. . .
1
2
M
. . .
Particle generation
1
2
M
. . .
Weight computation
Exit
28Implementation of PFs Gaussian PFs (cont.)
Minimum sampling period versus number of PEs of
parallel GPFs and SIRs
29Conclusions and Future Work
- Modification of the algorithms to be suitable
for hardware implementation - Development of parallel algorithms and
architectures - Implementation of the particle filter in FPGA
- Analysis of the other types of particle
filtering algorithms
- Simplifying floating to fixed-point conversion
- Developing application-domain specific processor
for PFs - Developing reconfigurable architectures for PFs
30Department of Electrical and Computer
Engineering Stony Brook University
Dissertation Defense
ARCHITECTURES FOR EFFICIENT IMPLEMENTATION OF
PARTICLE FILTERS
Miodrag Bolic
Advisor Prof. Petar M. Djuric