Optimizing Flocking Controllers using Gradient Descent - PowerPoint PPT Presentation

About This Presentation

Title:

Optimizing Flocking Controllers using Gradient Descent

Description:

Optimizing Flocking Controllers using Gradient Descent Kevin Forbes Background Flocking model Background Learning Model Project Steps 1. – PowerPoint PPT presentation

Number of Views:142

Avg rating:3.0/5.0

Slides: 18

Provided by: Kevin652

Learn more at: https://www.dgp.toronto.edu

Category:

more less

Transcript and Presenter's Notes

Title: Optimizing Flocking Controllers using Gradient Descent

1
Optimizing Flocking Controllers using Gradient
Descent
Kevin Forbes
2
Motivation

Flocking models can animate complex scenes in a
cost-effective way
But, they are hard to control there are many
parameters that interact in non-intuitive ways
animators find good values by trial and error

Can we use machine learning techniques to
optimize the parameters instead of setting them
by hand?

3
Background Flocking model
Reynolds (1987) Flocks, Herds, and Schools A
Distributed Behavioral Model Reynolds (1999)
Steering Behaviors For Autonomous Characters

Each agent can see other agents in its
neighbourhood
Motion derived from weighted combination of
force vectors

Alignment
Cohesion
Separation
4
Background Learning Model
Lawrence (2003) Efficient Gradient Estimation
for Motor Control Learning
Policy Search Finds optimal settings of a
systems control parameter vector, as evaluated
by some objective function Stochastic elements in
the system result in noisy gradient estimates,
but there are techniques to limit their effects.
Simple 2-parameter example Axes values of
control parameters Color value of objective
function Blue arrows negative gradient of
objective function Red line result of gradient
descent
5
Project Steps

Define physical agent model
Define flocking forces
Define objective function
Take derivatives of all system element w.r.t all
control parameters
Do policy search

6
1. Agent Model
Position, Velocity and Acceleration defined as in
Reynolds (1999)
Recursive definition the base case is the
systems initial condition. If there are no
stochastic forces, the system is deterministic
(w.r.t. the initial conditions). The flocks
policy is defined by the alpha vector.

7
2. Forces
The simulator includes the following
forces Flocking Forces Cohesion,
Separation, Alignment Single-Agent
Forces Noise, Drag Environmental
Forces Obstacle Avoidance, Goal Seeking
Implemented with learnable coefficients (so far)
8
3. Objective Function
The exact function used depends upon the goals of
the particular animation I used the following
objective function for the flock at time t
The neighbourhood function implied here (and in
the force calculations) will come back to haunt
us on the next slide. . .

9
4. Derivatives
In order to estimate the gradient of the
objective function, it must be differentiable.
We can build an appropriate N-function by
multiplying transformed sigmoids together

Other derivative-related wrinkles
Can not use max/min truncations
Numerical stability issues
Increased memory requirements

10
5. Policy Search
Use Monte Carlo to estimate the expected value of
the gradient
This assumes that the only random variables are
the initial conditions. A less-noisy estimate can
be made if the distribution of the stochastic
forces in the model are taken into account using
importance sampling.
11
The Simulator