Title: Brain Regions Involved in USCBP Reaching Models
1Brain Regions Involved in USCBP Reaching Models
2Brain Regions
- Cheols Models
- Motor cortex (M1)
- Spinal cord
- Basal Ganglia (BG)
- Dorsal Premotor (PMd, providing input)
- Jimmys Models
- Parieto-occipital area (V6a)
- Lateral intraparietal area (LIP)
- BG
- PMd (specifically F2)
3Issues In Model Integration
- Unified View of M1
- Interactions between PMd and M1
- Role of the BG
- Involvement of the Cerebellum
4M1 Modeling
- Cheol
- Top-down model directional tuning with
supervised and unsupervised learning - Bottom-up model input and output maps with
controlling muscle synergies - Jimmy
- Robotic control model trajectory generator,
inverse kinematics, PD controllers (probably not
all in M1)
5Cheols Top-Down M1 Model
- Directional tuning of M1 neurons tuned using
supervised learning and unsupervised learning - Arm choice learned with reinforcement learning
- Jimmy Equivalent to noisy WTA based on
executability - Cheol connecting to unified view of motor
learning
6Possible motor procedures in the motor cortex
Trajectory Generator
Joint static Level Planning
ACTOR
CRITIC
Inverse Dynamics
Evaluator Of Mvmt
- Inverse dynamics and muscle models learned using
temporal difference learning in an actor-critic
architecture - The actor may correspond to the motor cortex.
Joint force Level Planning
TD error
Inverse Muscle Model
Muscle Level Planning
Motoneurons (spinal cord)
Arm
7Cheols Bottom-up M1 model(based on feedback
signal)
Target location (premotor)
IDM mapping the error direction
to muscle synergy (directly related to
directional tuning)
ISM
Motor Cortex Model (map)
Feedback signal (premotor)
-
(with optimal feedback controller)
Muscle Synergy
Forward model
Pesaran et al. (2006) indicated that PMd neurons
encoded both target location and feedback signal.
8ILGA Motor Controller
- Input - reach target in wrist-centered
coordinates - Dynamic Motor Primitives generate reach
trajectory - Inverse Kinematics pseudo-inverse of Jacobian
matrix - PD controllers one for each DOF
9Interactions Between PMd and M1
- Our views of the role of PMd are very similar
- Jimmy
- PMd (F2) provides M1 with target location in
wrist-centered coordinates - Cheol
- Supra-motor-cortex coding in PMd may be feedback
error (target location in hand-centered reference
frame) and/or target location in the fixation
point coordinates.
10ILGA F2 Integrates Bottom-Up and Top-Down Reach
Target Signals
- Rostral F2 performs target selection based on
parietal and prefrontal input - Caudal F2 encodes selected target and initiates
reach - F6 detects go signal and disinhibits via BG
Tanne et al (1995)
11Reconciliation with FARS view of PMd
- FARS implicated F2 in conditional action
selection and F4 in reach target selection - However many studies show F2 to contain
directionally tuned neurons that discharge prior
to reaching - F4 contains bimodal (visual / somatosensory)
neurons that respond when objects approach their
somatosensory receptive field on the arm or hand
12F2 vs. F4 Experimental Data
- Neurons in F2 are broadly tuned to
multidimensional direction in a reaching task
(Caminiti, 1991 Fu et al., 1993) - Pesaran, Nelson Andersen (2006) PMd neurons
encode relative positions of eye, hand, and
target - PMd contains combined signals.
- MIP contains more (target-eye) coding fixation
point coordinate - F4 bimodal visual-tactile neurons have very large
visual and somatosensory receptive fields and
visual field is anchored to somatosensory field - But most dont fire for stimuli farther than 25cm
away (Graziano et al., 1997) - Not suitable for
encoding reach target! - May be involved in feedback control of
reach-grasp coordination tactile RFs may
contribute to transition from visual- to
haptic-based control
13Role of the BG
- Cheol
- Adaptive critic in actor-critic architecture
- Jimmy
- Adaptive critic gated by internal state
- Action disinhibition
- Role in previous USCBP models
- DA / DAJ action disinhibition
- ILGM reward signal
- Extended TD adaptive critic
- Bischoff BG model next-state prediction
14BG Disinhibition of Action
- ILGAs use of the basal
- ganglia to disinhibit
- actions is largely
- consistent with its role
- in the Dominey-Arbib and Dominey-Arbib-Joseph
Models - The cortical target of context-dependent biases
are different
15BG as an Adaptive Critic
- The basal ganglias role as an adaptive critic is
not very controversial - However, each of our models uses it to learn
different parameters - Cheols top-down model to modify arm selection
- Cheols bottom-up model to learn inverse models
- ILGA kinematic parameters and contextual bias
- ACQ executability and internal state-dependent
desirability - Does this imply several actor/critic combinations
(11, N1, 1N, NN)? - Cheols top-down model actor / critic
- Cheols bottom-up model actor / critic
- ILGA actor / critic
- ACQ actor / multiple critics
16M1 BG roles in Cheols unified view
The reinforcement learning framework will replace
optimization of a task-related cost function
with maximization of a task-related reward
function which also accounts for actuators
limitation The critic encodes the current
task-related reward function. The reward or an
action value is defined only when we have an
objective. So, the critic will try to encode
which action might be the best action in terms of
reward (action value) to achieve a certain
objective. It will monitor that the current
movements performance. If the performance is
changed, the critic will give the information of
the next best action. And it will facilitate
changing the actor accordingly. If there are
multiple tasks, there should be multiple
critics. What is now the critics role? It will
encode the objective function and provide the
teaching signal to the actor through TD error
if TD error is zero, we dont need to change the
actor, and so on.
Visual signal (world representation)
Action-oriented perception ?
Target related signal
Critic X (vision-task-related)
Send limitation of the actuators via TD error
This arrow is the actor.
representation of the actuator
Critic (motor-task-related)
TD error.
It represent the current maximum capability of
the motor actuators. So, if the motor actuators
are based on muscles, it will be the muscle
synergies and the limitation of muscle-based
actuator. If there is a stroke on it, the maximum
capability is changed and the limitation of the
world increases. If there is a rehabilitation,
the maximum capability is changed again and the
limitation decreases.
Send limitation of the actuators via unsupervised
learning
Any motor actuators
17M1 BG roles in Cheols unified view
PLoS model
Jools variability data
Reaching module
Coordination manager
Grasping module
Maybe separated obtaining of those two modules
(early learning)
Critic
Representation of the actuators
actor
Critic
In this coordination problem, we may have an
objective of the coordination. As an example, we
can weigh more on faster movement, or on the
accurate movement, or accurate grasping. So based
on the different objective, we may have
variability in coordination. However, this
coordination is not free from the actuators.
First, if there is a signal dependent noise, we
cannot have too fast movement. (This limitation
is already in the Hoff-Arbib model). Second, too
large initial aperture can assure the more
accurate grasping but will give a limitation of
the reaching module (slower reaching).
Because of the stroke on a motor cortex, we have
a change in limitation (performance change) of
the corresponding actuator. The action choice
module will encode which arm is better in a
certain direction. So when the performance of the
affected arm decreased, it will say that the best
action is using the unaffected arm. (i.e.
behavioral compensation). Can we connect these
ideas with the words executability and
desirability? In general, the objective function
contains both concepts I think.
Hierarchical Optimal Feedback Controller
Todorov et al (2005) found a similar idea on
hierarchical optimization of the plants. But the
reinforcement learning framework will provide the
more general framework of the motor system
learning and may be more applicable
Motor cortex model
Kambara et al. (2008) showed the possibility and
I also would implement it with map
reorganization!
18Involvement of the Cerebellum
- Schweighofers Modeling corrects for
nonlinearities in arm control - Cheol what about learning projections from
cerebellum to M1?
19References
- Caminiti, R., Johnson, P.B., Galli, C., Ferraina,
S., Burnod, Y. (1991) Making Arm Movements within
Different Parts of Space The Premotor and Motor
Cortical Representation of a Coordinate System
for Reaching to Visual Targets. The Journal of
Neuroscience, 11(5) 1182-1197. - Fu, Q.G, Suarez, J.I., Ebner, T.J. (1993)
Neuronal Specification of Direction and Distance
During Reaching Movements in the Superior
Precentral Premotor Area and Primary Motor Cortex
of Monkeys. Journal of Neurophysiology, 70(5)
2097-2116. - Graziano, M.S.A., Hu, X.T., Gross, C.G. (1997)
Visuospatial Properties of Ventral Premotor
Cortex. Journal of Neurophysiology, 77
2268-2292. - Tanne, J., Boussaoud, D., Boyer-Zeller, N.,
Roiuller, E.M. (1995) Direct visual pathways for
reaching movements in the macaque monkey.
NeuroReport, 7 267-272. - Pesaran, B., Nelson, MJ., Andersen, RA. (2006)
Dorsal premotor neurons encode the relative
position of the hand, eye, and goal during reach
planning. Neuron 51, 125-134 - Buneo, CA., Jarvis, MR., Batista, AP., Andersen
RA, (2002) Direct visuo-motor transformation for
reaching, Nature 416, 632-636. - Todorov, E., Li, W., Pan X., (2005) From task
parameters to motor synergies A hierarchical
framework for approximately optimal control of
redundant manipulator, J Robot Syst. 22(11),
691-710. - Kambara, H., Kim, K., Shin, D., Sato, M., Koike,
Y., (2006) Motor control-learning model for
reaching movements, IJCNN2006