Title: Human Cognitive Emulation
1Human Cognitive Emulation
2Abstract
This project attempts to accurately model human
responses to stimuli. Using a survey format, this
research hopes to produce a unique response to a
stimuli based on information gained about the
user. Analyzed solely on its own, the
ramification of this project can perhaps draw
broad conclusions about groups of people and how
they respond. When combined with other techniques
of emulating human though patterns, computer
programs can come closer to representing
accurately human responses.
3Scope of Project/Expected Results
- Many data points
- Decision trees and entropy calculations
- more data better results
4Similar Projects
- Sandia National Labs is doing a much more complex
version of this project with many different
approaches adding up to a complete psychological
profile. - Dr. Ann Speed is one of the lead psychologists at
Sandia working on Cognitive Emulation. - Similar tree-formats used in past.
5Approach
- lisp!!(((())))
- text-based, with an intranet poll for seniors
with results returned in csv format. - evolutionary programming
- Tree branches number of possible survey
responses
6Construction
ID3 the heart of Decision Tree work first
developed in 1975 by J. Ross Quinlan at the
University of Sydney here in LISP
(defun id3 (examples target.attribute
attributes) (let (firstvalue a partitions)
(setq firstvalue (get.value target.attribute
(first examples))) (cond ((every
'(lambda(e)(eq firstvalue (get.value
target.attribute e))) examples)
firstvalue) ((null attributes)
(most.common.value target.attribute examples))
(t (setq partitions (loop for
a in attributes collect (partition a examples)))
(setq a (choose.best.partition
target.attribute partitions)) (cons (first
a) (loop for branch in (cdr a)
collect (list (first branch)
(id3 (cdr branch)
target.attribute
(remove (first a)
attributes)))))))))
7Results
TIMESPORT LOW SEX MALE
PHYSICS N gt YES Y
TIMEDRAMA LOW
TIMELAB LOW gt
YES HIGH gt YES
NONE gt NO NONE
TIMELAB
LOW gt NO HIGH gt NO
NONE gt YES
AVERAGE gt YES FEMALE gt NO NONE
BIO NO gt NIL Y SEX
MALE gt NO FEMALE
YEAR JUNIOR gt NO
SENIOR gt YES N gt NO AVG
YEAR NONE gt N SOPHMORE
TIMEVIDEO LOW gt YES
HIGH gt YES AVG gt NO JUNOR
gt YES JUNIOR TIMELAB
LOW TIMEDRAMA
AVG gt NO HIGH gt NO
NONE gt YES NONE gt NO
HIGH gt NO SENIOR SEX
FEMALE gt YES MALE
TIMELAB LOW gt NO
AVG gt NO NONE gt
NO HIGH SEX FEMALE CHEM
Y gt NO N
TIMELAB AVG gt NO
NONE RACE
ASIAN gt NO WHITE
gt YES MALE TIMELAB
AVG RACE OTHER
gt NO ASIAN
TIMEDRAMA NONE gt YES
LOW gt NO
WHITE gt YES HIGH gt NO
LOW gt YES NONE
PHYSICS Y gt YES
N TIMEVIDEO
LOW gt NO NONE
YEAR
SENIOR gt NO
SOPHMORE gt YES AVG gt YES
- After less than 100 data points, the decision
tree has already become relatively accurate. The
resulting tree is on the left in size 4 font. - This is an example of over fitting the tree.
There aren't enough data points to justify so
many branches. This makes some of the branches
pointless, misleading and obsolete.
8More results
- Most decisive factor in determining college major
focus time in Sys Lab - Most decisive factor in determining college
classroom preparedness whether or not higher
math was taken. - College Life? Sports
9Entropy
- perhaps the most obscure part of Decision Trees.
Entropy is used to find the best classifier by
means of information gain. - Entropy(S) - pplog2 pp pnlog2 pn
pp is the proportion of positive examples while
pn is the proportion of negative examples
10Bad stuff?
- No two people can directly contradict themselves.
- The odds of that happening aren't too bad, so the
larger the data set the more likely to crash the
program
- Even with 240 data points the tree is still very
specific and it is easy to see where individuals
stand out. - More valuable for broad decision factors
11