Title: User Authentication Using Keystroke Dynamics
1User Authentication Using Keystroke Dynamics
ECE 614 Spring 2005 University of Louisville
2Three types of authentication
- Something you know.
- A password
- Something you have.
- An ID card or badge
- Something you are.
- Biometrics
3Biometrics
- Biometrics measure physical or behavioral
characteristics of an individual. - Physical (do not change over time)
- Fingerprint, iris pattern, hand geometry
- Behavioral (may change over time)
- Signature, speech pattern, keystroke pattern
4Keystroke biometrics
- A keystroke dynamic is based on the assumption
that each person has a unique keystroke rhythm. - Keystroke features are
- Latency between keystrokes.
- Duration of key presses.
- 4 possible authentication outcomes
- Genuine individual is accepted.
- Genuine individual is rejected.
- Imposter is accepted.
- Imposter is rejected.
- Biometric classification accuracy measures
- FRR false rejection rate (ii)
- FAR false acceptance rate (iii)
- EER equal error rate FRR FAR
5Methods for classifying keystroke rhythms
- Statistical / probabilistic approaches
- Data Mining Techniques
- Neural Networks
- EBP networks
- CPNN (based on SOM)
- ART2 networks (unsupervised learning)
- LVQ networks
- RBFN
6Project Description
- Authenticate users based on the keystroke times
captured while typing their name. - Use EBP to train a neural network to generate a
user identification that can be compared to a
known user identification. - Result of the system will be either
authentication failed or authentication
successful.
7Methodology flowchart
8Implementation
- Capturing keystrokes GUI in C
- Requirements
- Near microsecond accuracy (HiPerfTimer)
- Enrollment times and labels
- Authentication using captured times.
- Remote call Matlab to processes times.
- Processing Data, Matlab
- Subroutines needed
- Error back propagation
- Evaluate a vector of authentication times using
trained network - Normalization of training times
- Normalization of authentication times
9Capturing Training Times
- Time the interval between successive key_up and
key_down events, keystroke latency. - Maximum of 50 time intervals can be captured and
stored. - Unused elements are set to 0.
- User must correctly type name or trial is thrown
out. - Training times are stored in a text file.
- Additional training times are appended to this
file. - An enrollment is comprised of 7 successful
(correct name typed) captures. - After enrollment the neural network is retrained.
10Labeling training times
- Each user is represented by a binary string
- Ex.
- User Jeff Hieb 1 0 0
- User Kunal Pharas 0 1 0
- User Suman 0 0 1
- Training labels are stored in a text file
- Each line in the file is the user label for the
same line in the training file. - Additional training labels are appended to this
file. - When a new user enrolls a 0 is appended to all
existing user labels in the file.
11Training Data Files
- Sample of training times file
- . . .
- 150 31 52 43 125 9 83 14 90 86 69 261 50 213 129
41 166 80 65 253 68 27 67 5 77 10 62 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 - 165 83 31 195 105 6 78 11 155 1 61 220 70 192 140
52 93 129 57 272 70 24 69 7 86 5 67 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 - 190 62 52 115 92 21 73 13 111 32 72 223 77 152
129 52 114 131 56 275 69 39 64 1 82 9 74 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 - 173 62 42 103 105 31 41 38 97 51 63 235 56 187
125 51 125 109 57 269 73 16 67 13 81 1 61 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 - 199 62 21 126 103 10 53 30 93 170 59 175 63 145
135 41 114 130 56 293 70 21 61 14 80 1 63 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 - 208 62 52 117 112 1 82 6 98 208 62 168 81 168 123
53 103 163 66 348 77 33 61 10 83 1 71 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 - 162 73 62 111 97 20 52 36 109 36 78 216 64 155
136 52 125 126 71 308 76 30 63 4 79 10 62 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 - . . .
- Sample of training labels file
- . . .
- 0 1 0 0 0 0 0
- 0 0 1 0 0 0 0
- 0 0 1 0 0 0 0
- . . .
- 0 0 0 0 0 1 0
- 0 0 0 0 1 0 0
- 0 0 0 1 0 0 0
- 0 0 0 1 0 0 0
- . . .
12Training the Neural Network
- GUI calls Matlab function EBP(filename) where
filename denotes the training times and training
labels. - EBP normalizes the data and stores the
normalization parameters in a file - Number of output neurons is determined by the
training labels, 5 users ? 5 output neurons. - Output layer uses uni-polar activation function.
- Trained weights are stored in file.
13Authentication
- Capture keystrokes using same procedure as
before. - If user mistypes name, authentication fails, but
user is informed why and trial is discarded. - GUI calls matlab function evaluate(filename)
where filename is a file containing the captured
times. - Evaluate normalizes the data using the parameters
stored during training - Evaluate then uses the stored weights to produce
the output of the network, which are returned - The GUI maps the network output to a string of
0s and 1s. - If f(net) is greater than alpha (i.e. .95) then
the value is 1, otherwise the value is 0. - This string is then compared to the desired user
string. - If there is a match, authentication is
successful, other wise authentication fails.
14Keystroke capture and authentication GUI
15Testing and Results
- Enrolled 7 users (49 training pairs).
- Each user had at least 3 authentication attempts
(total of 45 authentication trials). - 42 imposter trials.
- The majority of imposter authentication attempts
were made by us. - Many authentication trials are for one user.
16Plot of Normalized Training Times
17Effect of hidden layers on accuracy
Alpha .95 C .2 Emax .0005
18Effect of Training error on accuracy
Alpha .95 C .2 Hidden Neurons 24
19Overall Classifier Accuracy
Max error .0005 C .2 Hidden Neurons 24 Best
performance Alpha .75 FRR 7 FAR 30
20Conclusions
- For users short name (less than 8 characters) or
with long latency (not proficient typists)
circumvention was high. - Creating an interface that is acceptable and easy
to use for a wide variety of users is not
trivial. - Not allowing for typographical errors is
irritating to users and may effect acceptance. - Dont require imposter training samples.
21Future Research Directions
- Ways of handling typographical errors.
- Ways to scale keystroke biometrics to large
numbers of users. - Explore other methods of evaluations,
particularly unsupervised learning. - Explore extraction of more sophisticated
keystroke features.
22Questions ?
23References
- J. Bechtel, Passphrase authentication based on
typing style through an ART 2 Neural network,
IJCIA Vol. 2, No. 2 (2002) pp 1 22. - A. Peacock, Typing Patters A Key to User
Identification, IEEE Security and Privacy,
September / October 2004, pp 40- 47. - L. Araujo, User Authentication Through Typing
Biometrics Features, IEEE Transactions on Signal
Processing, Vol. 53, No. 2, February 2005. - A. Guven, Understanding users keystroke patters
for computer access security, Computers
Security, Vol. 22, No. 8, 2003, pp 695-706. - F. Monrose Keystroke dynamics as a biometric for
authentication, Future Generation Computer
Systems, Vol. 16, 2000, pp. 351-359. - M. Obiadat, An On-Line Neural Network System for
Computer Access Security, IEEE Transactions On
Industrial Electronics, Vol. 40, No. 2, April
1993, pp. 235-242.