Title: Security Research at Pace Keystroke Biometric
1Security Research at PaceKeystroke Biometric
- Drs. Charles Tappert and Allen Stix
- Seidenberg School of CSIS
2Introduction Validate importance of study
applications
- Internet authentication application
- Authenticate (verify) student test-takers
- Internet identification application
- Identify perpetrators of inappropriate email
- Internet security for other applications
- Important as more businesses move toward
- e-commerce
3Introduction Define Keystroke Biometric
- The keystroke biometric is one of the
less-studied behavioral biometrics - Based on the idea that typing patterns are unique
to individuals and difficult to duplicate
4Introduction Appeal of Keystroke Biometric
- Not intrusive data captured as users type
- Users type frequently for business/pleasure
- Inexpensive keyboards are common
- No special equipment necessary
- Can continue to check ID with keystrokes after
initial authentication - As users continue to type
5Introduction Previous Work on Keystroke Biometric
- One early study goes back to typewriter input
- Identification versus authentication
- Most studies were on authentication
- Two commercial products on hardening passwords
- Few on identification (more difficult problem)
- Short versus long text input
- Most studies used short input passwords, names
- Few used long text input copy or free text
- Other keystroke problems studies
- One study detected fatigue, stress, etc.
- Another detected ID change via monitoring
6Introduction Feature Measurements
- Features derived from raw data
- Key press times and key release times
- Each keystroke provides small amount of data
- Data varies from different keyboards, different
conditions, and different entered texts - Using long text input allows
- Use of good (statistical) feature measurements
- Generalization over keyboards, conditions, etc.
7Introduction Make Case for Using
- Data over the internet
- Required by applications
- Long text input
- More and better features
- Higher accuracy
- Free text input
- Required by applications
- Predefined copy texts unacceptable
8Introduction Summary of Scope and Methodology
- Determine distinctiveness of keystroke patterns
- Two application types
- Identification (1-of-n problem)
- Authentication (yes/no problem)
- Two indep. variables (4 data quadrants)
- Keyboard type desktop versus laptop
- Entry mode copy versus free text
9Keystroke Biometric System Components
- Raw keystroke data capture
- Feature extraction
- Classification for identification
- Classification for authentication
10Keystroke Biometric SystemRaw Keystroke Data
Capture
11Keystroke Biometric SystemRaw Keystroke Data
Capture
12Keystroke Biometric SystemFeature Extraction
- Mostly statistical features
- Averages and standard deviations
- Key press times
- Transition times between keystroke pairs
- Individual keys and groups of keys hierarchy
- Percentage features
- Percentage use of non-letter keys
- Percentage use of mouse clicks
- Input rates average time/keystroke
13Keystroke Biometric SystemFeature Extraction
A two-key sequence (th) showing the two
transition measures
14Keystroke Biometric SystemFeature Extraction
Hierarchy tree for the 39 duration categories
15Keystroke Biometric SystemFeature Extraction
Hierarchy tree for the 35 transition categories
16Keystroke Biometric SystemFeature Extraction
- Two preprocessing steps
- Outlier removal
- Remove duration and transition times gt threshold
- Feature standardization
- Convert features into the range 0-1
17Keystroke Biometric SystemClassification for
Identification
- Nearest neighbor using Euclidean distance
- Compare a test sample against the training
samples, and the author of the nearest training
sample is identified as the author of the test
sample
18Experimental Design and Data Collection Design
- Two independent variables
- Keyboard type
- Desktop all Dell
- Laptop 90 Dell IBM, Compaq, Apple, HP,
Toshiba - Input mode
- Copy task predefined text
- Free text input e.g., arbitrary email
19Experimental Design and Data Collection
20Experimental ResultsIdentification Experimental
Results
Identification performance under ideal
conditions (same keyboard type and input mode,
leave-one-out procedure)
21Experimental Results System hierarchical model
and parameters
Identification accuracy versus enrollment samples
22Experimental Results System hierarchical model
and parameters
Distributions of u duration times for each
entry mode
23Conclusions
- Results are important and timely as more people
become involved in the applications of interest - Authenticating online test-takers
- Identifying senders of inappropriate email
- High performance (accuracy) results if
- 2 or more enrollment samples/user
- Users use same keyboard type