Title: Biometric Data Mining
1Biometric Data Mining
- A Data Mining Study of Mouse Movement,
Stylometry, and Keystroke Biometric Data
Clara Eusebi, Cosmin Gilga, Deepa John, Andre
Maisonave.
2Presentation Summary
- Project Description
- Experiment Structure
- Algorithms and Techniques
- Results of Experiments
- Future Research
- Conclusions
3Project Description
- The study extends previous studies at Pace
University on Biometric data by running
previously obtained data sets through a data
mining tool called Weka, using various algorithms
and techniques.
4Study Experiments
- Authentication
- Dichotomy model
- Identification
- Normalized data
- Additional
- Normalized data
5Algorithms and Techniques
- Authentication
- IBk with k 1 on Dichotomy data
- Identification
- IBk with k 1 on Normalized data
- Additional
- PredictiveApriori
- simpleKmeans
- IBk with k 1 using leave-one-out and percentage
splits
6Results
Train Test Type Accuracy
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Copy Desktop 95.79
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Free Desktop 96.32
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Copy Laptop 91.58
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Free Laptop 92.11
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Copy Desktop 88.95
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Free Desktop 98.42
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Copy Laptop 100.00
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Free Laptop 93.68
Results of Longitudinal Authentication
Experiments on new Keystroke Capture Data
7Results
Train Test Type Accuracy
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Copy Desktop 95
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Free Desktop 100
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Copy Laptop 100
1(5 samples from each of 4 subjects) 2(5 samples from each of 4 subjects) Free Laptop 85
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Copy Desktop 80
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Free Desktop 100
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Copy Laptop 100
1(5 samples from each of 4 subjects) 3(5 samples from each of 4 subjects) Free Laptop 100
Results of Longitudinal Identification
Experiments on the new KeystrokeCapture Data.
8Opportunities for Research
- Authentication based solely on subject in
question. - Separate sets of data holding only within and
between class records for each subject, - Rather than comparing a community of subjects to
a community of records. - Higher accuracies could be legitimately obtained
in this manner.
9Conclusion
- The study has furthered previous studies at Pace
University through running experiments on Mouse
Movement, Stylometry, and Keystroke Biometric
data, new and previously obtained, using the data
mining tool Weka. - The data mining algorithms with which the
experiments were conducted are widely used and
provide an entry point for future researchers
into the use of data mining with biometric data
sets.