Title: Data Mining for Computer Security
1Data Mining for Computer Security
- Data Mining Laboratory, SNU.
- Tae Hoon Ko
- 2008.10.29
2Contents
- Introduction
- Problem Case - I, II, III, IV
- Research Area
- Research Area I Intrusion Detection System
(IDS) - Research Area II Detection of Malicious
Executables - Conclusion
3Introduction
4Introduction Problem Case I
- Huge infrastructure of computer network system
- Too enormous in respect of width and depth.
- Many chances to be attacked
5Introduction Problem Case II
- Importance of possessing the information
6Introduction Problem Case II
- Importance of possessing the information
- Example) Private credit information
- Attackers vs. Defenders
- Attackers Try to get information illegally.
- Defenders Should keep the (customer)
information safely. - An unscrupulous quest for profit
- Some corporations sell the customer information
illegally. - ? get some monetary value political
benefit.
7Introduction Problem Case III
- Exponentially increase of the number of
malicious executables. - Malicious code Virus, Worm, Trojan horses.
- Detecting the malicious code researched for a
long time. - But, too many malicious executables!
- 1995 5,500 viruses - 2004 more than
100,000 viruses !!
8Introduction Problem Case IV
- Lack of perception about computer security.
- Efforts to preserve each private space is needed.
Study from ?American Online the National Cyber
Security Alliance? (2006)
- 77 of users think they are safe from online
threats. - 67 of computers lack antivirus
software. - 49 of users dont use any firewall
protection.
9Research Area I Intrusion Detection System (IDS)
- Goal
- Stop hackers activity before she(he) can cause
any damage or access to any sensitive
information. - Basic Mechanism Monitor the data, and Filter.
10Research Area I Intrusion Detection System (IDS)
- Types How does IDS detect intrusions?
- Anomaly Detection IDS
- Assumption Intrusions ? some deviations from
normal patterns. - Misuse Detection IDS
- Based on the knowledge of system weak points
- Based on known attack patterns
- Find intruders who aims at some know weak points.
- Types Where does IDS obtain the information?
- Host based IDS
- Obtain the information from a single host.
- Network based IDS
- Monitor the traffic in the network.
11Research Area I Intrusion Detection System (IDS)
- Data Mining Issues.
- Data Objects Reduction
- Too many data objects. ? Time space
complexities - How to classify useless objects?
- Feature Selection
- False correlations, Redundant features,
- What subset of features is best?
- In order to classify intrusions accurately.
12Research Area II Detection of Malicious
Executables
- Situations
- Software for virus detection (Norton Anti-Virus,
V3) - Excellent technology.
- Based on only known patterns.
- Detecting malicious codes after they have
attacked some computers. - Exponentially increase of the number of malicious
executables. - Easy to write malicious programs with free virus
kit. - Goal
- Detect unknown malicious executables.
13Research Area II Detection of Malicious
Executables
Z.Kolter and A.Maloof, Learning to Detect
Malicious Executables in the Wild, Proceedings
of the tenth ACM SIGKDD international conference
on Knowledge discovery and data mining, 2004
14Conclusion
- Review more papers about
- Computer security
- Information assurance
- Intrusion detection
- Try to get some real data.
15References
- 1 A. Maloof (Ed.), Machine learning and data
mining for computer security, Springer, 2006. - 2 Z.Kolter and A.Maloof, Learning to detect
malicious executables in the wild, Proceedings of
the 10th ACM SIGKDD international conference on
Knowledge discovery and data mining , 2004. - 3 T.Shon, Y.Kim, C.Lee and J.Moon, A machine
learning framework for network anamaly detection
using SVM and GA, Proceedings of the 2005 IEEE
workshop on Assurance and security, 2006. - 4 D.Dasgupta and F.A.Gonzalez, An intelligent
decision support system for intrusion - detection and response, Lecture Notes in
Computer Science, 2001.
16THANK YOU
Any Questions?