Technology for Pooling Knowledge - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Technology for Pooling Knowledge

Description:

What is Knowledge Engineering? Techniques employed to build intelligent systems ... Process Control Depth of Anaesthesia. Teaching intelligent tutoring systems ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 27

Provided by: davep171

Category:

more less

Transcript and Presenter's Notes

Title: Technology for Pooling Knowledge

1
Technology for Pooling Knowledge
2
Overview

What is Knowledge Engineering
Why Pool Knowledge
Knowledge Pooling Process
What is Data Mining
Paradigms for Data Mining
Which Paradigm to Use?
Issues
Conclusions

3
What is Knowledge Engineering?

Techniques employed to build intelligent systems
knowledge acquisition and discovery
representation and integration
reasoning methodologies
explanation
Decision Support Tumour identification
Process Control Depth of Anaesthesia
Teaching intelligent tutoring systems

4
Data Information and Knowledge

Data Collection of facts which have no meaning
on their own
Hot, -6
Information data becomes information when it is
interpreted in context
Engine Hot, -6oC
Knowledge Information becomes knowledge when it
is usefully applied
The engine is hot therefore it must have been
used recently
The temperature is -6oC, I better wear gloves

5
Why Pool Knowledge

To develop intelligent systems we must pool
knowledge from data
See the big picture what does all the data have
to say?
Improve our decision making processes
Improved diagnosis
More effective treatment
Higher quality management
Enhance understanding of disease progression

6
How Do We Pool Knowledge

Too much data for human to trawl through so
automated techniques developed
Knowledge Discovery
Data Warehousing consistency, integrity
Data Mining
Expert may not be aware of knowledge found
Either Im missing something, or nothings going
on!

7
Knowledge Pooling in Practice
Patient History
Knowledge Pool
Knowledge Discovery Toolset
Data Warehouse
Clinical Data
Background Knowledge
Knowledge Based Decision Support System
Patient Details
Advice
Unstructured data
Expert(s) Decision Maker(s)
8
Knowledge Pooling Process

Data Warehouse
Integration and Cleaning
Legacy Databases
9
Knowledge Pooling Process

Data Warehouse
Selection
Integration and Cleaning
Legacy Databases
10
Knowledge Pooling Process
Data Mining
Knowledge Pool

Data Warehouse
Selection
Integration and Cleaning
Legacy Databases
11
Knowledge Pooling Process
Expert(s) Decision Maker(s)
Feedback
Data Mining
Knowledge Pool

Data Warehouse
Selection
Background knowledge
Integration and Cleaning
Legacy Databases
12
What is Data Mining

The nontrivial extraction of implicit,
previously unknown, and potentially useful
information from data
Term is a misnomer knowledge mining
Uses machine learning, statistical and
visualisation techniques to discover and present
knowledge in a form which is easily
comprehensible to humans

13
Approaches For Data Mining

Classification
Prediction
Association Rule Discovery
Sequence Rule Discovery
Clustering/Segmentation

14
Classification Tasks

Process of examining the features of record and
assigning it to one of a predefined set of
classes
Discovers, from the data the model that can
classify new records
Example application classifying skin disease
Technologies used
Decision Trees
Memory Based Reasoning
Rule Induction

15
Classification
psoriasis seboreic dermatitis lichen
planus pityriasis rosea cronic dermatitis
pityriasis rubra pilaris
Build Model
Classifier Model
Training Data
Use Model
Test Data
Classification
psoriasis
16
Predictive Tasks

A predictive model is similar in nature to the
classification model except that the value being
predicted is numeric
Predicting the life expectancy of a cancer
patient from characteristics of the tumour.
Technologies used
Decision Trees
Memory Based Reasoning
Rule Induction

17
Prediction
Build Model
Predictive Model
Training Data
Use Model
Test Data
Prediction
45 Months
18
Association Rule Discovery

Rules that define relations between attribute
values
If Headache and Temperature then Sore Throat,
support 35 and confidence 75
75 of records in which the patient has a
Headache and a Temperature also have a Sore
Throat and these patients constitute 35 of all
patients presented during the period analysed.
Technology used
Set Oriented Methods

19
Sequence Rule Discovery

Generalisations of association rules
Discovered rules take into account the temporal
nature of data
If a diabetic patient presents with early stage
retinopathy at an age under 18 then within 5
years renal failure will occur, support 30 and
confidence 65.
Technology used
Set Oriented Methods

20
Clustering (Segmentation)

The aim of cluster detection is to discover
regularities in data based on similarity
These algorithms discover sub-groups (clusters)
of data that are more similar (intra-cluster
distance) than data that belong to other clusters
(inter-cluster distance).
Technologies used
Bayesian Techniques
Statistical Techniques

21
Clustering Example
Cluster 2 Females Over 45
Cluster 1 Males Over 40
Cluster 3 Males Under 40
Cluster 4 Females Under 45
Clustering of patients with a particular illness
reveals 4 patient clusters. It is seen that each
group responds best to a different drug.
22
Which Paradigm to Use?

Mining Task Given the data mining task at hand,
the choice of paradigm is restricted to those
that can solve the task
TransparencySome paradigms generate knowledge in
more understandable representations than others
Decision trees and memory based reasoning
generate knowledge in more intuitive
representations than paradigms like neural
networks

23
Which Paradigm to Use?

Data A number of aspects of data can affect the
effectiveness of the paradigm being used
Data Types Some paradigms such as set-oriented
methods can only handle categorical data
Dimensionality Some paradigms are more adept at
handling large number of attributes in the data

24
Issues