Title: Lessons Learned from Applications of Machine Learning
1Lessons Learned from Applications of Machine
Learning
- Robert C. Holte
- University of Alberta
2Source Material
- Personal involvement in a commercial project to
use ML to detect oil spills in satellite images - Other peoples papers on specific applications
- Other peoples lessons learned
- e.g. discussions with Foster Provost
3Lesson 1 ML works
- Numerous examples of machine learning being
successfully applied in science and industry - saving time or money
- doing something that would not have been possible
otherwise - sometimes superior to human performance
Corollary it would be beneficial to have an
on-line repository of success stories
4Example D. Michie
- American Express (UK)
- Loan applications automatically categorized by a
statistical method - definitely accept
- definitely reject
- refer to a human expert
- Human experts 50 accurate predicting loan
defaults - Learned rules 70 accurate
5Oil Spill project the task
- In a continuous stream of satellite radar images,
- identify the images that are likely to contain
one or more oil slicks, - highlight the suspected region(s), and
- forward the selected, annotated images to a human
expert for a final decision and action. - Macdonald Dettwiler Associates
6oil slick
7Oil Spill project - team
- MDA - satellite image processing experts
- Canada Centre for Remote Sensing - human expert
in recognizing oil slicks in radar images - attempts to build a classifier by hand failed
- me, Stan Matwin, Miroslav Kubat
- 1995-97
- see Machine Learning, vol.30, February 1998
8Lesson 2 Research Spinoffs
- Many new, general research issues arose during
the oil spill project, but could not be properly
investigated within the scope of the project. - A great deal of follow-on research is needed.
Corollary when you write up an application, look
for general techniques, issues, and phenomena
9Research Issues (1)
- hand-picked data (purchased)
- not a representative sample
- small data sets (9 images, 937 dark regions)
- risk of overtuning
- task formulation
- classifying images, regions, or pixels ?
- subcategories of non-slicks ?
10Research Issues (2)
- imbalanced data sets (41 oil slicks, 896
non-slicks) - accuracy inappropriate performance measure
- standard learners optimize accuracy, tend to
classify everything as not an oil slick - data is in distinct batches
- leave one batch out (LOBO) testing method
- how to learn from batched data ?
11Research Issues (3)
- feature engineering
- image processing parameter settings affects
learning in 2 ways - which regions are extracted from the image
- the features of each region which are calculated
and then fed into the learning algorithm - best settings for one were not best for the other
12Good Classification, Poor Region
oil slick
13Lesson 3 Need Version Control
- Over the course of the project we had a vast
variety of data sets - images from three different types of satellite
- a growing set of images for each type
- a different data set for every different setting
of the image processing parameters - and many variations on the learning algorithms,
experimental method, etc.
14Lesson 4 Understand the Deployment Context
- What is the task ? classification, filtering,
control, diagnosis - non-uniform misclassification costs
- costs vary with user, time, not known during
learning - some tasks require explanations in addition to
classifications, or classifiers that can be
understood by domain experts
Corollary your experiments and performance
measure should reflect how the system will be
used and judged after deployment
15Example Evans Fisher
- Printing press banding problem
- ML built a decision tree to predict if banding
would occur or not. Some features exogenous (e.g.
humidity), others were controllable (e.g. ink
viscosity). - Actually, used to set the controllable variables
given the values of the exogenous ones - But different variables were under the control of
different craftsmen who would not necessarily
co-operate with each other
16Lesson 5 Expect Skepticism
- It will be very hard to convince a decision-maker
to actually deploy something new. - It will help if the learned system is in a form
that the decision-maker is familiar with or can
easily comprehend, and is consistent with all
available background knowledge.
17Counterexample Evans Fisher
- One of the learned rules flatly contradicted the
advice of an expert consultant, and the latter
was more intuitive. - Upon further analysis by the local engineers, the
learned rule was adopted.
18Lesson 6 Exploit Human Experts
- Capture as much expertise as you can
- Involve the expert in the induction process
- e.g. interactive induction (Evans Fisher,
PROTOS) - e.g. Structured Induction (Alen Shapiro)
19Lesson 7 Start Simple
- 1R, Naïve Bayes, Perceptron, 1-NN
- often work surprisingly well
- provide a performance baseline
- successes and failures inform you about your data
20Lesson 8 Visualize
- Visualize your data
- e.g. project onto 1 or 2 dimensions
- Visualize your classifiers performance
- e.g. with ROC or cost curves
- e.g. in instance space (which examples are
problematic ?) - e.g. systematic error
21Lessons
- ML works
- Applications spin off research issues
- Need version control for experiments
- Understand the deployment context
- Expect skepticism from decision-makers
- Exploit human experts
- Start simple
- Visualize your data and your classifier