Application of Metamorphic Testing to Supervised Classifiers - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Application of Metamorphic Testing to Supervised Classifiers

Description:

What are the metamorphic properties of supervised ML classification algorithms? ... 11 properties that we would expect all classification algorithms to have ... – PowerPoint PPT presentation

Number of Views:232

Avg rating:1.0/5.0

Slides: 24

Provided by: GailK9

Category:

more less

Transcript and Presenter's Notes

Title: Application of Metamorphic Testing to Supervised Classifiers

1
Application of Metamorphic Testing to
Supervised Classifiers
Xiaoyuan Xie, Tsong Yueh Chen Swinburne
University of Technology
Christian Murphy, Gail Kaiser Columbia University
Joshua Ho University of Sydney
Baowen Xu Nanjing University
2
Background

Many applications in the field of scientific
computing depend on machine learning (ML)
algorithms
ML applications often do not have test oracles
that indicate whether the output is correct for
arbitrary input
Applications without test oracles are called
non-testable programs

3
Problem Statement

Oracles may exist for a limited subset of the
input domain, and gross errors (e.g. crashes) can
be detected with certain inputs or techniques
However, it is difficult to detect subtle
(computational) errors for arbitrary inputs

4
Testing ML Applications

There has been much research into applying ML
techniques to software testing, but not the other
way around
Reusable real-world data sets and frameworks are
available for checking that an ML algorithm
predicts well, but not for checking that an
implementation works correctly

5
Observation

If there is no oracle in the general case, we
cannot know the expected relationship between a
particular input and its output
However, it may be possible to know relationships
between a set of inputs and the corresponding set
of outputs
Metamorphic Testing Chen et al. 98 is such
an approach

6
Metamorphic Testing

An approach for creating follow-on test cases
based on previous test cases
If input x produces output f(x), then the
functions metamorphic properties are used to
guide a transformation function t, which is
applied to produce a new test case input, t(x)
We can then predict the expected value of f(t(x))
based on the value of f(x) obtained from the
actual execution

7
Metamorphic Testing without an Oracle

When a test oracle exists, we can know whether
f(t(x)) is correct
Because we have an oracle for f(x)
So if f(t(x)) is as expected, then it is correct
When there is no test oracle, f(x) acts as a
pseudo-oracle for f(t(x))
If f(t(x)) is as expected, it is not necessarily
correct
However, if f(t(x)) is not as expected, either
f(x) or f(t(x)) (or both) is wrong

8
Metamorphic Testing Example

Consider a program that reads a text file of test
scores for students in a class, and computes the
averages and the standard deviation of the
averages
If we permute the values in the text file, the
results should stay the same
If we multiply each score by 10, the final
results should all be multiplied by 10 as well
These metamorphic properties can be used to
create a pseudo-oracle for the application

9
Approach

To apply Metamorphic Testing to such ML
applications, we first enumerate the metamorphic
relations based on the expected behaviors of a
given machine learning algorithm
We then utilize these relations to conduct
metamorphic testing on the implementation

10
Verification Validation

The scope of which metamorphic properties are
necessary may differ between various problems in
the domain
Properties that are necessary can be used for
verification Is the implementation of the
algorithm correct?
Other properties can be used for validation Is
the algorithm appropriate for solving this
problem?

11
Research Questions

What are the metamorphic properties of supervised
ML classification algorithms?
Which can be used for verification?
Which can be used for validation?
Can metamorphic testing detect defects in
real-world ML applications?

12
Machine Learning Fundamentals

Data sets consist of a number of samples, each of
which has attributes and a label
In the first phase (training), a model is
generated that attempts to generalize how
attributes relate to the label
In the second phase, the model is applied to a
previously-unseen data set (testing data) with
unknown labels to produce a classification of
each sample

13
Algorithms Investigated

k-Nearest Neighbors (kNN)
Samples in the testing data are classified by
using Euclidean distance to find the k nearest
samples in the training data
Classification is then done by majority rule
Naïve Bayes Classifier (NBC)
For a given sample in the testing data, computes
the probability of that sample belonging to each
class, assuming conditional independence between
the attributes
Chooses the class that is most likely

14
Metamorphic Relations

We identified 11 properties that we would expect
all classification algorithms to have
Affine transformation of attributes
Permutation of labels or attributes
Addition of informative or uninformative
attributes
Addition of classes by duplicating or re-labeling
samples
Removal of classes or samples

15
Experimental Setup

Applied the approach to implementations in the
Weka 3.5.7 toolkit
Initial test cases
Randomly generated values
Four attributes (columns)
20-50 samples (rows)
Metamorphic relations were applied to create
20-300 follow-on test cases

16
Results
k Nearest Neighbors
Naïve Bayes Classifier
17
Analysis kNN

No necessary properties were violated
Issues related to validation
Labels that are non-existent in the training data
have a non-zero chance of being selected in
classification
If two labels are equally likely, the first one
that is listed is chosen

18
Analysis Naïve Bayes

Four necessary properties were violated,
indicating defects in the implementation
Loss of precision related to use of the double
datatype in Java
Laplace Accuracy used to determine probabilities
thus, labels that did not appear in training data
have non-zero probability

19
Suggestions

We suggest using the BigDecimal class instead
of the double datatype
Laplace Accuracy is appropriate for the
attributes but not for the labels
Use of Laplace Accuracy should be set as an
option

20
Future Work

Apply the testing approach to other domains that
depend on ML, such as scientific computing
Further investigation of testing non-testable
programs
Measure the effectiveness of the approach in
empirical studies

21
Summary

Metamorphic testing is easy to implement and
automate
We were able to devise fault-revealing properties
even with just a basic understanding of the ML
algorithms
Metamorphic testing can be used for both
verification and validation

22
Application of Metamorphic Testing to
Supervised Classifiers
Xiaoyuan Xie, Tsong Yueh Chen Swinburne
University of Technology
Christian Murphy, Gail Kaiser Columbia University
Joshua Ho University of Sydney
Baowen Xu Nanjing University
23
Related Work