Random Sets Approach and its Applications - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Random Sets Approach and its Applications

Description:

Random Sets Approach and its Applications. Basic iterative ... Our results against all unmanipulated and all validation sets are in line with. top results. ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 17
Provided by: defau917
Category:

less

Transcript and Presenter's Notes

Title: Random Sets Approach and its Applications


1
Random Sets Approach and its Applications
Vladimir Nikulin, Suncorp, Australia
  • Introduction input data, objectives and main
    assumptions.
  • Basic iterative feature selection, and
    modifications.
  • Random sets approach.
  • Tests for independence trimmings (similar to
    HITON algorithm).
  • Experimental results with some comments.
  • Concluding remarks.

2
Introduction
Training data where
is binary label and is a vector
of m features
In practical situation the label y may be
hidden, and the task is to estimate it using
vector of features Area under receiver
operating curve (AUC) will be used as an
evaluation and optimisation criterion.
3
Causal relations
Manipulations are actions or experiments
performed by an external agent on a system,
whose effect disrupts the natural functioning of
the system. By definition, all direct features
can not be manipulated.
X1
X2
X6
X7
Y
X3
X4
X8
X9
Main assumption direct features have stronger
influence on the target variable and, therefore,
are more likely to be selected by the
FS-algorithms.
4
Basic iterative FS-algorithm
5
BIFS behaviour of the target function
CINA
LUCAP
MARTI
REGED
6
RS-algorithm
7
10
10
RS(10000, 40), MARTI case
8
Test for independence (or trimming)
9
Base models and software
10
Final results (first 4 lines)
11
Some particular results
12
Behaviour of linear filtering coefficients,
MARTI-set
13
CINA-set AdaBoost, plot of one solution against
another
14
SIDO, RF(1000, 70, 10)
15
Some comments
In practical applications we are dealing not with
pure probability distributions, but with
mixtures of distributions, which reflect changing
in time trends and patterns. Accordingly, it
appears to be more natural to form training set
as an unlabeled mixture of subsets derived from
different (manipulated) distributions, for
example, REGED1, REGED2,..,REGED9. As a
distribution for the test set we can select any
pure distribution. Proper validation is
particularly important in the case when training
and test sets have different distributions.
Respectively, it will be good to apply
traditional strategy split randomly available
test-set into 2 parts 50/50 where one part will
be used for validation, second part for testing.
16
Concluding remarks
Random sets approach has heuristic nature and has
been inspired by the growing speed of
computations. It is general method, and there are
many ways for further developments. Performance
of the model depends on the particular data.
Definitely, we can not expect that one method
will produce good solutions for all
problems. Probably, it was necessary to apply
more aggressive FS-strategy in the case of
Causal Discovery competition. Our results
against all unmanipulated and all validation sets
are in line with top results.
Write a Comment
User Comments (0)
About PowerShow.com