Efficient Computer ExperimentBased Optimization through Variable Selection - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Efficient Computer ExperimentBased Optimization through Variable Selection

Description:

Regression trees (CART) Multiple testing procedure based on FDR (false discovery rate) ... CART, Breiman et al. (1984) Salford Systems (www.salfordsystems.com) ... – PowerPoint PPT presentation

Number of Views:26
Avg rating:3.0/5.0
Slides: 21
Provided by: studen72
Category:

less

Transcript and Presenter's Notes

Title: Efficient Computer ExperimentBased Optimization through Variable Selection


1
Efficient Computer Experiment-Based Optimization
throughVariable Selection
2
  • Design and Analysis of Computer Experiments
  • Methods CART, CART/FDR, Inverse FDR
  • Applications
  • Stochastic dynamic programming (SDP)
  • air quality problem (Yang 2004)
  • Stochastic programming fleet assignment model
    (SP-FAM) (Pilla 2006)
  • Future work

3
  • Design and Analysis of Computer Experiments
    (DACE) can be used to reduce the computation.
  • DACE Steps
  • An optimization model is formulated as the
    computer experiment.
  • Design of Experiments (DoE) is used to select
    sample points as input to the optimization
    model.
  • A Multivariate Adaptive Regression Splines
    (MARS) model is fit to these data.

4
  • Large-scale optimization problems
  • Can include thousands variables.
  • Can be computationally expensive.
  • The number of variables could be greater than the
    number of runs i.e. p gt n.
  • Variable selection methods
  • Regression trees (CART)
  • Multiple testing procedure based on FDR
  • (false discovery rate)
  • CART/FDR 2 sample/FDR
  • Inverse FDR (InvFDR)

5
  • Data mining (Berry and Linoff 2000)
  • A process of exploratory data analysis.
  • To discover meaningful patterns and rules.
  • Variable selection problem
  • Model a response variable of interest.
  • Select important explanatory variables.

6
(No Transcript)
7
( Possible outcomes from the multiple hypothesis
tests of size m )
False Discovery Rate (FDR Benjamini and
Hochberg, 1995) The expected proportion of
false positives among rejected hypotheses.
V of false rejections R of rejections.
8
H1 H2 H3 H4 H5 H6 H996
H997 H998 H999 H1000
P1 P2 P3 P4 P5 P6
P996 P997 P998 P999 P1000
Procedure Controlling FDR?
Among the significant hypotheses, the proportion
of falsely rejected positive to be ? ?
9
FDR-based Variable Selection with Grouping
  • Assume the response surface is monotonic.
  • Divide the data into C2 groups by using median
  • (or mean) of y with the following rule
  • If y median (or mean), then group 1
  • If y lt median (or mean), then group 2.
  • Construct two-sample t statistic for each
    variable.
  • Generate p-value from each statistic.
  • Conduct an FDR procedure to find the significant
    variables.

10
Illustration
Conduct an FDR procedure with the FDR level ?
11
FDR-based Variable Selection from Regression
Trees
12
Illustration
Conduct an FDR procedure with the FDR level ?
13
Inverse FDR
14
Illustration
15
(No Transcript)
16
  • Two-stage SP problem
  • First-stage expected profit objective function
    approximation
  • Crew compatible allocation (CCA)
  • Decision variables (CCA) p 1,264.
  • Data set of n 141 points.
  • Large p and small n

17
(No Transcript)
18
  • New variable selection methods based on
    principal component analysis (PCA).
  • Information gain (IG) could be an option to
    perform variable selection.
  • Theoretical investigation of Inverse FDR .

19
  • Thank you for your attention!

20
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com