An Outlier Detection Methodology with Consideration for an Inefficient Frontier

1 / 24
About This Presentation
Title:

An Outlier Detection Methodology with Consideration for an Inefficient Frontier

Description:

An output-oriented measure of inefficiency relative to an inefficient frontier ... Input oriented Threshold Value. Output oriented Threshold Value ... –

Number of Views:53
Avg rating:3.0/5.0
Slides: 25
Provided by: andre674
Category:

less

Transcript and Presenter's Notes

Title: An Outlier Detection Methodology with Consideration for an Inefficient Frontier


1
An Outlier Detection Methodology with
Consideration for an Inefficient Frontier
  • By
  • Andy Johnson
  • Leon McGinnis

2
Outline
  • Background and motivation
  • Current two-stage DEA method
  • Proposed changes to two-stage DEA
  • Definitions Relative to an Inefficient Frontier
  • Leave-one-out method / threshold value
  • Iterative Outlier Detection
  • Second Stage Bootstrap
  • Summary

3
Background
  • Outlier detection in a non-parametric framework
    is important because many of these methods do not
    consider measurement error or random fluctuations
    when constructing a frontier
  • Thus allowing over stated data to be included in
    the reference set can bias not only one
    efficiency estimate, but several efficiency
    estimates if the over stated observation is used
    to construct the frontier

4
Motivation
  • The iDEAs project for warehouse performance
    benchmarking
  • On-line data collection requires more scrutiny
    than data collected and analyzed by a single
    person
  • What could be the/a cause of a negatively skewed
    efficiency distribution

5
Motivation
  • To investigate the impact of environmental
    characteristics on efficiency the two-stage DEA
    method has been developed
  • A data set is identified as using a similar
    technology
  • In the first stage efficiency estimates are
    calculated
  • In the second stage the estimates are regressed
    against environmental characteristics
  • In this setting over stated observations are a
    problem in the first stage, but both overstated
    and understated observations are a problem in the
    second stage

6
Current Two-stage DEA Method when Outlier
Detection is Considered
  • Consider the problem of understanding sources of
    inefficiency
  • Identify outliers relative to the efficient
    frontier and remove them
  • Calculate efficiency estimates
  • Regress efficiency estimates against
    environmental variables

7
An Outlier Detection Methodology with
Consideration for an Inefficient Frontier
Output
input
8
An Outlier Detection Methodology with
Consideration for an Inefficient Frontier
  • A proposed improvement on the current method
    requires an outlier methodology
  • First, identify outliers relative to both the
    efficient and inefficient frontier
  • Use a two-stage DEA method where DEA estimates
    are calculated in the first stage and regressed
    against environmental variables in the second
    stage
  • Use bootstrapping in the second stage to deal
    with the problem of correlation among the error
    terms

9
An Outlier Detection Methodology with
Consideration for an Inefficient Frontier
output
input
10
Definitions Relative to an Inefficient Frontier
  • The production possibility set when the
    inefficient frontier is included
  • Shephards input inefficient distance function
    can be defined as
  • Shephards output inefficient distance function
    can be defined as

11
Inefficiency Measures Relative to an Inefficient
Frontier
  • The Single-Output Inefficient Production
    Frontiers and the Measure of Technical Efficiency
  • An input oriented inefficiency measure relative
    to an inefficient frontier as
  • An output-oriented measure of inefficiency
    relative to an inefficient frontier is given by
    the function

12
Definition of an Inefficient Frontier
  • The Multiple-Output Inefficient Production
    Frontiers and the Measure of Technical Efficiency
  • The inefficient frontier with respect to the
    subset X(y) can be found by

13
Linear Program for Calculating Inefficiency
  • The Multiple-Output Inefficient Production
    Frontiers and the Measure of Technical Efficiency
  • The inefficiency estimate calculated from the
    input perspective can be found by solving the
    following linear program

14
Definition of an Inefficient Frontier
  • The Multiple-Output Inefficient Production
    Frontiers and the Measure of Technical Efficiency
  • The inefficient frontier with respect to the
    subset Y(x) can be found by

15
Linear Program for Calculating Inefficiency
  • The Multiple-Output Inefficient Production
    Frontiers and the Measure of Technical Efficiency
  • The inefficiency estimate calculated from the
    output perspective can be found by solving the
    following linear program

16
Outlier Detection
  • As suggested by Simar 2003 an outlier needs to be
    identified by both an input and an output
    oriented detection method

17
Leave-One-Out DEA Linear Program
  • The leave-one-out DEA inefficiency estimate is

18
Threshold Value for Identifying Outliers
Input oriented Threshold Value
the worst observation or convex combination of
bad observations in the reference set excluding
the observation under evaluation can produce the
same level of output as the given observation
using half the inputs
Output oriented Threshold Value
the worst observation or convex combination of
bad observations in the reference set excluding
the observation under evaluation can use the same
level of input as the given observation and
produce twice the output
The reciprocal of the input threshold
19
Iterative Outlier Detection
  • Identify outliers based on agreement between both
    the input and output orientation detection method
  • Remove identified outliers
  • Rerun outlier detection method
  • Continue for the number of outlier detected is
    below some limit or for a set number of iterations

20
Bootstrapping method for the second stage of the
two-stage DEA method
  • Necessary because of the correlation among error
    terms in the second stage regression
  • Sample n observations, call this set b, with
    replacement from the set of input/output data
  • Calculate efficiency estimates for each of the
    original n observations relative to the set b
  • Repeat these two steps 2000 times
  • Construct confidence intervals and bias estimates
    for each of the original n efficiency estimates

21
Second Stage Bootstrap
  • Bootstrapping method for the second stage of the
    two-stage DEA method
  • Because of bias present, the corrected efficiency
    estimate
  • Confidence interval estimates

22
Banker and Morey Data
  • When outliers are only determined based on the
    efficient frontier, the finding is population has
    no impact on efficiency
  • When outliers are determined based on an
    efficient and an inefficient frontier, the
    finding is population is negative correlated with
    efficiency at the 95 confidence level

23
Summary
  • By not considering an inefficient frontier the
    second stage results of the two-stage method are
    biased. This bias is not corrected for by the
    bootstrapping method.
  • Developed a more comprehensive outlier detection
    methodology for non-parametric efficiency methods

24
Thank You
Write a Comment
User Comments (0)
About PowerShow.com