Revisiting Briand et al' Studies - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Revisiting Briand et al' Studies

Description:

Both studies showed that CBR performed poorly compared to other prediction ... There were attributes of the data sets themselves that caused CBR to perform poorly ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 9
Provided by: cen7151
Category:

less

Transcript and Presenter's Notes

Title: Revisiting Briand et al' Studies


1
Revisiting Briand et al. Studies
  • Colin Kirsopp and Martin Shepperd
  • Empirical Software Engineering Research Group
  • Bournemouth University
  • email ckirsopp_at_bmth.ac.uk
  • mshepper_at_bmth.ac.uk

2
Independent Replication
  • Niessink and van Vliet (1997)
  • Stensrud and Myrtviet (1998, 99)
  • Jeffery and Walkerden (1999)
  • no search for best subset of features
  • Briand and El Eman (2000,01)
  • approx. 30 features so exhaustive search for best
    subset not possible
  • homogeneity well defined relationships favour
    regression techniques

?
?
?
?
3
Briand et al. studies
  • There were 2 studies done to compare difference
    cost estimation techniques
  • Both studies used an N-Fold validation
  • One study used the experience data set from STTF
  • The other study use European Space Agency data

4
Briand et al. studies
  • Both studies showed that CBR performed poorly
    compared to other prediction techniques (such as
    SWR, CART, ANOVA)
  • These results are markedly different from our own
    results and those of other independent
    replications
  • The question we want to answer is WHY DO
    THEY GET SUCH DIFFERENT RESULTS?

5
Possible Hypotheses
  • There were attributes of the data sets themselves
    that caused CBR to perform poorly
  • There was some bias in the results
  • Poor knowledge of CBR configuration
  • Using there own specialised statistical methods
  • They used a filter method of feature selection
    rather than a wrapper method for the CBR results

6
The aim is ...
  • To replicate the Briand work using updated CBR
    configuration
  • The ESA data is not available but we do have the
    version of the Finnish data set that was used
  • We have (with some help from Katrina Maxwell)
    been able to recreate the alternative models

7
Initial Results
  • A quick assessment the likely outcome a results
    for our updates configuration of CBR
  • This was done with a fixed feature subset for all
    validation sets
  • Results showed CBR out performing Briands
    statistical techniques

8
Problem!
  • The single feature set was derived using an
    N-Fold hold-out from all of the data.
  • Strictly, each of the validation sets should have
    had its own feature set derived from its training
    set
  • We didnt feel that this simplification should
    have too much impact
  • IT DID! - Now there isnt a significant
    improvement
Write a Comment
User Comments (0)
About PowerShow.com