MemoryBased Reasoning - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

MemoryBased Reasoning

Description:

Applying the information from these cases to the problem at hand. ... Using MBR to determining if the new customer will attrite. Attrition prediction with confidence ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 15
Provided by: pas59
Category:

less

Transcript and Presenter's Notes

Title: MemoryBased Reasoning


1
Memory-Based Reasoning
  • ???
  • PASTA Lab.
  • POSTECH

2
1. Introduction
  • Memory-Based Reasoning(MBR) is
  • Identifying similar cases from experience
  • Applying the information from these cases to the
    problem at hand.
  • MBR finds neighbors similar to a new record and
    uses the neighbors for classification and
    prediction.
  • It cares about the existence of two operations
  • Distance function assigns a distance between
    any two records
  • Combination function combines the results from
    the neighbors to arrive at an answer.
  • Applications of MBR span many areas
  • Fraud detection
  • Customer response prediction
  • Medical treatments
  • Classifying responses

3
2. How does MBR work?
  • What is the most likely movie last seen by a
    respondent based on the source of the record and
    the age of the individual?
  • MBR has two distinct phases
  • The learning phase generates the historical
    database
  • The prediction phase applies MBR to new cases

4
2.1. The three main issues in solving a problem
with MBR
  • Choosing the appropriate set of historical
    records
  • The historical records, also known as the
    training set, is a subset of available records.
  • The training set needs to provide good coverage
    of the records so that the nearest neighbors to
    an unknown record are useful for predictive
    purposes.
  • Representing the historical records
  • The performance of MBR in making predictions
    depends on how the training set is represented in
    the computer.
  • Determining the distance function, Combination
    function, and number of neighbors
  • The distance function, combination function, and
    number of neighbors are the key ingredients in
    determining how good MBR is at producing results.

5
3. Case study Classifying News Stories
  • What are the codes?
  • News provider assigns codes to news stories in
    order to describe the content of the stories.
    These codes help users search for stories of
    interest.
  • Applying MBR
  • Choosing the training set
  • The training set consisted of 49,652 news stories
  • Choosing the Distance function
  • In this case, a distance function already
    existed, based on a notion called relevance
  • feedback that measures the similarity of two
    documents based on the words they contain.

6
3. Case study Classifying News Stories
  • Relevance Feedback function
  • Choosing the combination function
  • The combination function used a weighted
    summation technique.
  • Choosing the number of neighbors
  • The investigation varied the number of nearest
    neighbors between 1 and 11 inclusive.

7
3. Case study Classifying News Stories
  • The result
  • Recall and precision are two measurements that
    are useful when measuring how well a set of codes
    get assigned.
  • Recall How many of the correct codes did MBR
    assign to the story?
  • Precision How many of the codes assigned by
    MBR were correct?

8
4. Measuring Distance
  • Three most common distance functions
  • Absolute value of the difference A-B
  • Square of the difference (A-B)2
  • Normalized absolute value A-B/(maximum
    difference)
  • Example
  • Gender
  • Dgender(female,female) 0, Dgender(male,female)
    1
  • Dgender(female,male) 1, Dgender(male,male)
    0

9
4. Measuring Distance
  • Age
  • Merge into a single record distance function.
  • Summation dsum(A,B) dgender(A,B) dage(A,B)
    dsalary(A,B)
  • Normalized summation dnorm(A,B)
    dsum(A,B)/max(dsum)
  • Euclidean distance deuclid(A,B)
    sqrt(dgender(A,B)2 dage(A,B)2 dsalaty(A,B)2)

10
4. Measuring Distance
  • Set of nearest neighbors for three distance
    functions
  • Insert new customer
  • Gender Female, Age 45, Salary
    100,000
  • Set of nearest neighbor for new customer

11
5. The combination function Asking the
neighbors for the answer
  • The basic approach Democracy
  • The basic combination function used for MBR is to
    have the K nearest neighbors vote on the
    answer-democracy in data mining.
  • Customers with Attrition History

12
5. The combination function Asking the
neighbors for the answer
  • Using MBR to determining if the new customer
    will attrite
  • Attrition prediction with confidence

13
5. The combination function Asking the
neighbors for the answer
  • Weighted voting
  • Weighted voting is similar to voting except that
    the neighbors are not all created equal
  • Closer neighbors have stronger votes than
    neighbors farther away do.
  • The size of the vote is inversely proportional to
    the distance from the new record.
  • To prevent problems when the distance might be 0,
    it is common to add 1 to the distance before
    taking the inverse.
  • Attrition prediction with weighted voting
  • Confidence with weighted voting

14
6. Conclusion
  • Strengths of Memory-Based Reasoning
  • It produces results that are readily
    understandable.
  • It is applicable to arbitrary data types, even
    non-relational data.
  • It works efficiently on almost any number of
    fields.
  • Maintaining the training set requires a minimal
    amount of effort.
  • Weaknesses of Memory-Based Reasoning
  • It is computationally expensive when doing
    classification and prediction.
  • It requires a large amount of storage for the
    training set.
  • Results can be dependent on the choice of
    distance function, combination function, and
    number of neighbors
Write a Comment
User Comments (0)
About PowerShow.com