INSYS 300 Dimensionality Reduction - PowerPoint PPT Presentation

1 / 3
About This Presentation
Title:

INSYS 300 Dimensionality Reduction

Description:

In the vector model, we considered each term a distinct dimension ... By this definition, opposites (up/down) are more similar than un-related terms (up/soccer) ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 4
Provided by: xia52
Category:

less

Transcript and Presenter's Notes

Title: INSYS 300 Dimensionality Reduction


1
INSYS 300 Dimensionality Reduction
2
What are the Dimensions
  • In the vector model, we considered each term a
    distinct dimension
  • We would like to reduce the number of dimensions
    to
  • Make the calculations simpler
  • Get better averages
  • Stemming helps but wed really like to find
    semantic similarities

3
Co-occurrence and Semantic Similarity
  • Co-occurrence of words they appear together in
    the same document is an indication of semantic
    similarity.
  • By this definition, opposites (up/down) are more
    similar than un-related terms (up/soccer)
  • Calculations based on these similarities can be
    used to create a type of thesaurus.
  • This is also used in an IR technique called
    Latent Semantic Indexing
Write a Comment
User Comments (0)
About PowerShow.com