Automatic Keyphrase Extraction - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Automatic Keyphrase Extraction

Description:

Automatic Keyphrase Extraction. Emma Nguyen. Wing group meeting. December 15th 2006. Outline ... Document - list of candidate keyphrases - output keyphrases ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 11
Provided by: thuy4
Category:

less

Transcript and Presenter's Notes

Title: Automatic Keyphrase Extraction


1
Automatic Keyphrase Extraction
  • Emma Nguyen
  • Wing group meeting
  • December 15th 2006

2
Outline
  • Related work
  • Method
  • Evaluation
  • Future work

3
Related work
  • Approaches
  • Keyphrase assignment
  • Predefined list of potential keyphrases -gt
    document
  • Disadvantage human expertise
  • Keyphrase extraction
  • Document -gt list of candidate keyphrases -gt
    output keyphrases (based on set of features)

4
Keyphrase extraction
  • Frank et al 1999 KEA
  • TFxIDF
  • Position of phrases first occurrence
  • Turney 2003
  • Enhanced KEA
  • Pointwise Mutual Information
  • Disadvantage time and resource consuming

5
Method
  • Based on KEA
  • Only noun phrases (POS tagger)
  • Features
  • TFxIDF
  • Number of words in phrase
  • Acronym
  • Section related
  • Number of sections appear
  • Position in section
  • Frequency in section

6
Section Finder
  • Abstract
  • Introduction
  • Related work
  • Method
  • Evaluation
  • Reference

7
Features
  • Number of sections
  • A Number of sections phrase P appears
  • B Number of total sections in document
  • Position in section
  • C The first sentence in section S that contains
    phrase P
  • D Number of total sentences in section S
  • Frequency in section

8
Evaluation
  • n number of matches between the output phrases
    and author-assigned phrases
  • N number of total extracted keyphrases

Training document 50
Testing document 53
Precision 16.2
9
Evaluation (Cont)
Author-assigned keyphrases Extracted keyphrases
Dispersers Ramsey graphs Independent sources Extractors disperser sources independent sources extractor Result
Implicit feedback Personalized search User model Interactive retrieval Search engine Information retrieval User model Web search Implicit user
10
Future work
  • Revise the added features
  • Study the collocation of words
Write a Comment
User Comments (0)
About PowerShow.com