A New Annotation Tool - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

A New Annotation Tool

Description:

10/15/09. Tom Morton. 1. A New Annotation Tool. Tom Morton and Jeremy LaCivita. 10/15/09 ... Subtle differences in corpora can break statistical tools. ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 13
Provided by: cisU
Category:
Tags: annotation | jeremy | new | tool

less

Transcript and Presenter's Notes

Title: A New Annotation Tool


1
A New Annotation Tool
  • Tom Morton and Jeremy LaCivita

2
Introduction
  • Subtle differences in corpora can break
    statistical tools.
  • Adapting tools typically means annotating data.
  • Can this be done quickly and easily?
  • The distribution of words makes annotation
    repetitive.
  • Can we annotate only some of the data?

3
Design Goals
  • Support for automatic annotation.
  • It is typically faster to fix annotations.
  • Support for active learning.
  • Examples which will be most beneficial for
    training are selected for annotation.
  • Easily extendable and reusable components.

4
Tool Overview
  • Persistent Features
  • Sorting Annotations
  • Filtering Annotations
  • Annotation Specific Features
  • Schemes
  • Annotators
  • Viewers

5
Sorting Annotations
  • Allows user to
  • Sort by order in the text (span).
  • Sort by confidence.
  • Supports active learning.
  • Sort by type.

6
Filtering Annotations
  • Allows user to
  • Look for certain words, patterns, or
    morphological stems.
  • Look for specific annotation types.
  • Filter annotations based on who did the
    annotation.
  • Filter annotations based on confidence.

7
Tool Overview
  • Persistent Features
  • Sorting Annotations
  • Filtering Annotations
  • Annotation Specific Features
  • Schemes
  • Annotators
  • Viewers

8
Schemes Implemented
  • Paragraph, Sentence, Segmentation, POS,
    Constituent, Named Entity, Word Sense, and
    Coreference.
  • Most are about 80 lines of code.
  • Word Sense and Coreference have
  • Custom input.
  • Dynamic outcomes.
  • About 300 lines.

9
Automatic Annotators Implemented
  • Paragraph English/Chinese.
  • Sentence English/Chinese.
  • Segmentation English/Chinese.
  • POS English/Chinese.
  • NP-Chunking English.
  • Coreference English.

10
Viewers Implemented
  • Color Text
  • Tree
  • Concordance

11
Conclusions
  • Allow automatic annotators to be easily
    integrated.
  • Sorting of annotations allows for an active
    learning approach to annotation.
  • Tool is easily extendable.
  • Previously developed components can be reused
    with new annotation types.

12
Future Work
  • Integrate more automatic annotators
  • English
  • Full Parser Bikel
  • Word Sense - Dang
  • Extraction Relationships - Shen
  • Chinese
  • Full Parser Chiang/Bikel
  • Word Sense Dang/Chi
  • Coreference - Converse
Write a Comment
User Comments (0)
About PowerShow.com