Utility data annotation via Amazon Mechanical Turk - PowerPoint PPT Presentation

About This Presentation
Title:

Utility data annotation via Amazon Mechanical Turk

Description:

Utility data annotation via Amazon Mechanical Turk. Alexander Sorokin ... Unlabeled data is free. Labels are useful. We need large volumes of labeled data ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 22
Provided by: alexande94
Category:

less

Transcript and Presenter's Notes

Title: Utility data annotation via Amazon Mechanical Turk


1
Utility data annotation via Amazon Mechanical Turk
X 100 000 5000
  • Alexander Sorokin
  • David Forsyth
  • University of Illinois at Urbana-Champaign
  • http//visionpc.cs.uiuc.edu/largescale/

2
Motivation
  • Unlabeled data is free
  • Labels are useful
  • We need large volumes of labeled data
  • Different labeling needs
  • Is there X in the image?
  • Outline X.
  • Where is part Y of X.
  • Of these 500 images, which belong to category X?
  • . and many more .

3
Amazon Mechanical Turk
Workers
Task
Task Dog?
Broker
Answer Yes
Pay 0.01
Is this a dog?
www.mturk.com
o Yes o No
0.01
4
Motivation
X 100 000 5000
Custom annotations
Large scale
Low price
5
Annotation protocols
  • Type keywords
  • Select relevant images
  • Click on landmarks
  • Outline something
  • Detect features
  • .. anything else

6
Type keywords
http//austinsmoke.com/turk/.
0.01
7
Select examples
Joint work with Tamara and Alex Berg
http//visionpc.cs.uiuc.edu/largescale/data/simpl
eevaluation/html/horse.html
8
Select examples
0.02
requester mtlabel
9
Click on landmarks
0.01
http//vision-app1.cs.uiuc.edu/mt/results/people14
-batch11/p7/
10
Outline something
0.01
http//visionpc.cs.uiuc.edu/largescale/results/pr
oduction-3-2/results_page_013.html Data from
Ramanan NIPS06
11
Detect features
Measuring molecules. Joint work with Rebecca
Schulman (Caltech)
?? 0.1
http//visionpc.cs.uiuc.edu/largescale/all_exampl
es.html
12
Motivation
X 100 000 5000
Custom annotations
Large scale
Low price
13
Issues
  • Quality?
  • How good is it?
  • How to be sure?
  • Price?
  • How to price it?
  • How does MTurk compare with others?
  • How do I sign up?
  • sorokin2_at_uiuc.edu
  • http//visionpc.cs.uiuc.edu/largescale/

14
Annotation quality
  • Agree within 5-10 pixels
  • on 500x500 screen
  • There are bad ones.

A
C
E
G
15
Grading tasks
  • Take 10 submitted results
  • Create new task to verify the result
  • Verification is easy
  • Pay the same or slightly higher price
  • Total overhead - 10
  • (work in progress)

http//vision-app1.cs.uiuc.edu/mt/grading/people14
-batch11-small/p1/
16
Price
  • 0.01 per image (16 clicks)
  • 1500 / 100 000 images
  • gt1000 images per day
  • lt4 months
  • Workers suggested 0.03 - 0.05/img
  • 3500 - 5500 / 100 000 images

17
Is the price right?
  • 0.01/ 40 clicks
  • 15 hours
  • 900 labels

0.01 / 14 clicks 1.6 hours 900 labels
0.01 / 16 clicks 4 hours 900 labels
18
Annotation Method Comparison
Approach Cost Scale Setup effort Centralized Quality Elastic to
MTurk no /
LabelME Yes
ImageParsing.com Yes
Games with purpose (ESP) Yes
In house no
19
How do I sign up?
  • Go to our web page
  • http//visionpc.cs.uiuc.edu/largescale/
  • Send us an e-mail
  • sorokin2_at_uiuc.edu
  • Register at Amazon Mechanical Turk
  • http//www.mturk.com

20
Acknowledgments
  • Special thanks to
  • David Forsyth
  • Tamara Berg
  • Rebecca Schulman
  • David Martin
  • Kobus Barnard
  • Mert Dikmen
  • All workers at Amazon Mechanical Turk
  • This work was funded in part by ONR

21
Thank you
X 100 000 5000
Write a Comment
User Comments (0)
About PowerShow.com