Push Singh - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Push Singh

Description:

Parsed & sense-tagged corpora, paraphrases, translations. Commonsense reasoning: ... Online supervised learning (i.e. Stork's Animals) ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 12
Provided by: pushs
Category:
Tags: push | singh | stork

less

Transcript and Presenter's Notes

Title: Push Singh


1
Push Singh Tim Chklovski
2
AI systems need data lots of it!
  • Natural language processing
  • Parsed sense-tagged corpora, paraphrases,
    translations
  • Commonsense reasoning
  • Facts, descriptions, scripts, rules, exceptions
  • Computer vision and speech recognition
  • Segmented images, transcribed speech
  • Robotics
  • Motion capture data, body configurations

3
Traditional Sources
  • 1. Knowledge engineering, programming
  • pro can be high quality
  • cons
  • - often brittle because lack of coverage
  • - expensive!
  • 2. Learning from raw data
  • pro there is sometimes a lot of raw data
  • cons
  • - you have little control over the data
  • - for many tasks the data is not available
  • - hard to learn structured representations

4
Solution turn to the general public!
  • There are 500,000,000 people on-line
    (Nielsen/NetRatings)
  • People can participate by
  • Providing labeled training examples (i.e. for
    OCR)
  • Tagging corpora (with part-of-speech, word
    senses)
  • Verifying and cleaning data (validating
    assertions)
  • Supplying rules and examples (assertions,
    stories)
  • Evaluating performance of systems (i.e. of face
    recognizer)
  • Online supervised learning (i.e. Storks Animals)
  • Organizing and structuring information (i.e. the
    web)

5
Successful Distributed Human Projects
  • The Open Directory Project (www.dmoz.org)
  • indexes 3,248,314 sites
  • 46,846 editors
  • FreeDB (www.freedb.org)
  • 543,786 CDs catalogued
  • Others
  • The Internet Movie Database (www.imdb.com)
  • American Psychological Society (psych.hanover.edu)
  • Distributed Proofreaders (charlz.dns2go.com/gutenb
    erg)
  • NASA Crater marking project (clickworkers.arc.nasa
    .gov/top)

6
Open Mind Common Sense
  • Second-largest commonsense database after Cyc
  • - 410,000 assertions, stories, descriptions,
    rules, etc.
  • Built by 8600 users over 1 ½ years
  • Can extract relations and rules via shallow
    parsing
  • Basis for several applications and experiments
  • - ARIA photo annotation and retrieval agent
  • - GOOSE commonsense search engine
  • - MAKEBELIEVE story generator
  • - intelligent camera, analogical reasoner
  • - word sense disambiguator
  • (Henry Lieberman, Hugo Liu, Barbara Barry, Thomas
    Lin)

7
The Snowball Effect
  • Two systems that leverage what they already know
    and give feedback to the contributors
  • Word Sense Disambiguation
  • - Lets users select what sense a word is used in
    in a given sentence
  • - Uses collected information to decide where more
    learning is needed.
  • - Provides feedback on how much an automatic
    tagger has improved because of your contribution
  • (learner.media.mit.edu/cgi/wsd-collect-tagging.cgi
    )
  • Learner
  • - Gathers commonsense knowledge by asking
    questions that the system thinks may be true
  • - Questions are formed by making analogies, based
    on existing knowledge
  • (forthcoming, see www.media.mit.edu/timc/learner)

8
The Pyramid of Tasks
  • Core
  • write plug-ins
  • contribute inference rules
  • contribute and verify simple assertions
  • This is because the prior experience required is
    an inverse pyramid
  • some lisp / scheme experience, knowledge
    representation / AI background
  • knowledge rep / ai background or interest, some
    programming
  • analytical skills, familiarity with reasoning
  • possess common sense

9
Future Open Mind Projects
  • - Using webcams, thousands of people help teach
    their computers to recognize the appearance and
    behavior of various kinds of objects.
  • - A system that reads text on the web, but has
    people help it comprehend passages.
  • - A dialogue system that people teach how to have
    a conversation.
  • - Using future cell phones and wearable
    computers, we could all start to teach computers
    the patterns of our everyday lives by letting
    them see and hear us as we actually do things in
    the world.

10
Open Questions
  • - How do we get users hooked?
  • - How do we acquire more sophisticated knowledge?
  • - Can we acquire hard-to-articulate knowledge?
  • - How do we use knowledge that is easy to
    acquire?

11
Fear, Rejoice,
  • Fear
  • You have to learn how to build a community of
    contributors.
  • You have to make your research accessible.
  • Rejoice
  • Unlimited number of urops!
  • Mom and Dad
  • Come to our web site! You can help too.
  • Discoveries
  • If you build it they will come.
  • Malevolent users not a big problem.
  • Recent
  • Built 2nd largest commonsense database.
  • Useful in prototype applications.
Write a Comment
User Comments (0)
About PowerShow.com