Title: 16-721: Advanced Machine Perception
116-721 Advanced Machine Perception
- Staff
- Instructor Alexei (Alyosha) Efros (efros_at_cs),
4207 NSH - TA David Bradley (dbradley_at_cs), 2216 NSH
- Web Page
- http//www.cs.cmu.edu/efros/courses/AP06/
2Today
- Introduction
- Why Perception?
- Administrative stuff
- Overview of the course
- Image Datasets
3A bit about me
- Alexei (Alyosha) Efros
- Relatively new faculty (RI/CSD)
- Ph.D 2003, from UC Berkeley (signed by Arnie!)
- Research Fellow, University of Oxford, 03-04
- Teaching
- I am still learning
- The plan is to have fun and learn cool things,
both you and me! - Social warning I dont see well
- Research
- Vision, Graphics, Data-driven stuff
4PhD Thesis on Texture and Action Synthesis
Smart Erase button in Microsoft Digital Image
Pro
Antonio Criminisis son cannot walk but he can
fly?
5The story begins
- All happy families are alike each unhappy
family is unhappy in its own way. - -- Lev Tolstoy, Anna Karenina
- What does it mean, to see? The plain man's
answer (and Aristotle's, too). would be, to know
what is where by looking. - -- David Marr, Vision (1982)
6Vision a split personality
- What does it mean, to see? The plain man's
answer (and Aristotle's, too). would be, to know
what is where by looking. In other words, vision
is the process of discovering from images what is
present in the world, and where it is. -
- Answer 1 pixel of brightness 243 at position
(124,54) - and depth .7 meters
- Answer 2 looks like bottom edge of whiteboard
showing at the top of the image - Is the difference just a matter of scale?
7Measurement vs. Perception
8Brightness Measurement vs. Perception
9Brightness Measurement vs. Perception
Proof!
10Lengths Measurement vs. Perception
Müller-Lyer Illusion
http//www.michaelbach.de/ot/sze_muelue/index.html
11Vision as Measurement Device
Real-time stereo on Mars
Physics-based Vision
Virtualized Reality
Structure from Motion
12but why?
- Reason 1
- Semester too short, cant cover everything
- Other great classes offered at CMU, e.g.
- Appearance Modeling (Srinivas Narasimhan, every
fall) - Medical Vision (Yanxi Liu)
- Structure from Motion (Martial Hebert, sometime?)
- But what if I dont care about this wishy-washy
human perception stuff? I just want to make my
robot go! - Reason 2
- For measurement, other sensors are often better
(in DARPA Grand Challenge, vision was barely
used!) - Reason 3
- The goals of computer vision (what where) are
in terms of what humans care about.
13So what do humans care about?
slide by Fei Fei, Fergus Torralba
14Verification is that a bus?
slide by Fei Fei, Fergus Torralba
15Detection are there cars?
slide by Fei Fei, Fergus Torralba
16Identification is that a picture of Mao?
slide by Fei Fei, Fergus Torralba
17Object categorization
sky
building
flag
face
banner
wall
street lamp
bus
bus
cars
slide by Fei Fei, Fergus Torralba
18Scene and context categorization
slide by Fei Fei, Fergus Torralba
19Rough 3D layout, depth ordering
20Challenges 1 view point variation
Michelangelo 1475-1564
21Challenges 2 illumination
slide credit S. Ullman
22Challenges 3 occlusion
Magritte, 1957
23Challenges 4 scale
slide by Fei Fei, Fergus Torralba
24Challenges 5 deformation
Xu, Beihong 1943
25Challenges 6 background clutter
Klimt, 1913
26Challenges 7 object intra-class variation
slide by Fei-Fei, Fergus Torralba
27Challenges 8 local ambiguity
slide by Fei-Fei, Fergus Torralba
28Challenges 9 the world behind the image
29In this course, we will
Take a few baby steps
30Course Organization
- Requirements
- Paper Presentations (50)
- Paper Advocate
- Paper Demo Presenter
- Paper Opponent
- Class Participation (20)
- Keep annotated bibliography
- Post questions / comments on Quick-topic
- Ask questions / debate / flight / be involved!
- Final Project (30)
- Do something with lots of data (at least 500
images) - Groups of 1, 2, or 3
31Paper Advocate
- Pick a paper from list
- That you like and willing to defend
- Sometimes I will make you do two papers, or
background - Meet with me before starting to talk about how to
present the paper(s) - Prepare a good, conference-quality presentation
(20-45 min, depending on difficulty of material) - Meet with me again 2 days before class to go over
the presentation - Office hours at end of each class
- Present and defend the paper in front of class
32Paper Demo Presenter
- For some papers, we will have separate demo
presentations - Sign up for a paper you find interesting
- Get the code online (or implement if easy)
- Run it on a toy problem, play with parameters
- Run it on a new dataset
- Prepare short 5-10 min presentation detailing
results - Can cooperate with Paper Advocate
33Paper Opponent
- Sign up for a paper you dont like / suspicious
about - Prepare an argument (with or without slides)
against the paper - Paper weaknesses
- Relevance to real problems
- Existence of better alternative approaches
- Etc.
- Present in front of class (5-10 min)
34Class Participation
- Keep annotated bibliography of papers you read
(always a good idea!). The format is up to you.
At least, it needs to have - Summary of key points
- A few Interesting insights, aha moments, keen
observations, etc. - Weaknesses of approach. Unanswered questions.
Areas of further investigation, improvement. - Submit your thoughts for current paper(s) before
each class (printout)
35Class Participation
- In addition, submit interesting observations or
questions to QuickTopic before class for public
discussion. - Be active in class. Voice your ideas, concerns.
- You need to participate either in class or in
QuickTopic every week! - Dave will be watching and keeping track!
36Final Project
- Can grow out of paper presentation, or your own
research - But it needs to use large amounts of data!
- 1-3 people per project.
- Project proposals in a few weeks.
- Project presentations at the end of semester.
- Results presented as a CVPR-format paper.
- Hopefully, a few papers may be submitted to
conferences.
37End of Semester Awards
- We will vote for
- Best Paper Presenter
- Best Paper Opponent
- Best Demo
- Best Project
- Prize dinner in a nice restaurant
38Course Outline
- Physiology of Vision (1 lecture)
- Overview of Human Visual Percetion (1 lecture)
- Need presenter for Monday!
- Part I Low-level vision (images as texture)
- Texture segmentation, image retrieval, scene
models, Bag of words representations - Part II Mid-level vision (segmentation)
- Principles of grouping, Normalized Cuts,
Mean-shift, DD-MCMC, Graph-cut, super-pixels - Part III 2D Recognition
- Window scanning (SchnidermanKanade, ViolaJones)
- Correspondence Matching (schanfer matching,
housedorf distance, shape contexts, invariant
features, active appearance models) - Recognition with Segmentation (top-down
buttom-up) - Words and Pictures
39Course Outline (cont.)
- Part IV Intrinsic Images
- Shading vs. reflectance
- Recovering surface orientations and depth
- Style vs. content
- Part V Dealing with Data
- Isomap, LLE, Non-negative Matrix Factorization
- Part VI Tracking and Motion Segmentation
- Particle filtering, examplar-based, layers
- Sign up to present one paper on Wed on QuickTopic
40Datasets