Title: Camera Culture
1Camera Culture
Ramesh Raskar Associate Prof, Media Lab, MIT
Course WebPage http//raskar.info/course.html
2Todays Plan
- Summary, Camera for image search
- Visual Social Computing Citizen Journalism
- Next class big question
- Opportunities in Pervasive Public Recording
- Big concept
- (Last week)
- Understanding Camera Constraints
- (This week)
- What matters in photography pixels (Low-level
cues) or low-dimensional features (Mid-level
cues)? - Decomposing pixels into meaningful values
3Camera for image search
- How can we augment the camera to support best
'image search'? - 'Search'segment/identify/recognize/transform/comp
are/archive - Or more precisely, object matching across images.
- (For example, if we find to find a specific face
image, we need a procedure to segment and
identify (detect) the pixels likely to belong to
a face, then recognize the candidate face by
transforming into a representation where we can
match with that specific face image. Currently,
this is all performed in software using
traditional cameras. Typically, the algorithms
try to reduce the image to lower-dimensional
'features' and do the matching in this
feature-space. Unlike text search, where the
search pipeline is simple thanks to easy matching
process, object-matching-in-images is quite
difficult. What can we additional data can we
capture while recording pixels and what new
algorithms can exploit this augmented photo?) - How can we make the scene ingredients machine
readable so that we can easily perform the
'search'? Is this the key problem? 3D
reconstruction (so that it is view independent,
)? Hardware and software solutions? Crowdsourcing
(let people do - marking/sorting/indexing for others)? Metadata
tagging (tag highlevel text labels rather than
pixel-level tagging)? - Do we need to capture Material index (where is
all the wood in this image)? Segmentation
boundaries (shape versus reflectance edges)?
Repeatable view and illumination invariance (be
able to recreate image from a given view so it
can be compared with another image, or create
images that look same independent of
time-of-day)? - Some ideas (i) to locate all 'images' with
faces, record the iris biometric which validates
if a photo includes a human eye, and then we can
search all images across an album with that
face/eye/iris, (ii) embed RFID tag (electronic
bar-code) in every object and record the binary
index with an RFID reader.
4Next Class
- Homework
- What are the opportunities in pervasive recording
of public spaces? - Pervasive public recordingsurveillance/GoogleEart
hLive/Subscription cameras - Technology
- See thru fog, time-lapse processing,
day-nite/season/multi-modal fusion, how to
consume these images, how to merge with
static/dynamic content, merge with static/dynamic
cameras, support object recognition, refine GPS
coords, crowdsourcing, metadata (video frame)
tagging - Society
- Commerce (real-estate, reviews, remote
maintenance), Environment (earthquake-prediction
like opportunities, Politics (protests) - Volunteer
- Class notes Lav (today), next ..
- Select/read/present/paper
- Visual Social Computing Tom
- Mobile Photography Eugene
- Beyond Visible Spectrum Brandon
- Emerging sensors Matt
- Developing Countries Lav/ Tilke
- Sols for Visually Challenged James
5Today 3pm
- Less is More Coded Computational Photography
- Speaker Ramesh Raskar, MIT Media LabDate
Wednesday, February 20 2008Time 300PM to
400PM Refreshments 245PM Location Star
Seminar Room (32-D463)
6Topics
- Imaging Devices, Modern Optics and Lenses
- Emerging Sensor Technologies
- Mobile Photography
- Visual Social Computing and Citizen Journalism
- Imaging Beyond Visible Spectrum
- Computational Imaging in Sciences
- Trust in Visual Media
- Solutions for Visually Challenged
- Cameras in Developing Countries
- Future Products and Business Models
7Feedback
- What are your questions about camera/technology/so
ciety? - Your expectations from the course?
8Topics
- Other courses
- Art and Photography
- CSAIL Computational Photography
- MechE Optics
- Fall2008
- Intro to Computational Camera and Photography
- I will teach course in Fall
- Current course
- More emphasis on future cameras
- Faster review of technology and then look at
impact/applications/opportunities - Big ideas/technologies/applications,
- Understand rules-of-thumb and trade-offs
- Ideal for thesis/projects/research
papers/business models - Learn fun stuff before the nitty gritty
9Photography Full of Tradeoffs...
- Available light vs. exposure time vs. scene
movement vs. field of view vs. focus depth vs.
sensitivity vs. noise vs. color rendition vs.
color gamut vs. contrast vs. visible detail vs. .
Flash
10Available Light vs Parameter/Specs box
Exposure
Dynamic Range
Focus distance
Resolution/Frame rate
Focal Length (zoom)
Field of view
Depth of field
Aperture
Limited Parameters
Limited Abilities
11Dynamic Range
Short Exposure
Goal High Dynamic Range
Long Exposure
12Phase 1 of Better Photography
- Epsilon Photography
- Low-level vision
- Best pixel and pixel-features
- Vary focus, exposure, polarization, illumination
- Vary time, view
- Better than any one photo (resolution/frame rate,
fov, dynamic range etc) - Achieve effects via multi-photo fusion
- Create a Super-camera
- Mimic human eye
13Phase 1.1 of Better Photography
- Create a Super-camera
- Mimic human eye
- What aspect of human eye are critical/ useless?
- Eye Feedback wrt brain, After-image/illusions,
- Camera geometry/stereo pair, multispectral,unifor
m res, memory, - What are other parameters/Design/Features to
improve? - Very small camera/thin camera ..
- Tight loop with illumination
- ..
14The Eyes Lens
15Varioptic Liquid Lens Electrowetting
Varioptic, Inc., 2007
16Varioptic Liquid Lens
(Courtesy Varioptic Inc.)
17Captured Video
(Courtesy Varioptic Inc.)
18Conventional Compound Lens
19Origami Lens Thin Folded Optics (2007)
Ultrathin Cameras Using Annular Folded Optics,
E. J. Tremblay, R. A. Stack, R. L. Morrison, J.
E. Ford Applied Optics, 2007 - OSA
20Origami Lens
Conventional Lens
Origami Lens
21Optical Performance
Conventional Origami
Scene
22Compound Lens of Dragonfly
23TOMBO Thin Camera (2001)
Thin observation module by bound optics
(TOMBO), J. Tanida, T. Kumagai, K.
Yamada, S. Miyatake Applied Optics, 2001
24TOMBO Thin Camera
25Captured Image
TOMBO
Scene
Captured Image (Multiple low-resolution copies
of the scene)
26Reconstructed Image
27Phase 1 of Better Photography
- Epsilon Photography
- Low-level vision
- Best pixel and pixel-features
- Vary focus, exposure, polarization, illumination
- Vary time, view
- Better than any one photo (resolution/frame rate,
fov, dynamic range etc) - Achieve effects via multi-photo fusion
- Create a Super-camera
- Mimic human eye
28Phase 1.1 of Better Photography
- Create a Super-camera
- Mimic human eye
- What aspect of human eye are critical/ useless?
- ..
- What are other parameters/Design/Features to
improve? - Very small camera/thin camera ..
- Tight loop with illumination
- ..
29Phase 2 of Better Photography
- Coded Photography
- Mid-level cues
- Regions, shapes(depth), edges, motion,
material-index () - Cartoons via Multi-flash camera (depth edges),
Wavelength profile, - Visual interface issue (human eye expects pixels)
- Decompose pixel values ()
- Single or few photos
- Create a functionally super-camera
- Dont mimic human eye
30Multiperspective Camera?
31Phase 3 of Better Photography
- Essence Photography
- High-level cues
- Inference, perception, cognition
- Intent based (like biovision systems)
- Not a single-solution fits-all
- ? Single or few photos
- Beats photography
- Dont just mimic human eye, or record
pixels/mid-level cues - Create a meaningful representation of visual
experience - New art form, new commerce models
32Visual Social Computing and Citizen Journalism
- What is VSC
- Social Computing is well known, I made up VSC
- My defn of SC Online computation of the people,
by the people, for the people (old world govt,
economy, epidemiology) - Subsets
- Crowdsourcing (CAPTCHA) (by the people, but maybe
for just one person) - Participatory sensing (of the people, but no
active part by individuals, not for the people) - Recommendation systems (by the people and for the
people) - Tagging (Digg) (all three)
- Blogs, social networks, auctions, wikipedia, tags
- 90 of all data will be about people
- Example problem Can we reduce distrust among
Kenyas groups? - Easy to predict certain trends ..
- Just add dimensions
- Text, audio/music, images, video, (whats next)
- LP-gtCassette-VHS player -gt CD player -gt DVD
Player (ok Blue-ray DVD player) -gt (whats next) - Radio-TV- ..
- Gopher -gt Newsgroups -gtWikipedia _gt (whats next)
- Take anything text/audio based -gt image/video
- Take anything image based -gt video (Flickr -gt
YouTube)
33Today 3pm
- Less is More Coded Computational Photography
- Speaker Ramesh Raskar, MIT Media LabDate
Wednesday, February 20 2008Time 300PM to
400PM Refreshments 245PM Location Star
Seminar Room (32-D463)
34Next Class
- Homework
- What are the opportunities in pervasive recording
of public spaces? - Pervasive public recordingsurveillance/GoogleEart
hLive/Subscription cameras - Technology
- See thru fog, time-lapse processing,
day-nite/season/multi-modal fusion, how to
consume these images, how to merge with
static/dynamic content, merge with static/dynamic
cameras, support object recognition, refine GPS
coords, crowdsourcing, metadata (video frame)
tagging - Society
- Commerce (real-estate, reviews, remote
maintenance), Environment (earthquake-prediction
like opportunities, Politics (protests) - Volunteer
- Class notes Lav (today), next ..
- Select/read/present/paper
- Visual Social Computing Tom
- Beyond Visible Spectrum Brandon
- Mobile Photography
- Emerging sensors
- Developing Countries Lav