Title: Case Study: Cameraphones
1 Case Study Cameraphones
IS146 Foundations of New Media
- Prof. Marc Davis, Prof. Peter Lyman, and danah
boyd - UC Berkeley SIMS
- Tuesday and Thursday 200 pm 330 pm
- Spring 2005
- http//www.sims.berkeley.edu/academics/courses/is1
46/s05/
2Lecture Overview
- Review of Last Time
- Understanding Visual Media
- Today
- Case Study Cameraphone
- Preview of Next Time
- Databases
3Lecture Overview
- Review of Last Time
- Understanding Visual Media
- Today
- Case Study Cameraphone
- Preview of Next Time
- Databases
4What Are Comics?
- Juxtaposed pictorial and other images in
deliberate sequence, intended to convey
information and/or to produce an aesthetic
response in the viewer. (p. 9) - How do comics differ from
- Photographs?
- Movies?
- Hieroglyphics?
- Emoticons?
5Old Comics Mayan Codex Nuttall
6Scott McClouds Big Triangle
Picture Plane
Reality
Language
McCloud found that The Big Triangle as it came
to be known, was an interesting tool for thinking
about comics art...
7Cartoons and Viewer Identification
8Closure From Parts To The Whole
9Closure Bridging Time and Space
10Closure in Comics
11Types of Closure
- Scene-To-Scene
- Aspect-To-Aspect
- Non-Sequitur
- Moment-To-Moment
- Action-To-Action
- Subject-To-Subject
12Questions for Today
- How do we interpret images and sequences of
images? - How do we read different visual representations
of the world (especially different levels of
realism and abstraction) differently? - How does what is left out affect how we
understand images and sequences of images?
13Questions for Today
- What are some of the differences between how text
and images function in comics? - What would be lost/gained in moving between
images and text?
14Questions for Today
- How could we represent images and sequences of
images in order to make them programmable? - What could computation do to affect how we
produce, manipulate, reuse, and understand images
and sequences of images?
15Lecture Overview
- Review of Last Time
- Understanding Visual Media
- Today
- Case Study Cameraphone
- Preview of Next Time
- Databases
16What is the Problem?
- Today people cannot easily find, edit, share, and
reuse digital visual media - Computers dont understand visual media content
- Digital visual media are opaque and data rich
- We lack structured representations
- Without metadata, manipulating digital visual
media will remain like word-processing with
bitmaps
17Signal-to-Symbol Problems
- Semantic Gap
- Gap between low-level signal analysis and
high-level semantic descriptions - Vertical off-white rectangular blob on blue
background does not equal Campanile at UC
Berkeley
18Signal-to-Symbol Problems
- Sensory Gap
- Gap between how an object appears and what it is
- Different images of same object can appear
dissimilar - Images of different objects can appear similar
19Computer Vision and Context
- You go out drinking with your friends
- You get drunk
- Really drunk
- You get hit over the head and pass out
- You are flown to a city in a country youve never
been to with a language you dont understand and
an alphabet you cant read - You wake up face down in a gutter with a terrible
hangover - You have no idea where you are or how you got
there - This is what its like to be most computer vision
systemsthey have no context - Context is what enables us to understand what we
see
20How We Got Here Disabling Assumptions
- Contextual (spatial, temporal, social, etc.)
metadata about the capture and use of media are
not available - Therefore all analysis of media content must be
focused on the media signal alone - Media capture and media analysis are separated in
time and space - Therefore removed from their context of creation
and the users who created them - Multimedia content analysis must not involve
humans - Therefore missing out on the possibility of
human-in-the-loop approaches to algorithm
design and network effects of the activities of
groups of users
21Where To Go Enabling Assumptions
- Leverage contextual, sensory-rich metadata
(spatial, temporal, social, etc.) about the
capture and use of media content - Integrate media capture and analysis at the point
of capture and throughout the media lifecycle - Design systems that incorporate human beings as
interactive functional components and aggregate
and analyze user behavior
22Traditional Media Production Chain
METADATA
Metadata-Centric Production Chain
PRE-PRODUCTION
POST-PRODUCTION
PRODUCTION
DISTRIBUTION
23Moores Law for Cameras
2000
2002
400
Kodak DX4900
Kodak DC40
40
SiPix StyleCam Blink
Nintendo GameBoy Camera
24CaptureProcessingInteractionNetwork
25Camera Phones as Platform
- Media capture (images, video, audio)
- Programmable processing using open standard
operating systems, programming languages, and
APIs - Wireless networking
- Personal information management functions
- Rich user interaction modalities
- Time, location, and user contextual metadata
26Camera Phones as Platform
- In the first half of 2003, more camera phones
were sold worldwide than digital cameras - By 2008, the average camera phone is predicted to
have 5 megapixel resolution - Last month Samsung introduced 7 megapixel camera
phones with optical zoom and photo flash - There are more cell phone users in China than
people in the United States (300 million) - For 90 of the world their computer is their
cell phone
27Campanile Inspiration
28Mobile Media Metadata Idea
- Leverage the spatio-temporal context and social
community of media capture in mobile devices - Gather all automatically available information at
the point of capture (time, spatial location,
phone user, etc.) - Use metadata similarity and media analysis
algorithms to find similar media that has been
annotated before - Take advantage of this previously annotated media
to make educated guesses about the content of the
newly captured media - Interact in a simple and intuitive way with the
phone user to confirm and augment system-supplied
metadata for captured media
29Campanile Scenario
30From Context to Content
- Context
- When
- Date and time
- Where
- CellID refined to semantic place
- Who
- Cellphone user
- What
- Activity as product of when, where, and who
- Content
- When was the photo taken?
- Where is the subject of the photo?
- Who is in the photo?
- What are the people doing?
- What objects are in the photo?
31Space Time Social Space
SPATIAL
TEMPORAL
SOCIAL
32What is Location?
33Camera Location vs. Subject Location
- Camera Location Golden Gate Bridge
- Subject Location Golden Gate Bridge
- Camera Location Albany Marina
- Subject Location Golden Gate Bridge
34Kodak Picture Spot
35Location Guesser
- Weighted sum of features
- Most recently visited location
- Most visited location by me in this CellID
around this time - Most visited location by me in this CellID
- Most visited location by others in this
CellID around this time - Most visited location by others in this CellID
36Location Guesser Performance
- Exempting the occasions on which a user first
enters a new location into the system, MMM
guessed the correct location of the subject of
the photo (out of an average of 36.8 possible
locations) - 100 of the time within the first four guesses
- 96 of the time within the first three guesses
- 88 of the time within the first two guesses
- 69 of the time as the first guess
37MMM1 Context to Content
- When
- Network Time Server
- Where
- CellID
- Who
- Cellphone ID
- What
- Faceted Annotation
Context
Content
38From MMM-1 To MMM-2
- MMM-1 asked
- What did I just take a picture of?
- MMM-2 adds
- Whom do I want to share this picture with?
Content
Community
Context
Community
39Sharing ? Metadata
- From contextual metadata to sharing
- A parent takes a photo of his child on the
childs birthday - Whom does he share it with?
- From sharing to content metadata
- A birdwatcher takes a photo in a bird sanctuary
and sends it to her birdwatching group - What is the photo of?
40MMM2 Context to Sharing
- When
- Network Time Server
- Where
- CellID
- GPS
- Bluetooth
- Who
- Cellphone ID
- Bluetooth
- Sharing History
- What
- Faceted Annotation
- Captions
Context
Community
41MMM2 Context to Sharing
42MMM2 Interfaces Phone
43MMM2 Interfaces Web
44MMM2 Image Map
45More Captures and Uploads
STATS MMM1 MMM2 DIFF
Users 38 40 5
Days 63 39 -38
Raw totals
Personal photos uploaded 155 1478 854
Total photos uploaded 535 1678 214
Photos not uploaded 108 52 -52
Average per user per day
Personal photos uploaded 0.06 0.95 1363
Total photos uploaded 0.22 1.08 381
Photos not uploaded 0.05 0.03 -26
Upload failure rate 16.8 3.0 -82
46Reasons For 13.6 Times Increase
- Better image quality
- VGA vs. 1 megapixel image resolution
- Night mode for low light
- Digital zoom
- Familiarity of the user population with
cameraphones - 12 prior cameraphone users this year vs. 1 last
year - The availability of only 1 rather than 2 camera
applications in MMM2 vs. MMM1 - Automatic background upload of photos to the web
photo management application - Automatic support for sharing on the cameraphone
and on the web
47More Sharing With Suggestions
48More Sharing With Suggestions
MMM2 USER BEHAVIOR BEFORE SHARE GUESSER AFTER SHARE GUESSER DIFF
TOTAL PHOTOS UPLOADED 688 990 144
TOTAL PERSONAL PHOTOS UPLOADED 688 790 115
TOTAL PHOTOS SHARED 249 791 318
TOTAL PERSONAL PHOTOS SHARED 249 591 237
PERCENTAGE OF PHOTOS SHARED 36 80 221
PERCENTAGE OF PERSONAL PHOTOS SHARED 36 75 207
49Sharing Graph
50Scaling Up Photo Sharing
100K
100M
51MMM3 Context Content Sharing
- When
- Network Time Server
- Calendar Events
- Where
- CellID
- GPS
- Bluetooth
- Who
- Cellphone ID
- Bluetooth
- Sharing History
- What
- Faceted Annotations
- Captions
- Weather Service
- Image Analysis
Content
Context
Community
52MMM3 Research Questions
Content
Community
Context
- MMM1
- Context ? Content
- MMM2
- Context ? Community
- MMM3
- Community ? Context
- Community ? Content
- Content ? Context
- Content ? Community
53Social Uses of Personal Photos
- Looking not just at what people do with digital
imaging technology, but why they do it - Goals
- Identify social uses of photography to predict
resistances and affordances of next generation
mobile media devices and applications - Methods
- Situated video interviews
- Review of online photo sites
- Sociotechnological prototyping (magic thing,
technology probes)
54From What to Why to What
55Preliminary Findings
- Social uses of personal photos
- Creating and maintaining social relationships
- Constructing personal and group memory
- Self-presentation
- Self-expression
- Functional self and others
- Media and resistance
- Materiality
- Orality
- Storytelling
56Photo Examples of Social Uses
57Summary
- Cameraphones are a paradigm-changing device for
multimedia computing - Context-aware mobile media metadata will solve
many problems in media asset management - MMM1
- Content can be inferred from context
- MMM2
- Sharing can be inferred from context
58Alex Jaffe on Cameraphone Uses
- Many of the users of cell phone cameras in this
paper felt compelled to chronicle very "normal"
aspects of their daily life, either to share with
others or for personal memories. Do you think the
ability to constantly record one's life satisfies
an existing desire, or is the technology
fulfilling a need it itself inspires in people?
Regardless, can you think of examples where
technology is used to do something not because
there is a need, but simply because it becomes
possible?
59Alex Jaffe on Cameraphone Uses
- Respondents indicated that one of their favorite
features unique to MMM(2) was their ability to
send pictures to people immediately after they
were taken. This created a sense of immediacy and
"being there" in the viewer. How is communicating
in this way reminiscent of orality, albeit in
visual form? Might this be an important part of
secondary orality in times to come?
60Magen Farrar on Context-To-Content
- Context-to-content inferencing promises to
solve the problems of the sensory and semantic
gaps in multimedia information systems...By using
the spatio-temporal-social context of image
capture, we are able to infer that different
images taken in the vicinity of the Campanile are
very likely of the Campanile at UC Berkeley and
know that they are not of, for example, the
Washington Monument... So, how is the system of
context to content inferencing changing to
allow deciphering, or specifics, between similar
content within the same context?
61Magen Farrar on Context-To-Content
- Sharing metadata is exceptionally useful in
inferring media content from context, but can
potentially violate one's privacy. Other than
the opt-in/opt-out mechanisms in the system, what
other steps are being thought of to assure the
preservation of privacy while sharing information
in the Mobile Media Metadata system?
62Lecture Overview
- Review of Last Time
- Understanding Visual Media
- Today
- Case Study Cameraphone
- Preview of Next Time
- Databases
63Readings for Next Week
- Tuesday (Guest Lecture by Dr. Frank Nack)
- Lev Manovich. Database as a Symbolic Form. 1999,
p. 1-16. http//www.manovich.net/DOCS/database.rtf
- Discussion Questions
- Dorian Peters
- Joshia Chang
- Thursday (Guest Lecture by Prof. Yehuda Kalay)
- Steve Harrison and Paul Dourish. Re-Place-ing
Space The Roles of Place and Space in
Collaborative Systems. in Proceedings of ACM
Conference on CSCW. New York ACM Press, 1996, p.
67-76. - Discussion Questions
- Vlad Kaplun
- Annie Chiu