Title: MobileASL: Making Cell Phones Accessible to the Deaf Community
1MobileASL Making Cell Phones Accessible to the
Deaf Community
- Anna CavenderRichard Ladner, Eve Riskin
- University of Washington
2Two Themes
- MobileASL
- Cyber-Community for Advancing Deaf and Hard of
Hearing in STEM (Science Technology Engineering
and Math)
3Our goal
- ASL communication using video cell phones over
current U.S. cell phone network
Challenges
- Limited network bandwidth
- Limited processing power on cell phones
4Cell Phone Network Constraints
- Low bit rate goal
- GPRS (General Packet Radio Service)
- Ranges from 30kbps to 80kbps (download)
- Perhaps half that for upload
- Unpredictable variation and packet loss
- 3G 3rd Generation
- Special service
- Not yet widespread
- Will still have congestion
- Service providers more likely to offer services
if throughput can be minimized.
5MobileASL Network Goals
- Sign language presents a unique challenge
- Not just appearance of video, intelligibility
too! - If it works for sign language, other video
applications benefit too. - MobileASL is about fair access to the current
network - As soon as possible, no special accommodations
- Not geographically limited
- Lower bitrate power savings more accessible
6Architecture
Cell phone
Sender
Receiver
Camera
Player
Encoder
Decoder
Encoder
Transmitter
Receiver
Cell Phone Network
7Codec Used x264
- Open source implementation of H.264 standard
- Doubles compression ratio over MPEG2
- Replacing MPEG2 as industry standard
- x264 offers faster encoding
- Off-the-shelf H.264 decoder can be used
- (speculation about H.264 on the iPhone)
8Outline
- Motivation
- Introduction
- MobileASL Consumers
- Eyetracking Motivation
- Video Phone Study
- Compression Challenges
- Current Work
- Conclusions
9Discussions with Consumers
- Open ended questions
- Physical Setup
- Camera, distance,
- Features
- Compatibility, text,
- Scenarios
- Lighting, driving, relay services,
10Consumer Response
- I dont foresee any limitations. I would use
the phone anywhere the grocery store, the bus,
the car, a restaurant, anywhere! - There is a need within the Deaf Community for
mobile ASL conversations - Existing video phone technology (with minor
modifications) would be usable
11Video Encoding for ASL
- Constraints of cell phone network create video
compression challenges - How do we compress ASL video to maximize
intelligibility?
12Outline
- Motivation
- Introduction
- MobileASL Consumers
- Eyetracking Motivation
- Video Phone Study
- Compression Challenges
- Current Work
- Conclusions
13Eyetracking Studies
- Participants watched ASL videos while eye
movements were tracked - Important regions of the video could be encoded
differently
Muir et al. (2005) and Agrafiotis et al. (2003)
14Eyetracking Results
- 95 of eye movements within 2 degrees visual
angle of the signers face (demo) - Implications Face region of video is most
visually important - Detailed grammar in face requires foveal vision
- Hands and arms can be viewed in peripheral vision
Muir et al. (2005) and Agrafiotis et al. (2003)
15Outline
- Motivation
- Introduction
- MobileASL Consumers
- Eyetracking Motivation
- Video Phone Study
- Compression Challenges
- Current Work
- Conclusions
16Mobile Video Phone Study
- 3 Region-of-Interest (ROI) values
- 2 Frame rates, frames per second (FPS)
- 3 different Bit rates
- 15 kbps, 20 kbps, 25 kbps
- 18 participants (7 women)
- 10 Deaf, 5 hearing, 3 CODA
- All fluent in ASL
CODA (Hearing) Child of a Deaf Adult
17Example of ROI
- Varied quality in fixed-sized region around the
face - (demo)
2x quality in face
4x quality in face
18Examples of FPS
- Varied frame rate 10 fps and 15 fps
- For a given bit rate
- Fewer frames more bits per frame
- (demo)
19Questionnaire
20User Preferences Results
Bit Rate
Frame Rate
Region of Interest
21Implications of results
- A mid-range ROI was preferred
- Optimal tradeoff between clarity in face and
distortion in rest of sign-box - Lower frame rate preferred
- Optimal tradeoff between clarity of frames and
number of frames per second - Results independent of bit rate
22Outline
- Motivation
- Introduction
- MobileASL Consumers
- Eyetracking Motivation
- Video Phone Study
- Compression Challenges
- Current Work
- Conclusions
23Rate, distortion and complexity optimization
Inputparameters
H.264 encoder
Compressed video
Raw video
- H.264 standard provides 50 bit savings over
MPEG 2, but with higher complexity. - Objective Achieve best possible quality for
least encoding time at a given bitrate
24Time Complexity Tradeoff
MSE
Encoding Time
25Encoding/Decoding on the Cell Phone
- Implemented a command-line version of x264 on a
cell phone using Windows Mobile Edition 5.0.
26QVGA 320x240
27Outline
- Motivation
- Introduction
- MobileASL Consumers
- Eyetracking Motivation
- Video Phone Study
- Compression Challenges
- Current Work
- Conclusions
28Dynamic Region-of-Interest
- Skin detection algorithms
- Region-based metric for
- bit allocation
- Automatically determine priority for face and
hands based on currently available bitrate.
29Activity Recognition
- Can save data and power by detecting
- Fingerspelling
- Increase frame rate for better intelligibility
- Signing
- Sign language-specific encoding
- Just listening
- Less processing and transmission needed
- (demo)
30User Interface
- Leverages users prior experience with video
conferencing interfaces (such as Sorenson, HOVRS,
etc.) - Optimized for small screen space
- Initial state user interface
- Incoming Video Stream
- Outgoing Video Stream
- Control Toolbar
- Toggle Privacy Mode
- Toggle Chat View
- Video Screen Layout
- Toggle Status Bar
31Building the System
32MobileASL Team
- Principal Investigators
- Richard Ladner, Eve Riskin, and Sheila Hemami
(Cornell) - Graduate Students
- Anna Cavender, Rahul Vanam, Neva Cherniavsky,
Jaehong Chon, Dane Barney, Frank Ciaramello
(Cornell) - Undergraduate Students
- Omari Dennis, Jessica DeWitt, Loren Merritt
- National Science Foundation
33Cyber infrastructure for Advancing Deaf Hard of
Hearing in STEM
Richard Ladner Jorge Díaz-Herrera James
J DeCaro E William Clymer Anna
Cavender University of Washington
Rochester Institute of TechnologyNational
Technical Institute for the Deaf
34Our Goal
- Advancing Deaf and Hard of Hearing people in STEM
fields through better access to education.
35Problems
- Deaf students pursuing STEM fields need skilled
interpreters and captioners with specific domain
knowledge. - The best interpreter may not be at the students
locale. - Deaf students face challenging classroom
environments multiple sources of information are
all visual - Deaf Whiplash
- Sign language is growing to include STEM
vocabulary - Community consensus is required.
36Enabling Access to STEM Education
37Enabling ASL to Grow in STEM
38Summit to Create a Cyber-Community to Advance
Deaf and Hard-of-Hearing Individuals in STEM
(DHH Cyber-Community)
- Scheduled for June 2008 RIT/NTID
- Discussion among the many stakeholders
- Deaf and hard of hearing students in STEM fields.
- Faculty and administrators in colleges and
universities with a commitment to deaf and hard
of hearing students in STEM fields. - Interpreters and captioners.
- Researchers who study sign vocabulary for STEM
fields and interpreting and captioning for
education. - Educational technology researchers.
- Experts in multimedia and network services that
use the national cyberinfrastructure (e.g.,
AccessGrid). - Companies already in the business of providing
video relay interpreting (VRI) and real time
captioning (RTC). - Leaders in organizations who have an interest in
advancing deaf and hard of hearing students in
STEM fields.
39Questions?
- Thanks!
- MobileASL Webpage
- www.cs.washington.edu/research/MobileASL
- Richard Ladner
- ladner_at_cs.washington.edu
- Anna Cavender
- cavender_at_cs.washington.edu