Informedia and Health Care video archives

About This Presentation

Title:

Informedia and Health Care video archives

Description:

Informedia and Health Care video archives – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 28

Provided by: alexhauptm

Category:

more less

Transcript and Presenter's Notes

Title: Informedia and Health Care video archives

1
Research in Creating Video Archives and the
Potential for Health Care
Alex Hauptmann August 1, 2005
Carnegie Mellon University Pittsburgh, USA
2
Outline

Overview of the Informedia project
Metadata exchange
CareMedia video archives for medical
observations

3
Informedia Project Mission

Enable Search and Discovery in the Video Medium
Automated information and metadata extraction
from video
Full-content search and retrieval of spoken
language and visual documents
Integration of speech, image and natural language
understanding for library creation and
exploration
Validation through user testbeds

4
Application of Diverse, Imperfect Technologies

Speech understanding for automatically derived
transcripts
Image understanding for video paragraphing
face, text, object and scene recognition
Natural language for segmentation, query
understanding and content summarization
Human computer interaction for video display,
navigation and reuse

5
Video Search Demonstration
6
(No Transcript)
7
Informedia Metadata Extraction
Metadata Extractor
User Interface
Visualization Templates
(final representation)
Summarizer
Carnegie
8
Informedia DVL Overview
Modularize metadata extraction process.
Specify metadata exchange
interface processing synchronization
XML/XSL data representation for user customized
interfaces
9
Metadata Creation Paradigm

Goal Provide a logical view of metadata creation
modules and their logical relationships

Metadata Creation
Video-based Analysis
Segment-based Analysis

Face detection
VOCR
Title generation
Topic Assignment
Capitalization
Phrase Extraction
Geocoding
(Still object detection)
(Moving object detection)

Scene break detection
Black frame detection
Speech recognition
Signal-to-noise ratio

Segmentation Transcript Processing
10
Informedia System Structure
11
Text and Face Detection
12
Camera and Motion Detection
Pan
Right object motion (not pan left)
13
Video OCR
Final VOCR Results GERRY ADAMS SINN
FEIN PRESIDENT
14
Annotations and Data Export

Annotation fields contain metadata automatically
derived from the content (e.g. topics, chyron)
Annotations are included in the index (searchable
separately or combined with transcript)
Personal annotations are typed or spoken comments
that are established on a per user basis
bookmarking or commentary
fully indexed and searchable with other data
Shot bookmarking implemented and tested with both
novice and expert users
XML and segment metadata import/export capability
Conversion to MPEG-7

15
Informedia XML Presentation Architecture
16
Efficient navigation is especially important
with video

Multiple levels of abstraction and summarization
Visual icons with relevance measure
One-line headlines
Static film strip views
Active video skims
Transcript following (even when errorful)
Let the eyes do the searching

17
Interfaces Let the eyes do the searching
18
The Challenge of Extensibility (current work)
19
Informedia Current Capabilities

Information retrieval in both spoken language and
video/image domains
Fully automated transcriptions generated entirely
through speech recognition or with closed
captions
Information summaries at varying detail, both
visually and textually
Full content georeferencing of every event for
geographic display and query
Extraction and reuse of video documents for
Web-based access and presentation
All integrated into a user tested and validated
interface

20
Informedia Focus

Allow complete access to information within
multimedia sources
Generate metadata descriptions
Segment audio and video into meaningful segments
Provide abstractions for reviewing those segments
Improve query and browsing interfaces to this
data
Iterate based on user studies

21
Digital Human Memory

Technology for creating a continuously recorded,
digital, high fidelity record of ones whole life
in video form
Personal, wearable units which record audio,
video, GPS and electronic communications
capturing all that is heard, seen experienced
Transforming this personal history into a
meaningful, accessible information resource with
auto-search and auto-summarization
Feasible 100MB/h or 1GB/day or .33 TB/year or
30 TB/lifetime

22
Data Collection The Vest
23
(No Transcript)
24
LSCOM A Large Scale Concept Ontology for
Multimedia

Collaborative activity of three critical
communities Users, Library Scientists and
Knowledge Experts, and Technical Researchers,
Algorithm, System and Solution Designers to
create a user-driven concept ontology for
analysis of video broadcast news

Lexicon and Ontology 1000 or more concepts
25
Large Scale Concept Ontology for Multimedia
Understanding (LSCOM) Scope
Analyst User Interactions
Pre-Analyst Annotation

Ontology (lexicon)

Analyst Tools

Raw audio video (possibly plus some metadata)
Extractable feature descriptors (eg., cut-rate,
motion)

Annotation Engine
Feature Extraction

Search filter results

Search Engines
Terms
LSCOM Workshop SCOPE

Higher-level subjective interpretation

Inference Engines
Inference Engines
Features
Annotation metadata
Maximizes constraints such as computability,
utility, reusability, compatibility (e.g., Cyc,
OWL, etc.) Inference engines may be
rule-based, statistically-based, hard-wired, etc.
26
The Power of an Ontology

An explicit formal specification of how to
represent the objects, concepts and other
entities that are assumed to exist in some area
of interest and the relationships that hold among
them.
Descriptive power can be achieved if a small
number of primitives can be combined using a few
composition operators and a limited number of
relations to form multiple threads that generate
a large number of complex concepts
This compositional structure leads to a divide
and conquer strategy that makes it possible to
make progress on several fronts simultaneously
Different research groups can focus on different
concepts
Primitive concept recognition methods can be
shared reused
Composite concepts can be used as parts of other
concepts

27
What an Ontology with Background Knowledge and
Inference can give us

Query Someone smiling

(?x) (feelsEmotion x Happiness Positive)

Caption A man watching his daughter take
her first step

(?x,y) (and (father x y) (gender x Female) (sees
x y) (walking
28
Broadcast News Video Content Description Ontology

Why the Focus on Broadcast News Domain?
Critical mass of users, content providers,
applications
Good content availability (TRECVID, LDC, FBIS)
Shares large set of core concepts with other
domains
Ontology Formalism
Entity-Relationship (E-R) Graphs
RDF, DAML / DAMLOIL, W3C OWL, CycL
MPEG-7, MediaNet, VEML
Seed Representations
TRECVID-2003 News Lexicon (Annotation Forum)
Library of Congress TGM
CYC knowledge representation (ontology)
CNN, BBC Classification Scheme, TVAnytime,
Comstock,

MPEG-7 Video Annotation Tool
29
MPEG-7 for Metadata Exchange
Multimedia Content Description Interface
Standard

Standardize a framework for describing
audio-visual content
Describe different aspects of multimedia
documents at different abstraction levels
Create descriptions to form the basis of
applications like search, filtering and browsing
multimedia content
Does NOT specify video compression or
transmission
MPEG-7 descriptions live in separate files from
the video
Extensible for new description schemas

30
What is MPEG-7
Four Types of Normative Elements

Descriptors (Ds)
Primarily to describe low-level audio or visual
features
Description Schemes (DSs)
Describe higher-level AV features such as
regions, segments, objects, events and other
immutable metadata related to creation and
production, usage, and so forth
Description Definition Language (DLL)
Allow specifying new description schemes and
descriptors
Coding Schemes
Specify how to code the needed descriptions to
satisfy the compression and the transmission
requirements

31
MPEG-7 Application Chain
32
Example
33
Simple Example

lt?xml version"1.0" encoding"iso-8859-1"?gt
ltMpeg7 xmlns"urnmpegmpeg7schema2001"
xmlnsxsi"http//www.w3.org/2001/XMLSchema-insta
nce" xmlnsmpeg7"urnmpegmpeg7schema2001"
xsischemaLocation"urnmpegmpeg7schema2001
.\Mpeg7-2001.xsd"gt
ltDescription xsitype"ContentEntityType"gt
ltDescriptionMetadatagt
ltLastUpdategt2001-04-06T0000000000lt/LastUpda
tegt
lt/DescriptionMetadatagt
ltCreationInformation id"179650"gt
ltCreationgt
ltTitle xmllang"en"gtCNN World Today -
09-28-2000lt/Titlegt
ltAbstractgt
ltFreeTextAnnotationgt
Today, .......
lt/FreeTextAnnotationgt
lt/Abstractgt
ltCreationCoordinatesgt
ltCreationLocationgt
ltName xmllang"es"gtCNNlt/Namegt
ltCountrygtuslt/Countrygt
ltAdministrativeUnitgtNew Yorklt/Administrative
Unitgt

34
More Information about MPEG-7

MPEG http//www.cselt.it/mpeg
MPEG-7 Industry Forum http//www.mpeg7.org
www
Overview www.telecomitalialab.com/mpeg/standards/
mpeg-7/mpeg-7.htm
Scheme http//pmedia.i2.ibm.com8000/mpeg7/schema
/
Mds http//www.mpeg7.ee.columbia.edu/
DDL http//archive.dstc.edu.au/mpeg7-ddl/
XM http//www.lis.ei.tum.de/research/bv/topics/mm
db/e_mpeg7.html

35
CareMedia Behavior Observations in a Nursing
Home
36
CareMedia Automated Behavior Analysis in the
Nursing Home

Primary objective is improved quality of assisted
care
Example automating detection of behavioral
psychological symptoms of dementia (BPSD).
Apply more broadly to monitoring and maintaining
the quality of life
Ultimately, make automated, quantitative
measurements to
Explore relation of symptoms to environments in
which they occur
Evaluate symptoms longitudinally
Determine the frequency of symptoms
Develop patient profiles of responses to
treatment interventions
gtgtgtgt Enable earlier intervention to sustain
quality of life

37
Applications in the Nursing Home

Clinical/Research
Tracking patient behavior and incidents in
long-term care facilities
e.g., disruptive vocalizations, falls
recording patient mobility and activity levels
Correlating with time of day, location and
environmental factors
Observing effects of drugs on individuals and
groups
Patient
Cognitive assist - reminding, alerting and
summoning help
Staff training
Analysis of video records of incidents used for
training
Management
Monitoring and documenting compliance

38
What is Presently Measured by Humans

The Pittsburgh Agitation Scale
Aberrant Vocalizations
(repetitious requests or
complaints, non-verbal vocalizations, i.e.
moaning)
Motor Agitation
(pacing, wandering,
rocking in chair)
Aggressiveness
(vocal threats,
threatening gestures)
Resisting Care
(pushing away to
avoid tasks)

39
What are we trying to detect and measure?

Person Tracking
Person Identification
Gross Motor Behavior
(broad area e.g.,
walking/gait, falling, wheel chair motion)
Small Motion Behavior
(task-specific
e.g., eating, washing hands, combing hair)
Fine Motion Behaviors
(close-in e.g.,
twitches, tremors, eye-blinking, frowns)
Social Interactions
Gradual Trends over Time
Rare Events
Required in processing Obfuscation (for privacy
protection)

40
CareMedia What are the observables?

Who?
Identify people across cameras, days.
What are they doing?
Wandering around
Working on tasks
Looking for things
Eating, sleeping in public
How well did they do it?
Quantify normal performance
Detect/report anomalies

41
What could the system reporting look like?
42
Coarse Motion Measurement
Informedia Digital Libraries

Applying mean-shift analysis

target detection
red indicates target
43
Fine Motion with Directions
Applying optical flow analysis
44
Capturing Key Events
Unobserved Aggression
Observed Elopement
45
Measure Normal Activity, Detect Whats Not
46
Problem Privacy Protection in Public Places

Block the persons that are reluctant to be
captured in the video
¼ of nursing home residents deny disclosure of
their images
Real-time automatic people tracking framework
Detect foreground information adapt for
real-time background
Multi-target, multi-assignment blob matching
Apply mean shift algorithm to separate merged
persons

47
Edge Motion Imaging
48
Obscured Faces
49
Problem Monitoring in Private Spaces

Observe and monitor activity without storing
video
Maintain only feature vectors classify in
real-time
Record event type, time of day, duration
Detect changes in daily pattern of activity
Example Monitor bathroom/mirror activities
What brushing teeth, combing hair, washing
hands, washing face
How small camera behind center of mirror, mono
microphone, embedded computing
Create summary
how long, how often, chart by day

50
(No Transcript)
51
Conclusions (Designed for Controversy)

The video data firehose has not yet started
We need automatic metadata extraction to handle
the volume
Manual indexing/archiving does not scale
Automatic metadata extraction will improve over
time
Provide for iterations of similar metadata with
different quality
Consider confidence for automatic metadata
Data doesnt have to be perfect to be useful and
used

52
Future Opportunities

Instrument facilities with distributed sensors
for precision
Force sensors in chairs, beds, carpeting
RFID in clothing, utensils
Upgrade to hi-resolution cameras for fine motor
detection
Measure facial expressions, tremors
Conduct large-scale testbeds for validation
Comprehensive instrumentation in multiple homes
Move through lesser levels of care to expand
market
From constrained skilled care environments to
less structured assisted and independent living
gtgtgtgt Enable earlier detection and intervention
Delaying nursing home entry by 1 month saves
1.2B/year

53
People Identification
Silhouette Extraction
Classification
Person 1
Person 2
Person 3
54
Challenges of people identification

Limited training data
Imperfect feature representation (Color, Gait,
Face)

Write a Comment

User Comments (0)

About PowerShow.com

Informedia and Health Care video archives - PowerPoint PPT Presentation

Informedia and Health Care video archives

Informedia and Health Care video archives – PowerPoint PPT presentation