Informedia and Health Care video archives - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Informedia and Health Care video archives

Description:

Informedia and Health Care video archives – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 28
Provided by: alexhauptm
Category:
Tags: ani | archives | care | cooee | health | informedia | ion1 | kaeb | llis | lrae | nc1 | orin | pael | rsx | tned | video | xer

less

Transcript and Presenter's Notes

Title: Informedia and Health Care video archives


1
Research in Creating Video Archives and the
Potential for Health Care
Alex Hauptmann August 1, 2005
Carnegie Mellon University Pittsburgh, USA
2
Outline
  • Overview of the Informedia project
  • Metadata exchange
  • CareMedia video archives for medical
    observations

3
Informedia Project Mission
  • Enable Search and Discovery in the Video Medium
  • Automated information and metadata extraction
    from video
  • Full-content search and retrieval of spoken
    language and visual documents
  • Integration of speech, image and natural language
    understanding for library creation and
    exploration
  • Validation through user testbeds

4
Application of Diverse, Imperfect Technologies
  • Speech understanding for automatically derived
    transcripts
  • Image understanding for video paragraphing
    face, text, object and scene recognition
  • Natural language for segmentation, query
    understanding and content summarization
  • Human computer interaction for video display,
    navigation and reuse

5
Video Search Demonstration
6
(No Transcript)
7
Informedia Metadata Extraction
Metadata Extractor
User Interface
Visualization Templates
(final representation)
Summarizer
Carnegie
8
Informedia DVL Overview
Modularize metadata extraction process.
Specify metadata exchange
interface processing synchronization
XML/XSL data representation for user customized
interfaces
9
Metadata Creation Paradigm
  • Goal Provide a logical view of metadata creation
    modules and their logical relationships

Metadata Creation
Video-based Analysis
Segment-based Analysis
  • Face detection
  • VOCR
  • Title generation
  • Topic Assignment
  • Capitalization
  • Phrase Extraction
  • Geocoding
  • (Still object detection)
  • (Moving object detection)
  • Scene break detection
  • Black frame detection
  • Speech recognition
  • Signal-to-noise ratio

Segmentation Transcript Processing
10
Informedia System Structure
11
Text and Face Detection
12
Camera and Motion Detection
Pan
Right object motion (not pan left)
13
Video OCR
Final VOCR Results GERRY ADAMS SINN
FEIN PRESIDENT
14
Annotations and Data Export
  • Annotation fields contain metadata automatically
    derived from the content (e.g. topics, chyron)
  • Annotations are included in the index (searchable
    separately or combined with transcript)
  • Personal annotations are typed or spoken comments
    that are established on a per user basis
  • bookmarking or commentary
  • fully indexed and searchable with other data
  • Shot bookmarking implemented and tested with both
    novice and expert users
  • XML and segment metadata import/export capability
  • Conversion to MPEG-7

15
Informedia XML Presentation Architecture
16
Efficient navigation is especially important
with video
  • Multiple levels of abstraction and summarization
  • Visual icons with relevance measure
  • One-line headlines
  • Static film strip views
  • Active video skims
  • Transcript following (even when errorful)
  • Let the eyes do the searching

17
Interfaces Let the eyes do the searching
18
The Challenge of Extensibility (current work)
19
Informedia Current Capabilities
  • Information retrieval in both spoken language and
    video/image domains
  • Fully automated transcriptions generated entirely
    through speech recognition or with closed
    captions
  • Information summaries at varying detail, both
    visually and textually
  • Full content georeferencing of every event for
    geographic display and query
  • Extraction and reuse of video documents for
    Web-based access and presentation
  • All integrated into a user tested and validated
    interface

20
Informedia Focus
  • Allow complete access to information within
    multimedia sources
  • Generate metadata descriptions
  • Segment audio and video into meaningful segments
  • Provide abstractions for reviewing those segments
  • Improve query and browsing interfaces to this
    data
  • Iterate based on user studies

21
Digital Human Memory
  • Technology for creating a continuously recorded,
    digital, high fidelity record of ones whole life
    in video form
  • Personal, wearable units which record audio,
    video, GPS and electronic communications
    capturing all that is heard, seen experienced
  • Transforming this personal history into a
    meaningful, accessible information resource with
    auto-search and auto-summarization
  • Feasible 100MB/h or 1GB/day or .33 TB/year or
    30 TB/lifetime

22
Data Collection The Vest
23
(No Transcript)
24
LSCOM A Large Scale Concept Ontology for
Multimedia
  • Collaborative activity of three critical
    communities Users, Library Scientists and
    Knowledge Experts, and Technical Researchers,
    Algorithm, System and Solution Designers to
    create a user-driven concept ontology for
    analysis of video broadcast news

Lexicon and Ontology 1000 or more concepts
25
Large Scale Concept Ontology for Multimedia
Understanding (LSCOM) Scope
Analyst User Interactions
Pre-Analyst Annotation
  • Ontology (lexicon)

Analyst Tools
  • Raw audio video (possibly plus some metadata)
  • Extractable feature descriptors (eg., cut-rate,
    motion)

Annotation Engine
Feature Extraction
  • Search filter results

Search Engines
Terms
LSCOM Workshop SCOPE
  • Higher-level subjective interpretation

Inference Engines
Inference Engines
Features
Annotation metadata
Maximizes constraints such as computability,
utility, reusability, compatibility (e.g., Cyc,
OWL, etc.) Inference engines may be
rule-based, statistically-based, hard-wired, etc.
26
The Power of an Ontology
  • An explicit formal specification of how to
    represent the objects, concepts and other
    entities that are assumed to exist in some area
    of interest and the relationships that hold among
    them.
  • Descriptive power can be achieved if a small
    number of primitives can be combined using a few
    composition operators and a limited number of
    relations to form multiple threads that generate
    a large number of complex concepts
  • This compositional structure leads to a divide
    and conquer strategy that makes it possible to
    make progress on several fronts simultaneously
  • Different research groups can focus on different
    concepts
  • Primitive concept recognition methods can be
    shared reused
  • Composite concepts can be used as parts of other
    concepts

27
What an Ontology with Background Knowledge and
Inference can give us
  • Query Someone smiling

(?x) (feelsEmotion x Happiness Positive)
  • Caption A man watching his daughter take
    her first step

(?x,y) (and (father x y) (gender x Female) (sees
x y) (walking
28
Broadcast News Video Content Description Ontology
  • Why the Focus on Broadcast News Domain?
  • Critical mass of users, content providers,
    applications
  • Good content availability (TRECVID, LDC, FBIS)
  • Shares large set of core concepts with other
    domains
  • Ontology Formalism
  • Entity-Relationship (E-R) Graphs
  • RDF, DAML / DAMLOIL, W3C OWL, CycL
  • MPEG-7, MediaNet, VEML
  • Seed Representations
  • TRECVID-2003 News Lexicon (Annotation Forum)
  • Library of Congress TGM
  • CYC knowledge representation (ontology)
  • CNN, BBC Classification Scheme, TVAnytime,
    Comstock,

MPEG-7 Video Annotation Tool
29
MPEG-7 for Metadata Exchange
Multimedia Content Description Interface
Standard
  • Standardize a framework for describing
    audio-visual content
  • Describe different aspects of multimedia
    documents at different abstraction levels
  • Create descriptions to form the basis of
    applications like search, filtering and browsing
    multimedia content
  • Does NOT specify video compression or
    transmission
  • MPEG-7 descriptions live in separate files from
    the video
  • Extensible for new description schemas

30
What is MPEG-7
Four Types of Normative Elements
  • Descriptors (Ds)
  • Primarily to describe low-level audio or visual
    features
  • Description Schemes (DSs)
  • Describe higher-level AV features such as
    regions, segments, objects, events and other
    immutable metadata related to creation and
    production, usage, and so forth
  • Description Definition Language (DLL)
  • Allow specifying new description schemes and
    descriptors
  • Coding Schemes
  • Specify how to code the needed descriptions to
    satisfy the compression and the transmission
    requirements

31
MPEG-7 Application Chain
32
Example
33
Simple Example
  •  lt?xml version"1.0" encoding"iso-8859-1"?gt
  • ltMpeg7 xmlns"urnmpegmpeg7schema2001"
    xmlnsxsi"http//www.w3.org/2001/XMLSchema-insta
    nce" xmlnsmpeg7"urnmpegmpeg7schema2001"
    xsischemaLocation"urnmpegmpeg7schema2001
    .\Mpeg7-2001.xsd"gt
  • ltDescription xsitype"ContentEntityType"gt
  • ltDescriptionMetadatagt
  • ltLastUpdategt2001-04-06T0000000000lt/LastUpda
    tegt
  • lt/DescriptionMetadatagt
  • ltCreationInformation id"179650"gt
  • ltCreationgt
  • ltTitle xmllang"en"gtCNN World Today -
    09-28-2000lt/Titlegt
  • ltAbstractgt
  • ltFreeTextAnnotationgt
  • Today, .......
  • lt/FreeTextAnnotationgt
  • lt/Abstractgt
  • ltCreationCoordinatesgt
  • ltCreationLocationgt
  • ltName xmllang"es"gtCNNlt/Namegt
  • ltCountrygtuslt/Countrygt
  • ltAdministrativeUnitgtNew Yorklt/Administrative
    Unitgt

34
More Information about MPEG-7
  • MPEG http//www.cselt.it/mpeg
  • MPEG-7 Industry Forum http//www.mpeg7.org
  • www
  • Overview www.telecomitalialab.com/mpeg/standards/
    mpeg-7/mpeg-7.htm
  • Scheme http//pmedia.i2.ibm.com8000/mpeg7/schema
    /
  • Mds http//www.mpeg7.ee.columbia.edu/
  • DDL http//archive.dstc.edu.au/mpeg7-ddl/
  • XM http//www.lis.ei.tum.de/research/bv/topics/mm
    db/e_mpeg7.html

35
CareMedia Behavior Observations in a Nursing
Home
36
CareMedia Automated Behavior Analysis in the
Nursing Home
  • Primary objective is improved quality of assisted
    care
  • Example automating detection of behavioral
    psychological symptoms of dementia (BPSD).
  • Apply more broadly to monitoring and maintaining
    the quality of life
  • Ultimately, make automated, quantitative
    measurements to
  • Explore relation of symptoms to environments in
    which they occur
  • Evaluate symptoms longitudinally
  • Determine the frequency of symptoms
  • Develop patient profiles of responses to
    treatment interventions
  • gtgtgtgt Enable earlier intervention to sustain
    quality of life

37
Applications in the Nursing Home
  • Clinical/Research
  • Tracking patient behavior and incidents in
    long-term care facilities
  • e.g., disruptive vocalizations, falls
  • recording patient mobility and activity levels
  • Correlating with time of day, location and
    environmental factors
  • Observing effects of drugs on individuals and
    groups
  • Patient
  • Cognitive assist - reminding, alerting and
    summoning help
  • Staff training
  • Analysis of video records of incidents used for
    training
  • Management
  • Monitoring and documenting compliance

38
What is Presently Measured by Humans
  • The Pittsburgh Agitation Scale
  • Aberrant Vocalizations
    (repetitious requests or
    complaints, non-verbal vocalizations, i.e.
    moaning)
  • Motor Agitation
    (pacing, wandering,
    rocking in chair)
  • Aggressiveness
    (vocal threats,
    threatening gestures)
  • Resisting Care
    (pushing away to
    avoid tasks)

39
What are we trying to detect and measure?
  • Person Tracking
  • Person Identification
  • Gross Motor Behavior
    (broad area e.g.,
    walking/gait, falling, wheel chair motion)
  • Small Motion Behavior
    (task-specific
    e.g., eating, washing hands, combing hair)
  • Fine Motion Behaviors
    (close-in e.g.,
    twitches, tremors, eye-blinking, frowns)
  • Social Interactions
  • Gradual Trends over Time
  • Rare Events
  • Required in processing Obfuscation (for privacy
    protection)

40
CareMedia What are the observables?
  • Who?
  • Identify people across cameras, days.
  • What are they doing?
  • Wandering around
  • Working on tasks
  • Looking for things
  • Eating, sleeping in public
  • How well did they do it?
  • Quantify normal performance
  • Detect/report anomalies

41
What could the system reporting look like?
42
Coarse Motion Measurement
Informedia Digital Libraries
  • Applying mean-shift analysis

target detection
red indicates target
43
Fine Motion with Directions
Applying optical flow analysis
44
Capturing Key Events
Unobserved Aggression
Observed Elopement
45
Measure Normal Activity, Detect Whats Not
46
Problem Privacy Protection in Public Places
  • Block the persons that are reluctant to be
    captured in the video
  • ¼ of nursing home residents deny disclosure of
    their images
  • Real-time automatic people tracking framework
  • Detect foreground information adapt for
    real-time background
  • Multi-target, multi-assignment blob matching
  • Apply mean shift algorithm to separate merged
    persons

47
Edge Motion Imaging
48
Obscured Faces
49
Problem Monitoring in Private Spaces
  • Observe and monitor activity without storing
    video
  • Maintain only feature vectors classify in
    real-time
  • Record event type, time of day, duration
  • Detect changes in daily pattern of activity
  • Example Monitor bathroom/mirror activities
  • What brushing teeth, combing hair, washing
    hands, washing face
  • How small camera behind center of mirror, mono
    microphone, embedded computing
  • Create summary
  • how long, how often, chart by day

50
(No Transcript)
51
Conclusions (Designed for Controversy)
  • The video data firehose has not yet started
  • We need automatic metadata extraction to handle
    the volume
  • Manual indexing/archiving does not scale
  • Automatic metadata extraction will improve over
    time
  • Provide for iterations of similar metadata with
    different quality
  • Consider confidence for automatic metadata
  • Data doesnt have to be perfect to be useful and
    used

52
Future Opportunities
  • Instrument facilities with distributed sensors
    for precision
  • Force sensors in chairs, beds, carpeting
  • RFID in clothing, utensils
  • Upgrade to hi-resolution cameras for fine motor
    detection
  • Measure facial expressions, tremors
  • Conduct large-scale testbeds for validation
  • Comprehensive instrumentation in multiple homes
  • Move through lesser levels of care to expand
    market
  • From constrained skilled care environments to
    less structured assisted and independent living
  • gtgtgtgt Enable earlier detection and intervention
  • Delaying nursing home entry by 1 month saves
    1.2B/year

53
People Identification
Silhouette Extraction
Classification
Person 1
Person 2
Person 3
54
Challenges of people identification
  • Limited training data
  • Imperfect feature representation (Color, Gait,
    Face)
Write a Comment
User Comments (0)
About PowerShow.com