Automatic Caption Localization in Compressed Video - PowerPoint PPT Presentation

About This Presentation
Title:

Automatic Caption Localization in Compressed Video

Description:

IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol 22, ... Local vertical harmonics. Corresponding row of text. High vertical spectrum energy ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 17
Provided by: lak99
Category:

less

Transcript and Presenter's Notes

Title: Automatic Caption Localization in Compressed Video


1
Automatic Caption Localization in Compressed Video
  • By Yu Zhong, Hongjiang Zhang, and Anil K. Jain,
    Fellow, IEEE
  • IEEE Transactions on Pattern Analysis and Machine
    Intelligence
  • Vol 22, No. 4, April 2000

2
Introduction
  • Caption text on video
  • General methods for caption extraction
  • Proposed Method
  • How it works
  • Evaluation

3
Caption Text on Video
  • Parse, index and abstract of Video
  • Caption Text
  • Information of Video
  • Describe the content
  • Catch highlights

4
General Extraction Methods
  • Component-based
  • Geometrical arrangement
  • Homogeneous color
  • Texture-based
  • Contrast the background
  • Horizontal intensity variation

5
  • Most published method
  • Applied on uncompressed images
  • Digital video and images
  • Compressed (MPEG JPEG)
  • DCT (Discrete Cosine Transform) coding
  • Reducing interframe redundancy (for MPEG)

6
Proposed Method
  • Step 1 2
  • Detecting Blocks
  • Step 3
  • Refinement
  • Step 4
  • Segmentation of rows

Step 1
Step 2
Step 3
Step 4
7
Proposed Method
Source frame
8
Step 1 2Detecting Blocks of High Horizontal
Spatial Intensity Variation
  • Operates in DCT domain
  • Not necessary to decompress
  • Unit 8x8 blocks in I-frames (Intracoded)
  • Quantized DCT coefficients
  • Readily extracted
  • Fast

DCT blocks with high horizontal intensity
variation
9
Step 3Remove noise by applying Morphological
Operations
  • Step 1 2
  • Picked high contrast nontext blocks
  • Disconnected text blocks
  • Wide spacing, low contrast, large fonts
  • Step 3
  • Remove most isolated blocks
  • Merges nearby blocks

Applying Morphological Operations
10
Step 4Segmentation based on vertical intensity
variation
  • Detected text regions
  • Large vertical intensity variation
  • Local vertical harmonics
  • Corresponding row of text
  • High vertical spectrum energy

After horizontal/vertical text energy test
11
Dilating the previous result by one block
12
Evaluation
  • Not work properly when
  • Very big characters
  • Too widely spaced text
  • Image texture

13
Caption Text on Video
  • Parse, index and abstract of Video
  • Caption Text
  • Information of Video
  • Describe the content
  • Catch highlights

14
Evaluation
  • Commonly used caption
  • NOT very big characters
  • NOT too widely spaced text
  • NOT image texture
  • Therefore, important information retrieved!

15
Evaluation
  • Future work
  • Proposed to other transform-based compressions
  • Use also color information to improve accuracy
  • Combining DCT blocks to support larger fonts
  • Solution to P- and B-frames

16
Summary
  • Proposed caption localization method
  • For compressed video
  • Fast
  • Further development is needed to improve
  • Accuracy
  • Support other compression methods
Write a Comment
User Comments (0)
About PowerShow.com