Multi-Media Computing - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Multi-Media Computing

Description:

Name commonly used media formats for audio, image & video ... Audio data is basically waveform. The audio data is said to be in 'Time Domain' ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 31

Provided by: kenk99

Category:

more less

Transcript and Presenter's Notes

Title: Multi-Media Computing

1
Multi-Media Computing
2
Expected outcome after this lecture

After this lecture, students are expected to be
able to
Explain how audio/video are stored digitally
Perform calculation on the storage requirement
for un-com-pressed audio/video, as well as
bitrate and compression ratio
Explain the differences and respective advantages
of lossy and lossless compression
Name commonly used media formats for audio, image
video
Suggest commonly used audio/video configuration
(resolution, color depth, sampling frequencyetc)

3
Multimedia components

Typical computer systems represent information by
means of text and graphics.
Multimedia Apart from text and graphics, also
consists of
Audio
Still image
Video sequence
Computer Animation (e.g. 3D)

4
Audio - How can audio be store in computer?

Audio is originally an analog waveform signal
Analog data cannot be stored in computer
directly.
Two steps to change analog signal to digital
Sampling
Quantization

5
Sampling

Measure the magnitude of audio signal multiple
times per second, with a fixed interval
Audio signal now have a discrete timing and is
calledPulse-Amplitude-Modulation (PAM)

6
Sampling How many times per second?

The optimal sampling rate depends on the
characteristic of the audio signal.
Too small Bad psychoacoustic perception
Too large Unnecessary waste of storage space

7
Sampling How many times per second?

In order to store all the hearable sound,
sampling frequency should be 44100Hz, which
exactly is the sampling rate of Audio CD
Some very high quality (studio quality) audio
uses a sampling rate of 48000Hz 96000Hz
For voice/speech signal, a smaller sampling rate
is good enough (e.g. telephone uses 8000Hz)

8
Second step Quantization

Sampling is the 1st step for analog -gt digital
conversion
The second step is quantization which maps the
magnitude of PAM signal to a nearest valid
number
(e.g. 36.45123 to 36, -124.5678 to -125)
Finally, represent the digital signal in binary
form. which is called Pulse-Code-Modulation
(PCM)

9
Audio samples

How many bits are used to store a single sample?
CD uses 16 bits. (65536 steps, high quality)
Lower quality music uses 8 bits (256 steps)per
sample
Speech samples may use as low as 6 bits
Number of channels
Audio CD is stereo (2 channels Left, Right)
Telephone is mono (1 channel)

10
Audio samples

Some high definition audio uses 5.1 channels
Left, Right, Center, Left Surround, Right
Surround
And a Base enhancement channel. (LFE)
Newer sound cards even supports 7.1 channels
5.1 Center Rear Top

11
How many storage space is required?

Consider a 3-minute stereo music signal with 16
bits per sample and a sampling rate of 44100Hz.
16 bits 2 bytes per sample
1 second uses 44100 x 2 channels x 2 bytes
176400 bytes
3 minutes 176400 x 60 x 3 31,752,000 bytes

31.75 MB
(Note MB is Mega Byte, Mb is Mega bit)

12
Audio Compression

Uncompressed audio data take up large amount of
space. (3-minute music 32MB!)
Compression is used to reduce the amount of space
required (especially for hand-held devices)
Compression ratio
Ratio between the original size and compressed
size
e.g. 32MB PCM signal compressed to 8MB
Compression Ratio 32 8 4 1

13
Lossless audio compression

Audio compression can be lossless or lossy
Lossless The decompressed audio signal is 100
identical to the original one
Advantage
Best audio quality
Disadvantage
The saving (percentage) of file size is not
large.
Typical compression ratio is about 21 to 31

14
Lossy compression

The decompressed audio signal is not identical to
the original one
But the differences are almost undetectable by
human ear. (Psychoacoustic perception are almost
the same)
Higher compression ratio (usually 101 to 361)
Allow specifying the final compressed file size
at the cost of audio quality.

15
Bitrate

For compressed audio / video, it is usually
represented by bitrate rather than compression
ratio
Bitrate The number of bits per 1 second
audio/video
e.g. For stereo PCM data (44100Hz, 16 bit)
Bitrate 44100 x 2 channel x 16 bit
1411200 bps (bits per second)
1411kbps or 1.4Mbps
If the audio signal is compressed to mp3 of
128kbpsCompression ratio 1411 128 11.0 1

16
Basics for lossy audio compression

Audio data is basically waveform.
The audio data is said to be in Time Domain
The basic idea of audio compression is to change
audio from time domain to frequency domain
through mathematical manipulation

time
0 ms
1 ms
2 ms
frequency
1000 Hz
17
Mathematical transform

The mathematical operation which changes signal
from time domain to frequency domain is called
Transform
Transform guarantees one-to-one mapping and is
usually reversible
After transform, the audio signal could be
represented in a more compact manner.
Two most common transforms
DCT (Discrete Cosine Transform)
DWT (Discrete Wavelet Transform)

18
Commonly used audio format

MPEG I/II (for high fidelity audio)
Layer I .mp1 (for satellite broadcast)
Layer II .mp2 (audio in VCD)
Layer III .mp3 (audio transfer / sharing in
internet)
Real Audio (.ra) (for high fidelity audio
speech)
For streaming audio in internet
Windows Media Audio (.wma)
Ogg Vorbis (.ogg) Open Source Format
Monkeys Audio (.ape) Lossless audio compression

19
Still Images

Image is represented as a collection of color
dots (pixel picture element)
Pixel depth Number of bits to represent a pixel
From 1 bit (Mono) up to 24 bits (true color)
For 24-bit images, a pixel is divided into 3
channels
8 bit each for Red / Green /Blue
But for image compression, it is a common
practice to convert image from RGB to YUV or YCrCb

20
Lossless Image compression

Similar to Audio compression, Image compression
can be lossy or lossless.
Lossless formats
.BMP (Windows Bitmap)
.PCX (ZSoft PC Paintbrush)
.GIF (Support only up to 256 colors)
Support transparency animation
Extensively used in internet
.PNG (Portable Network Graphics)
As a replacement of GIF
Can also be lossy

21
Lossy Image compression

Similar psychovisual perception at very high
compression ratio.
Common lossy formats
JPEG (Joint Photographic Experts Group)
JPEG2000 (The new, better JPEG standard)
Similar to audio compression, also make use of
mathematical transform to change to frequency
domain
Audio Audio Sample (Time Domain) -gt Frequency
Domain
Image Pixel (Spatial Domain) -gt Frequency Domain
The difference is that the transform is
two-dimensional

22
Image compression Mathematical Transform
(2D)(for reference)

Similar to audio compression, spatial domain
pixels are converted to frequency domain by
either
DCT (Discrete Cosine Transform) (e.g. JPEG)
DWT (Discrete Wavelet Transform) (e.g. JPEG2000)
The reason for transform is that information can
be coded in a more compact manner after
transform

23
Video Compression

Video is just a sequence of images (frames)
showing one after another very quickly.
Video compression is similar to repeating image
compression multiple times
One major difference
There are similarities between frames
For example, if the video is showing a car
running along a road in countryside
The background (Sky, Mountain, Sun) unchange
The car in later frames is the same as the car in
previous frames, the only difference is the
position.

24
Video Compression

Better compression efficiency can be achieved if
we copy the car from one frame to another with
its position adjusted.
The process of searching the new position (e.g.
of the car) by comparing old new frames is
called Motion-Estimation
Motion Estimation is the major difference between
video and still image compression.
Instead of coding the original image, one only
need to code the prediction error (called
residue)

25
Typical video specification

The two most common spec
PAL (Phrase Alternating Line)
Frame resolution VCD (352x288), DVD (704x576)
pixels
25 frames per second
NTSC (National Television System Committee)
Frame resolution VCD (352x240), DVD (704x480)
pixels
30 frames per second
For VCD/DVD, video typically use 24-bit color
depth
HDTV provides even higher resolution
720p (1280x720)
1080p (1920x1080) (so-called Full
High-Definition)
Ultra-HDTV (7680x4320) (experimental, begin in
2015)

26
How many storage space required for 30 seconds of
uncompressed clip (no sound)?

Assume PAL format (352 x 288 x 25 frames)
1 pixel 24 bits 24/8 3 bytes
1 frame 352 x 288 x 3 bytes 304,128 bytes
1 second 304,128 x 25 7,603,200 bytes
30 seconds 7,603,200 x 30 228,096,000
bytes 228.1MB

27
Bitrate Compression ratio

Similar to Audio, video is always described as
bit rate
For example, bitrate of VCD standard is 1.5Mbps
Continue from the previous page,
given that the audio of the 30-sec clip uses
5,760,000 bytes. What is the compression ratio if
the clip is compressed as VCD format?

28
Compression ratio?