Video Compression and transport considerations - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Video Compression and transport considerations

Description:

Whereas MPEG-1 encoders and decoders can be developed royalty free, MPEG-2 demands that ... images that can be used to recreate the entire information for a picture. ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 28
Provided by: stephen329
Category:

less

Transcript and Presenter's Notes

Title: Video Compression and transport considerations


1
Video Compression and transport considerations
  • 20 Sept 2007
  • Stephen Simonin

2
What is digital video?
DVB-ASI and SD video serial digital interface
(SDI) output rates of 270, 360, or 540 Mbps.
Straight digital sampling of analog to digital is
typically 135 m with dithering This recreates the
analog very close to the original at the end.
10001010101
135 Meg plus 4 Meg audio
What is digital MPEG video?
Base band NTSC / PAL 4.5 / 6 Meg NTSC compressed
to MPEG2 at a 1 to 1 rate Yields a data rate for
all I frames at 40M to 50M rates
What is digital cable?
What and how is this transported?
3
(No Transcript)
4
Typical Video link systems
(1,500M HDTV)
NTSC SDI 270meg IF 4.5meg (6meg) MPEG 50 meg to
4meg 12 NTSC/ IF channel
HDTV 1.5gig camera Broadcast 39meg Air 19.5
meg IF 4.5meg 256 QAM 2 HDTV / slot 3meg
270M
recorder
50M
5M
270M
tap
74M
LE
1.5M
74M
5M
LE
coax
fiber
1gig
node
Cable HE
LE
MPEG to Base band Ad insertion Base band to
MPEG Compressed to 12/ 4.5 Meg
IF MPEG QAM DATA IP Voice
DATA 16meg shared /500 users
5
PART 76 -- MULTICHANNEL VIDEO AND CABLE
TELEVISION SERVICE TITLE 47--TELECOMMUNICATIONCHA
PTER I--FEDERAL COMMUNICATIONS COMMISSIONPART
76--MULTICHANNEL VIDEO AND CABLE TELEVISION
SERVICE  76.605  Technical standards.
FCC Regs
Cable systems are required to use this channel
allocation plan for signals transmitted in the
frequency range 54 MHz to 1002 MHz.
(2) The aural center frequency of the aural
carrier must be 4.5 MHz 5 kHz above the
frequency of the visual carrier at the output of
the modulating or processing equipment of a cable
television system, and at the subscriber
terminal.
(3) The visual signal level,
, shall not be less than 1 millivolt across an
internal impedance of 75 ohms (0 dBmV).
Additionally, as measured at the end of a 30
meter (100 foot) cable drop that is connected to
the subscriber tap, it shall not be less than
1.41 millivolts across an internal impedance of
75 ohms (3 dBmV).
(6) The amplitude characteristic shall be within
a range of 2 decibels from 0.75 MHz to 5.0 MHz
above the lower boundary frequency of the cable
television channel, referenced to the average of
the highest and lowest amplitudes within these
frequency boundaries
(iii) As of June 30, 1995, shall not be less then
43 decibels.
Broadcaster output above 69 dBm SNR Short haul
68 SNR Long haul 58 SNR DVD 55 SNR VCR
45 SNR Cable design to hit the home 45 SNR FCC
min 43 SNR TV noise floor 42 SNR
No requirements for MPEG / HDTV
Cable set top boxfloor 32
6
MPEG-1 MPEG-1 was the first format to be
released, in 1993 by the Moving Picture Experts
Group from where the format gets it name. The
standard was developed as a way of highly
compressing video. MPEG-1 is an extremely popular
standard and there are players available for
almost every popular computing platform available
today (Windows comes with an MPEG-1 decoder as
standard). MPEG-1 is used by the Video CD
standard for storing movies on standard CDs and
is used in many multimedia and games applications.
MPEG-2 Although popular, MPEG-1 did not address
all of the needs of the developing digital video
world, so the MPEG-2 standard was released.
MPEG-2 addresses many of the needs for high
quality video, and is now used in digital
television broadcasting, DVDs and many other
applications. Whereas MPEG-1 encoders and
decoders can be developed royalty free, MPEG-2
demands that licenses be paid for developers of
both MPEG-2 encoders and decoders. This has meant
that there is generally less support (or the
support is more expensive) for MPEG-2 than
MPEG-1. Windows does not come with an MPEG-2
decoder by default, but some software DVD
players will install a suitable decoder.
MPEG compression is based on a very simple idea -
the concept that in video, often one frame
differs very little from the previous frame.
This means that if you take two sequential frames
in a video sequence, there's not that much
difference between them
JPEG2000 based on state-of-the-art wavelet
compression. The JPEG2000 architecture advances a
number of different applications in the digital
imaging market, everything from digital cameras,
pre-press, medical imaging and other key sectors.
7
Frame    -    1   2   3   4   5   6   7   8  
9   10  11  12  13Frame type -  I   B   B   P  
B   B   P   B    B   P    B   B    I


8
MPEG video supports 3 types of frames, called I,
P and B I-frames contain information to
describe the entire frame. They are like
stand-alone bitmap images that can be used to
recreate the entire information for a picture.
P-frames are Predictive frames, and rely on
information from a previous I or P frame in order
to recreate themselves as a full picture.
B-frames are Bi-directional, needing
information from both previous frames and
following frames in the sequence in order to
draw themselves fully. These other frames can be
I or P frames only.
Group Of Pictures. A typical GOP sequence might
look like
Frame type I B B P B B P B B P B B
I...etc
Frame number1 2 3 4 5 6 7 8 9 10111213
The actual order these frames are kept in the
MPEG file are
Frame type I P B B P B B P B B I B
B...etc
Frame number1 4 2 3 7 5 6 10 8 9 131112
9
I frame all pixels B frame only pixels
changed P frame Predicted changes Similar to
the compression of video, MPEG audio compression
also capitalizes on the characteristics of a
human sensatory organ-this time, its the human
ear. The compression takes into account both
auditory masking, where a louder sound will hide
a softer sound, and time uncertainty, where a
sound in the past or future will interfere
with the ears ability to hear a current sound.
One example of auditory masking occurs when you
try to carry on a quiet conversation in a train
station. Passing trains drown out your
conversation each time they speed by. In the
presence of the sound generated by the train, the
quiet voices in the conversation become
imperceptible.
10
Video stream packets include one frame each,
while audio stream packets usually include
approximately 24milliseconds of sound each. Each
packet contains a header and a payload. The
header gives timing information so the decoder
will know at what time to decode and present the
specified frame.
11
Audio Compression
The MP3 audio format is a widely used audio
format with computers at the moment, this is also
the audio part of an MPEG file. As well as MP3,
there are also "MP1" and "MP2" formats that are
more commonly used when building an MPEG
file. MP3 can give good compression ratios when
compared to a standard WAV file
Sample audio from drums the audio is represented
in a way similar to below
Area most cut by compression
(speech)
The graph shows that the waveform contains a
large amount of low frequencies (the bass drum)
and a small amount of high frequencies (probably
the symbols or high-hat). the audio has "lots of
low frequency, little in the middle, small amount
of high frequencies and this can be stored in a
much smaller amount of information as the
original waveform.
The human ear is not very good at picking up
quiet sounds after loud ones, so all quiet sounds
in this situation can be removed entirely. When
encoding stereo, most of the content of the left
and right channels are very similar so there are
different techniques for compressing only the
difference between the two channels, rather than
encoding each separately.
MP1, MP2 and MP3 formats the higher the number,
the better than audio compression. It should be
noted that although MP3 is the more popular audio
format for stand-alone tracks, MP2 is actually
more widely accepted in MPEG video files. MP3
incurs additional licensing costs that aren't
applicable to MP1 or MP2, so most standards only
use MP2.
Once the audio has been compressed, it must be
stored in a file or transmitted across a digital
transmission channel. The MPEG audio encoder has
to fit the audio to a specific bit rate. 128
kilobits per second (Kbps) and 224 Kbps are
popular rates the higher the bit rate, the
better quality the audio but the more space the
audio requires.
12
Transportation Considerations Packetizing MPEG2
video in IP typically encapsulates 1316 bytes
(10,528 video bits) in an IP frame. Each IP
packet is protected by a payload checksum.  If a
packet is received with an invalid checksum
somewhere in the network, the entire packet (all
10,528 video bits) will be discarded.  A loss of
10,528 bits results in an observable video loss
by up to 0.5seconds.  From the stream rate and
packet loss rates the packet loss rates per unit
time can be computed.   If we look at the
translation from bit error rate to packet error
rate for an IP video stream we find that a Bit
error rate of 1E-10 for 10,000 bit packets
translates to a frame error rate of 1E-6.   For a
(typical VOD stream) 3.75Mbps video stream is a
packet rate of 360PPS.   Lost packets per second
gt 360 1 E-6 gt 3.61E-4 lost packets
per second   The Time interval between lost
packets gt 1/3.61E-4 gt 2770 seconds or 46
minutes   This yields a 1E-10 BER resulting in a
break up every 45 minutes.   For progressively
increasing BER, break up interval is factor 10
more frequent, i.e.   1E-9 BER is a break up
every 4.5 minutes 1E-8 BER is a break up every 27
seconds 1E-7 BER is a break up every 2.7
seconds 1E-6 BER is a break up every .27
seconds Some where between 1E-6 and 1E-7 the
picture is not recoverable.  For typical video
transport, even a 1E-10 picture break up of one
per 45 minutes is not acceptable.   Compare this
to the TDM transport. The transport operates
flawlessly at 1E-9 BER.  At 1E-6 BER there is
only popping in the audio and a sparkle here
and there in the video.
13
(No Transcript)
14
The multi burst signal and pattern are shown
below

15
To display this multiburst signal, the waveform
monitor is set for one-line sweep and flat
response. This signal shows a flat response for
high frequencies (500 kHz to 4.2 MHz).
16
The shot on the left is the signal from the
generator and the shot on the right is the signal
after the MPEG compression.

Signal from test generator signal from MPEG
decoder The picture effect, which results from
the loss of High frequency information, is loss
of details high contrasting edges become
blurred.
17
The typical TV viewing screen without the
blanking pulse is 640 X 480 pixels. The MPEG
system will take 8X8 pixels and use 8 bits to
represent the transition, 8 bits for the
intensity and 2 for the color. The color over
lays a grid that averages 4 pixels at a time. 4
pixels will use 5 bytes for information, 4 bytes
are intensity and 1 byte for color. This process
has considerable affect on color and any edges
defined by color. This type of compression will
result in a video data stream of about 6 Mbps.
The typical data stream for VOD (video on
demand) is at 3.75 Mbps. The reduction is done
be using predicted frames and also by expanding
the compression on the number of bit from 8 X 8
to 16 X 16 pixels. If this is done then the color
is averaged with 16 pixels, causing blurring in
the image. (4.2.0 coding)
18
  • Sampling systems and ratios
  • The subsampling scheme is commonly expressed as a
    three part ratio
  • (e.g. 422), although sometimes expressed as
    four parts (e.g. 4224). The parts are (in
    their respective order)
  • Luma horizontal sampling reference (originally,
    as a multiple of 3.579 MHz in the NTSC television
    system).
  • Cb and Cr (chroma) horizontal factor (relative to
    first digit)
  • Same as second digit, except when zero. Zero
    indicates Cb and Cr are subsampled 21 vertically
  • If present, same as luma digit indicates alpha
    (key) component

19
Things to note The example close-ups below are
magnified at 200 to help show the MPEG encoding
artifacts. The presence of obvious "blocking"
where the video is broken into squares is a sign
that the encoder is having problems resolving
detail and motion. This is most noticeable in the
ground close-ups.
20
(No Transcript)
21
(No Transcript)
22
Some basic facts NTSC is based on a 525-line,
60 fields/30 frames-per-second at 60Hz system for
transmission and display of video images. This
is an interlaced system in which each frame is
scanned in two fields of 262 lines, which is then
combined to display a frame of video with 525
scan lines. NTSC is the official analog video
standard in the U.S., Canada, Mexico, some parts
of Central and South America, Japan, Taiwan, and
Korea. PAL is the dominant format in the World
for analog television broadcasting and video
display and is based on a 625 line, 50 field/25
frames a second, 50HZ system. The signal is
interlaced, like NTSC, into two fields, composed
of 312 lines each. Several distinguishing
features are one A better overall picture than
NTSC because of the increased amount of scan
lines. Two Since color was part of the standard
from the beginning, color consistency between
stations and TVs are much better. In addition,
PAL has a frame rate closer to that of film. PAL
has 25 frames per second rate, while film has a
frame rate of 24 frames per second. Countries
on the PAL system include the U.K., Germany,
Spain, Portugal, Italy, China, India, most of
Africa, and the Middle East. Standard definition
DTV is broadcast in 480p (the same
characteristics as progressive scan DVD - 480
lines or pixel rows progressively scanned)
HDTV is broadcast at either 720p (720p lines or
pixel rows progressively scanned) or 1080i
(1,080 lines or pixel rows that are alternately
scanned fields made up of 540 lines each). 1080P
lines or pixel rows progressively scanned.
23
(No Transcript)
24
While 720p and 1080i are both high definition TV
formats, there are some differences. A 720p
picture is made up of 720 lines drawn
progressively, one right after the other, on your
screen. A 1080i picture is made up by first
drawing 540 odd numbered lines...lines 1, 3, 5,
and so on, and then slightly later, the 540 even
numbered lines...2, 4, 6, and the rest. Your
brain makes this interlaced picture look like one
picture.
Progressive signals send the entire frame at once
whereas the interlaced signal is sent in two
intervals. The first half of an image is sent,
then, the next half is sent in the next scan
completing the single frame.
25
The PAL system which has a frame rate of 25
frames per second (1 frame every 0.04 of a
second), 50 fields are displayed every second.
shown below
The first field is drawn in red, followed by the
second field in green. After a 1/50th of a second
(0.02s), the first field has been drawn,
followed by the second field after 1/25th of a
second (0.04s) - this equates to one full frame.
For the NTSC system, 29.97 frames are shown
every second and the timings change accordingly.
The advantage with interlacing is that this
allows a field rate that is double the frame
rate, leading to smoother motion than with a
simple frame based system. Additionally, when
used with a television, this leads to a less
flickery picture as the screen is refreshed
(scanned) at twice the frame rate.
Displaying On A Non-Interlaced Display If you
intend to display interlaced video on a
non-interlaced display (eg. a computer monitor)
then artifacts can be seen. A non-interlaced
display will show both fields simultaneously,
rather than sequentially.
26
(No Transcript)
27
Quadrature Amplitude Modulation
Quadrature amplitude modulation, or QAM, is a
big name for a relatively simply technique. It
is simply a combination of amplitude modulation
and phase shift keying.
001010100011101000011110 break it up into 3-bit
triads 001-010-100-011-101-000-011-110
IF NTSC converted to QAM 256 QAM 78meg
bits/sec 12 (MPEG 6 meg streams) (2 HDTV
39meg streams) There is also over head that
cuts into this
Write a Comment
User Comments (0)
About PowerShow.com