Chapter 12 Sound

About This Presentation

Title:

Chapter 12 Sound

Description:

Sounds can be characterized by their ... RealAudio is used for streaming audio. ... Real Networks' RealAudio. Streaming QuickTime. Play on demand. MIDI ... – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 49

Provided by: hol79

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 12 Sound

1
Chapter 12Sound

Multimedia Systems

2
Key Points

Sound is a complex mixture of physical and
psychological factors, which is difficult to
model accurately.
Sounds can be characterized by their waveforms,
which plot amplitude against time.
CD quality sound is sampled at 44.1 kHz, using a
sample size of 16 bits. Multimedia productions
may have to use lower sampling rates and smaller
sample sizes.

3
Key Points

The quality of digitized sound can be improved by
dithering adding a small quantity of noise to
randomize the quantization error.
Software can provide the functions of a recording
studio, including multi-track recording, mixing
and effects, on a desktop computer.
The most vexatious aspect of recording is getting
the levels right.
Audio filters are used to remove noise and
unwanted frequency components.

4
Key Points

Digital versions of established effects, such as
reverb and envelope shaping are used to alter the
quality of sounds. Digital technology permits new
kinds of alteration, including time stretching
and pitch alteration.
Speech data can be compressed using established
technology, including µ-law and A-law companding
and ADPCM.
MPEG-1 Layer 3 audio (MP3) is a lossy method of
audio compression that uses a psycho-acoustical
model to determine which information to discard.

5
Key Points

Each of the three major platforms has its own
sound file format AIFF for MacOS, WAV for
Windows, and AU for Unix. RealAudio is used for
streaming audio.
MIDI (The Musical Instruments Digital Interface)
provides a standard for controlling digital
instruments and communicating between them and
computers running sequencer programs.
When sound is combined with video,
synchronization must be established and
maintained.

6
The Nature of Sound

All sounds are produced by the conversion of
energy into vibrations in the air or some other
elastic medium
ex tuning forks (??) and guitars
A good tuning fork produces the clean tines at a
single frequency, most other sound sources
vibrate in more complicated ways.
A single note is composed of several components
at frequencies that are multiplies of fundamental
pitch of the note.

7
Harmonic

The spectrum of a single note from a musical
instrument usually has a set of peaks at
(approximately) harmonic ratios.
That is, if the fundamental frequency is f, there
are peaks at f, and also at (about) 2f, 3f, 4f,
etc.
The pitch of a note refers to the fundamental
frequency with which the source of the tone
resonates.

8
Frequency Spectrum

Percussive sounds and most natural sounds do not
even have a single identifiable fundamental
frequency, but can still be decomposed into a
collection of frequency components.
Frequency spectrum relative amplitudes of its
frequency components

9
The Nature of Sound

The human ear is able to detect frequencies in
the range between 20 Hz and 20 kHz
Upper limit decreases with increasing age
We can display the waveform of any sound by
plotting its amplitude against time
Figs. 12.1-7some waveforms for a range of types
of sound

10
Speech

Speaker repeats Feisty teenager twice, then a
more distance responds.
The second time faster and with more emphasis
Record in open air and there is background noise.
Compress speech removing the silences

Feisty teenager
11
Instruments

Figs. 12.2-5

Didgeridoo
Boogie-woogie
12
Violin, cello and piano
Men grow cold...
13
Water sounds
A trickling stream
The sea
14
Stereophony

One of the most useful illusions in sound
perception is stereophony.
Brain identifies the source of a sound on the
basis of the differences in intensity and phrase
between the signals received from the left and
right ears.

15
Digitizing Sound

Sampling
The selection of the sampling rate
If limiting of hearing is 20 kHz, a minimum rate
of 40 kHz is required by the Sampling Theorem.
The sampling rate of audio CDs is 44.1 kHz
22.05 kHz is commonly used for Internet11.025
kHz for speech
DAT (digital audio tape) 48 kHz

16
Sampling

How does sampling work in computer system
Sound card
Digital audio inputs are uncommon
Analog line output of DAT or CD is re-digitalized
by sound card
Incompatible rate re-sampling
Its called jitter that the intervals between
samples drift

17
Sampling

If sampling rate 40 kHz, the inaudible
components will manifest as aliasing when signal
is reconstructed.
A filter is used to remove any frequencies than
half the sampling rate before the signal is
sampled.

18
Digitizing Sound

Quantization
Its usually 65536 quantization levels for CD
audio
16 bits
Undersampling a pure sine wave
An analogue signal will be coarsely approximated
by samples that jump between just a few quantized
values
Dithering
When a small amount of random noise is added to
the analogue signal before sampling

19
Quantization
Undersampling a pure sine wave
20
Dithering
Dithering
21
Dithering

Sampling and dithering on frequency spectrum

22
Processing Sound

Modern multi-track recording studio
There is presently no single sound application
that has the de facto status.
MIDI sequencing
Multi-track recording
Video editing packages include some integrated
sound editing and processing facilities.

23
Recording and Importing Sound

Sampling rate and sampling size
If level of signal is too low, then resulting
recording will be quiet.
If level is too high, clipping will occur.
Fig. 12.10
Gain control can be used to alter level.
Automatic gain control

24
Sound Editing and Effects

Interface timeline
Tracks
Creation of loops
Very short loops are needed to create voices for
the electric musical instruments known as
samplers.
Longer loops are used in certain styles of dance
music
Post-production
Correct defects, enhance quality, modify their
character.
Premieres effects plug-in format is widely used.
Professional level Cubase VST, DigiDesign
ProTools

Removal of unwanted noise
Noise gate
Eliminates all samples whose value falls below a
specified threshold
Specify a minimum time that must elapse before a
sequence of low amplitude samples counts as a
silence and a similar limit before a sequence
whose values exceed the threshold counts as
sound.
This prevents the gate being turned on or off by
transient glitches (????????).

26
Noise Gate

Since noise gate has no effect on speakers
words, the background noise will cut in and out
as he speaks.
Noise combined with signal
Noise gate all-or-nothing filtering
Low-pass, high-pass, notch filters
Specialized filters
de-esser remove the sibilance (???) that results
from speaking or singing into microphone placed
too close to performer
Click repairer
Remove clicks from recording taken from damaged
or dirty vinyl records.

Single effect may be used in different ways
depending on values of parameters
Reverb effect
Small delay and low reflectivity inside a small
room
Longer reverb times concert hall or stadium

28
Graphic Equalization

Transforms spectrum of a sound using a bank of
filters, each controlled by its own slider and
each affecting a fairly narrow band of
frequencies.

29
Envelope Shaping

Changing outline of a waveform
Allow user to draw a new envelope around the
waveform, altering its attack and decay and
introducing arbitrary fluctuations of amplitude.
Fader a specialized versions of envelope shaping
Volume to be gradually increased and decreased
Tremolo (??)
Cause the amplitude to oscillate periodically
from zero to its maximum value

Time stretching and pitch alteration are two
closely related effects
Analogue recordings can only be achieved by
altering speed at which it is played back, and
this alters the pitch.
With digital sound, the duration can be changed
without altering the pitch by inserting or
removing samples.
The pitch can be altered without affecting
duration
Time stretching required when sound is being
synchronized to video or another sound.

31
Compression

3 minutes, stereo 25 MBytes
Huffman coding
Run-length coding silence

32
Speech Compression

Telephone companies, 1960s
Companding compressing/expanding
non-linear quantization Fig. 12.11
G.711 ?-law, North America and Japan, SUN
A-law
ADPCM, adaptive differential pulse code
modulation
Differential pulse code modulation
Linear Predictive Coding
Mathematical model of state of vocal tract as its
representation of speech
2.4 kbps, machine-like quality

33
Perceptually Based Compression

Threshold of hearingminimum level at which a
sound can be heard
Fig. 12.12, the threshold of hearing
Very low or high frequency sound must be much
louder than a mid-range tone to be heard.
Phycho-acoustical model
Mathematical description of aspects of the way
the ear and brain perceive sounds
Loud tones can obscure softer tones that occur at
the same time
Depends on the relative frequencies of the two
tones

34
(No Transcript)
35
Masking

A modification of threshold of hearing curve in
region of a loud tone
Fig.12.13, the threshold is raised in
neighborhood of masking tone
The raised portion, or masking curve is
non-linear, and asymmetrical, raising faster than
it falls
Any sound that lies within the masking curve will
be inaudible, even though it raises above the
unmodified threshold of hearing.
Because masking hides noise as well as some
components of the signal, quantization noise can
be masked.
Where a masking sound is present, the signal can
be quantized relatively coarsely, using fewer
bits than would otherwise be needed, because the
resulting quantization noise can be hidden under
the masking curve.

36
Compression

Use a bank of filters to split signal into bands
of frequencies 32 bands are commonly used.
The average signal level in each band is
calculated, and using these values and a
psycho-acoustical model, a masking level for each
band is computed.

37
MPEG Audio

3 layers
Layer 1 192 kbps for each channelLayer 2 128
kbps for each channelLayer 3 64 kbps for each
channel
MP3 MPEG-1 Layer 3compression rate 101

38
Formats

AIFF for MacOSWAV for WindowsAU for Unix
Each can store audio data at a variety of
commonly used sampling rates and sample sizes.
Each supports uncompressed or compressed data
with a range of compressors.

39
Streaming Audio

Sound is delivered over a network and played as
it arrives without having to be stored on users
machine first.
Because of lower bandwidth required by audio,
streaming is more successful for sound than it is
for video.
Real Networks RealAudio
Streaming QuickTime
Play on demand

40
MIDI

The Musical Instruments Digital Interface
Standard protocol for communicating between
electronic instruments, such as synthesizers,
sampler, and drum machines.
MIDI allowed instruments to be controlled
automatically by devices that could be programmed
to send out sequences of MIDI instructions.

41
MIDI Messages

An instruction that controls some aspect of the
performance of an instrument
Status byte type of messageone or two bytes
giving the values of parameters
Note On, Note Off, Key Pressure
Running status

MIDI data is transmitted using a 10-bit packet
that includes a start and stop bit
The MIDI message Note On is followed by two data
bytes, as is the Note Off message.
42
General MIDI and QuickTime

General MIDI specifies 128 standard voices, Table
12.1
Drum machine and percussion samplers
Drum kits, Table 12.2
There is no guarantee that identical sounds will
be generated for each name by different
instruments.
A good sampler may use high quality samples of
the corresponding real instruments.
QuickTime MIDI-like functionality

43
MIDI Software

MIDI sequencing programs
Capture and editing functions equivalent to those
of video editing software.
Multiple tracks
Composition
Music can be captured as it is palyed from MIDI
controllers attached to a computer via a MIDI
interface.
Punch in
The start and end point of a defective passage
are marked, the sequencer starts playing before
the beginning, and then switches to record mode,
allowing a new version of the passage to be
recorded to replace the original.

44
Sequencers

Quantize tempo during recording, fitting the
length of notes to exact sixteenth notes, or
eighth note triplets, or whatever duration is
specified.
Most programs allow music to be entered using
classical music notation.
Printed sheet music to be scanned and will
perform optical character recognition to
transform the music into MIDI.
The opposite transformation, from MIDI to a
printed score, is also often provided, enabling
transcriptions of performed music to be made
automatically.

Piano-roll interface, Fig. 12.14
Major limitations of MIDI
Impossibility of representing vocals
MIDI can be transformed into audio.
Reverse transformation is sometimes supported,
although it is more difficult to implement.

46
Computer Sequencing Software
47
Music Notation Software
48
Combing Sound and Picture

Voice-overs should match the picture they
describe, music will often be related to edits,
and natural sounds will be associated with events
on screen.
Synchronization, timecode
If sound and video are physically independently,
synchronization will sometimes be lost.
Audio and video data streams must carry the
equivalent of timecode, so that their
synchronization can be checked.
Audio and video play from local hard disk
For short clips, it is possible to load the
entire sound track into memory before playback
begins.
This is impractical for movies. Fore these, it is
normal to interleave the audio and video.