Title: CSCI130
1Lecture 3 (1.1-2.3)
- CSCI130
- Instructor Dr. Imad Rahal
2Non-numeric Data Representation
- Text data
- The most common type of data people use on
computers - How can we transform it to binary --- not as
intuitive as numbers! - Numbers were (relatively) easy to map to binary
- Decimal to binary change
- Words can be divided into characters
- Each character can then be encoded by some binary
code - Every language has its set of letters ? we will
limit ourselves to the Latin alphabet - What about letters and other symbols?
3Non-numeric Data Representation
- Many transformations exist and all are arbitrary
- Two popular
- EBCDIC (Extended Binary Coded Decimal Interchange
Code) by IBM - ASCII (American Standard Code for Information
Interchange) by American National Standards
Institute (ANSI) - Most widely used
- Every letter/symbol is represented by 7 bits
- How many letters/symbols do we have in total?
- A-Z (26) , a-z (26) , 0-9 (10), symbols (33),
control characters (33) - If using 1 byte/character ? we have one extra bit
- Extended ASCII-8 (more mathematical and graphical
symbols or phonetic marks) - Why are passwords case sensitive?
- How to sort alphabetically
4(No Transcript)
5Non-numeric Data Representation
- Given a memory register with the value 00110100
- 2s complement ? 52
- Floating point ? 0.625
- ASCII ? character 4 (check ASCII table in book)
- There are some code blocks preceding such values
informing the computer of the type - Sometimes called meta-data
- E.g. Font style
6Non-numeric Data Representation
- Picture/Image Data
- 512256 image ? grid has 512 columns and 256 rows
- Divide the screen into a grid of cells each
referred to as a pixel - Pixel values and sizes depend on the type of the
image - Black white images 1 bit for every pixel such
that 1 is black and 0 is white - Grayscale images 1 byte/pixel where 255 is black
and 0 is white and anything in between is gray
(higher/lower values are closer to black/white) - Color images Three values per pixel
- Red/Green/Blue
- 1 byte per color ? 3 bytes per pixel
- For an 512x256 image (512x256x3 384K bytes)
- For any image, we only store the pixel values
7(No Transcript)
8An Image File
- //B/W IMAGE
- P1
- 15 11
- 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
- 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1
- 0 1 0 0 0 0 1 1 1 1 0 1 1 1 1
- 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1
- 0 1 1 1 1 0 1 1 1 1 0 1 1 1 1
- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
- 0 1 1 1 1 0 0 1 1 1 0 1 0 0 1
- 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1
- 0 1 1 1 1 0 0 0 0 1 0 1 0 0 1
- 0 0 0 0 1 0 0 1 0 1 0 1 0 0 1
- 0 1 1 1 1 0 0 1 1 1 0 1 1 1 1
- //COLORED IMAGE
- P3
- 2 7
9Non-numeric Data Representation
- Image movies are built from a number of images
(or frames) that are displayed in a certain
sequence at high speeds - 30 images per second
- Assume every image has a size of 500KB
- 500K 30 15 MB
- 2-hr movie needs (assume same sample image used)
- 15MB 60 120 108 GB (billion) bytes!
10Non-numeric Data Representation
- People use compression to reduce large movie
sizes - (Temporal compression) Usually the change between
two consecutive images is small ? store only
difference between frames
11Non-numeric Data Representation
Heard sound depends on the amplitude and
frequency
- Sound/Audio Data
- Produced when objects vibrate in matter (e.g.
air) - Sends a wave of pressure fluctuations
- Sounds differ because of variations in the sound
wave frequency (pitch or speed) - Frequency cycles/second
- Higher wave frequency ? pressure fluctuation
switches back and forth more quickly during a
period of time - We hear this as a higher pitch
- Level of air pressure in each fluctuation, the
wave's amplitude or height, determines how loud
the sound is
Volts
12Non-numeric Data Representation
- Not all frequencies are audible
- Your ears are particularly sensitive to sounds in
the middle range, from about 500 Hz to 2 kHz - The hi-fi range is defined as from 20 Hz to
20 kHz - As you get older, you will find it more and more
difficult to hear higher frequencies - By the time you are able to afford a decent hi-fi
system, you will probably be unable to fully
appreciate its performance ?
13Non-numeric Data Representation
- Numbers used to represent the amplitude of sound
wave - Analog is continuous and we need digital
- Digitize the sound signal
- Measure the voltage of the signal at certain
intervals (e.g. 40,000 per sec for CDs) - process of sampling
- Reconstruct wave
- Sound will slightly vary
- A sound file is nothing but a sequence of numbers
measured at equal intervals
14Non-numeric Data Representation
- Higher frequency/pitch
- more cycles during same time interval
- less time between any two cycles
- less measurements between any two peaks (given
that measurements are taken at equal intervals of
time) - Compression can also be used for audio files
- MP3 reduces size to 1/10th
- ? faster transfer over the Internet
15Non-numeric Data Representation
- Digital image and audio have a lot of advantages
over non-digital ones - Can easily be modified by changing the bit
pattern - Image enhancement, noise/distortion removal, etc
- Superimpose one sound on another or image on
another results in newer ones - Not admissible as evidence in courts of law
16Non-numeric Data Representation
- adapted from a course offered at BU
17Non-numeric Data Representation