Chapter 2: How are data represented - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Chapter 2: How are data represented

Description:

The speed of processing. The range of alphabets available to us ... 210 bytes = 1024 bytes =1 kilobytes = 1KB. 220 bytes = 210 KB = 1 megabytes = 1MB ... – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 29

Provided by: YB

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 2: How are data represented

1
Chapter 2 How are data represented?
2
Why we care?

The accuracy of our results
The speed of processing
The range of alphabets available to us
The size of the files we must store
The quality of graphics on screen and on paper
The time it takes for Internet download

3
Why computers work in binary?

Cheapest and simplest in design and engineering
Switch on ? 1 off ? 0
Circuit voltages
1.7 volts higher ? 1
0.0 volts - 1.3 volts ? 0
Voltages (1.3 to 1.7) are avoided in design
Mathematics binary numbers
Using digits 0 and 1 only.

4
Decimal vs. Binary

Decimal system
10 symbols 1, 2, 3,9, 0
Base 10 (We have 10 fingers)
Decimal number 2324 reads 2 thousands 3 hundreds
twenty four.
Binary system
2 symbols 0 and 1
Base 2
Binary number 1101 ?

5
Decimal vs. Binary
4
2
3
2
.
Decimal System
21000
3100
210
41
Each digit represents
1000
100
10
1
Position values
103
Position values (base)
102
101
100
Value in Decimal
21000310021041 2324D
1
1
1
0
.
Binary System
23
Position values (base)
22
21
20
8
4
2
1
Position values
18
14
02
11
Each digit represents
Value in Decimal
18140211 13D
6
Why do computer work in binary?

Binary digits bits
8 bits 1 byte
210 bytes 1024 bytes 1 kilobytes 1KB
220 bytes 210 KB 1 megabytes 1MB
230 bytes 210 MB 1 gigabytes 1GB
240 bytes 210 GB 1 terabytes 1TB

7
Types of data

Instructions
Computer instructions are coded in sequences of
0s and 1s
Numbers
2324, -34.35, 34567890123.12345
Characters and symbols
A, B, C, Z, a, b, c, z,
0, 1, 2, 3 9, , -, ), (, , , etc
Images
Photos, charts, drawings
Audio
Sound, music, etc
Video
Video clips and movies

8
Representation of Numbers

Fixed-size-storage approach
Computers allocate a specified amount of space
for a number
Integers
1 bit 0 to 1
2 bits 00, 01, 10, 11 ? 0 to 3
4 bits 0000, 0001, 0010, 1111 ? 0 to 15
1 byte 0 to 255
2 bytes -32768 to 32767
4 bytes -2,147,648 to 2,147,483,647
Note with 4 bytes for integers, any number
smaller than -2,147,648 or larger than
2,147,483,647 would be incorrectly represented.,

9
Representation of Numbers
Binary representation of real numbers
1
1
1
0
.
Binary System
1
21
Position values (base)
20
2-1
2-2
2-3
2
1
1/2
1/4
Position values
1/8
12
01
10.5
10.25
Each digit represents
10.125
Value in Decimal
2 ½ ¼ 1/8 2.875D
10
Representation of Numbers

Floating-point numbers for real numbers
Three parts of representation
Sign (always 1 bits 0 for and 1 for -)
Significant digits (e.g., six bits)
the power of 2 for the leftmost digit (e.g., 3
bits)
Example for binary -1111.01
Sign 1 (negative)
Significant digits 111101B
Power of 2 011B
Example for binary 100.1101B
Sign 0 (positive)
Significant digits 110110B
Note the last digit is lost, which is 1/16 in
decimal
Power of 2 010B

11
Representation of Numbers

Single-precision floating-point numbers
Sign (always 1 bits 0 for and 1 for -)
Significant digits 23 bits
exponent 8
Double-precision floating-point numbers
Sign (always 1 bits 0 for and 1 for -)
Significant digits 52 bits
exponent 11
What you should know?
Computers can represent numbers only in limited
accuracy.
E.g., when you enter a 20 digit decimal into a
program that uses single-precision, only about 7
digits are actually stored, the rest are lost.
Real examples
Designing aircraft on p.35
The Vancouver Stock Exchange Index on pp. 38-39

12
Representation of Numbers

// file public_html/2005f-html/cil102/accuracy.c
include ltstdio.hgt
int main()
int x, y, result // x, y, and result all
use 32 bits to represent integers (-2,147,648 to
2,147,483,647)
char op
int i
for (i 0 i lt 100 i)
printf("please enter an
expression\n")
scanf("d c d", x, op, y)
if (op '')
result x y
else if (op '-')
result x - y
else
printf("Invalid
operator!!")
break

13
Representation of Numbers

Variable-size-storage approach
Allow a wide-range of numbers to be stored
accurately
Needs significant more time to process
Fixed-size approach is used more common than
variable-size approach.

14
Representation of characters

There are no visual letters A, B, C, etc stored
in computers like we have in mind.
Letters and symbols are encoded in 8 bits one
byte - of 0s and 1s.
Keyboard converts keys A, B, C etc to their
corresponding codes and
monitor converts the code into visual letters A,
B, C etc on screen.
Two commonly used coding schemes
ASCII American Standard Code Information
Interchange
EBCDIC Extended Binary Coded Decimal Interchange
Code

15
Representation of characters
16
Representation of characters

Foreign characters two approaches
Use one byte per char
Ex.,
ISO-8859-1 for Western (Roman)
ISO-8859-7 for Greek
ISO-2022-CN for simplified Chinese
Webpage using META charset to specify which
encoding is used.
Use two bytes per char/symbols
16 bits have 65,536 combinations (characters)
Unicode coding system

17
Representation of Images

A picture is treated as a matrix of dots, called
pixels.

18
Representation of Images

The pixels are so small and close together we
cannot really see them as separate dots.
Resolution dots per inch (dpi)
72 dpi for Web images
600 or 1200 dpi for professional printers or home
photo printers

19
Representation of Images

The color of each pixel is represented using
bits.
Black/White one bit per pixel
1-white and 0-black
Gray scale one byte per pixel
256 different degrees of gray (00000000 to
11111111)
00000000 black, 01111111 intermediate gray,
11111111 white
Color three bytes per pixel
Red, green, blue color
One byte for the intensity of each of the three
color
256 possible red, 256 green, 256 blue
Pure red 11111111 for red byte, 00000000 for
green and blue
White 11111111 for all three bytes
Black 00000000 for all three bytes

20
Representation of Images

Image storage -- size
Gray scale one byte per pixel
E.g., A 3 X 5 picture with 300 dpi resolution
3 300 900 pixels per column
5 300 1500 pixels per row
900 1500 1,350,000 pixels/picture
Needed storage 1,350,000 bytes/picture
1MB/picture
Color three bytes per pixel
E.g., A 3 X 5 picture with 300 dpi resolution
3 300 900 pixels per column
5 300 1500 pixels per row
900 1500 1,350,000 pixels/picture
Needed storage 3 (bytes per pixel)
1,350,000
4,050,000
bytes/picture
4MB/picture ---
TOO BIG

21
Representation of Images

Image compression
Color table
Most pictures contain a small of different
colors
Use a table to define colors that are actually
used in the picture
Each pixel has an index to the color table.
Each image contains a color table and table
indices
Example
For a picture with 100 different colors, the
color table would contain 100 entries, three
bytes each entry for each color. One byte can be
used as index to the table for each pixel.

22
Representation of Images

Drawing commands
Draw picture using basic commands
Just as artists draws using a pencil or a brush
and other basic movements
Example,
A house is drawn by sketching various elements
(doors, windows, walls), adding color to them,
and moving to the desired position.

23
Representation of Images

Data averaging or sampling
Condense the size by selecting a smaller
collection of information to store.
Many different ways of sampling and data
averaging
An example choose to store only every other
pixel in an image (sampling) reducing the size
to half. To display the full picture, the
computer need to fill in the missing data with,
for example, the average of neighboring pixels
(data averaging)
The resulting picture cannot be as sharp as the
original
Lossy data compression

24
What are .gif, .ps, .jpg, .bmp formats?

Commonly used image file formats -1
Bitmap (.bmp)
Pixel-by-pixel storage of all color information
for each pixel.
Lossless representation
Files are huge.
Graphics Interchange Format (.gif)
Use one or more color tables the color table
technique
Each table contains 256 colors.
Suitable for pictures with a small (lt256) of
different colors (e.g., organization charts)
Not suitable for pictures with shading (e.g.,
photos)

25
What are .gif, .ps, .jpg, .bmp formats?

Commonly used image file formats - 2
PostScript (.ps)
Employ the drawing commands technique
moveto draws a line from current position to a
new one and arc draws an arc given its center,
radius, etc
General shapes can be used in multiple places
Fonts can be reused.
Useful when the picture can be rendered as a
drawing or its contains many of the same elements
(e.g., text of the same fonts)
Joint Photographic Experts Group (JPEG) (.jpg)
use the data averaging and sampling on 88 pixel
blocks
User determines the level of details and clarity
High-quality image 88 blocks maintain their
contents
Low-quality image info in 88 blocks is
discarded ? smaller files

26
Comparison b/w jpg, gif, and ps

Pictures in the textbook
http//www.cs.grinnell.edu/walker/fluency-book/fi
gures/chapter2/fig-2-overview.html
Comparison of .jpg and .gif
http//www.siriusweb.com/tutorials/gifvsjpg/
More on .jpg and .gif
http//www.wfu.edu/matthews/misc/jpg_vs_gif/JpgVs
Gif.html

27
Summary chapter 2

Computers work in binary
Integers may be constrained in size
Real numbers may have limited accuracy
Computations may produce roundoff errors,
affecting accuracy
Characters and languages are encoded in binary
Pictures are displayed pixel by pixel
Color table, draw commands, and data averaging
and sampling compression techniques
.bmp, jpg, .gif, .ps formats

28
Terminology