Title: Data Representation, Data Structures, and Multifile compilation
1Data Representation, Data Structures, and
Multi-file compilation
2Data Representation Binary representation Oct
al, Hexadecimal Data types
3Memory concepts
- Every piece of information stored on computer is
encoded as combination of ones and zeros. - These ones and zeros are called bits.
- One byte is a sequence of eight consecutive bits.
- A word is some number (typically 4) of
consecutive bytes.
4Binary representation
A single (unsigned) byte of memory
1
0
0
1
0
1
1
1
In decimal representation, this number is 120
021 022 123 024 125 126 127
233
5Binary representation
One bit must be used to store sign of number
A single (signed) byte of memory
1
0
0
1
0
1
1
/-
In decimal representation, this number is 120
021 022 123 024 125 126
/- 105
6Binary representation, cont.
- What is the range of numbers that can be stored
in a single signed/unsigned byte? - How would you write a program to convert an
arbitrary base 10 number to binary? - How would you write a program to convert an
arbitrary binary number to base 10? - What is the effect of right/left shifting bits
(assuming the lost bit is set to zero)?
7Octal representation
- Octal representation base 8
- Just a simple extension of binary and decimal but
using only the digits 0-7. - Best seen with an example
- What is the value of the octal number 711?
- 180 181 782 457
- What is the octal representation of the number
64? - 100 (since 080 081 182 64)
- Try this in C using the "o" format expression
with printf printf("o\n", 457)
8Hexadecimal representation
- Hexadecimal representation base 16
- Just a simple extension of binary, octal, and
decimal but using 16 "digits" 0-9,a,b,c,d,e,f - Example What is the value of the hexadecimal
number 10ef? - 15160 14161 0162 1163 4351
- Try this in C using the "x" format expression
with printf printf("x\n", 4351)
9Understanding datatypes at a more fundamental
level
10char revisited
- Before doing some example bitwise operations, we
first revisit our simple C datatypes to
understand them at a deeper level. - Recall that we have just a few basic types
- Char, int, float, double
- Recall also that char represents a single byte of
storage, while int is typically 4 bytes - Important Do not be misled by the name "char"
the char datatype is really no different from int
(other than its storage capacity) - What do I mean by "no different from int"? We
explore this with some examples on the next slide
11Char vs. int
Consider the following declarations int j
4 char k 4 In memory, these appear as
j
k
They are both perfectly valid ways to represent
the number 4. In one case (int), there is much
more "wasted" memory. In the other case (char),
there is a much stricter limit on how large the
number can be if you choose to change it.
12Char, cont.
- Why would you not always use char to represent a
small number, such as 4? - Consider what happens in this case
- char j 4
- j j 300 / bad! Can't store 304 in a char!
- So, it is safer to use a larger type, such as
int, unless you are 100 sure that the char limit
will never be exceeded in the program!
13Char as "character" storage
- So, if char is just an abbreviated int, what does
it have to do with characters? - The answer is twofold
- First, char can do nothing special with
characters that int can't do. - Both store equivalent ASCII integer code when
single quotes are placed around a single
character in an assignment - Example
- char c 'e' / store the integer (ASCII) code
for the character e in the byte c / - Int c 'e' / same as above, but store integer
in 4-byte (ie int) sequence.
14Char example
The best way to understand this is with a simple
example. / char_int1.c / include
ltstdio.hgt main() char c int j j
100 c 100 / random choice lt 255 /
printf("d d\n", j, c) / print j and c as
decimal ints / printf("c c\n", j, c) /
print j and c as characters / j 'h' c
'h' / change assignment / printf("c c\n",
j, c) / what is printed here? / printf("d
d\n", j,c) / print asci code for 'h' /
15include ltstdio.hgt int main(int argc, char
argv) int input if (argc !2)
printf("s\n", "Must enter a single argument")
exit(1) input atoi(argv1) /
grab input as integer / if (input gt 255
input lt 0) printf("s\n", "Must enter a
number gt 0 lt 256") exit(1)
printf("s c\n", "The corresponding character
is", input)
16 includeltstdio.hgt int main(int argc, char
argv) char input if (argc !2)
printf("s\n", "Must enter a single argument")
exit(1) input argv1 / grab
single character from keyboard / printf("s
c d\n", "The ascii code for", input ,
input)
Note We will not understand why the needs to
be here until we study pointers. However, you
should be able go write an equivalent code using
scanf.
17Very low-level stuff
18Bitwise operations
- C contains six operators for performing bitwise
operations on integers - Logical AND if both bits are 1 the result is
1 - Logical OR if either bit is 1, the result
is 1 - Logical XOR (exlusive OR) if one and only
one bit equals 1, the result is 1 - Logical invert if the bit is 1, the result
is 0 if the bit is 0, the result is 1 - ltlt n Left shift n places
- gtgt n Right shift n places
19Bitwise operations
- Bitwise operations are considered "low-level"
programming by today's standards. For many
programs, manipulating individual bits is never
necessary. - Sometimes, this level of control is needed for
memory or performance optimization - In any case, it is very important for a
conceptual understanding of programming
20Bitwise examples AND
- Bitwise AND
- Char j 11 char k 14
- j 0 0 0 0 1 0 1 1
- k 0 0 0 0 1 1 1 0
- ---------------------
- 0 0 0 0 1 0 1 0 10
-
21OR
- Bitwise OR
- Char j 11 char k 14
- j 0 0 0 0 1 0 1 1
- k 0 0 0 0 1 1 1 0
- ---------------------
- 0 0 0 0 1 1 1 1 15
22XOR
- Bitwise XOR
- Char j 11 char k 14
- j 0 0 0 0 1 0 1 1
- k 0 0 0 0 1 1 1 0
- ---------------------
- 0 0 0 0 0 1 0 1 5
23Shifting
- Logical invert
- Char j 11
- j 0 0 0 0 1 0 1 1
- j 1 1 1 1 0 1 0 0 244
- Shifting
- char j 11
- j ltlt 1 0 0 0 1 0 1 1 0 22
- j gtgt 1 0 0 0 0 0 1 0 1 5
24Data Structures and Algorithms
25Sorting
- Comes up all the time
- Demonstrates important techniques
- Can be done many ways
- Different algorithms.
26Bubble Sort
- Very simple
- Terrible
- Go through list, swapping out-of-order neighbors
- Continue until no more swaps
27Bubble Sort
- N number of items
- If first number is initially at bottom of list,
have to go through list N times - Each time, looking/maybe swapping N times
- Total of N2 operations
- S..L..O..W.. for long lists
- But if list is very nearly sorted, can be quick.
- No one would really use this algorithm.
28Insertion sort
- About as simple, but better
- Way most people sort cards
- Keep inserting in order
- Still N2, but faster on average
29Data Structures
- Both these methods very array-based
- Have to look through half/most/all of list each
iteration - Definitely need N iterations
- Doomed to be fairly slow
- For faster techniques, need different ways of
looking at data.
30Binary Trees
- A binary tree is either empty, or consists of a
node with a left and a right child. - Left and right children are binary trees
31Complete Binary Trees
- In a complete binary tree, every node has either
2 or 0 children, and all nodes w/ 0 nodes (leaf
nodes') are on the bottom level. - A complete binary tree with L levels has 2L-1
nodes - One with N nodes has log2(N1) levels
32Heaps
- A binary tree with values (keys') stored at each
node. - Almost complete binary tree
- Partial ordering root's key is less than either
of children, and both children are roots of heaps
33Storing a heap in an array
- Can easily store a heap in an array
- Parent node i has left child (2i1) and right
child (2i2).
34Why bother?
- Putting things in this partial order easier than
sorting - Very easy to find lowest value in data once data
is in heap - This is useful
- Priority queue
- Sorting!
35Heap Sort teaser
- Get data into heap
- Top value is lowest value.
- Delete top value re-heap
- Repeat until no more data
- Results are sorted list!
36Heap Operationsinsert
- Put into existing heap
- Put number in first available leaf node.
- If parent tree no longer a heap, swap.
- Then repeat this process until you hit the root.
37Heap Operations delete root
- Take bottom-most value from the tree, put it
where root used to be - Remove that node.
- Go down heap, swapping if node larger than
children.
38Heap Ops build heap from data
- It's much easier to insert into an existing heap
than build one at once. - Single nodes are always heaps!
- Start from bottom, working up, inserting parents
into heaps. - Repeat until no more data
39Notice
- Heap insert/delete operations take lg(N)
operations (one per level of the tree). - To build heap, each piece of data needs to be put
in N lg N operations - To pull out sorted list, need to do N operations
of a delete which takes lg N steps another N lg
N operations. - N lg N is much less than N2 for large N!!
40Heapsort Algorithm
- Build heap from scratch
- For each piece of data,
- Get root value
- Delete from heap
41Multiple-File compilation
42Why more than one file?
- As program gets bigger, having whole program in
one file gets quickly awkward. - File hard to read
- Takes forever to edit a 1M line file!
- Hard to re-use code
- Have to re-compile entire program even if just
small change in one routine
43Compilation vs. Linking
- Compilation compile source code into machine
language. - Generates object file (.o)
- Linking bring in code from other libriaries that
we might need - Link in code for printf() from std. C library
link in code for sin() from math library, etc. - Generates an executable
44Compilation vs. Linking
- If all of program is in one file, the distinction
isn't important, and gcc will do the compile/link
in one step. - Otherwise, do it seperately
- Running Average Example
- Sort Example