Homework 1 review - PowerPoint PPT Presentation

1 / 89
About This Presentation
Title:

Homework 1 review

Description:

Homework 1 review – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 90
Provided by: randa84
Category:

less

Transcript and Presenter's Notes

Title: Homework 1 review


1
Lecture 2
  • Homework 1 review
  • Review of Caching (chapter 6) for Lab 1
  • Linking (chapter 7)
  • All source code is posted at http//reed.cs.depaul
    .edu/lperkovic/csc374/lectures/lecture2/

2
Review of caching for lab 1
  • The following slides are a review of caching.
  • You will need these ideas for lab 1.

3
Intel Pentium Cache Hierarchy
Processor Chip
L1 Data 1 cycle latency 16 KB 4-way
assoc Write-through 32B lines
L2 Unified 128KB--4 MB 4-way assoc Write-back Writ
e allocate 32B lines
Main Memory Up to 4GB
Regs.
L1 Instruction 16 KB, 4-way 32B lines
4
Cache Performance Metrics
  • Miss Rate
  • Fraction of memory references not found in cache
    (misses/references)
  • Typical numbers
  • 3-10 for L1
  • can be quite small (e.g., lt 1) for L2, depending
    on size, etc.
  • Hit Time
  • Time to deliver a line in the cache to the
    processor (includes time to determine whether the
    line is in the cache)
  • Typical numbers
  • 1 clock cycle for L1
  • 3-8 clock cycles for L2
  • Miss Penalty
  • Additional time required because of a miss
  • Typically 25-100 cycles for main memory

5
Writing Cache Friendly Code
  • Repeated references to variables are good
    (temporal locality)
  • Stride-1 reference patterns are good (spatial
    locality)
  • Examples
  • cold cache, 4-byte words, 4-word cache blocks

int sumarrayrows(int aMN) int i, j, sum
0 for (i 0 i lt M i) for (j
0 j lt N j) sum aij
return sum
int sumarraycols(int aMN) int i, j, sum
0 for (j 0 j lt N j) for (i
0 i lt M i) sum aij
return sum
1/4 25
100
Miss rate
Miss rate
6
Matrix Multiplication Example
  • Major Cache Effects to Consider
  • Total cache size
  • Exploit temporal locality and keep the working
    set small (e.g., by using blocking)
  • Block size
  • Exploit spatial locality
  • Description
  • Multiply N x N matrices
  • O(N3) total operations
  • Accesses
  • N reads per source element
  • N values summed per destination
  • but may be able to hold in register

/ ijk / for (i0 iltn i) for (j0 jltn
j) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum

Variable sum held in register
7
Miss Rate Analysis for Matrix Multiply
  • Assume
  • Line size 32B (big enough for 4 64-bit words)
  • Matrix dimension (N) is very large
  • Approximate 1/N as 0.0
  • Cache is not even big enough to hold multiple
    rows
  • Analysis Method
  • Look at access pattern of inner loop

C
8
Layout of C Arrays in Memory (review)
  • C arrays allocated in row-major order
  • each row in contiguous memory locations
  • Stepping through columns in one row
  • for (i 0 i lt N i)
  • sum a0i
  • accesses successive elements
  • if block size (B) gt 4 bytes, exploit spatial
    locality
  • compulsory miss rate 4 bytes / B
  • Stepping through rows in one column
  • for (i 0 i lt n i)
  • sum ai0
  • accesses distant elements
  • no spatial locality!
  • compulsory miss rate 1 (i.e. 100)

9
Matrix Multiplication (ijk)
/ ijk / for (i0 iltn i) for (j0 jltn
j) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum

Inner loop
(,j)
(i,j)
(i,)
A
B
C
Row-wise
Misses per Inner Loop Iteration A B C 0.25 1.
0 0.0
10
Matrix Multiplication (jik)
/ jik / for (j0 jltn j) for (i0 iltn
i) sum 0.0 for (k0 kltn k)
sum aik bkj cij sum

Inner loop
(,j)
(i,j)
(i,)
A
B
C
Misses per Inner Loop Iteration A B C 0.25 1.
0 0.0
11
Matrix Multiplication (kij)
/ kij / for (k0 kltn k) for (i0 iltn
i) r aik for (j0 jltn j)
cij r bkj
Inner loop
(i,k)
(k,)
(i,)
A
B
C
Misses per Inner Loop Iteration A B C 0.0 0.2
5 0.25
12
Matrix Multiplication (ikj)
/ ikj / for (i0 iltn i) for (k0 kltn
k) r aik for (j0 jltn j)
cij r bkj
Inner loop
(i,k)
(k,)
(i,)
A
B
C
Fixed
Misses per Inner Loop Iteration A B C 0.0 0.2
5 0.25
13
Matrix Multiplication (jki)
/ jki / for (j0 jltn j) for (k0 kltn
k) r bkj for (i0 iltn i)
cij aik r
Inner loop
(,j)
(,k)
(k,j)
A
B
C
Misses per Inner Loop Iteration A B C 1.0 0.0
1.0
14
Matrix Multiplication (kji)
/ kji / for (k0 kltn k) for (j0 jltn
j) r bkj for (i0 iltn i)
cij aik r
Inner loop
(,j)
(,k)
(k,j)
A
B
C
Misses per Inner Loop Iteration A B C 1.0 0.0
1.0
15
Summary of Matrix Multiplication
  • ijk ( jik)
  • 2 loads, 0 stores
  • misses/iter 1.25
  • kij ( ikj)
  • 2 loads, 1 store
  • misses/iter 0.5
  • jki ( kji)
  • 2 loads, 1 store
  • misses/iter 2.0

for (i0 iltn i) for (j0 jltn j)
sum 0.0 for (k0 kltn k)
sum aik bkj
cij sum
for (k0 kltn k) for (i0 iltn i)
r aik for (j0 jltn j)
cij r bkj
for (j0 jltn j) for (k0 kltn k)
r bkj for (i0 iltn i)
cij aik r
16
Pentium Matrix Multiply Performance
  • Miss rates are helpful but not perfect
    predictors.
  • Code scheduling matters, too.

17
Improving Temporal Locality by Blocking
  • Example Blocked matrix multiplication
  • block (in this context) does not mean cache
    block.
  • Instead, it mean a sub-block within the matrix.
  • Example N 8 sub-block size 4

A11 A12 A21 A22
B11 B12 B21 B22
C11 C12 C21 C22

X
Key idea Sub-blocks (i.e., Axy) can be treated
just like scalars.
C11 A11B11 A12B21 C12 A11B12
A12B22 C21 A21B11 A22B21 C22
A21B12 A22B22
18
Blocked Matrix Multiply (bijk)
for (jj0 jjltn jjbsize) for (i0 iltn
i) for (jjj j lt min(jjbsize,n) j)
cij 0.0 for (kk0 kkltn kkbsize)
for (i0 iltn i) for (jjj j lt
min(jjbsize,n) j) sum 0.0
for (kkk k lt min(kkbsize,n) k)
sum aik bkj
cij sum
19
Blocked Matrix Multiply Analysis
  • Innermost loop pair multiplies a 1 X bsize sliver
    of A by a bsize X bsize block of B and
    accumulates into 1 X bsize sliver of C
  • Loop over i steps through n row slivers of A C,
    using same B

Innermost Loop Pair
i
i
A
B
C
Update successive elements of sliver
row sliver accessed bsize times
block reused n times in succession
20
Pentium Blocked Matrix Multiply Performance
  • Blocking (bijk and bikj) improves performance by
    a factor of two over unblocked versions (ijk and
    jik)
  • relatively insensitive to array size.

21
Concluding Observations
  • Programmer can optimize for cache performance
  • How data structures are organized
  • How data are accessed
  • Nested loop structure
  • Blocking is a general technique
  • All systems favor cache friendly code
  • Getting absolute optimum performance is very
    platform specific
  • Cache sizes, line sizes, associativities, etc.
  • Can get most of the advantage with generic code
  • Keep working set reasonably small (temporal
    locality)
  • Use small strides (spatial locality)

22
Linking
  • Linking
  • Static linking
  • Object files
  • Static libraries
  • Loading
  • Dynamic linking of shared libraries

23
Linker Puzzles
int x p1()
p1()
int x p1()
int x p2()
int x int y p1()
double x p2()
int x7 int y5 p1()
double x p2()
int x7 p1()
int x p2()
24
A Simplistic Program Translation Scheme
m.c
ASCII source file
Translator
Binary executable object file (memory image on
disk)
p
  • Problems
  • Efficiency small change requires complete
    recompilation
  • Modularity hard to share common functions (e.g.
    printf)
  • Solution
  • Static linker (or linker)

25
A Better Scheme Using a Linker
m.c
a.c
Translators
Translators
Separately compiled relocatable object files
m.o
a.o
Linker (ld)
Executable object file (contains code and data
for all functions defined in m.c and a.c)
p
26
Translating the Example Program
  • Compiler driver coordinates all steps in the
    translation and linking process.
  • Typically included with each compilation system
    (e.g., gcc)
  • Invokes preprocessor (cpp), compiler (cc1),
    assembler (as), and linker (ld).
  • Passes command line arguments to appropriate
    phases
  • Example create executable p from m.c and a.c

gcc -O2 -v -o p m.c a.c cpp args m.c
/tmp/cca07630.i cc1 /tmp/cca07630.i m.c -O2
args -o /tmp/cca07630.s as args -o
/tmp/cca076301.o /tmp/cca07630.s ltsimilar
process for a.cgt ld -o p system obj files
/tmp/cca076301.o /tmp/cca076302.o
27
What Does a Linker Do?
  • Merges object files
  • Merges multiple relocatable (.o) object files
    into a single executable object file that can be
    loaded and executed by the loader.
  • Resolves external references
  • As part of the merging process, resolves external
    references.
  • External reference reference to a symbol
    defined in another object file.
  • Relocates symbols
  • Relocates symbols from their relative locations
    in the .o files to new absolute positions in the
    executable.
  • Updates all references to these symbols to
    reflect their new positions.
  • References can be in either code or data
  • code a() / reference to symbol a /
  • data int xpx / reference to symbol x /

28
Why Linkers?
  • Modularity
  • Program can be written as a collection of smaller
    source files, rather than one monolithic mass.
  • Can build libraries of common functions
  • e.g., Math library, standard C library
  • Efficiency
  • Time
  • Change one source file, compile, and then relink.
  • No need to recompile other source files.
  • Space
  • Libraries of common functions can be aggregated
    into a single file...
  • Yet executable files and running memory images
    contain only code for the functions they actually
    use.

29
Executable and Linkable Format (ELF)
  • Standard binary format for object files
  • Derives from ATT System V Unix
  • Later adopted by BSD Unix variants and Linux
  • One unified format for
  • Relocatable object files (.o),
  • Executable object files
  • Shared object files (.so)
  • Generic name ELF binaries

30
ELF Object File Format
  • Elf header
  • Type (.o, exec, .so), machine, byte ordering,
    etc.
  • Program header table
  • Page size, virtual addresses memory segments
    (sections), segment sizes.
  • .text section
  • Code
  • .data section
  • Initialized (static) data
  • .bss section
  • Uninitialized (static) data
  • Block Started by Symbol
  • Better Save Space
  • Has section header but occupies no space

0
ELF header
Program header table (required for executables)
.text section
.data section
.bss section
.symtab
.rel.txt
.rel.data
.debug
Section header table (required for relocatables)
31
ELF Object File Format (cont)
  • .symtab section
  • Symbol table
  • Procedure and static variable names
  • Section names and locations
  • .rel.text section
  • Relocation info for .text section
  • Addresses of instructions that will need to be
    modified in the executable
  • Instructions for modifying.
  • .rel.data section
  • Relocation info for .data section
  • Addresses of pointer data that will need to be
    modified in the merged executable
  • .debug section
  • Info for symbolic debugging (gcc -g)

0
ELF header
Program header table (required for executables)
.text section
.data section
.bss section
.symtab
.rel.text
.rel.data
.debug
Section header table (required for relocatables)
32
Example C Program
m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
33
Merging Relocatable Object Files into an
Executable Object File
Relocatable Object Files
Executable Object File
0
system code
.text
headers
.data
system data
system code
main()
.text
a()
main()
.text
m.o
more system code
.data
int e 7
system data
int e 7
.data
a()
int ep e
.text
int x 15
.bss
a.o
.data
int ep e
uninitialized data
int x 15
.symtab .debug
.bss
int y
34
Relocating Symbols and Resolving External
References
  • Symbols are lexical entities that name functions
    and variables.
  • Each symbol has a value (typically a memory
    address).
  • Code consists of symbol definitions and
    references.
  • References can be either local or external.

m.c
a.c
extern int e int epe int x15 int y
int a() return epxy
int e7 int main() int r a()
exit(0)
35
m.o Relocation Info
m.c
Disassembly of section .text 00000000 ltmaingt
00000000 ltmaingt 0 55 pushl
ebp 1 89 e5 movl esp,ebp
3 e8 fc ff ff ff call 4 ltmain0x4gt
4 R_386_PC32 a 8 6a 00
pushl 0x0 a e8 fc ff ff ff
call b ltmain0xbgt b
R_386_PC32 exit f 90 nop

int e7 int main() int r a()
exit(0)
Disassembly of section .data 00000000 ltegt
0 07 00 00 00
source objdump
36
a.o Relocation Info (.text)
a.c
Disassembly of section .text 00000000 ltagt
0 55 pushl ebp 1 8b 15
00 00 00 movl 0x0,edx 6 00
3 R_386_32 ep 7 a1 00
00 00 00 movl 0x0,eax
8 R_386_32 x c 89 e5 movl
esp,ebp e 03 02 addl
(edx),eax 10 89 ec movl
ebp,esp 12 03 05 00 00 00 addl
0x0,eax 17 00
14 R_386_32 y 18 5d popl
ebp 19 c3 ret
extern int e int epe int x15 int y
int a() return epxy
37
a.o Relocation Info (.data)
a.c
Disassembly of section .data 00000000 ltepgt
0 00 00 00 00 0 R_386_32 e
00000004 ltxgt 4 0f 00 00 00
extern int e int epe int x15 int y
int a() return epxy
38
Executable After Relocation and External
Reference Resolution (.text)
08048530 ltmaingt 8048530 55
pushl ebp 8048531 89 e5 movl
esp,ebp 8048533 e8 08 00 00 00 call
8048540 ltagt 8048538 6a 00
pushl 0x0 804853a e8 35 ff ff ff call
8048474 lt_init0x94gt 804853f 90
nop 08048540 ltagt 8048540
55 pushl ebp 8048541 8b
15 1c a0 04 movl 0x804a01c,edx 8048546
08 8048547 a1 20 a0 04 08 movl
0x804a020,eax 804854c 89 e5
movl esp,ebp 804854e 03 02
addl (edx),eax 8048550 89 ec
movl ebp,esp 8048552 03 05 d0 a3
04 addl 0x804a3d0,eax 8048557 08
8048558 5d popl ebp
8048559 c3 ret
39
Executable After Relocation and External
Reference Resolution(.data)
m.c
int e7 int main() int r a()
exit(0)
Disassembly of section .data 0804a018 ltegt
804a018 07 00 00 00
0804a01c ltepgt 804a01c
18 a0 04 08
0804a020 ltxgt 804a020 0f 00 00 00
a.c
extern int e int epe int x15 int y
int a() return epxy
40
Strong and Weak Symbols
  • Program symbols are either strong or weak
  • strong procedures and initialized globals
  • weak uninitialized globals

p1.c
p2.c
int foo5 p1()
int foo p2()
weak
strong
strong
strong
41
Linkers Symbol Rules
  • Rule 1. A strong symbol can only appear once.
  • Rule 2. A weak symbol can be overridden by a
    strong symbol of the same name.
  • references to the weak symbol resolve to the
    strong symbol.
  • Rule 3. If there are multiple weak symbols, the
    linker can pick an arbitrary one.

42
Linker Puzzles
int x p1()
p1()
Link time error two strong symbols (p1)
int x p1()
References to x will refer to the same
uninitialized int. Is this what you really want?
int x p2()
int x int y p1()
double x p2()
Writes to x in p2 might overwrite y! Evil!
int x7 int y5 p1()
double x p2()
Writes to x in p2 will overwrite y! Nasty!
int x7 p1()
int x p2()
43
Packaging Commonly Used Functions
  • How to package functions commonly used by
    programmers?
  • Math, I/O, memory management, string
    manipulation, etc.
  • Awkward, given the linker framework so far
  • Option 1 Put all functions in a single source
    file
  • Programmers link big object file into their
    programs
  • Space and time inefficient
  • Option 2 Put each function in a separate source
    file
  • Programmers explicitly link appropriate binaries
    into their programs
  • More efficient, but burdensome on the programmer
  • Solution static libraries (.a archive files)
  • Concatenate related relocatable object files into
    a single file with an index (called an archive).
  • Enhance linker so that it tries to resolve
    unresolved external references by looking for the
    symbols in one or more archives.
  • If an archive member file resolves reference,
    link into executable.

44
Static Libraries (archives)
p1.c
p2.c
Translator
Translator
static library (archive) of relocatable object
files concatenated into one file.
p1.o
p2.o
libc.a
Linker (ld)
executable object file (only contains code and
data for libc functions that are called from p1.c
and p2.c)
p
Further improves modularity and efficiency by
packaging commonly used functions e.g., C
standard library (libc), math library (libm)
Linker selectively only the .o files in the
archive that are actually needed by the program.
45
Creating Static Libraries
atoi.c
printf.c
random.c
...
Translator
Translator
Translator
atoi.o
printf.o
random.o
ar rs libc.a \ atoi.o printf.o random.o
Archiver (ar)
libc.a
C standard library
  • Archiver allows incremental updates
  • Recompile function that changes and replace .o
    file in archive.

46
Commonly Used Libraries
  • libc.a (the C standard library)
  • 8 MB archive of 900 object files.
  • I/O, memory allocation, signal handling, string
    handling, data and time, random numbers, integer
    math
  • libm.a (the C math library)
  • 1 MB archive of 226 object files.
  • floating point math (sin, cos, tan, log, exp,
    sqrt, )

ar -t /usr/lib/libc.a sort fork.o
fprintf.o fpu_control.o fputc.o freopen.o
fscanf.o fseek.o fstab.o
ar -t /usr/lib/libm.a sort e_acos.o
e_acosf.o e_acosh.o e_acoshf.o e_acoshl.o
e_acosl.o e_asin.o e_asinf.o e_asinl.o
47
Using Static Libraries
  • Linkers algorithm for resolving external
    references
  • Scan .o files and .a files in the command line
    order.
  • During the scan, keep a list of the current
    unresolved references.
  • As each new .o or .a file obj is encountered, try
    to resolve each unresolved reference in the list
    against the symbols in obj.
  • If any entries in the unresolved list at end of
    scan, then error.
  • Problem
  • Command line order matters!
  • Moral put libraries at the end of the command
    line.

bassgt gcc -L. libtest.o -lmine bassgt gcc -L.
-lmine libtest.o libtest.o In function main'
libtest.o(.text0x4) undefined reference to
libfun'
48
Loading Executable Binaries
Executable object file for example program p
0
ELF header
Virtual addr
Process image
Program header table (required for executables)
0x080483e0
init and shared lib segments
.text section
.data section
0x08048494
.text segment (r/o)
.bss section
.symtab
.rel.text
0x0804a010
.data segment (initialized r/w)
.rel.data
.debug
0x0804a3b0
Section header table (required for relocatables)
.bss segment (uninitialized r/w)
49
Shared Libraries
  • Static libraries have the following
    disadvantages
  • Potential for duplicating lots of common code in
    the executable files on a filesystem.
  • e.g., every C program needs the standard C
    library
  • Potential for duplicating lots of code in the
    virtual memory space of many processes.
  • Minor bug fixes of system libraries require each
    application to explicitly relink
  • Solution
  • Shared libraries (dynamic link libraries, DLLs)
    whose members are dynamically loaded into memory
    and linked into an application at run-time.
  • Dynamic linking can occur when executable is
    first loaded and run.
  • Common case for Linux, handled automatically by
    ld-linux.so.
  • Dynamic linking can also occur after program has
    begun.
  • In Linux, this is done explicitly by user with
    dlopen().
  • Basis for High-Performance Web Servers.
  • Shared library routines can be shared by multiple
    processes.

50
Dynamically Linked Shared Libraries
m.c
a.c
Translators (cc1, as)
Translators (cc1,as)
m.o
a.o
Linker (ld)
Shared library of dynamically relocatable object
files
libc.so
p
Partially linked executable p (on disk)
Loader/Dynamic Linker (ld-linux.so)
libc.so functions called by m.c and a.c are
loaded, linked, and (potentially) shared among
processes.
Fully linked executable p (in memory)
P
51
The Complete Picture
m.c
a.c
Translator
Translator
m.o
a.o
libwhatever.a
Static Linker (ld)
p
libc.so
libm.so
Loader/Dynamic Linker (ld-linux.so)
p
52
Exceptional Control Flow
  • Exceptional Control Flow
  • Exceptions
  • Process context switches
  • Creating and destroying processes

53
Control Flow
  • Computers do Only One Thing
  • From startup to shutdown, a CPU simply reads and
    executes (interprets) a sequence of instructions,
    one at a time.
  • This sequence is the systems physical control
    flow (or flow of control).

Physical control flow
ltstartupgt inst1 inst2 inst3 instn ltshutdowngt
Time
54
Altering the Control Flow
  • Up to Now two mechanisms for changing control
    flow
  • Jumps and branches
  • Call and return using the stack discipline.
  • Both react to changes in program state.
  • Insufficient for a useful system
  • Difficult for the CPU to react to changes in
    system state.
  • data arrives from a disk or a network adapter.
  • Instruction divides by zero
  • User hits ctl-c at the keyboard
  • System needs mechanisms for exceptional control
    flow

55
Exceptional Control Flow
  • Mechanisms for exceptional control flow exists at
    all levels of a computer system.
  • Low level Mechanism
  • exceptions
  • change in control flow in response to a system
    event (i.e., change in system state)
  • Combination of hardware and OS software
  • Higher Level Mechanisms
  • Process context switch
  • Signals
  • Nonlocal jumps (setjmp/longjmp)
  • Implemented by either
  • OS software (context switch and signals).
  • C language runtime library nonlocal jumps.

56
Exceptions
  • An exception is a transfer of control to the OS
    in response to some event (i.e., change in
    processor state)

User Process
OS
exception
current
event
exception processing by exception handler
next
exception return (optional)
57
Interrupt Vectors
Exception numbers
  • Each type of event has a unique exception number
    k
  • Index into jump table (a.k.a., interrupt vector)
  • Jump table entry k points to a function
    (exception handler).
  • Handler k is called each time exception k occurs.

code for exception handler 0
interrupt vector
code for exception handler 1
0
1
code for exception handler 2
2
...
...
n-1
code for exception handler n-1
58
Asynchronous Exceptions (Interrupts)
  • Caused by events external to the processor
  • Indicated by setting the processors interrupt
    pin
  • handler returns to next instruction.
  • Examples
  • I/O interrupts
  • hitting ctl-c at the keyboard
  • arrival of a packet from a network
  • arrival of a data sector from a disk
  • Hard reset interrupt
  • hitting the reset button
  • Soft reset interrupt
  • hitting ctl-alt-delete on a PC

59
Synchronous Exceptions
  • Caused by events that occur as a result of
    executing an instruction
  • Traps
  • Intentional
  • Examples system calls, breakpoint traps, special
    instructions
  • Returns control to next instruction
  • Faults
  • Unintentional but possibly recoverable
  • Examples page faults (recoverable), protection
    faults (unrecoverable).
  • Either re-executes faulting (current)
    instruction or aborts.
  • Aborts
  • unintentional and unrecoverable
  • Examples parity error, machine check.
  • Aborts current program

60
Trap Example
  • Opening a File
  • User calls open(filename, options)
  • Function open executes system call instruction
    int
  • OS must find or create file, get it ready for
    reading or writing
  • Returns integer file descriptor

0804d070 lt__libc_opengt . . . 804d082 cd 80
int 0x80 804d084 5b
pop ebx . . .
User Process
OS
exception
int
Open file
pop
return
61
Fault Example 1
int a1000 main () a500 13
  • Memory Reference
  • User writes to memory location
  • That portion (page) of users memory is currently
    on disk
  • Page handler must load page into physical memory
  • Returns to faulting instruction
  • Successful on second try

80483b7 c7 05 10 9d 04 08 0d movl
0xd,0x8049d10
62
Fault Example 2
int a1000 main () a5000 13
  • Memory Reference
  • User writes to memory location
  • Address is not valid
  • Page handler detects invalid address
  • Sends SIGSEG signal to user process
  • User process exits with segmentation fault

80483b7 c7 05 60 e3 04 08 0d movl
0xd,0x804e360
User Process
OS
page fault
event
movl
Detect invalid address
Signal process
63
Processes
  • Def A process is an instance of a running
    program.
  • One of the most profound ideas in computer
    science.
  • Not the same as program or processor
  • Process provides each program with two key
    abstractions
  • Logical control flow
  • Each program seems to have exclusive use of the
    CPU.
  • Private address space
  • Each program seems to have exclusive use of main
    memory.
  • How are these illusions maintained?
  • Process executions interleaved (multitasking)
  • Address spaces managed by virtual memory system

64
Logical Control Flows
Each process has its own logical control flow
Process A
Process B
Process C
Time
65
Concurrent Processes
  • Two processes run concurrently (are concurrent)
    if their flows overlap in time.
  • Otherwise, they are sequential.
  • Examples
  • Concurrent A B, A C
  • Sequential B C

66
User View of Concurrent Processes
  • Control flows for concurrent processes are
    physically disjoint in time.
  • However, we can think of concurrent processes are
    running in parallel with each other.

Process A
Process B
Process C
Time
67
Context Switching
  • Processes are managed by a shared chunk of OS
    code called the kernel
  • Important the kernel is not a separate process,
    but rather runs as part of some user process
  • Control flow passes from one process to another
    via a context switch.

Process A code
Process B code
user code
context switch
kernel code
Time
user code
context switch
kernel code
user code
68
Private Address Spaces
  • Each process has its own private address space.

0xffffffff
kernel virtual memory (code, data, heap, stack)
memory invisible to user code
0xc0000000
user stack (created at runtime)
esp (stack pointer)
memory mapped region for shared libraries
0x40000000
brk
run-time heap (managed by malloc)
read/write segment (.data, .bss)
loaded from the executable file
read-only segment (.init, .text, .rodata)
0x08048000
unused
0
69
System calls
  • Unix systems provide many different types of
    systems calls to be used by application programs
    when they need a service from the kernel
  • Reading a file
  • Creating a new process
  • To get the complete list, type
  • man syscalls

70
fork Creating new processes
  • int fork(void)
  • creates a new process (child process) that is
    identical to the calling process (parent process)
  • returns 0 to the child process
  • returns childs pid to the parent process

if (fork() 0) printf("hello from
child\n") else printf("hello from
parent\n")
Fork is interesting (and often confusing) because
it is called once but returns twice
71
Fork Example 1
  • Key Points
  • Parent and child both run same code
  • Distinguish parent from child by return value
    from fork
  • Start with same state, but each has private copy
  • Including shared output file descriptor
  • Relative ordering of their print statements
    undefined

void fork1() int x 1 pid_t pid
fork() if (pid 0) printf("Child has x
d\n", x) else printf("Parent has x
d\n", --x) printf("Bye from process
d with x d\n", getpid(), x)
72
Fork Example 2
  • Key Points
  • Both parent and child can continue forking

void fork2() printf("L0\n") fork()
printf("L1\n") fork()
printf("Bye\n")
73
Fork Example 3
  • Key Points
  • Both parent and child can continue forking

void fork3() printf("L0\n") fork()
printf("L1\n") fork()
printf("L2\n") fork()
printf("Bye\n")
74
Fork Example 4
  • Key Points
  • Both parent and child can continue forking

void fork4() printf("L0\n") if (fork()
! 0) printf("L1\n") if (fork() ! 0)
printf("L2\n") fork()
printf("Bye\n")
75
Fork Example 5
  • Key Points
  • Both parent and child can continue forking

void fork5() printf("L0\n") if (fork()
0) printf("L1\n") if (fork() 0)
printf("L2\n") fork()
printf("Bye\n")
76
exit Destroying Process
  • void exit(int status)
  • exits a process
  • Normally return with status 0
  • atexit() registers functions to be executed upon
    exit

void cleanup(void) printf("cleaning
up\n") void fork6() atexit(cleanup)
fork() exit(0)
77
Zombies
  • Idea
  • When process terminates, it still consumes system
    resources
  • Various tables maintained by OS
  • Called a zombie
  • Living corpse, half alive and half dead
  • Reaping
  • Performed by parent on terminated child
  • Parent is given exit status information
  • Kernel discards process
  • What if Parent Doesnt Reap?
  • If any parent terminates without reaping a child,
    then child will be reaped by init process
  • Only need explicit reaping for long-running
    processes
  • E.g., shells and servers

78
ZombieExample
void fork7() if (fork() 0) / Child
/ printf("Terminating Child, PID d\n",
getpid()) exit(0) else
printf("Running Parent, PID d\n",
getpid()) while (1) / Infinite loop /

linuxgt ./forks 7 1 6639 Running Parent, PID
6639 Terminating Child, PID 6640 linuxgt ps
PID TTY TIME CMD 6585 ttyp9 000000
tcsh 6639 ttyp9 000003 forks 6640 ttyp9
000000 forks ltdefunctgt 6641 ttyp9 000000
ps linuxgt kill 6639 1 Terminated linuxgt ps
PID TTY TIME CMD 6585 ttyp9 000000
tcsh 6642 ttyp9 000000 ps
  • ps shows child process as defunct
  • Killing parent allows child to be reaped

79
NonterminatingChildExample
void fork8() if (fork() 0) / Child
/ printf("Running Child, PID d\n",
getpid()) while (1) / Infinite loop /
else printf("Terminating Parent, PID
d\n", getpid()) exit(0)
linuxgt ./forks 8 Terminating Parent, PID
6675 Running Child, PID 6676 linuxgt ps PID
TTY TIME CMD 6585 ttyp9 000000
tcsh 6676 ttyp9 000006 forks 6677 ttyp9
000000 ps linuxgt kill 6676 linuxgt ps PID TTY
TIME CMD 6585 ttyp9 000000 tcsh
6678 ttyp9 000000 ps
  • Child process still active even though parent has
    terminated
  • Must kill explicitly, or else will keep running
    indefinitely

80
wait Synchronizing with children
  • int wait(int child_status)
  • suspends current process until one of its
    children terminates
  • return value is the pid of the child process that
    terminated
  • if child_status ! NULL, then the object it
    points to will be set to a status indicating why
    the child process terminated

81
wait Synchronizing with children
void fork9() int child_status if
(fork() 0) printf("HC hello from
child\n") else printf("HP hello
from parent\n") wait(child_status)
printf("CT child has terminated\n")
printf("Bye\n") exit()
82
Wait Example
  • If multiple children completed, will take in
    arbitrary order
  • Can use macros WIFEXITED and WEXITSTATUS to get
    information about exit status

void fork10() pid_t pidN int i
int child_status for (i 0 i lt N i) if
((pidi fork()) 0) exit(100i) /
Child / for (i 0 i lt N i) pid_t
wpid wait(child_status) if
(WIFEXITED(child_status)) printf("Child d
terminated with exit status d\n", wpid,
WEXITSTATUS(child_status)) else
printf("Child d terminate abnormally\n", wpid)

83
Waitpid
  • waitpid(pid, status, options)
  • Can wait for specific process
  • Various options

void fork11() pid_t pidN int i
int child_status for (i 0 i lt N i) if
((pidi fork()) 0) exit(100i) /
Child / for (i 0 i lt N i) pid_t
wpid waitpid(pidi, child_status, 0) if
(WIFEXITED(child_status)) printf("Child d
terminated with exit status d\n", wpid,
WEXITSTATUS(child_status)) else
printf("Child d terminated abnormally\n",
wpid)
84
Wait/Waitpid Example Outputs
Using wait (fork10)
Child 3565 terminated with exit status 103 Child
3564 terminated with exit status 102 Child 3563
terminated with exit status 101 Child 3562
terminated with exit status 100 Child 3566
terminated with exit status 104
Using waitpid (fork11)
Child 3568 terminated with exit status 100 Child
3569 terminated with exit status 101 Child 3570
terminated with exit status 102 Child 3571
terminated with exit status 103 Child 3572
terminated with exit status 104
85
Command line arguments in C/C
  • Command line arguments to C/C programs are
    passed through the argv arrayint main (int
    argc, char argv)argc is the number of
    command line arguments, including the name of the
    program or command being executed ( passed as
    argv0)Example 1 printArgs.cExample 2
    printN.c

86
exec Running new programs
  • int execl(char path, char arg0, char arg1, ,
    0)
  • loads and runs executable at path with args arg0,
    arg1,
  • path is the complete path of an executable
  • arg0 becomes the name of the process
  • typically arg0 is either identical to path, or
    else it contains only the executable filename
    from path
  • real arguments to the executable start with
    arg1, etc.
  • list of args is terminated by a (char )0
    argument
  • returns -1 if error, otherwise doesnt return!

main() if (fork() 0)
execl("/usr/bin/cp", "cp", "foo", "bar", 0)
wait(NULL) printf("copy completed\n")
exit()
87
Running printArgs from a program
  • Instead of running the printArgs program from
    the Unix shell, we can run it from a program,
    using execlp.
  • Example 1 runls.c
  • Example 2 prog.c
  • Example 3 prog2.c

88
Creating new processes in UNIX
  • A process creates a new process that executes a
    given program or command as followsCall fork(
    ) to create a new processCall exec( ) within
    the new process to execute the program or command
  • Example progExec.c

89
Writing a Unix Shell
  • Pseudo Code for a shell
  • print a prompt.
  • while( EOF is not signaled and an input line is
    read )
  • create a child process (using fork)
  • have the child process replace its program (this
    shell program) with the program specified in the
    input line. (using execlp)
  • wait for the child to finish executing its
    program. (using wait)
  • print a prompt
Write a Comment
User Comments (0)
About PowerShow.com