Title: Advanced Operating Systems
1Advanced Operating Systems
Lecture 2 Design principles
- University of Tehran
- Dept. of EE and Computer Engineering
- By
- Dr. Nasser Yazdani
2How to design a system
- Some general guides.
- References
- Hints for Computer System Design, by Butler W.
Lampson - Read The Interaction of Architecture and
Operation System Design, Thomas E. Anderson, et
al. - Lisp Good News, Bad news, How to Win Big,
Richard P. Gabriel
3Outline
- Introduction
- Lampson general rules
- The worse is better
- Interaction between OS and Architecture.
4Why this discussion
- sea of possibilities, unclear about how one
choice will limit his freedom to make other
choices, or affect the size and performance of
the entire system. - In system design not try the best way (is there?)
avoid terrible choices. - We, as designer, forgot principles, obvious
problems. Usually, saying something and acting
another. - Pervious discussion, methods, top-down,
abstraction, modularity, etc., are there.
5Main and basic questions?
- Why it helps in making a good system?
- with functionality (does it work?),
- speed (is it fast enough?), or
- fault-tolerance (does it keep working?).
- Where in the system design it helps?
- in ensuring completeness,
- in choosing interfaces,
- or in devising implementations.
6Functionality
- The most important hints, and the vaguest.
Obtaining the right functionality that is, with
getting it to do the things you want it to do. - Separates Interface and implementation of some
abstraction from the clients who use the
abstraction. - Three conflicting requirements on interface
- should be simple,
- complete, and
- admit a sufficiently small and fast
implementation.
7Functionality (Simplicity1)
- Perfection is reached not when there is no longer
anything to add, but when there is no longer
anything to take away. (A. Saint-Exupery). - Do one thing at a time, and do it well. An
interface should capture the minimum essentials
of an abstraction. Dont generalize
generalizations are generally wrong. Too much its
implementation gt probably be large, slow and
complicated. - KISS Keep It Simple, Stupid. (Anonymous)
- If in doubt, leave if out. (Anonymous)
- Exterminate features. (C. Thacker)
- Everything should be made as simple as possible,
but no simpler. (A. Einstein)
8Functionality (Simplicity2)
- Services must have a fairly predictable cost, and
the interface must not promise more than the
implementer knows how to deliver. - PL/1 got into serious trouble by attempting to
provide consistent meanings for a large number of
generic operations across a wide variety of data
types.
9Functionality (Simplicity3)
- The Tenex system reports a reference to an
unassigned virtual page by a trap to the user
program. One of arguments of CONNECT to obtain
access to another directory is a string
containing the password. If the password is
wrong, the call fails after a three second delay,
to prevent guessing passwords at high speed. - for i 0 to Length(directoryPassword) do
- if directoryPasswordi ¹ passwordArgumenti
then - Wait three seconds return BadPassword
- connect to directory return Success
- The following trick finds a password of length n
in 64n tries on the average, rather than 128n/2
Arrange the passwordArgument so that its first
character is the last character of a page and the
next page is unassigned, and try each possible
character as the first. If CONNECT reports
BadPassword, the guess was wrong if an
unassigned page, it was correct. Now arrange the
passwordArgument so that its second character is
the last character of the page, and proceed in
the obvious way.
10Functionality (Simplicity4)
- An engineer is a man who can do for a dime what
any fool can do for a dollar. (Anonymous) - Get it right. Neither abstraction nor simplicity
is a substitute for getting it right. In fact,
abstraction can be a source of severe
difficulties. - for i 0 to numberofFields do
- FindIthField if its name is name then exit
- end loop
- In name contents Takes O(n2) to find name
field?
11Functionality (Other consideration 1)
- Make it fast, rather than general or powerful.
- RISC 39 doing general instructions like store,
load, etc., quick can run programs faster than
machines like the VAX with more general and
powerful instructions that take longer in the
simple cases. It is easy to lose a factor of two
in the running time of a program. - To find the places where time is being spent in a
large system. Need measurement tools that will
pinpoint the time-consuming code.
12Functionality (Other consideration 1)
- Make it fast, Other people opinion
- Make it simple
- Beware premature optimization
- Ousterhout The biggest speedup is when a system
goes from working to not working. Thats infinite
speedup. - it is necessary to have measurement tools ...
- Design a system with a structure that
facilitates incremental tuning
13Functionality (Other consideration 2)
- Dont hide power.
- When a low level of abstraction allows something
to be done quickly, higher levels should not bury
this power inside something more general. - Alto disk hardware 53 can transfer a full
cylinder at disk speed, use it wherever possible. - Use procedure arguments to provide flexibility in
an interface. This technique can greatly simplify
an interface, eliminating unnecessary parameters.
14Functionality (Other consideration 3)
- Leave it to the client.
- An interface can combine simplicity, flexibility
and high performance by solving only one problem
and leaving the rest to the client. - Success of monitors as a synchronization device
is partly due to the fact that the locking and
signaling mechanisms do very little, leaving all
the real work to the client programs. - Unix encourages building small programs taking
one or more character streams, produce one or
more streams as output, and do one operation.
Each program has a simple interface and does one
thing well, leaving the client to combine a set
of such programs with its own code.
15Functionality (Continuity1)
- Keep basic interfaces stable.
- An interface embodies assumptions that are shared
by more than one part of a system or many
systems. It is very desirable not to change the
interface. - Keep a place to stand if you have to change the
interface. - IBM 360/370 systems provided emulation of the
instruction sets of older machines. - Success of Microsoft due to the compatibility
with the older version or products. DOS on Windows
16Functionality ( Making it works)
- Plan to throw one away you will anyhow
- It costs a lot less if you plan to have a
prototype. Rethink about old and new features. - Keep secrets of the implementation. Assumptions
about an implementation that client programs are
not allowed to make. They are things that can
change the interface defines the things that
cannot change. - Divide and conquer.
- Use a good idea again instead of generalizing it.
- A specialized implementation of the idea may be
much more effective than a general one.
17Functionality ( Handling cases)
- Handle normal and worst cases separately The
normal case must be fast. The worst case must
make some progress. In scheduling we might face
deadlock. Checking safety of status are extremely
costly. Treat deadlock differently.
18Speed ( 1)
- Split resources in a fixed way if in doubt,
rather than sharing them. - It is usually faster to allocate dedicated
resources, access them, and the behavior of the
allocator is more predictable. - Disadvantage is that more total resources are
needed, ignoring multiplexing overheads, than if
all come from a common pool. - Ex. Use registers for local variable since they
fast. - Dynamic translation (compact, easily modified or
easily displayed) representation to one that can
be quickly interpreted. - Smalltalk use byte codes and translate them on
invoking
19Speed ( 2)
- Cache answers to expensive computations, rather
than doing them over or precomputing. Prefetching
instructions. - Use hints to speed up normal execution. Using
link or references for allocated/free spaces. Try
to avoid lengthy searches - When in doubt, use brute force. Cost of hardware
declines, straightforward and easily analyzed
solution requiring computing cycles is better
than a complex, poorly characterized one that may
work well under certain assumptions. Ex. Ken
Thompsons chess machine, personal computers vs.
time-sharing systems
20Speed ( 3)
- Compute in background when possible. Ex. garbage
collection algorithm, backups on the middle of
night. - Use batch processing if possible.
- Safety first. In allocating resources, strive to
avoid disaster rather than to attain an optimum.
a general-purpose system cannot optimize the use
of resources. It is easy enough to overload a
system and drastically degrade the service if it
exceeds two-thirds of the capacity. Ex. Page
prediction algorithm have no use, the memory is
cheep - The nicest thing about the Alto is that it
doesnt run faster at night. (J. Morris)
21Speed ( 4)
- Shed load to control demand, rather than allowing
the system to become overloaded. - An interactive system can refuse new users, or
even deny service to existing ones. A memory
manager can limit the jobs being served so that
all their working sets fit in the available
memory. - Bob Morris suggested that a shared interactive
system should have a large red button on each
terminal. - Ethernet, IP are best effort delivery.
22Fault-tolerance( 1)
- The unavoidable price of reliability is
simplicity. (C. Hoare) - End-to-end. Ex. Error recovery at the reliable
data transmission. - Requires a cheap test for success.
- Lead to working systems with severe performance
defects that may not appear until the system
becomes operational and is placed under heavy
load. - Log updates to record the truth about the state
of an object. - Make actions atomic or restartable.
23Final word of Lampson
- Most humbly do I take my leave, my lord.
- Still we are left with building big and working
systems.
24Worse is Better
- Two style of design
- MIT/Stanford style Get it right. Ex. Common
Lisp. - ATT style or New Jersey approach Get it work.
Ex. Early Unix and C
25MIT/Stanford style
- Simplicity-the design must be simple, both in
implementation and interface. Interface is more
important. - Correctness.
- Consistency-the design must not be inconsistent.
A design is allowed to be slightly less simple
and less complete to avoid inconsistency.
Consistency is as important as correctness. - Completeness-the design must cover as many
important situations as is practical. All
reasonably expected cases must be covered.
Simplicity is not allowed to overly reduce
completeness.
26New Jersey style
- Simplicity-the design must be simple, both in
implementation and interface. It is more
important for the implementation to be simple
than the interface. Simplicity is the most
important consideration in a design. - Correctness-the design must be correct. It is
slightly better to be simple than correct. - Consistency-the design must not be overly
inconsistent. Consistency can be sacrificed for
simplicity in some cases, but it is better to
drop those parts of the design that deal with
less common circumstances than to introduce
either implementational complexity or
inconsistency. - Completeness-the design must cover as many
important situations as is practical.
Completeness can be sacrificed in favor of any
other quality. In fact, completeness must
sacrificed whenever implementation simplicity is
jeopardized. Consistency can be sacrificed to
achieve completeness if simplicity is retained
especially worthless is consistency of interface.
27Which one is better?
- New Jersey Style has better survival
characteristics than the-right-thing. - The PC loser-ing problem (in calling a routine)
- MIT philosophy Back out and restore the user
program PC to the instruction that invoked the
system routine. - New Jersey Style The system routine to always
finish, but sometimes an error code would be
returned that signaled that the system routine
had failed to complete its action. A correct user
program, then, had to check the error code to
determine whether to simply try the system
routine again. The MIT guy did not like this
solution because it was not the right thing.
28Which one is better?
- Unix and C are the ultimate computer viruses.
Why? - easy to write a decent compiler.
- writing programs that is easy for the compiler to
interpret. - simple structures, easy to port, require few
machine resources to run. - provide about 50--80 of functionality of an
operating system and programming language. - Implementation simplicity has highest priority!
- Programmer is conditioned to sacrifice some
safety, convenience, and hassle to get good
performance and modest resource use. - Initial virus has to be basically good. Then, the
viral spread is assured as long as it is
portable. Then, there will be pressure to improve
it, possibly by increasing its functionality
closer to 90. - gain acceptance
- will condition its users to expect less.
- will be improved to a point that is almost the
right thing. - Compare Lisp with C.
29Which one is better?
- How does the right thing stack up?
- big complex system scenario'
- diamond-like jewel'' scenario
- big complex system'' scenario First, designed.
Then its implementation designed. Finally, its
implemented. - it has nearly 100 of desired functionality,
implementation simplicity is not a concern. - long time to implement.
- large and complex.
- requires complex tools to use properly.
- The last 20 takes 80 of the effort,
- the right thing takes a long time to get out,
and it only runs satisfactorily on the most
sophisticated hardware
30Which one is better?
- The diamond-like jewel'' scenario
- takes forever to design.
- but it is quite small at every point along the
way. - To implement it to run fast is either impossible
or beyond the capabilities of most implementers.
- The first scenario is classic artificial
intelligence software. - Both scenarios correspond to Common Lisp.
- Lesson Not go for the right thing first. Get
half of the right thing available so that it
spreads like a virus. Once people are hooked on
it, take the time to improve it to 90 of the
right thing. - Wrong Lesson C is not the right vehicle for AI
software
31History of OS Change!
1980 2000 Factor
Speed CPU 1 MIPS 1000 MIPS 1000
Memory 500 ns 2 ns 250
Disk 18 ms 2 ms 9
Modem 300 bits/sec 56 Kbits/sec 200
Capacity Memory 64 Kbytes 128 Mbytes 2000
Disk 1 Mbytes 6 Gbytes 6000
Cost Per MIP 100K lt 1 100000
Other Address bits 8 64 8
Users/machine 10s lt1 .01
32Interaction of OS Arch.
- New Trend in Computer Architecture
- Simple, directly-executed instruction sets.
- Open Arch., Make implementation visible to higher
level. - Migration of functions from HW to SW
- Arch. Features are removed unless can be
justified by cost/performance - OS new requirements,
- Fast local communication,
- Distributed programming
- Parallel programming
- Virtual Memory
33Interaction of OS Arch.
- improved extensibility,
- maintainability,
- fault tolerance,
- New system like Mach are moving toward more
modular organization. - Unfortunately, modern OS and Arch. have evolved
somewhat independently - Studies for performance overlook OS. In VAX 50
of reference are for OS. Application result does
not apply to OS. - Modern Arch. Studies traditional OS not new
requirement. - OS is focused on Performance, until HW limits.
Then, HW is bottleneck. But still it takes into
account old Arch. - OS might be optimized not Software on top of it.
34Some tests
- Compared performance of several modern
microprocessors based on - Null system call time to enter a null C
procedure in the kernel, with interrupt
(re-)enabled, and then return. - Trap time to take a data access fault (e.g., on
a protection violation), vector to a null C
procedure in the kernel, and return. This
requires saving and restoring any registers that
are not preserved across procedure calls. - Page table entry change the time, once in the
kernel, to convert a virtual address into its
corresponding page table entry, update that entry
to change protection information, and then update
any hardware (e.g., the translation buffer) that
caches this information. - Context switch
35Some tests (Cont)
- Results for a CISC, VAXstation 3200 (11.1 MHz
CVAX and four RISC Tektronix XD88/01 (2o MHz
Motorola 88000, DECstation 3100 (16.6 MHz MIPS
R2000 Kane 87), DECstation 5000/200 (25 MHz
MIPS R30001 ) and SPARCstation 1 (25 MHz Sun
SPARC).
36Some tests (Cont)
- Objective isolate architectural impact on
operating system functions. Restructured drivers
to remove operating system dependencies and
measure only the operating system independent
parts. This often reduced the execution times by
a factor of two. Further optimized (e.g.,
removing extraneous procedure calls). Handlers
were almost entirely written in assembler. - Result OS performance is not as good as
application according to the table Why? - More microcode for OS Primitives in RISC.
37Some tests (Cont)
38Major OS components (IPC)
- IPC is central to modern operating system
structure and performance. - From monolithic, centralized kernels to a more
decentralized structure. Independent address
spaces communicating through messages. Using
message, instead of shared memory simplifies the
move to a physically distributed topology or
distributed computing - Modularity, fault tolerance, extensibility of
design. - Small kernel. Faster?
- Resource sharing
39IPC Cross-Machine Communication
- RPC has become the preferred method to
communicate between address spaces, both on the
same machine and across a network. - Encapsulates message oriented communication in
procedure call semantics. - OS overhead dominates network Latency.
40IPC Cross-Machine Communication
- Cross-machine RPC involves communication between
two remote address spaces, the operating system
must be involved to transfer control and data
between the client and server. - Tripling CPU speed would reduce SRC RPC latency
by about 50, on the expectation that the 83 of
the time not spent on the wire will decrease by a
factor of3. - System calls and traps do not scale well.
- Large register sets and pipelines, present on
most modern RISCS, are not likely to benefit
interrupt processing and thread management
because of the additional state to examine, save,
and restore. checksum processing is memory
intensive and not compute intensive each
checksum addition is paired with a load (which on
some
41IPC Local Communication
- How effectively the operating system can be
decomposed, as well as how rapidly clients can
communicate with other servers. - Lightweight remote procedure call (LRPC).
- For the simplest local calls, LRPC achieves a
3-fold performance
Each LRPC enter Kernel twice Kernel is bottleneck.
42IPC Sys call interrupt Handling
- VAX performs functions in hardware as part of the
system call and return from exception
instructions, the time to enter and exit the
kernel is longer, but the cost once in the kernel
is much less. - SPARC use Register window to improve application,
but 30 of the null sys. Call is on register
window processing.
43IPC Sys call interrupt Handling
- Motorola 88000 loses much of its performance
advantage because of the complexity of managing
its pipelines in software when a trap occurs. The
88000 includes a large number of registers
containing pipeline state, and these must be
examined and manipulated on a trap to check for
and service any outstanding faults. - Low level trap handling code on the R2000 makes
relatively poor use of load and branch delay
slots. Nearly 50 of the delay slots in this code
path are unfilled, accounting for approximately
13 of the null system call time.
44IPC Data Copying
- Data copying is another area in which modern
processors have not scaled proportionally to
their integer performance, yet it is an important
aspect of local and remote communication. - arguments may have to be copied as many as 4
times from client to kernel, from kernel to
server, from server to kernel, and from kernel to
client. Various optimizations can be applied, but
even in LRPC which uses a shared client/server
buffer, two copies are necessary one to copy
arguments from the invocation stack on the call,
and one to copy results on the return.
45IPC Data Copying
- Problems Mismatch between memory speed and
processor speed. - On Chip cache? One level, two level? But it is
relatively small. - Data copying for message passing and cache
interface is there too.
46Virtual Memory
- The most basic use of virtual memory is to
support address spaces larger than the physical
memory. - Enhance the system performance. Ex. Mach uses
copy on write to speed program startup and
cross-address space communication - transparently support parallel programming across
networks. Important for an illusion of a shared
memory multiprocessors. - Other supports with virtual memory like
transaction locking, garbage collection
47Virtual Memory
- performance of a virtual memory
- The ratio of physical to virtual memory size.
- The size and organization of the TLB
- The cost of servicing a fault
- Page replacement algorithm.
- For OS more important factors
- Flexibility of the addressing mechanism,
- The information provided by that mechanism, and
- The ease of handling faults and changing
hardware VM addressing state.
48V. M. (Fault Handling)
- New Arch. Are not good in exception handling,
besides - Motorola 88000 has 5 internal pipelines,
including an instruction fetch pipeline, each of
which must be restarted after a fault. Associated
with these pipelined execution units are nearly
30 internal registers. During an exception, many
of these registers must be read, saved, and
restored, adding to the complexity and latency of
fault handling. When a page fault occurs on the
88000, several instructions may be in the process
of execution, and instructions following the
faulting instruction may have already completed. - Reduction of the information provided to handle
faults. Intel i860 provides no information on the
faulting address in fact, it provides little
information about why the fault occurred. Trap
handler knows only that a data access fault of
some type occurred, and the address of the
faulting instruction. The fault handler must
interpret the faulting instruction to determine
the type of fault and the offending address. This
requirement adds 26 instructions to our trap
handler
49V. M. (TLB and page tables)
- Formerly, TLB miss handling was hidden from the
operating system, but some new RISC architectures
have moved TLB management into software. Adv.
Flexibility in page table structure. - MIPS virtual address space is divided into two
halves user and system spaces. User space
addresses are always translated through the TLB.
System space, however, is again subdivided into
four regions unmapped and cached, unmapped and
uncached, mapped and cached, and mapped and
uncached. Unfortunately, TLB is used poorly by
OS, in VAX 11/780 two thirds of all TLB misses. - Unmapped region is accessed directly through a
physical base register, there is no indirection
and therefore no ability to specify page-level
protection or access control, except to the
entire region.
50Threads and Multiproccessing
- Threads, or lightweight processes, have become
a common and necessary component of new languages
and operating systems - System level and user level thread.
- User-level thread systems can provide
high-performance parallel programming primitives
that approach minimal hardware costs due to low
cost of communication within a single address
space. - The importance of threads will continue to
increase in the future.
51Arch User-level Threads
- While fine-grained user-level threads require no
special architectural support, the architecture
can have a large impact on thread performance. - Most crucial is the amount of state that must be
managed to context switch from one thread to
another thread in the same address space. - Most of the newer RISC processors, such as the
Sun SPARC, the MIPS R2000 and the IBM RS6000,
have more than 64 registers, compared to 16 in
most earlier 32-bit CISC architectures.
52Arch User-level Threads
On SPARC of registers to be saved depends on
the number of register windows in use at the
time of the context switch. It shown that with 8
windows, on average three need to be
saved/restored on each context switch. It spend
70 of its time To save and restore windows.
Window pointer is privileged Register, then,
user level context is impossible.
53Arch User-level Threads
- Larger register sets can reduce memory accesses.
Register windows greatly reduce the cost of
parameter passing and register saving on
procedure call. The assumption was that procedure
calls are much more frequent than context
switches. Since context switching was expensive,
it was avoided if possible. But for user level
thread parallelism we can avoid it - In Synapse, OO, ratio of procedure call to C. S.
is 211. But, it spend more time on C.S since it
is 50 times costly.
54Arch Kernel-level Threads
- Kernel-level threads can be problematic due to
less TLB effectiveness (thread context switches
between separate address spaces), specially, in
architectures with small numbers of TLB entries
and when threads are scheduled independently of
the address space with which they are associated. - MIPS R2000/R3000 has no atomic semaphore this
hurts performance since threads often are used
for program structure as well as for parallel
programming. Then, system have to use trap into
the kernel or locking. Both are expensive.
55OS Applications
- Performance of primitive Os functions has not
scaled with overall processor performance, - Low-level primitives such as trap and context
switch are frequently used - These primitives are becoming more frequently
used as operating system structure evolves. - Applications spellcheck- 1 latex-150
andrew-local (a script of file system intensive
programs such as copy, compile and search, run
using an entirely local file system)
andrew-remote (the same script run using a remote
file system) link- vmunix (the final link phase
of a Mach kernel build) and parthenon (a
resolution-based theorem prover that rises
multiple threads)
56OS Applications
57OS Applications
- All applications were run on a MIPS R3000-baaed
DECstation 5000/200 with 24 megabytes of memory.
We ran each program three times to smooth out the
effects of paging and file caching - Application were running on Mach OS.
58OS Appl. (Results)
- The most important, operating system primitives
occur frequently, particularly in a small-kernel
operating system such as Mach 3.o, and their
performance has a clear effect on application - Decomposed system will execute more low-level
system functions than a monolithic system. - The number of kernel-level TLB misses is
significantly larger for all applications running
under Mach 3.o than it is under Mach 2.5. - The emulated instruction have significant
performance due to lack of an atomic Test-and-Set
instruction. Critical sections execute in kernel
mode in Mach 2.5 and can simply disable
interrupts. But in Mach 3.0 the operating
systems critical sections execute at user-level
a trap to the kernel is needed to provide mutual
exclusion.
59Next Lecture
- Kernel
- References
- "Exokernel An Operating System Architecture for
Application-Level Resource Management", - "The x-Kernel An Architecture for Implementing
Network Protocols" - TinyOS An Operating System for Sensor Networks