Title: CS 4410 Review II
1CS 4410 Review II
2Announcements
- Today is the last day of class!!!
- You are (almost) done!
- Final, Thursday, December 18th, 2-430pm (2½ hour
exam) - Room 131 Warren Hall
- Exam will be comprehensive, covers entire
semester - Closed book, no calculators/PDAs/
- Bring ID
- I will be out of town next week (Monday, Dec 8th
- 10th) - Fill out evaluations online
- Consider being a TA next semester!
3What does the disk look like?
4Disk overheads
- To read from disk, we must specify
- cylinder , surface , sector , transfer size,
memory address - Transfer time includes
- Seek time to get to the track
- Latency time to get to the sector and
- Transfer time get bits off the disk
Track
Sector
Rotation Delay
Seek Time
5Disk Scheduling
- FCFS
- SSTF
- SCAN
- C-SCAN
- LOOK
- C-LOOK
6RAID Levels
- 0 Striping
- 1 Mirroring
- 2 Hamming Codes
- 3 Parity Bit
- 4 Block Striping
- 5 Spread parity blocks across all disks
- 01 and 10
7Stable Storage Algo
- Use 2 identical disks
- corresponding blocks on both drives are the same
- 3 operations
- Stable write retry on 1st until successful, then
try 2nd disk - Stable read read from 1st. If ECC error, then
try 2nd - Crash recovery scan corresponding blocks on both
disks - If one block is bad, replace with good one
- If both are good, replace block in 2nd with the
one in 1st
8File System Layout
- File System is stored on disks
- Disk is divided into 1 or more partitions
- Sector 0 of disk called Master Boot Record
- End of MBR has partition table (start end
address of partitions) - First block of each partition has boot block
- Loaded by MBR and executed on boot
9Implementing Files
- Contiguous Allocation allocate files
contiguously on disk
10Linked List Allocation
- Each file is stored as linked list of blocks
- First word of each block points to next block
- Rest of disk block is file data
11Using an in-memory table
- Implement a linked list allocation using a table
- Called File Allocation Table (FAT)
- Take pointer away from blocks, store in this
table
12I-nodes
- Index-node (I-node) is a per-file data structure
- Lists attributes and disk addresses of files
blocks - Pros Space (max open files size per I-node)
- Cons what if file expands beyond I-node address
space?
13Implementing Directories
- When a file is opened, OS uses path name to find
dir - Directory has information about the files disk
blocks - Whole file (contiguous), first block
(linked-list) or I-node - Directory also has attributes of each file
- Directory map ASCII file name to file attributes
location - 2 options entries have all attributes, or point
to file I-node
14Implementing Directories
- What if files have large, variable-length names?
- Solution
- Limit file name length, say 255 chars, and use
previous scheme - Pros Simple Cons wastes space
- Directory entry comprises fixed and variable
portion - Fixed part starts with entry size, followed by
attributes - Variable part has the file name
- Pros saves space
- Cons holes on removal, page fault on file read,
word boundaries - Directory entries are fixed in length, pointer to
file name in heap - Pros easy removal, no space wasted for word
boundaries - Cons manage heap, page faults on file names
15Managing Free Disk Space
- 2 approaches to keep track of free disk blocks
- Linked list and bitmap approach
16Backup Strategies
- Physical Dump
- Start from block 0 of disk, write all blocks in
order, stop after last - Pros Simple to implement, speed
- Cons skip directories, incremental dumps,
restore some file - No point dumping unused blocks, avoiding it is a
big overhead - How to dump bad blocks?
- Logical Dump
- Start at a directory
- dump all directories and files changed since base
date - Base date could be of last incremental dump, last
full dump, etc. - Also dump all dirs (even unmodified) in path to a
modified file
17File System Consistency
- System crash before modified files written back
- Leads to inconsistency in FS
- fsck (UNIX) scandisk (Windows) check FS
consistency - Algorithm
- Build 2 tables, each containing counter for all
blocks (init to 0) - 1st table checks how many times a block is in a
file - 2nd table records how often block is present in
the free list - gt1 not possible if using a bitmap
- Read all i-nodes, and modify table 1
- Read free-list and modify table 2
- Consistent state if block is either in table 1 or
2, but not both
18FS Performance
- Access to disk is much slower than access to
memory - Optimizations needed to get best performance
- 3 possible approaches caching, prefetching, disk
layout - Block or buffer cache
- Read/write from and to the cache.
19Block Cache Replacement
- Which cache block to replace?
- Could use any page replacement algorithm
- Possible to implement perfect LRU
- Since much lesser frequency of cache access
- Move block to front of queue
- Perfect LRU is undesirable. We should also
answer - Is the block essential to consistency of system?
- Will this block be needed again soon?
- When to write back other blocks?
- Update daemon in UNIX calls sync system call
every 30 s - MS-DOS uses write-through caches
20LFS Basic Idea
- Structure the disk a log
- Periodically, all pending writes buffered in
memory are collected in a single segment - The entire segment is written contiguously at end
of the log - Segment may contain i-nodes, directory entries,
data - Start of each segment has a summary
- If segment around 1 MB, then full disk bandwidth
can be utilized - Note, i-nodes are now scattered on disk
- Maintain i-node map (entry i points to i-node i
on disk) - Part of it is cached, reducing the delay in
accessing i-node - This description works great for disks of
infinite size
21LFS Cleaning
- Finite disk space implies that the disk is
eventually full - Fortunately, some segments have stale information
- A file overwrite causes i-node to point to new
blocks - Old ones still occupy space
- Solution LFS Cleaner thread compacts the log
- Read segment summary, and see if contents are
current - File blocks, i-nodes, etc.
- If not, the segment is marked free, and cleaner
moves forward - Else, cleaner writes content into new segment at
end of the log - The segment is marked as free!
- Disk is a circular buffer, writer adds contents
to the front, cleaner cleans content from the back
22FS Examples
- DOS
- Win98
- WinXP
- UNIX FS
- Linux ext2FS
- NFS, AFS, LFS
- P2P FSes
23Network Stack Layering
Node A
Application
Node B
Application
Presentation
Presentation
Session
Session
Transport
Transport
Network
Network
Data Link
Data Link
Physical
Physical
Network
24OSI Levels
- Application
- high-level protocols (mail, ftp, etc.)
- Presentation Layer
- data representation in the message
- Session Layer
- really part of transport, typ. Not impl.
- Transport Layer
- reliable transmission of messages,
disassembly/assembly, ordering, retransmission of
lost packets - Network Layer
- routing packets to the destination
- Data Link Layer
- sending frames of bits and error detection
- Physical Layer
- electrical details of bits on the wire
25End-to-End Argument
- What function to implement in each layer?
- Saltzer, Reed, Clarke 1984
- A function can be correctly and completely
implemented only with the knowledge and help of
applications standing at the communication
endpoints - Argues for moving function upward in a layered
architecture - Should the network guarantee packet delivery ?
- Think about a file transfer program
- Read file from disk, send it, the receiver reads
packets and writes them to the disk
26Packet vs. Circuit Switching
- Reliability no congestion, in-order data in
circuit-switch - Packet switching better bandwidth use
- State, resources packet switching has less state
- Good less control plane processing resources
along the way - More data plane (address lookup) processing
- Failure modes (routers/links down)
- Packet switch reconfigures sub-second timescale
- Circuit switching more complicated
- Involves all switches in the path
27Link level Issues
- Encoding map bits to analog signals
- Framing Group bits into frames (packets)
- Arbitration multiple senders, one resource
- Addressing multiple receivers, one wire
28Repeaters and Bridges
- Both connect LAN segments
- Usually do not originate data
- Repeaters (Hubs) physical layer devices
- forward packets on all LAN segments
- Useful for increasing range
- Increases contention
- Bridges link layer devices
- Forward packets only if meant on that segment
- Isolates congestion
- More expensive
29Network Layer
- Two important functions
- routing determine path from source to dest.
- forwarding move packets from routers input to
output
T3
T1 T3
Sts-1
T1
30Two connection models
- Connectionless (or datagram)
- each packet contains enough information that
routers can decide how to get it to its final
destination - Connection-oriented (or virtual circuit)
- first set up a connection between two nodes
- label it (called a virtual circuit identifier
(VCI)) - all packets carry label
1
A
31DNS name servers
- Name server process running on a host that
processes DNS requests - local name servers
- each ISP, company has local (default) name server
- host DNS query first goes to local name server
- authoritative name server
- can perform name/address translation for a
specific domain or zone
- How could we provide this service? Why not
centralize DNS? - single point of failure
- traffic volume
- distant centralized database
- maintenance
- doesnt scale!
- no server has all name-to-IP address mappings
32Purpose of Transport layer
- Interface end-to-end applications and protocols
- Turn best-effort IP into a usable interface
- Data transfer b/w processes
- Compared to end-to-end IP
- We will look at 2
- TCP
- UDP
33UDP
- Unreliable Datagram Protocol
- Best effort data delivery between processes
- No frills, bare bones transport protocol
- Packet may be lost, out of order
- Connectionless protocol
- No handshaking between sender and receiver
- Each UDP datagram handled independently
34UDP Functionality
- Multiplexing/Demultiplexing
- Using ports
- Checksums (optional)
- Check for corruption
P3
P4
application-layer data
segment header
P1
P2
segment
H
t
M
segment
receiver
35TCP
- Transmission Control Protocol
- Reliable, in-order, process-to-process, two-way
byte stream - Different from UDP
- Connection-oriented
- Error recovery Packet loss, duplication,
corruption, reordering - A number of applications require this guarantee
- Web browsers use TCP
36TCP Summary
- Reliable ordered message delivery
- Connection oriented, 3-way handshake
- Transmission window for better throughput
- Timeouts based on link parameters
- Congestion control
- Linear increase, exponential backoff
- Fast adaptation
- Exponential increase in the initial phase
37Security in Computer Systems
- In computer systems, this translates to
- Authorization
- Authentication
- Audit
- This is the Gold Standard for Security (Lampson)
- Some security goals
- Data confidentiality secret data remains secret
- Data integrity no tampering of data
- System availability unable to make system
unusable - Privacy protecting from misuse of users
information
38Cryptography Overview
- Encrypt data so it only makes sense to authorized
users - Input data is a message or file called plaintext
- Encrypted data is called ciphertext
- Encryption and decryption functions should be
public - Security by obscurity is not a good idea!
39Secret-Key Cryptography
- Also called symmetric cryptography
- Encryption algorithm is publicly known
- E(message, key) ciphertext D(ciphertext,
key) message - Naïve scheme monoalphabetic substitution
- Plaintext ABCDEFGHIJKLMNOPQRSTUVWXYZ
- Ciphertext QWERTYUIOPASDFGHJKLZXCVBNM
- So, attack is encrypted to qzzqea
- 26! possible keys 4x1026 possibilities
- 1 µs per permutation ? 10 trillion years to break
- easy to break this scheme! How?
- e occurs 14, t 9.85, q 0.26
40Public Key Cryptography
- Diffie and Hellman, 1976
- All users get a public key and a private key
- Public key is published
- Private key is not known to anyone else
- If Alice has a packet to send to Bob,
- She encrypts the packet with Bobs public key
- Bob uses his private key to decrypt Alices
packet - Private key linked mathematically to public key
- Difficult to derive by making it computationally
infeasible (RSA) - Pros more security, convenient, digital
signatures - Cons slower
41Digital Signatures
- Hashing function hard to invert, e.g. MD5, SHA
- Apply private key to hash (decrypt hash)
- Called signature block
- Receiver uses senders public key on signature
block - E(D(x)) x should work (works for RSA)
42Authentication
- Establish the identity of user/machine by
- Something you know (password, secret)
- Something you have (credit card, smart card)
- Something you are (retinal scan, fingerprint)
- In the case of an OS this is done during login
- OS wants to know who the user is
- Passwords secret known only to the subject
- Simplest OS implementation keeps (login,
password) pair - Authenticates user on login by checking the
password - Try to make this scheme as secure as possible!
- Display the password when being typed? (Windows,
UNIX)
43Salting Example
- If the hacker guesses Dog, he has to try Dog0001,
- UNIX adds 12-bit of salt
- Passwords should be made secure
- Length, case, digits, not from dictionary
- Can be imposed by the OS! This has its own
tradeoffs
44One time passwords
- Password lasts only once
- User gets book with passwords
- Each login uses next password in list
- Much easier approach (Lamport 1981)
- Uses one-way hash functions
- User stores Server stores
- uid, passwd uid, n, m, H hm(passwd)
-
- nn-1
- S hn(password)
- if(hm-n(S) H)
- then mnHSaccept
- else reject
uid
n
S
45Security Attacks Defenses
- Attacks
- Trojan Horses
- Login spoofing
- Logic bombs
- Trapdoors
- Buffer overflows
- Viruses, worms
- Denial of Service
- Defenses
- Virus Scanners
- Lures
- Intrusion Detection
46Mobile Code Protection
- Can we place extension code in the same address
space as the base system, yet remain secure ? - Many techniques have been proposed
- SFI
- Safe interpreters
- Language-based protection
- PCC
47Encoding Security
- Depends on how a system represents the Matrix
- Not much sense in storing entire matrix!
- ACL column for each object stored as a list for
the object - Capabilities row for each subject stored as list
for the subject
Cs414 grades Cs415 grades Emacs
Ranveer r/w r/w Kill/resume
Tom r r/w None
Mohamed r r None
48Protecting Capabilities
- Prevent users from tampering with capabilities
- Tagged Architecture
- Each memory word has extra bit indicating that it
is a capability - These bits can only be modified in kernel mode
- Cannot be used for arithmetic, etc.
- Sparse name space implementation
- Kernel stores capability as objectrightsrandom
number - Give copy of capability to the user user can
transfer rights - Relies on inability of user to guess the random
number - Need a good random number generator
49Protecting Capabilities
- Kernel capabilities per-process capability
information - Store the C-list in kernel memory
- Process access capabilities by offset into the
C-list - Indirection used to make capabilities unforgeable
- Meta instructions to add/delete/modify
capabilities
50Protecting Capabilities
- Cryptographically protected capabilities
- Store capabilities in user space useful for
distributed systems - Store ltserver, object, rights, f(object, rights,
check)gt tuple - The check is a nonce,
- unique number generated when capability is
created - kept with object on the server never sent on the
network - Language-protected capabilities
- SPIN operating system (Mesa, Java, etc.)
51Capability Revocation
- Kernel based implementation
- Kernel keeps track of all capabilities
invalidates on revocation - Object keeps track of revocation list
- Difficult to implement
- Timeout the capabilities
- How long should the expiration timer be?
- Revocation by indirection
- Grant access to object by creating alias give
capability to alias - Difficult to review all capabilities
- Revocation with conditional capabilities
- Object has state called big bag
- Access only if capabilitys little bag has sth.
in objects big bag
52Distributed SystemsEvent Ordering
- Fundamental problem distributed systems do not
share a clock - Many coordination problems would be simplified if
they did (first one wins) - Distributed systems do have some sense of time
- Events in a single process happen in order
- Messages between processes must be sent before
they can be received - How helpful is this?
53Happens-before
- Define a happens-before relation (denoted by ?).
- 1) If A and B are events in the same process, and
A was executed before B, then A ? B. - 2) If A is the event of sending a message by one
process and B is the event of receiving that
message by another process, then A ? B. - 3) If A ? B and B ? C then A ? C.
- Logical Clocks using happens-before relation
- Clocks tick once per event (including message
send) - When send a message, send your clock value
- When receive a message, set your clock to MAX(
your clock, timestamp of message 1) - Thus sending comes before receiving
- Only visibility into actions at other nodes
happens during communication, communicate
synchronizes the clocks
54Total ordering?
- Happens-before gives a partial ordering of events
- We still do not have a total ordering of events
- We are not able to order events that happen
concurrently - Concurrent if (not A?B) and (not B?A)
55Partial Ordering
Pi -gtPi1 Qi -gt Qi1 Ri -gt Ri1
R0-gtQ4 Q3-gtR4 Q1-gtP4 P1-gtQ2
56Total Ordering?
P0, P1, Q0, Q1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
P0, Q0, Q1, P1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
P0, Q0, P1, Q1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
57 Logical Timestamps w/ tie breaker
- Assume each process has a local logical clock
that ticks once per event and that the processes
are numbered - Clocks tick once per event (including message
send) - When send a message, send your clock value
- When receive a message, set your clock to MAX(
your clock, timestamp of message 1) - Thus sending comes before receiving
- Only visibility into actions at other nodes
happens during communication, communicate
synchronizes the clocks - If the timestamps of two events A and B are the
same, then use the process identity numbers to
break ties. - This gives a total ordering!
58Categories of failures
- Crash failures
- process simply stops, and does nothing wrong that
would be externally visible before it stops - These faults cant be directly detected
- These are common in real systems
- Fail-stop failures
- Process fails by crashing and notifies everyone
beforehand - Easy to work with but rarely supported
- Non-malicious Byzantine failures
- This is the best way to understand many kinds of
bugs - Process can do arbitrary behavior, including
sending corrupted messages - But it doesnt do so with the intention of
screwing up our protocols - Unfortunately, a pretty common mode of failure
- Malicious, true Byzantine, failures
- Model of an attacker who has studied the system
and wants to break it - can corrupt, intercept, replay messages,
compromise programs, etc - Does not happen often in practice
- These failures are costly to defend against
59Distributed Decision Making
- Two-phase commit distributed decision making
- First, make sure everyone guarantees that they
will commit if asked (prepare) - Next, ask everyone to commit
- Assumes crash or fail-stop faults
- Byzantine Generals Problem distributed decision
making with malicious failures - n general some number of them may be malicious
(upto f of them) - All non-malicious generals must come to same
decision - Only solvable if n ? 3f1, but costs (f1)n2
,messages
60Distributed File Systems
- RPC
- Call procedure on remote machine
- Provides same interface as procedure
- Automatic packing and unpacking of arguments
without user programming (in stub) - NFS
- Simple distributed file system protocol. No
open/close - Stateless server
- Has problems with cache consistency, locking
protocol - AFS
- More complicated distributed file system protocol
- Stateful server
- session semantics consistency on close
61Thats it!
- Good luck and have a good break!