CS 4410 Review II

About This Presentation

Title:

CS 4410 Review II

Description:

Crash recovery: scan corresponding blocks on both disks ... Lists attributes and disk addresses of file's blocks ... from block 0 of disk, write all blocks in ... – PowerPoint PPT presentation

Number of Views:96

Avg rating:3.0/5.0

Slides: 62

Provided by: ranveer7

Learn more at: http://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS 4410 Review II

1
CS 4410 Review II
2
Announcements

Today is the last day of class!!!
You are (almost) done!
Final, Thursday, December 18th, 2-430pm (2½ hour
exam)
Room 131 Warren Hall
Exam will be comprehensive, covers entire
semester
Closed book, no calculators/PDAs/
Bring ID
I will be out of town next week (Monday, Dec 8th
- 10th)
Fill out evaluations online
Consider being a TA next semester!

3
What does the disk look like?
4
Disk overheads

To read from disk, we must specify
cylinder , surface , sector , transfer size,
memory address
Transfer time includes
Seek time to get to the track
Latency time to get to the sector and
Transfer time get bits off the disk

Track
Sector
Rotation Delay
Seek Time
5
Disk Scheduling

FCFS
SSTF
SCAN
C-SCAN
LOOK
C-LOOK

6
RAID Levels

0 Striping
1 Mirroring
2 Hamming Codes
3 Parity Bit
4 Block Striping
5 Spread parity blocks across all disks
01 and 10

7
Stable Storage Algo

Use 2 identical disks
corresponding blocks on both drives are the same
3 operations
Stable write retry on 1st until successful, then
try 2nd disk
Stable read read from 1st. If ECC error, then
try 2nd
Crash recovery scan corresponding blocks on both
disks
If one block is bad, replace with good one
If both are good, replace block in 2nd with the
one in 1st

8
File System Layout

File System is stored on disks
Disk is divided into 1 or more partitions
Sector 0 of disk called Master Boot Record
End of MBR has partition table (start end
address of partitions)
First block of each partition has boot block
Loaded by MBR and executed on boot

9
Implementing Files

Contiguous Allocation allocate files
contiguously on disk

10
Linked List Allocation

Each file is stored as linked list of blocks
First word of each block points to next block
Rest of disk block is file data

11
Using an in-memory table

Implement a linked list allocation using a table
Called File Allocation Table (FAT)
Take pointer away from blocks, store in this
table

12
I-nodes

Index-node (I-node) is a per-file data structure
Lists attributes and disk addresses of files
blocks
Pros Space (max open files size per I-node)
Cons what if file expands beyond I-node address
space?

13
Implementing Directories

When a file is opened, OS uses path name to find
dir
Directory has information about the files disk
blocks
Whole file (contiguous), first block
(linked-list) or I-node
Directory also has attributes of each file
Directory map ASCII file name to file attributes
location
2 options entries have all attributes, or point
to file I-node

14
Implementing Directories

What if files have large, variable-length names?
Solution
Limit file name length, say 255 chars, and use
previous scheme
Pros Simple Cons wastes space
Directory entry comprises fixed and variable
portion
Fixed part starts with entry size, followed by
attributes
Variable part has the file name
Pros saves space
Cons holes on removal, page fault on file read,
word boundaries
Directory entries are fixed in length, pointer to
file name in heap
Pros easy removal, no space wasted for word
boundaries
Cons manage heap, page faults on file names

15
Managing Free Disk Space

2 approaches to keep track of free disk blocks
Linked list and bitmap approach

16
Backup Strategies

Physical Dump
Start from block 0 of disk, write all blocks in
order, stop after last
Pros Simple to implement, speed
Cons skip directories, incremental dumps,
restore some file
No point dumping unused blocks, avoiding it is a
big overhead
How to dump bad blocks?
Logical Dump
Start at a directory
dump all directories and files changed since base
date
Base date could be of last incremental dump, last
full dump, etc.
Also dump all dirs (even unmodified) in path to a
modified file

17
File System Consistency

System crash before modified files written back
Leads to inconsistency in FS
fsck (UNIX) scandisk (Windows) check FS
consistency
Algorithm
Build 2 tables, each containing counter for all
blocks (init to 0)
1st table checks how many times a block is in a
file
2nd table records how often block is present in
the free list
gt1 not possible if using a bitmap
Read all i-nodes, and modify table 1
Read free-list and modify table 2
Consistent state if block is either in table 1 or
2, but not both

18
FS Performance

Access to disk is much slower than access to
memory
Optimizations needed to get best performance
3 possible approaches caching, prefetching, disk
layout
Block or buffer cache
Read/write from and to the cache.

19
Block Cache Replacement

Which cache block to replace?
Could use any page replacement algorithm
Possible to implement perfect LRU
Since much lesser frequency of cache access
Move block to front of queue
Perfect LRU is undesirable. We should also
answer
Is the block essential to consistency of system?
Will this block be needed again soon?
When to write back other blocks?
Update daemon in UNIX calls sync system call
every 30 s
MS-DOS uses write-through caches

20
LFS Basic Idea

Structure the disk a log
Periodically, all pending writes buffered in
memory are collected in a single segment
The entire segment is written contiguously at end
of the log
Segment may contain i-nodes, directory entries,
data
Start of each segment has a summary
If segment around 1 MB, then full disk bandwidth
can be utilized
Note, i-nodes are now scattered on disk
Maintain i-node map (entry i points to i-node i
on disk)
Part of it is cached, reducing the delay in
accessing i-node
This description works great for disks of
infinite size

21
LFS Cleaning

Finite disk space implies that the disk is
eventually full
Fortunately, some segments have stale information
A file overwrite causes i-node to point to new
blocks
Old ones still occupy space
Solution LFS Cleaner thread compacts the log
Read segment summary, and see if contents are
current
File blocks, i-nodes, etc.
If not, the segment is marked free, and cleaner
moves forward
Else, cleaner writes content into new segment at
end of the log
The segment is marked as free!
Disk is a circular buffer, writer adds contents
to the front, cleaner cleans content from the back

22
FS Examples

DOS
Win98
WinXP
UNIX FS
Linux ext2FS
NFS, AFS, LFS
P2P FSes

23
Network Stack Layering
Node A
Application
Node B
Application
Presentation
Presentation
Session
Session
Transport
Transport
Network
Network
Data Link
Data Link
Physical
Physical
Network
24
OSI Levels

Application
high-level protocols (mail, ftp, etc.)
Presentation Layer
data representation in the message
Session Layer
really part of transport, typ. Not impl.
Transport Layer
reliable transmission of messages,
disassembly/assembly, ordering, retransmission of
lost packets
Network Layer
routing packets to the destination
Data Link Layer
sending frames of bits and error detection
Physical Layer
electrical details of bits on the wire

25
End-to-End Argument

What function to implement in each layer?
Saltzer, Reed, Clarke 1984
A function can be correctly and completely
implemented only with the knowledge and help of
applications standing at the communication
endpoints
Argues for moving function upward in a layered
architecture
Should the network guarantee packet delivery ?
Think about a file transfer program
Read file from disk, send it, the receiver reads
packets and writes them to the disk

26
Packet vs. Circuit Switching

Reliability no congestion, in-order data in
circuit-switch
Packet switching better bandwidth use
State, resources packet switching has less state
Good less control plane processing resources
along the way
More data plane (address lookup) processing
Failure modes (routers/links down)
Packet switch reconfigures sub-second timescale
Circuit switching more complicated
Involves all switches in the path

27
Link level Issues

Encoding map bits to analog signals
Framing Group bits into frames (packets)
Arbitration multiple senders, one resource
Addressing multiple receivers, one wire

28
Repeaters and Bridges

Both connect LAN segments
Usually do not originate data
Repeaters (Hubs) physical layer devices
forward packets on all LAN segments
Useful for increasing range
Increases contention
Bridges link layer devices
Forward packets only if meant on that segment
Isolates congestion
More expensive

29
Network Layer

Two important functions
routing determine path from source to dest.
forwarding move packets from routers input to
output

T3
T1 T3
Sts-1
T1
30
Two connection models

Connectionless (or datagram)
each packet contains enough information that
routers can decide how to get it to its final
destination
Connection-oriented (or virtual circuit)
first set up a connection between two nodes
label it (called a virtual circuit identifier
(VCI))
all packets carry label

1
A
31
DNS name servers

Name server process running on a host that
processes DNS requests
local name servers
each ISP, company has local (default) name server
host DNS query first goes to local name server
authoritative name server
can perform name/address translation for a
specific domain or zone

How could we provide this service? Why not
centralize DNS?
single point of failure
traffic volume
distant centralized database
maintenance
doesnt scale!
no server has all name-to-IP address mappings

32
Purpose of Transport layer

Interface end-to-end applications and protocols
Turn best-effort IP into a usable interface
Data transfer b/w processes
Compared to end-to-end IP
We will look at 2
TCP
UDP

33
UDP

Unreliable Datagram Protocol
Best effort data delivery between processes
No frills, bare bones transport protocol
Packet may be lost, out of order
Connectionless protocol
No handshaking between sender and receiver
Each UDP datagram handled independently

34
UDP Functionality

Multiplexing/Demultiplexing
Using ports
Checksums (optional)
Check for corruption

P3
P4
application-layer data
segment header
P1
P2
segment
H
t
M
segment
receiver
35
TCP

Transmission Control Protocol
Reliable, in-order, process-to-process, two-way
byte stream
Different from UDP
Connection-oriented
Error recovery Packet loss, duplication,
corruption, reordering
A number of applications require this guarantee
Web browsers use TCP

36
TCP Summary

Reliable ordered message delivery
Connection oriented, 3-way handshake
Transmission window for better throughput
Timeouts based on link parameters
Congestion control
Linear increase, exponential backoff
Fast adaptation
Exponential increase in the initial phase

37
Security in Computer Systems

In computer systems, this translates to
Authorization
Authentication
Audit
This is the Gold Standard for Security (Lampson)
Some security goals
Data confidentiality secret data remains secret
Data integrity no tampering of data
System availability unable to make system
unusable
Privacy protecting from misuse of users
information

38
Cryptography Overview

Encrypt data so it only makes sense to authorized
users
Input data is a message or file called plaintext
Encrypted data is called ciphertext
Encryption and decryption functions should be
public
Security by obscurity is not a good idea!

39
Secret-Key Cryptography

Also called symmetric cryptography
Encryption algorithm is publicly known
E(message, key) ciphertext D(ciphertext,
key) message
Naïve scheme monoalphabetic substitution
Plaintext ABCDEFGHIJKLMNOPQRSTUVWXYZ
Ciphertext QWERTYUIOPASDFGHJKLZXCVBNM
So, attack is encrypted to qzzqea
26! possible keys 4x1026 possibilities
1 µs per permutation ? 10 trillion years to break
easy to break this scheme! How?
e occurs 14, t 9.85, q 0.26

40
Public Key Cryptography

Diffie and Hellman, 1976
All users get a public key and a private key
Public key is published
Private key is not known to anyone else
If Alice has a packet to send to Bob,
She encrypts the packet with Bobs public key
Bob uses his private key to decrypt Alices
packet
Private key linked mathematically to public key
Difficult to derive by making it computationally
infeasible (RSA)
Pros more security, convenient, digital
signatures
Cons slower

41
Digital Signatures

Hashing function hard to invert, e.g. MD5, SHA
Apply private key to hash (decrypt hash)
Called signature block
Receiver uses senders public key on signature
block
E(D(x)) x should work (works for RSA)

42
Authentication

Establish the identity of user/machine by
Something you know (password, secret)
Something you have (credit card, smart card)
Something you are (retinal scan, fingerprint)
In the case of an OS this is done during login
OS wants to know who the user is
Passwords secret known only to the subject
Simplest OS implementation keeps (login,
password) pair
Authenticates user on login by checking the
password
Try to make this scheme as secure as possible!
Display the password when being typed? (Windows,
UNIX)

43
Salting Example

If the hacker guesses Dog, he has to try Dog0001,
UNIX adds 12-bit of salt
Passwords should be made secure
Length, case, digits, not from dictionary
Can be imposed by the OS! This has its own
tradeoffs

44
One time passwords

Password lasts only once
User gets book with passwords
Each login uses next password in list
Much easier approach (Lamport 1981)
Uses one-way hash functions
User stores Server stores
uid, passwd uid, n, m, H hm(passwd)
nn-1
S hn(password)
if(hm-n(S) H)
then mnHSaccept
else reject

uid
n
S
45
Security Attacks Defenses

Attacks
Trojan Horses
Login spoofing
Logic bombs
Trapdoors
Buffer overflows
Viruses, worms
Denial of Service
Defenses
Virus Scanners
Lures
Intrusion Detection

46
Mobile Code Protection

Can we place extension code in the same address
space as the base system, yet remain secure ?
Many techniques have been proposed
SFI
Safe interpreters
Language-based protection
PCC

47
Encoding Security

Depends on how a system represents the Matrix
Not much sense in storing entire matrix!
ACL column for each object stored as a list for
the object
Capabilities row for each subject stored as list
for the subject

Cs414 grades Cs415 grades Emacs
Ranveer r/w r/w Kill/resume
Tom r r/w None
Mohamed r r None
48
Protecting Capabilities

Prevent users from tampering with capabilities
Tagged Architecture
Each memory word has extra bit indicating that it
is a capability
These bits can only be modified in kernel mode
Cannot be used for arithmetic, etc.
Sparse name space implementation
Kernel stores capability as objectrightsrandom
number
Give copy of capability to the user user can
transfer rights
Relies on inability of user to guess the random
number
Need a good random number generator

49
Protecting Capabilities

Kernel capabilities per-process capability
information
Store the C-list in kernel memory
Process access capabilities by offset into the
C-list
Indirection used to make capabilities unforgeable
Meta instructions to add/delete/modify
capabilities

50
Protecting Capabilities

Cryptographically protected capabilities
Store capabilities in user space useful for
distributed systems
Store ltserver, object, rights, f(object, rights,
check)gt tuple
The check is a nonce,
unique number generated when capability is
created
kept with object on the server never sent on the
network
Language-protected capabilities
SPIN operating system (Mesa, Java, etc.)

51
Capability Revocation

Kernel based implementation
Kernel keeps track of all capabilities
invalidates on revocation
Object keeps track of revocation list
Difficult to implement
Timeout the capabilities
How long should the expiration timer be?
Revocation by indirection
Grant access to object by creating alias give
capability to alias
Difficult to review all capabilities
Revocation with conditional capabilities
Object has state called big bag
Access only if capabilitys little bag has sth.
in objects big bag

52
Distributed SystemsEvent Ordering

Fundamental problem distributed systems do not
share a clock
Many coordination problems would be simplified if
they did (first one wins)
Distributed systems do have some sense of time
Events in a single process happen in order
Messages between processes must be sent before
they can be received
How helpful is this?

53
Happens-before

Define a happens-before relation (denoted by ?).
1) If A and B are events in the same process, and
A was executed before B, then A ? B.
2) If A is the event of sending a message by one
process and B is the event of receiving that
message by another process, then A ? B.
3) If A ? B and B ? C then A ? C.
Logical Clocks using happens-before relation
Clocks tick once per event (including message
send)
When send a message, send your clock value
When receive a message, set your clock to MAX(
your clock, timestamp of message 1)
Thus sending comes before receiving
Only visibility into actions at other nodes
happens during communication, communicate
synchronizes the clocks

54
Total ordering?

Happens-before gives a partial ordering of events
We still do not have a total ordering of events
We are not able to order events that happen
concurrently
Concurrent if (not A?B) and (not B?A)

55
Partial Ordering
Pi -gtPi1 Qi -gt Qi1 Ri -gt Ri1
R0-gtQ4 Q3-gtR4 Q1-gtP4 P1-gtQ2
56
Total Ordering?
P0, P1, Q0, Q1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
P0, Q0, Q1, P1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
P0, Q0, P1, Q1, Q2, P2, P3, P4, Q3, R0, Q4, R1,
R2, R3, R4
57
Logical Timestamps w/ tie breaker

Assume each process has a local logical clock
that ticks once per event and that the processes
are numbered
Clocks tick once per event (including message
send)
When send a message, send your clock value
When receive a message, set your clock to MAX(
your clock, timestamp of message 1)
Thus sending comes before receiving
Only visibility into actions at other nodes
happens during communication, communicate
synchronizes the clocks
If the timestamps of two events A and B are the
same, then use the process identity numbers to
break ties.
This gives a total ordering!

58
Categories of failures

Crash failures
process simply stops, and does nothing wrong that
would be externally visible before it stops
These faults cant be directly detected
These are common in real systems
Fail-stop failures
Process fails by crashing and notifies everyone
beforehand
Easy to work with but rarely supported
Non-malicious Byzantine failures
This is the best way to understand many kinds of
bugs
Process can do arbitrary behavior, including
sending corrupted messages
But it doesnt do so with the intention of
screwing up our protocols
Unfortunately, a pretty common mode of failure
Malicious, true Byzantine, failures
Model of an attacker who has studied the system
and wants to break it
can corrupt, intercept, replay messages,
compromise programs, etc
Does not happen often in practice
These failures are costly to defend against