Final Review

About This Presentation

Title:

Final Review

Description:

Eric and Guppy can now execute SELECT queries on the view YoungSailors. Note, however, that Eric and Guppy don't have the right to execute SELECT ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 63

Provided by: scie232

Category:

more less

Transcript and Presenter's Notes

Title: Final Review

1
Final Review
2
SQL

Will be in the exam.
So, refresh midterm review.
Constraints as well.

3
Security and Authorization
4
GRANT Command

GRANT privileges ON object TO users WITH GRANT
OPTION
The following privileges can be specified
SELECT Can read all columns
including those added later via ALTER TABLE
command
INSERT(column-name) Can insert tuples with
non-null or nondefault values in this column.
INSERT means same right with respect to all
columns.
DELETE Can delete tuples.
REFERENCES (column-name) Can define foreign keys
(in other tables) that refer to this column.

5
Grant Examples I

Joe has created the tables
Sailors(sid, sname, rating, age)
Boats(bid, bname, color)
Reserves(sid, bid, day)
Joe now executes the following
GRANT INSERT, DELETE ON Reserves TO Yuppy WITH
GRANT OPTION
Yuppy can now insert or delete Reserves rows and
authorize someone else to do the same.

6
Grant Examples II

Joe further executes
GRANT SELECT ON Reserves TO Michael
GRANT SELECT ON Sailors TO Michael WITH GRANT
OPTION
Michael can now execute SELECT queries on Sailors
and Reserves, and he can pass this privilege to
others for Sailors but not for Reserves.
With the SELECT privilege, Michael can create a
view that accesses the Sailors and Reserves
tables, for example, the ActiveSailors view
CREATE VIEW ActiveSailors (name, age, day) AS
SELECT S.sname, S.age, R.day
FROM Sailors S, Reserves R
WHERE S.sid R.sid AND S.rating gt 6
However, Michael cannot grant SELECT on
ActiveSailors to others. Why?

7
Grant Examples III

On the other hand, suppose that Michael creates
the following view
CREATE VIEW YoungSailors (sid, age, rating)AS
SELECT S.sid, S.age, S.rating
FROM Sailors S
WHERE S.age lt 18
The only underlying table is Sailors, for which
Michael has SELECT with grant option. Therefore
he can pass this on to Eric and Guppy
GRANT SELECT ON YoungSailors TO Eric, Guppy
Eric and Guppy can now execute SELECT queries on
the view YoungSailors.
Note, however, that Eric and Guppy dont have the
right to execute SELECT queries directly on the
underlying Sailor table.

8
Grant Examples IV

Suppose now Joe executes
GRANT UPDATE (rating) ON Sailors TO Leah
Leah can update only the rating column of
Sailors. E.g.
UPDATE Sailors S
SET S.rating 8
However, she cannot execute
UPDATE Sailors S
SET S.age 25
She cannot execute either
UPDATE Sailors S
SET S.rating S.rating-l
Why?

9
Grant Examples V

Suppose now Joe executes
GRANT SELECT, REFERENCES(bid) ON Boats TO Bill
Bill can refer to the bid column of Boats as a
foreign key in another table. E.g.
CREATE TABLE Reserves (
sid INTEGER,
bid INTEGER,
day DATE,
PRIMARY KEY (bid, day),
FOREIGN KEY (sid) REFERENCES Sailors,
FOREIGN KEY (bid) REFERENCES Boats
)

10
Revoke Examples I

REVOKE GRANT OPTION FOR privileges
ON object FROM users RESTRICT CASCADE
Suppose Joe is the creator of Sailors.
GRANT SELECT ON Sailors TO Art WITH GRANT OPTION
(executed by Joe)
GRANT SELECT ON Sailors TO Bob WITH GRANT OPTION
(executed by Art)
REVOKE SELECT ON Sailors FROM Art CASCADE
(executed by Joe)

11
Revoke Examples II

Art loses the SELECT privilege on Sailors.
Then Bob, who received this privilege from Art,
and only Art, also loses this privilege.
Bobs privilege is said to be abandoned
When CASCADE is specified, all abandoned
privileges are also revoked
Possibly causing privileges held by other users
to become abandoned and thereby revoked
recursively.
If the RESTRICT keyword is specified, the command
is rejected if revoking privileges causes other
privileges becoming abandoned.

12
Grant and Revoke on Views

Suppose that Joe created Sailors and gave Michael
the SELECT privilege on it with the grant option.
Michael then created the view YoungSailors and
gave Eric the SELECT privilege on YoungSailors.
Eric now defines a view called FineYoungSailors
CREATE VIEW FineYoungSailors (name, age, rating)
AS
SELECT S.sname, S.age, S.rating
FROM YoungSailors S
WHERE S.ratinggt 6
What happens if Joe revokes the SELECT privilege
on Sailors from Michael?
Michael no longer has the authority to execute
the query used to define YoungSailors because the
definition refers to Sailors.
Therefore, the view YoungSailors is dropped
(I.e., destroyed).
In turn, FineYoungSailors is dropped as well.

13
Revoking REFERENCES privilege

Suppose Joe had executed
GRANT REFERENCES(bid) ON Boats TO Bill
Bill can refer to the bid column of Boats as a
foreign key in another table. E.g.
CREATE TABLE Reserves (
sid INTEGER,
bid INTEGER,
day DATE,
PRIMARY KEY (bid, day),
FOREIGN KEY (sid) REFERENCEs Sailors,
FOREIGN KEY (bid) REFERENCES Boats
)
If Joe revokes the REFERENCES privilege from
Bill,
then the Foreign Key constraint referencing the
Boat table will be dropped from the Bills
Reserves table.

14
Storage
15
The Memory Hierarchy
Desired data carried to read/write port, access
times in seconds. Most common racks of tapes
newer devices CDROM juke boxes, tape
silo's. Capacities in terabytes.

Typically magnetic disks, magnetooptical
(erasable), CDROM.
Access times in milliseconds, great
variability.
Unit of read/write block or page,
typically 16Kb.
Capacities in gigabytes.

under a microsecond, random access, perhaps 512Mb
fastest, perhaps 1Mb
16
Disks
To motivate many of the ideas used in DBMSes, we
must examine the operation of disks in detail.

Platters with top and bottom surfaces rotate
around a spindle.
Diameters 1 inch to 4 feet.
2--30 surfaces.
Rotation speed 3600--7200 rpm.
One head per surface.
All heads move in and out in unison.

17
Tracks and sectors

Surfaces are covered with concentric tracks.
Tracks at a common radius cylinder.
Important because all data of a cylinder can be
read quickly, without moving the heads.
Typical magnetic disk 16,000 cylinders
Tracks are divided into sectors by unmagnetized
gaps (which are 10 of track).
Typical track 512 sectors.
Typical sector 4096 bytes.
Sectors are grouped into blocks.
Typical one 16K block 4 4096byte sectors.

18
I/O model of computation

Disk I/O read or write of a block is very
expensive compared with what is likely to be done
with the block once it arrives in main memory.
Perhaps 1,000,000 machine instructions in the
time to do one random disk I/O.
Random block accesses is the norm if there are
several processes accessing disks, and the disk
controller does not schedule accesses carefully.
Reasonable model of computation that requires
secondary storage count only the disk I/O's.

19
TwoPhase, Multiway Merge Sort

Merge Sort still not very good in disk I/O model.
log2n passes, so each record is read/written from
disk log2n times.
The secondary memory algorithms operate in a
small number of passes
in one pass every record is read into main memory
once and written out to disk once.
2PMMS 2 reads 2 writes per block.
Phase 1
1. Fill main memory with records.
2. Sort using favorite mainmemory sort.
3. Write sorted sublist to disk.
4. Repeat until all records have been put into
one of the sorted lists.

20
Phase 2

Use one buffer for each of the sorted sublists
and one buffer for an output block.

Initially load input buffers with the first
blocks of their respective sorted lists.
Repeatedly run a competition among the first
unchosen records of each of the buffered blocks.
Move the record with the least key to the output
block it is now chosen.
Manage the buffers as needed
If an input block is exhausted, get the next
block from the same file.
If the output block is full, write it to disk.

21
Reasons to limit the block size

First, we cannot use blocks that cover several
tracks effectively.
Second, small relations would occupy only a
fraction of a block, so large blocks would waste
space on the disk.
The larger the blocks are, the fewer records we
can sort by 2PMMS (see next slide).
Nevertheless, as machines get faster and disks
more capacious, there is a tendency for block
sizes to grow.

22
How many records can we sort?

The block size is B bytes.
The main memory available for buffering blocks is
M bytes.
Records take R bytes.
Number of main memory buffers M/B blocks
We need one output buffer, so we can actually use
(M/B)-1 input buffers.
How many sorted sublists makes sense to produce?
(M/B)-1.
Whats the total number of records we can sort?
Each time we fill in the memory we sort M/R
records.
Hence, we are able to sort (M/R)(M/B)-1 or
approximately M2/RB.
If we use the parameters in the example about
TPMMS we have
M100MB 100,000,000 Bytes 108 Bytes
B 16,384 Bytes
R 160 Bytes
So, M2/RB (108)2 / (160 16,384) 4.2 billion
records, or 2/3 of a TeraByte.

23
Sorting larger relations

If our relation is bigger, then, we can use 2PMMS
to create sorted sublists of M2/RB records.
Then, in a third pass we can merge (M/B)-1 of
these sorted sublists.
The third phase lets us sort
(M/B)-1M2/RB ? M3/RB2 records
For our example, the third phase lets us sort 75
trillion records occupying 7500 Petabytes!!

24
Primary Indexes

Dense Indexes
Pointer to every record of a sequential file,
(ordered by search key).
Can make sense because records may be much bigger
than keypointer pairs.
Fit index in memory, even if data file does not?
Faster search through index than data file?
Test existence of record without going to data
file.
Sparse Indexes
Keypointer pairs for only a subset of records,
typically first in each block.
Saves index space.

25
Dense Index
26
Secondary Indexes

A primary index is an index on a sorted file.
Such an index controls the placement of records
to be primary,
Secondary index index that does not control
placement, surely not on a file sorted by its
search key.
Sparse, secondary index makes no sense.
Usually, search key is not a key.

27
Indirect Buckets

To avoid repeating keys in index, use a level of
indirection, called buckets.
Additional advantage allows intersection of sets
of records without looking at records themselves.
Example
Movies(title, year, length, studioName)
secondary indexes on studioName and year.
SELECT title
FROM Movies
WHERE
studioName 'Disney' AND
year 1995

28
(No Transcript)
29
BTrees

Generalizes multilevel index.
Number of levels varies with size of data file,
but is often 3.
B tree form we'll discuss.
All nodes have same format n keys, n 1
pointers.
Useful for primary, secondary indexes, primary
keys, nonkeys.
Leaf has at least
key-pointer pairs
Interior nodes use at least
pointers.

30
A typical leaf and interior node (unclusttered
index)
31
Lookup
13

Try to find a record with search key 40.
7
23 31 43

2 3 5
7 11
13 17 19
23 29
31 37 41
43 47

Recursive procedure
If we are at a leaf, look among the keys there.
If the i-th key is K, the the i-th pointer will
take us to the desired record.
If we are at an internal node with keys
K1,K2,,Kn, then if KltK1we follow the first
pointer, if K1?KltK2 we follow the second pointer,
and so on.

32
Insertion
Try to insert a search key 40. First, lookup
for it, in order to find where to insert.
13

7
23 31 43

2 3 5
7 11
13 17 19
23 29
31 37 41
43 47

It has to go here, but the node is full!
33
Beginning of the insertion of key 40
13

7
23 31 43

2 3 5
7 11
13 17 19
23 29
43 47

31 37
40 41
Whats the problem? No parent yet for the new
node!

Observe the new node and the redistribution of
keys and pointers
34
Continuing of the Insertion of key 40
We must now insert a pointer to the new leaf into
this node. We must also associate with this
pointer the key 40, which is the least key
reachable through the new leaf. But the node is
full. Thus it too must split!
13

7
23 31 43

2 3 5
7 11
13 17 19
23 29
43 47

31 37
40 41

35
Completing of the Insertion of key 40
13

This is a new node.
7
23 31
43

2 3 5
7 11
13 17 19
23 29
43 47

We have to redistribute 3 keys and 4 pointers.
We leave three pointers in the existing node and
give two pointers to the new node. 43 goes in the
new node.
But where the key 40 goes?
40 is the least key reachable via the new node.

31 37
40 41

36
Completing of the Insertion of key 40
It goes here! 40 is the least key reachable via
the new node.
13 40

7
23 31
43

2 3 5
7 11
13 17 19
23 29
43 47

31 37
40 41

37
Structure of B-trees

Degree n means that all nodes have space for n
search keys and n1 pointers
Node block
Let
block size be 4096 Bytes,
key 4 Bytes,
pointer 8 Bytes.
Lets solve for n
4n 8(n1) ? 4096
? n ? 340
n degree order fanout

38
Example

n 340, however a typical node has 255 keys
At level 3 we have
2552 nodes, which means
2553 ? 16 ? 220 records can be indexed.
Suppose record 1024 Bytes ? we can index a file
of size
16 ? 220 ? 210 ? 16 GB
If the root is kept in main memory accessing a
record requires 3 disk I/O

39
Transactions
40
Transactions Correctness Principle

A transaction is atomic -- all or none property.
If it executes partly, an invalid state is likely
to result.
A transaction, may change the DB from a
consistent state to another consistent state.
Otherwise it is rejected (aborted).
Concurrent execution of transactions may lead to
inconsistency each transaction must appear to
be executed in isolation
The effect of a committed transaction is durable
i.e. the effect on DB of a transaction must never
be lost, once the transaction has completed.
ACID Properties of a transaction
Atomicity, Consistency, Isolation, and
Durability

41
Concurrent Transactions

Even when there is no failure, several
transactions can interact to turn a
consistent state
into an
inconsistent state.

42
Transactions and Schedules

A transaction (model) is a sequence of r and w
actions on database elements.
A schedule is a sequence of reads/writes actions
performed by a collection of transactions.
Serial Schedule All actions for each
transaction are consecutive.
r1(A) w1(A) r1(B) w1(B) r2(A) w2(A)
r2(B) w2(B)
Serializable Schedule A schedule whose effect
is equivalent to that of some serial schedule.
We will introduce a sufficient condition for
serializability.

43
Conflicts

Suppose for DB elements X and Y,
ri(X) rj(Y) is part of a schedule, and we
flip the order of these operations.
ri(X) rj(Y) rj(Y) ri(X)
This holds always (even when XY)
We can flip ri(X) wj(Y), as long as X?Y
That is, ri(X) wj (X) ? wj(X) ri (X)
In the RHS, Ti reads the value of X written
by Tj, whereas it is not so in the LHS.

44
Conflicts (Contd)

We can flip wi(X) wj(Y) provided X?Y
However, wi(X) wj(X) ? wj(X) wi(X)
The final value of X may be different depending
on which write occurs last.
There is a conflict if 2 conditions hold.
A read and a write of the same X, or
Two writes of X conflict in general and may not
be swapped in order.
All other events (reads/writes) may be swapped
without changing the effect of the schedule (on
the DB).

45
Schedulers

A scheduler takes requests from transactions for
reads and writes, and decides if it is OK to
allow them to operate on DB or defer them until
it is safe to do so.
Ideal a scheduler forwards a request iff it
cannot lead to inconsistency of DB
Too hard to decide this in real time.
Real it forwards a request if it cannot result
in a violation of conflictserializability.
We thus need to develop schedulers which ensure
conflict-serializability.

46
Lock Actions

Before reading or writing an element X, a
transaction Ti requests a lock on X from the
scheduler.
The scheduler can either grant the lock to Ti or
make Ti wait for the lock.
If granted, Ti should eventually unlock (release)
the lock on X.
Shorthands
li(X) transaction Ti requests a lock on X
ui(X) Ti unlocks/releases the lock on X

47
Two Phase Locking
There is a simple condition, which guarantees
confict-serializability In every transaction,
all lock requests (phase 1) precede all unlock
requests (phase 2).
48
Undo Logging (Contd)

Two rules of Undo Logging
U1 Log records for a DB element X must be on
disk before any database modification to X
appears on disk.
U2 If a transaction T commits, then the log
record ltCOMMIT Tgt must be written to disk only
after all database elements changed by T are
written to disk.
In order to force log records to disk, the log
manager needs a FLUSH LOG command that tells the
buffer manager to copy to disk any log blocks
that havent previously been copied to disk or
that have been changed since they were last
copied.

49
Example
Action t Buff A Buff B A in HD B in
HD Log Read(A,t) 8 8 8 8 ltStart
Tgt tt2 16 8 8 8 Write(A,t) 16 16 8 8 ltT,A
,8gt Read(B,t) 8 16 8 8 8 tt2 16 16 8 8 8 Writ
e(B,t) 16 16 16 8 8 ltT,B,8gt Flush
Log Output(A) 16 16 16 16 8 Output(B) 16 16 16 16
16 ltCommit Tgt Flush Log
50
Recovery With Undo Logging

1. Examine the log to identify all transactions T
such that ltSTART Tgt appears in the log, but
neither ltCOMMIT Tgt nor ltABORT Tgt does.
Call such transactions incomplete.
2. Examine each log entry ltT, X, vgt
a) If T isnt an incomplete transaction, do
nothing.
b) If T is incomplete, restore the old value of
X
In what order?
From most recent to earliest.
3. For each incomplete transaction T add ltABORT
Tgt to the log, and flush the log.
What about the transactions that had already
ltABORT Tgt in the log?
We do nothing about them. If T aborted, then the
effect on the DB should have been restored anyway.

51
Checkpointing

Problem in principle, recovery requires looking
at the entire log.
Simple solution occasional checkpoint operation
during which we
Stop accepting new transactions.
Wait until all current transactions commit or
abort and have written a Commit or Abort log
record
Flush the log to disk
Enter a ltCKPTgt record in the log and flush the
log again
Resume accepting transactions
If recovery is necessary, we know that all
transactions prior to a ltCKPTgt record have
committed or aborted and ? need not be undone

52
Example of an Undo log

ltSTART T1gt
ltT1,A,5gt
ltSTART T2gt
ltT2,B,10gt ? decide to do a checkpoint
ltT2,C,15gt
ltT1,D,20gt
ltCOMMIT T1gt
ltCOMMIT T2gt
ltCKPTgt ? we may now write the CKPT
record
ltSTART T3gt
ltT3,E,25gt
ltT3,F,30gt ? If a crash occurs at this
point?

53
Nonquiescent Checkpoint (NQ CKPT)

Problem we may not want to stop transactions
from entering the system.
Solution
1. Write a record ltSTART CKPT(T1,...,Tk)gt
to log and flush to disk, where Tis are
all current active transactions.
2. Wait until all Tis commit or abort,
but do not prohibit new transactions.
3. When all T1Tk are done, write the
record ltEND CKPTgt to log and flush.

54
Recovery with NQ CKPT

First case
If the crash follows ltEND CKPTgt,
Then we can restrict recovery to transactions
that started after the ltSTART CKPTgt.
Second case
If the crash occurs between ltSTART CKPTgt and
ltEND CKPTgt, we need to undo
1. All transactions T on the list associated
with ltSTART CKPTgt with no ltCOMMIT Tgt.
2. All transactions T with ltSTART Tgt after the
ltSTART CKPTgt but with no ltCOMMIT Tgt.
i.e. 12 ? undo any incomplete transaction that
is on the CKPT list or started after ltSTART
CKPTgt.

55
Example of NQ Undo Log

ltSTART T1gt
ltT1,A,5gt
ltSTART T2gt
ltT2,B,10gt
ltSTART CKPT (T1,T2)gt
ltT2,C,15gt
ltSTART T3gt
ltT1,D,20gt
ltCOMMIT T1gt
ltT3,E,25gt
ltCOMMIT T2gt
ltEND CKPTgt
ltT3,F,30gt ? A crash occurs at this point

What if we have a crash right after ltT3,E,25gt?
56
Redo Logging

Idea Commit (log record appears on disk) before
writing data to disk.
Redolog entries contain the new values
ltT,X,NewXgt transaction T modified X and the
new value is NewX
Redo logging rule
R1. Before modifying DB element X on disk, all
log entries (including ltCOMMIT Tgt) must be
written to log (in disk).

57
Example
Action t Buff A Buff B A in HD B in
HD Log Read(A,t) 8 8 8 8 ltStart
Tgt tt2 16 8 8 8 Write(A,t) 16 16 8 8 ltT,A
,16gt Read(B,t) 8 16 8 8 8 tt2 16 16 8 8 8 Wri
te(B,t) 16 16 16 8 8 ltT,B,16gt ltCommit
Tgt Flush Log Output(A) 16 16 16 16 8 Output(B) 16
16 16 16 16
58
Recovery for Redo Logging

Identify committed transactions.
Examine the log forward, from earliest to latest.
Consider only the committed transactions, T.
For each ltT, X, vgt in the log do
WRITE(X,v) OUTPUT(X)
Note 1 Uncommitted transactions will have no
effect on the DB (unlike in undo logging)
This because none of the changes of an
uncommitted T have reached the disk
Note 2 Redoing starts from the head of the
log
In effect, each data item X will have the value
written by the last transaction in the log that
changed X.

59
Checkpointing for Redo Logging

The key action that we must take between the
start and end of checkpoint is to write to disk
all the dirty buffers.
Dirty buffers are those that have been changed by
committed transactions but not written to disk.
Unlike in the undo case, we dont need to wait
for active transactions to finish (in order to
write ltEND CKPTgt).
However, we wait for copying dirty buffers of the
committed transactions.

60
Checkpointing for Redo (Contd)

1. Write a ltSTART CKPT(T1,...,Tk )gt record to the
log, where Tis are all active transactions.
2. Write to disk all the dirty buffers of
transactions that had already committed when the
START CKPT was written to log.
3. Write an ltEND CKPTgt record to log.

61
Checkpointing for Redo (Contd)
ltSTART T1gt ltT1,A,5gt ltSTART T2gt ltCOMMIT
T1gt ltT2,B,10gt ltSTART CKPT(T2)gt ltT2,C,15gt ltSTART
T3gt ltT3,D,20gt ltEND CKPTgt ltCOMMIT T2gt ltCOMMIT T3gt

The buffer containing value A might be dirty. If
so, copy it to disk. Then write ltEND CKPTgt.
?
During this period three other actions took
place.
?

62
Recovery with Ckpt. Redo

Two cases
If the crash follows ltEND CKPTgt,
we can restrict ourselves to transactions that
began after the ltSTART CKPTgt and those in the
START list.
This is because we know that, in this case, every
value written by committed transactions, before
?START CKPT()?, is now in disk.
2. If the crash occurs between ltSTART CKPTgt and
ltEND CKPTgt,
then go and find the previous ltEND CKPTgt and do
the same as in the first case.
This is because we are not sure that committed
transactions before ?START CKPT() ? have their
changes in disk.