Title: Dynamic Array and Amortized Analysis
1Dynamic Array and Amortized Analysis
Discussion Section 09/04/2008 22C21 Computer
Science II Data Structures
2Todays topics
- dynamically resizing arrays and the cost of doing
this - amortized analysis
- generating random inputs
- timing Java programs
- (A quick discussion of this from last class)
3Static Array in RecordDB.java
- public void insert (Record rec)
- // Check for overflow if overflow, then don't
insert - if (numRecords maxRecords-1)
- return
-
The size of the Record array is maxRecords
current number of employees in the list
4Dynamic Array in DynamicRecordDB.java
- // Check for overflow.
- // Double the size, if overflow occurs.
- if (numRecords maxRecords-1)
- //create of new array of double size.
- Record tempRecord new Record2maxRecords
- // Copy contents of recordList into tempRecord
- for(i 0 i lt maxRecords i)
- tempRecordi recordListi
- // Make recordList point to tempRecord
- recordList tempRecord
- //Change maxRecords to the new value
- maxRecords 2maxRecords
-
Initial Array size
New Array Size (getting changed dynamically)
5Amortized Analysis
- Motivation behind Amortized Analysis
- See the run time comparison of the two programs
DynamicRecordDB.java and Dynamic RecordDB2.java
6What are we calculating?
- The comparison is for 10 different values of n
and these values are large (10,000, 10,500, etc.)
. - This is important because
- (i) for small n the difference may not show up
and - (ii) unless you do it for various n, you will not
see the trend.
//test n10000, 10500,...,20000 int
insertNum 10000 . . . insertNum
500
7What are we calculating?
- Even for a particular n , you don't want to time
the function call just once you want to do it a
bunch of times and take the average.
for(int rpt 0 rpt lt 10 rpt) //run time
for this repetition of the experiment runTime
stopTime - startTime // accumulated run time
over all repetition of this
experiment totalRunTime runTime System.out.
println("Per operation run time "
totalRunTime1.0/(10insertNum))
8What are we calculating?
- Here we are actually doing
- Aggregate analysis
- It determines the upper bound T(n) on the total
cost of a sequence of n operations, then
calculates the average cost to be T(n) / n.
9doubling the array v/s increasing it by one slot
- To compare the two programs we are outputting the
following data from the programs
System.out.print("numRecords"recDB.numRecords)
System.out.println("maxRecords"recDB.maxRecord
s) System.out.println("Per operation run time
" totalRunTime1.0/(10
insertNum)) In this program we are having 10
runs for a fixed input number of records. Then
total runtime is the addition of all the
individual runtimes in the 10 runs.
10Overflow in DynamicRecordDB.java
// Double the size, if overflow occurs.
if (numRecords maxRecords-1)
//create of new array of double
size. Record tempRecord new Record2maxRe
cords // Copy contents of recordList into
tempRecord for(i 0 i lt maxRecords i)
tempRecordi recordListi
// Make recordList point to
tempRecord recordList tempRecord
//Change maxRecords to the new value
maxRecords 2maxRecords
11Overflow in DynamicRecordDB2.java
// Increment the size by one, if overflow
occurs. if (numRecords maxRecords-1)
//create of new array of double
size. Record tempRecord new RecordmaxReco
rds1 // Copy contents of recordList into
tempRecord for(i 0 i lt maxRecords i)
tempRecordi recordListi
// Make recordList point to
tempRecord recordList tempRecord
//Change maxRecords to the new value
maxRecords maxRecords1
12Calculate runtime
- startTime System.currentTimeMillis()
-
- // other statements
-
- stopTime System.currentTimeMillis()
- //run time for this repetition of the experiment
- runTime stopTime - startTime
13Generating Random Inputs
- // Create an empty record data base
- recDB new DynamicRecordDB()
- //insert the records
- for(int i0 i lt insertNum i)
- recDB.insert(new
- Record(Integer.toString(i),
- Integer.toString(i), i))
14doubling the array v/s increasing it by one slot
Output for the program with array size increased
by one slot
numRecords10000maxRecords10001 Per operation
run time 3.95151 numRecords10500maxRecords105
01 Per operation run time 4.281171428571429 numR
ecords11000maxRecords11001 Per operation run
time 5.1755545454545455 numRecords11500maxReco
rds11501 Per operation run time
9.181573913043477
See the array size is getting changed at every
run!
Observe how the run time is increasing. Once the
overflow Occurs, at every step we are paying a
high price to increment the array size. This can
be said as every step then becomes the
worst-case scenario.
15doubling the array v/s increasing it by one slot
Output for the program with doubling array
numRecords10000maxRecords16384 Per operation
run time 1.4829 numRecords10500maxRecords1638
4 Per operation run time 1.1938190476190476 numR
ecords11000maxRecords16384 Per operation run
time 1.8180636363636364 numRecords11500maxReco
rds16384 Per operation run time
1.2178869565217392
See once the array size has been Doubled here,
and wont get doubled for a long time.
Worst-case operation can alter the state in a way
that worst-case doesnt occur again for a long
time.
16doubling the array v/s increasing it by one slot
17Amortized Analysis
- Average running time per operation over a
sequence of worst-case operations.
18Basic idea
- Knowledge of which sequence of operations is
possible. - Data structures that have states that persists
between operations. - Worst-case operation can alter the state in a way
that worst-case doesnt occur again for a long
time. Thus amortizing the cost!
the gradual reduction
19Basic Idea
Initial Array
Price of each new slot Is 2 time units
Doubled Array
Total cost of doubling the array is thus 2
(initial array size)
2
2
2
2
We will continue to Use this array for a while
without resizing it! (amortizing the cost!!)
20Basic Idea
- We double the size of the array each time it
fills up. - Because of this, array reallocation may be
required, and in the worst case an insertion may
require O(n). - However, a sequence of n insertions can always be
done in O(n) time, so the amortized time per
operation is O(n) / n O(1).
21Formal analysis
- Consider DynamicRecordDB with N slots in the
array and n records. - INSERT operations doubles the size before adding
another item if n N. - Any operation that doesnt involve doubling takes
O(1) time unit say, at most 1 seconds. - Resizing takes 2n seconds.
22Analysis (contd.)
- We start from empty list and perform i INSERT
operations. So, n i and N is the smallest power
of 2 i. - Total seconds for all the resizing operations 2
4 8 N/4 N/2 N 2N 2. -
In reference to the code n numRecords, N
maxRecords. We start with N 2. Then N becomes
4 and finally 8.
23Analysis (almost done!)
- Total seconds for i INSERTs i 2N 2
- Now, N 2n 2i. So the i INSERTs take O(5i 2)
or O(i) time. This is worst case! - So, on average, each INSERT takes O(i)/i O(1)
time. This is the amortized running time of
insertion.
24Bottom line(s)
- Amortized analysis is a way of proving that even
if an operation is occasionally expensive, its
cost is made up for by other, cheaper occurrences
of the same operation.