Quicksort - PowerPoint PPT Presentation

1 / 30

About This Presentation

Title:

Quicksort

Description:

Discover your questions. Discuss in pairs. Discuss in class ... A random variable is a function that assigns an arbitrary number to each sample point. ... – PowerPoint PPT presentation

Number of Views:95

Avg rating:3.0/5.0

Slides: 31

Provided by: mike437

Category:

more less

Transcript and Presenter's Notes

Title: Quicksort

1
Quicksort

Lecture 15
Quicksort
Background
Worst Case Analysis
Best Case Analysis
Average Case Analysis
Empirical Comparison
Randomized Quicksort

Lectures 13, 14
Learning Activity
Pair off with someone you
dont know
Take out a sheet of paper
Solve problems
Discover your questions
Discuss in pairs
Discuss in class
Leave knowing how to solve linear constant
coefficient difference equations with geometric
forcing terms

2
Quicksort

Mergesort is Q(nlogn), but inconvenient for
implementation with arrays since we need space to
merge
Quicksort sorts in place, using partitioning
Example Pivot about first element (3)
3 1 4 1 5 9 2 6 5 3 5 8 9 --- before
2 1 3 1 3 9 5 6 5 4 5 8 9 ---
after
At most n swaps
Pivot element ends up in its final position
No element left or right of pivot flips sides
Sort each side independently
Recursive Divide and Conquer approach

3
Quicksort
procedure Quicksort (Ti..j) if j - i is small
enough, then insert(Ti..j) else pivot
(Ti..j,l) quicksort(Ti..l-1) quicksort(Tl1
..j)
Dividing the problem into subproblems based on
where the pivot ends up
4
Choosing a Good Pivot

This is the crux of an implementation
What would the worst case be?

5
Choosing a Good Pivot

This is the crux of an implementation
What would the best case be?

Splitting the input as evenly as possible results
in subproblems of size n/2. The total number of
levels then becomes log2n, and the effort to
partition each level is Q(n), for a total
complexity Q(nlogn)
6
Choosing a Good Pivot

The instances that become worst or best cases
depend on the method used for choosing a pivot
Books pivot routine chooses first in list
Worst case is an already sorted list (!)
This can be very bad for many applications that
expect the list to already be partially sorted
Could instead choose
The middle element
Median of first, last, and middle
A random element

7
Choosing a Good Pivot

Quicksort takes Q(n2) in the worst case, and
Q(nlogn) in the best case. How does it perform
on average?

We want to consider the runtime as a
Random Variable and compute its expected
value given a particular distribution, or
probability measure, on possible inputs.

8
Recall Elementary Probability

A random variable is a function that assigns an
arbitrary number to each sample point.

Random Variable, X
Probability Measure

PrXx means the probability
of the event Xx, e.g
PrX -1 .1
PrX5 .2
PrX 0 .1

Set S
-1 0 4 -1.5 2 -3 5 5 9 3 2
.1 .1 .2 .1 .05 .05 .1 .1 .05 .1 .05
.1
.1
.2
.1
.05
.05
.1
.1
.05
.1
.05
9
Recall Elementary Probability

The expectation of X (or expected value, or mean,
or average) is given by

Random Variable, X
Probability Measure
Set S

Like the center of mass.
In this example E(X) 2.45

-1 0 4 -1.5 2 -3 5 5 9 3 2
.1 .1 .2 .1 .05 .05 .1 .1 .05 .1 .05
.1
.1
.2
.1
.05

The probability mass function
of random variable X is just
p(x) PrXx

.05
.1
.1
.05
.1
.05
10
Quicksort Average Case Analysis

What is the Sample Space?

All possible inputs of size n
Random Variable
Probability Measure
Set S
11
Quicksort Average Case Analysis

What is the Probability Measure?

Assume Equally Likely?
Random Variable
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
12
Quicksort Average Case Analysis

What is the Random Variable?

Runtime on that input
Random Variable
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
13
Quicksort Average Case Analysis

What is the Random Variable?

Runtime on that input
Pivot Location
Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
14
Quicksort Average Case Analysis

What is the Random Variable?

Runtime on that input
Pivot Location
Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(left of pivot)t(right of pivot) linear stuff
1st 1st 2nd 2nd 3rd 3rd
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
t(left of pivot)t(right of pivot) linear stuff
15
Quicksort Average Case Analysis

What is the Random Variable?

Runtime on that input
Pivot Location
Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(0)t(2) g(n)
1st 1st 2nd 2nd 3rd 3rd
Where t(m) is the average time to sort an array
of size m
t(0)t(2) g(n)
t(1)t(1) g(n)
t(1)t(1) g(n)
t(2)t(0) g(n)
t(2)t(0) g(n)
16
Quicksort Average Case Analysis

What is the Probability Mass Function for X?

Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(0)t(2) g(n)
t(0)t(2) g(n)
t(1)t(1) g(n)

The probability mass function
of random variable X is just
p(x) PrXx

t(1)t(1) g(n)
t(2)t(0) g(n)
1/n
t(2)t(0) g(n)
x
t(0)t(2)g(n)
t(2)t(0)g(n)
t(1)t(1)g(n)
17
Quicksort Average Case Analysis

What is the Expected, or Average, value of X?

Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(0)t(2) g(n)
t(0)t(2) g(n)
t(1)t(1) g(n)

The probability mass function
of random variable X is just
p(x) PrXx

t(1)t(1) g(n)
t(2)t(0) g(n)
1/n
t(2)t(0) g(n)
x
t(0)t(2)g(n)
t(2)t(0)g(n)
t(1)t(1)g(n)
18
Quicksort Average Case Analysis

What is the Expected, or Average, value of X?

Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(0)t(2) g(n)
t(0)t(2) g(n)
t(1)t(1) g(n)

The probability mass function
of random variable X is just
p(x) PrXx

t(1)t(1) g(n)
t(2)t(0) g(n)
1/n
t(2)t(0) g(n)
x
t(0)t(2)g(n)
t(2)t(0)g(n)
t(1)t(1)g(n)
19
Quicksort Average Case Analysis

What is the Expected, or Average, value of X?

Random Variable, X
Probability Measure
Set S
1/n! 1/n! 1/n! 1/n! 1/n! 1/n!
t(0)t(2) g(n)
t(0)t(2) g(n)
t(1)t(1) g(n)

The probability mass function
of random variable X is just
p(x) PrXx

t(1)t(1) g(n)
t(2)t(0) g(n)
1/n
t(2)t(0) g(n)
x
t(0)t(2)g(n)
t(2)t(0)g(n)
t(1)t(1)g(n)
20
Quicksort Average Case Analysis
Is this difference equation Linear? Is it
Constant Coefficient? Do our solution methods
apply?
21
Quicksort Average Case Analysis
Is this difference equation Linear? Is it
Constant Coefficient? Do our solution methods
apply?
22
Quicksort Average Case Analysis
Book claims the hidden constants are smaller
for Quicksort than for Heapsort or Mergesort
23
Empirical Comparison
Averages computed from 50 random samples
Miliseconds
Size of Instance
24
Empirical Comparison
Averages computed from 50 random samples
Miliseconds
Size of Instance
25
Empirical Comparison
Averages computed from 50 random samples
Miliseconds (log scale)
Size of Instance
26
Empirical Comparison
Averages computed from 50 random samples
Miliseconds
Size of Instance
27
Worst Case in 50
Averages computed from 50 random samples
Miliseconds
Size of Instance
Which has the better theoretical worst case?
practical worst case?
28
Worst Case
Averages computed from 50 random samples
Miliseconds
Size of Instance
29
Worst Case Asymptotic Bound
Averages computed from 50 random samples
Miliseconds
Size of Instance
30
Why Randomize Quicksort?

Avoiding pivoting around the first (or last)
element yields a worst case instance that is not
an already sorted list
No matter how a pivot is chosen, though, there is
an instance that generates the worst-case runtime
An enemy could target the worst case instance!
The average runtime depends strongly on how we
assumed the probability measure is distributed
over the algorithm domain, so by making the
worst case instance more likely, the expected
runtime approaches worst case
If you scramble every input (or randomly choose a
pivot), there is no instance that always yields
the worst case runtime the expected runtime
becomes independent of the (unknown) probability
measure on the algorithm domain
The worst case is still possible (now, from every
input), but it is also very unlikely from any
input