Selection --Medians and Order Statistics (Chap. 9) - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Selection --Medians and Order Statistics (Chap. 9)

Description:

Median, lower median, upper median. Selection in expected/average linear time ... it may be unlucky and always partition into A[q], an empty side ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 12
Provided by: Owne1286
Learn more at: http://cs.iupui.edu
Category:

less

Transcript and Presenter's Notes

Title: Selection --Medians and Order Statistics (Chap. 9)


1
Selection --Medians and Order Statistics (Chap. 9)
  • The ith order statistic of n elements Sa1,
    a2,, an ith smallest elements
  • Also called selection problem
  • Minimum and maximum
  • Median, lower median, upper median
  • Selection in expected/average linear time
  • Selection in worst-case linear time

2
O(nlg n) Algorithm
  • Suppose n elements are sorted by an O(nlg n)
    algorithm, e.g., MERGE-SORT
  • Minimum the first element
  • Maximum the last element
  • The ith order statistic the ith element.
  • Median
  • If n is odd, then ((n1)/2)th element.
  • If n is even,
  • then (?(n1)/2?)th element, lower median
  • then (?(n1)/2?)th element, upper median
  • All selections can be done in O(1), so total
    O(nlg n).
  • Can we do better?

3
Selection in Expected Linear Time O(n)
  • Select ith element
  • A divide-and-conquer algorithm RANDOMIZED-SELECT
  • Similar to quicksort, partition the input array
    recursively
  • Unlike quicksort, which works on both sides of
    the partition, just work on one side of the
    partition.
  • Called prune-and-search, prune one side, just
    search the other side).
  • (Please review or read quicksort in chapter 7.)

4
RANDOMIZED-SELECT(A,p,r,i)
  1. if pr then return Ap
  2. q?RANDOMIZED-PARTITION(A,p,r)
  3. //the q holds for Ap,q-1?Aq ?Aq1,r
  4. k ?q-p1
  5. if ik then return Aq
  6. else if iltk
  7. then return RANDOMIZED-SELECT(A,p
    ,q-1,i)
  8. else return RANDOMIZED-SELECT(A,
    q1,r,i-k)

5
Analysis of RANDOMIZED-SELECT
  • Worst-case running time ?(n2), why???

it may be unlucky and always partition into Aq,
an empty side and a side with remaining
elements. So every partitioning of m elements
will take ?(m) time, and mn,n-1,,2. Thus
total is ?(n) ?(n-1) ?(2) ? (n(n1)/2-1)
?(n2). Moreover, no particular input elicits the
worst-case behavior, Because of randomness.
But in average, it is good.
By using probabilistic analysis/random variable,
it can be proven that the expected running time
is O(n). (ref. to page 187).
Can we do better, such that O(n) in worst case??
6
Selection in worst case linear time O(n)
  • Select the ith smallest element of Sa1, a2,,
    an
  • Use so called prune-and-search technique
  • Let x? S, and partition S into three subsets
  • S1aj aj ltx, S2aj aj x, S3aj aj gtx
  • If S1 gti, search ith smallest element in S1
    recursively, (prune S2 and S3 away)
  • Else If S1 S2 gti, then return x (the ith
    smallest element)
  • Else search (i-( S1 S2 ))th in S3
    recursively, (prune S1 and S2 away)
  • The question is how to select x such that S1 and
    S3 are nearly equal.?

7
The Way to Select x
At least (3n/10)-6 elements ltx
Divide elements into ?n/5? groups of 5 elements
each. Find the median of each group Find the
median of the medians
At least (3n/10)-6 elements gtx
Because each of 1/2 ?n/5?-2 groups contributes 3
elements which are ? x
8
SELECT ith Element in n Elements)
  • Divide n elements into ?n/5? groups of 5
    elements.
  • Find the median of each group.
  • Use SELECT recursively to find the median x of
    the above ?n/5? medians.
  • Partition n elements around x into S1, S2 , and
    S3.
  • If S1gti, search ith smallest element in S1
    recursively,
  • Else If S1S2gti, then return x (the
    ith smallest element)
  • Else search (i-(S1S2))th in S3
    recursively,

9
Analysis of SELECT (cont.)
  • Steps 1,2,4 take O(n),
  • Step 3 takes T(?n/5?).
  • Let us see step 5
  • At least half of medians in step 2 are ? x, thus
    at least 1/2 ?n/5?-2 groups contribute 3 elements
    which are ? x. i.e, 3(?1/2 ?n/5? ? -2) ?
    (3n/10)-6.
  • Similarly, the number of elements ? x is also at
    least (3n/10)-6.
  • Thus, S1 is at most (7n/10)6, similarly for
    S3.
  • Thus SELECT in step 5 is called recursively on at
    most (7n/10)6 elements.
  • Recurrence is
  • T(n) O(1)
    if nlt some value (i.e. 140)
  • T(?n/5?)T(7n/106)O(n) if n ?the
    value (i.e, 140)

10
Solve recurrence by substitution
  • Suppose T(n) ? cn, for some c.
  • T(n) ? c ?n/5? c(7n/106) an
  • ? cn/5 c 7/10cn6c an
  • 9/10cnan7c
  • cn(-cn/10an7c)
  • Which is at most cn if -cn/10an7clt0.
  • i.e., c ?10a(n/(n-70)) when ngt70.
  • So select n140, and then c ?20a.
  • Note n may not be 140, any integer gt70 is OK.

11
Summary
  • Bucket sort, counting sort, radix sort
  • Their running times,
  • Modifications
  • The ith order statistic of n elements Sa1,
    a2,, an ith smallest elements
  • Minimum and maximum.
  • Median, lower median, upper median
  • Selection in expected/average linear time
  • Worst case running time
  • Prune-and-search
  • Selection in worst-case linear time
  • Why group size 5?
Write a Comment
User Comments (0)
About PowerShow.com