Estimating the longest increasing sequence in polylogarithmic time - PowerPoint PPT Presentation

About This Presentation

Title:

Estimating the longest increasing sequence in polylogarithmic time

Description:

Estimating the longest increasing sequence in polylogarithmic time C. Seshadhri (Sandia National Labs) Joint work with Michael Saks (Rutgers University) – PowerPoint PPT presentation

Number of Views:142

Avg rating:3.0/5.0

Slides: 46

Provided by: Ses114

Category:

more less

Transcript and Presenter's Notes

Title: Estimating the longest increasing sequence in polylogarithmic time

1
Estimating the longest increasing sequence in
polylogarithmic time

C. Seshadhri (Sandia National Labs)
Joint work with Michael Saks
(Rutgers University)

Sandia National Laboratories is a multi-program
laboratory managed and operated by Sandia
Corporation, a wholly owned subsidiary of
Lockheed Martin Corporation, for the U.S.
Department of Energy's National Nuclear Security
Administration under contract DE-AC04-94AL85000
2
The problem
4
24
10
9
15
17
20
18
4
19
3
4
10
15
17
18
19

Given array fn ? N, find (length of)
Longest Increasing Subsequence (LIS)
Rather self-explanatory
By now, textbook dynamic programming problem
CLRS 01 Chapter 15.4 (Longest Common
Subsequence), Starred Problem 15.4-6
Schensted 61, Fredman 75 O(n log n) algorithm

3
Too much to read
LIS is in range 0.4n, 0.6n
Algorithm
5
7
4
8
9
2

Array f is extremely large, so cant read all of
it
What can we say about LIS length, if we see very
little?
LIS LIS length
Read only poly(log n) positions
Obviously randomized

4
Uniform sample says nothing
2
1
4
3
6
5
8
7
10
9
4
9

Choose uniform random sample of poly(log n) size
LIS n/2, but random sample always increasing
So not really that easy to learn about LIS

5
Our result
LIS in this range
Algorithm
1
n
LIS

We want range to be small

6
Our result
LIS in this range
dn
Algorithm
1
n
LIS

We want range to be small
This work For any (constant) d gt 0
Algorithm gives additive dn approximation to
LIS
Running time is (1/d)1/d(log n)c

7
Our result
Ad alert!
dn
1
n
n/2
LIS

We want range to be small
This work For any (constant) d gt 0
Algorithm gives additive dn approximation to
LIS
Running time is (1/d)1/d(log n)c
Ailon Chazelle Liu S 03 Parnas Ron Rubinfeld
03
Previous best d ½

8
Our result
Ad alert!
dn
1
n
n/2
LIS

We want range to be small
This work For any (constant) d gt 0
Algorithm gives additive dn approximation to
LIS
Running time is (1/d)1/d(log n)c
We get (1 d)-approx to distance to monotonicity
Previously best was factor 2

9
Prelims the array in space
20
15
10
4
20
10
9
15
4
10
15
4
1
2
3
10
Prelims the array in space
Violation
Increasing sequence

Input is points in plane, given as array
(LIS is longest chain in partial order)

11
A hard example
k
k
10 points in each
k
k
LIS 4k
LIS 2k
3k
k
k
3k

The decision for a point depends on small scale
properties of far away portions

12
A hard example
k
k
k
k
3k
k
k
3k

Random samples in neighborhoods of points are
identical!
Can we really estimate LIS in polylog time?
Is it time for some heavy work?
I mean, time for lbs (lower bounds).

13
Outline (or lack thereof)

Will I show proofs?
No
Will I show the algorithm?
Maybe
I will try to demonstrate the main insight
By a series of thought experiments

14
The dynamic program
Closest LIS point to left
Splitter
n/2

Closest LIS point to left gives splitter
Find LIS is each blue region. Piece together!
So we break up original problem into subproblems

15
The dynamic program
S
n/2

But we dont know right splitter.
So try all possible! Only n different choices
Choose the one that gives the largest sum of
LISs
MaxS (LIS-below-S LIS-above-S)

16
The dynamic program
n/2

If you LIS in all small boxes, you can build LIS
for bigger boxes
Not the most efficient DP
So our sublinear algo will mimic this process

17
The IP
Is this point on LIS?
LIS is in blue region
Splitter
n/2
Where is the splitter?
It is there.
18
The IP
This point NOT on LIS
LIS is in blue region
n/2
Where is the splitter?
It is there.
19
The IP
n/2
3n/4
I wish we knew the splitter in that region
It is there.
20
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
21
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
22
The IP
n/2
3n/4
5n/8
I think I know what will happen next
Youre lucky Im here
23
The interactive protocol

If point stays in blue region till very end, then
it is good (on LIS). Otherwise, bad.
This takes (log n) steps, with the help of the
wizard

24
The interactive protocol

If point stays in blue region till very end, then
it is good (on LIS). Otherwise, bad.
This takes (log n) steps, with the help of the
wizard
If we could simulate the wizard

25
The interactive protocol

If point stays in blue region till very end, then
it is good (on LIS). Otherwise, bad.
This takes (log n) steps, with the help of the
wizard
If we could simulate the wizard

What?? If you could simulate the wizard, you know
the LIS!
26
Find a splitter
If very few LIS points outside blue, this is not
a bad splitter
n/2

Finding splitter may be hard, so try for
approximate versions?
But how do we determine the number of LIS points?

27
Find a splitter
Total no. of points outside bluelt µn
Conservative splitter
n/2

If µ lt 1/(100 log n), being against health care
conservative is good enough

28
Easy to check
n/2

Count fraction of sample outside blue
poly(log n) samples checks this accurately

29
Getting a conservative splitter
n/2

We can sample (log n) different candidates and
check which of them disbelieves evolution is
conservative
What if no conservative splitter exists?

30
A liberal paradise
Choose any line
No. of points outside at least µn
n/2

So we know that LIS lt (1-µ) n
Leads to the next idea. Boosting approximations!
Given d-approx to LIS, can we get improve to d?

31
Boosting approximations
Run dn-approx on points in box
No. of points outside at least µn
n2
Run dn-approx on points in box
Real splitter
n1
n/2

Take sum of outputs as total LIS estimate
LIS LIS1 LIS2, Est Est1 Est2
Est1 LIS1 lt dn1 Est2 LIS2 lt dn2
So Est LIS lt d(n1 n2)
n1n2 lt (1-µ)n, so Est LIS lt d(1-µ)n !

32
Putting it together
Conservative splitter?
n/2

Check if each is conservative splitter
If it is, were found right subproblems
Otherwise

33
Putting it together
Run dn-approx on points in box
S
Run dn-approx on points in box
n/2

One of these is close enough to real splitter
Est(S) Left-Est(S) Right-Est(S)

34
Putting it together
Run dn-approx on points in box
S
Run dn-approx on points in box
n/2

One of these is close enough to real splitter
Est(S) Left-Est(S) Right-Est(S)
Final Estimate maxS Est(S)
Looks like a great idea!
We go from dn to d(1- µ)n. Recur to keep
improving approximation

35
It fails, miserably
Alg
d0 d1(1-µ)
Alg
Alg
d1
1/µ
Alg
Alg
Alg
Alg
d2
½
Alg
Alg
Alg
Alg

As we go up each level, approx gets better by
(1-µ).
So to get d0 ¼, how many levels needed?
¼ ½ (1-µ)t So t 1/µ
We have running time at least 21/µ.
So, µ needs to be gt 1/log log n.

36
Find a splitter
Total no. of points outside bluelt µn
Conservative splitter
n/2

If µ lt 1/(100 log n), being against health care
conservative is good enough

37
The basic dichotomy
Continue IP
P
We find splitter
Cannot find splitter
The Interactive Protocol phase
The Dynamic Programming phase

For IP, we need µ lt 1/log n
µn is error in each level of IP
For DP, we need µ gt 1/log log n
(1-µ) is decrease in approximation

38
The basic dichotomy
Strengthen
Continue IP
Weaken
P
We find splitter
Cannot find splitter
The Interactive Protocol phase
The Dynamic Programming phase

For IP, we need µ lt 1/log n
µn is error in each level of IP
For DP, we need µ gt 1/log log n
(1-µ) is decrease in approximation

39
Reducing to smaller DP!
n/(log n)
Run d-approx to get LIS estimate inside box
n/(log n)

Run d-approx on all poly(log n) such boxes

40
Reducing to smaller DP!
n/(log n)

Run d-approx on all poly(log n) such boxes
Use Dynamic Program to find chain with largest
sum of estimates
Longest path in DAG
Can solve in poly(log n) time

41
Dichotomy theorem
OR
One can go from d-approx to (d-d2)-approx by a
(log n) sized DP
Either it is easy to find the right subproblems
42
The algorithm, in one slide
Continue IP
P
We find splitter
Cannot find splitter
Make poly(log n) calls to d-approx. Solve DP of
poly(log n) size.

Overall running time becomes (log n)1/d
miracle that the math works out

43
The even better version

Dont exactly solve this dynamic program!
Use our sublinear algo to approximately solve in
(loglog n) time. Then do it recursively
Its painful
Its all Greek a ß ? d e ? ? µ ?
We had ?, but got rid of it

44
What next?