Title: On Optimal Worst-Case Matching
1On Optimal Worst-Case Matching
- Cheng Long (Hong Kong University of Science and
Technology) - Raymond Chi-Wing Wong (Hong Kong University of
Science and Technology) - Philip S. Yu (University of Illinois at Chicago)
- Minhao Jiang (Hong Kong University of Science and
Technology)
Presented by Raymond Chi-Wing Wong Prepared by
Raymond Chi-Wing Wong
2Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
31. Introduction
Worst-case Optimized Assignment
mmd 6
hospitals
P p1, p2, p3
O o1, o2, o3
residential estates
p1
Some existing studies consider the capacities of
hospitals and the demands of customers
5
o1
p2
Return an assignment between P and O such that
the condition of the assignment is satisfied.
6
4
Worst-case Optimized Condition In the
assignment, the maximum matching distance (mmd)
between a residential-estate o and a hospital p
is minimized.
o2
o3
Different applications have different conditions.
p3
Worst-case Optimized assignment
41. Introduction
Worst-case Optimized Assignment
mmd 6
hospitals
P p1, p2, p3
O o1, o2, o3
residential estates
p1
Some existing studies consider the capacities of
hospitals and the demands of customers
There are a lot of applications which need the
worst-case optimized assignment.
5
o1
1. Emergency applications (e.g., hospital
allocation, fire stations and police stations)
p2
Return an assignment between P and O such that
the condition of the assignment is satisfied.
In Hong Kong Ambulance service, the minimized
maximum distance is 12 minutes (driving
distance).
6
4
Worst-case Optimized Condition In the
assignment, the maximum matching distance (mmd)
between a residential-estate o and a hospital p
is minimized.
2. Logistics, Data Warehouse Allocation
o2
o3
Different applications have different conditions.
3. Mail Delivery
p3
4. Profile Matching
Worst-case dissatisfactory rate among customers
is minimized
Worst-case Optimized assignment
51. Introduction
mmd 6
Worst-case Optimized Assignment
mmd 10
Fair Assignment
hospitals
P p1, p2, p3
O o1, o2, o3
residential estates
p1
Some existing studies consider the capacities of
hospitals and the demands of customers
o1
p2
Return an assignment between P and O such that
the condition of the assignment is satisfied.
10
3
2
Wong et al, VLDB 2009 Fair Condition In the
assignment, each o ? O is allocated to p ? P
that (i) is as near to o as possible, and (ii)
its servicing capacity has not been exhausted in
serving other closer estates.
o2
o3
Different applications have different conditions.
p3
Fair assignment
61. Introduction
mmd 6
Worst-case Optimized Assignment
mmd 10
Fair Assignment
Globally Optimized Assignment
mmd 7
hospitals
P p1, p2, p3
O o1, o2, o3
residential estates
p1
Some existing studies consider the capacities of
hospitals and the demands of customers
5
o1
p2
Return an assignment between P and O such that
the condition of the assignment is satisfied.
7
2
U et al, VLDBJ 2010 Globally Optimized
Condition The total cost of the assignment
(i.e., the sum of the matching distances) is
minimized.
o2
o3
Different applications have different conditions.
p3
Globally Optimized Assignment
71. Introduction
mmd 6
Worst-case Optimized Assignment
mmd 10
Fair Assignment
Globally Optimized Assignment
mmd 7
hospitals
NN Nearest neighbor RNN Reverse nearest neighbor
P p1, p2
O o1, o2, o3 , o4, o5
residential estates
Existing spatial assignments cannot solve the
problem of finding Worst-case optimized
assignment well.
The assignment with this globally optimized
condition is said to be a globally optimized
assignment.
p1
Some existing studies consider the capacities of
hospitals and the demands of customers
5
o1
p2
Return an assignment between P and O such that
the condition of the assignment is satisfied.
7
2
U et al, VLDBJ 2010 Globally Optimized
Condition The total cost of the assignment
(i.e., the sum of the matching distances) is
minimized.
o2
o3
Different applications have different conditions.
p3
Globally Optimized Assignment
8Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
92. Problem Definition
hospitals
P p1, p2, p3
O o1, o2, o3
residential estates
p1
Each o ? O is associated with its demand o.w
(which is a positive integer)
5
o1
p2
Each p ? P is associated with its capacity
p.w (which is a positive integer)
6
Worst-case Optimized assignment
4
o2
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
o3
p3
mmd 6
10Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
113. Related Work
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Our problem considers that the demand of each
object in A and the capacity of each object in B
are both equal to any positive integer.
- Bottleneck Matching Problem (BMP)
- Given two sets of objects, namely A and B, and a
matching distance cost between each object in A
and each object in B, - BMP is to find a perfect matching (or assignment)
between A and B which minimizes the maximum
matching distance.
This problem considers that the demand of each
object in A and the capacity of each object in B
are both equal to 1.
123. Related Work
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
- Bottleneck Matching Problem (BMP)
- Threshold
- the fastest algorithm
The algorithm requires to materialize all
pairwise distances. Thus, it is not quite
scalable.
133. Related Work
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
- Spatial Assignment Problem
- Fair Assignment
- Global Optimized Assignment
As we described before, they do not address our
problem well.
143. Related Work
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
- Major Contribution
- Propose an efficient and scalable algorithm
(called Swap-Chain) for this problem - More efficient and scalable than the adapted
algorithm for the bottleneck problem
15Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
164. Algorithm Swap-Chain
- Swap-Chain involves the following 3 steps.
- Step 1 (Initialization)
- Step 2 (Assignment Adjustment)
- Step 3 (Iterative Step)
174. Algorithm Swap-Chain
Step 1 (Initialization) Find a full assignment A
using a given condition (e.g., fair assignment,
globally optimized assignment and random
assignment)
p1
mmd 10
o1
p2
o2
o3
p3
Fair assignment
184. Algorithm Swap-Chain
Step 2 (Assignment Adjustment) Re-assign some
matches in A to form another full assignment A
such that the mmd value of A is smaller than
that of A.
p1
mmd 10
o1
p2
o2
o3
p3
194. Algorithm Swap-Chain
Step 2 (Assignment Adjustment) Re-assign some
matches in A to form another full assignment A
such that the mmd value of A is smaller than
that of A.
p1
mmd 10
7
5
o1
p2
7
2
o2
o3
p3
204. Algorithm Swap-Chain
Step 3 (Iterative Step) Repeat Step 2 until it
is not possible to perform the assignment
adjustment step (Step 2).
p1
mmd 10
7
5
o1
p2
7
2
o2
o3
p3
214. Algorithm Swap-Chain
Step 3 (Iterative Step) Repeat Step 2 until it
is not possible to perform the assignment
adjustment step (Step 2).
p1
6
mmd 10
7
5
o1
After this assignment adjustment step, we cannot
re-adjust the assignment again so that the mmd
value of the adjusted assignment is smaller.
p2
6
This is the final solution for our problem.
o2
o3
4
p3
Step 1 is easy. How can we perform Step 2 (i.e.,
how to re-assign some matches in A such that the
mmd value of this assignment A is decreased)?
224. Algorithm Swap-Chain
- Algorithm Swap-Chain makes use of extreme matches
for re-adjusting the assignment in Step 2 - Given an assignment A, a match in A is called an
extreme match if the matching distance of this
match is equal to the mmd value of A.
234. Algorithm Swap-Chain
Consider the assignment obtained just after Step
1.
mmd 10
p1
Step (a) Break the extreme match (o, p)
o1
p2
An extreme match
o2
o3
p3
244. Algorithm Swap-Chain
Consider the assignment obtained just after Step
1.
A range query
mmd 10
p1
Step (a) Break the extreme match (o, p)
5
Step (b) Find a set of objects in O and P to be
involved for the assignment adjustment.
o1
A range query
p2
3
7
o2
p2
o1
p1
List
2
o2
o3
p3
A chain from o2
254. Algorithm Swap-Chain
We continue these sub-steps again to reduce the
mmd value of the assignment.
mmd 10
7
p1
Step (a) Break the extreme match (o, p)
5
o1
p2
7
2
o2
o3
p3
An extreme match
264. Algorithm Swap-Chain
We continue these sub-steps again to reduce the
mmd value of the assignment.
mmd 10
7
p1
Step (a) Break the extreme match (o, p)
5
Step (b) Find a set of objects in O and P to be
involved for the assignment adjustment.
o1
A range query
p2
6
o2
p3
o3
p2
List
2
o2
o3
4
p3
A range query
A chain from o2
274. Algorithm Swap-Chain
We cannot re-adjust the assignment anymore to
reduce its mmd value.
mmd 10
7
6
p1
5
o1
p2
6
The final solution.
o2
o3
4
p3
284. Algorithm Swap-Chain
- In the algorithm, we have to perform a range
query on P - We build an index on P
- Let the time complexity of building on P be ?
? O(n log n)
294. Algorithm Swap-Chain
- The time complexity of Swap-Chain is equal to
O(R . n . (log n k))
k ltltn
R is the number of extreme matches found in
Swap-Chain
R is typically a small number. In our
experiments, R is equal to 500 on average when
the dataset size is 1M.
304. Algorithm Swap-Chain
- The space occupied by Swap-Chain mainly comes
from the index on P (which is O(n log n)).
314. Algorithm Swap-Chain
- Our Swap-Chain can be extended to handling the
non-spatial problem
Due to the time limit, we do not discuss the
details here.
32Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
335. Empirical Study
- Synthetic Dataset
- P and O Uniform distribution
- Real Dataset
- 4 Datasets in Canada
- AB (Alberta)
- BC (British Columbia)
- ON (Ontario)
- QC (Quebec)
- For each dataset,
- O a set of populated areas
- P a set of fire stations
345. Empirical Study
- Measurements
- Execution Time
- Memory
- Our proposed algorithm
- Swap-Chain
- Two Sets of Experiments
- Comparison with Existing Spatial Assignment
- Comparison with an adapted algorithm of the
bottleneck problem (Threshold-Adapt (TA))
355. Empirical Study
- First Set Comparison with Existing Spatial
Assignment
Synthetic dataset
Real dataset
365. Empirical Study
- Second Set Comparison with the Adapted Algorithm
Threshold-Adapt (TA)
Real dataset
375. Empirical Study
- Second Set Comparison with the Adapted Algorithm
Threshold-Adapt (TA)
Synthetic dataset
38Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
396. Conclusion
- Problem which is to find the worst-case optimized
assignment - Algorithm
- Swap-Chain
- Efficient and Scalable
- Experiments
40QA
411. Introduction
- Bichromatic Reverse Nearest Neighbor (BRNN or
RNN) - Given
- P and O are two sets of objects in the same data
space - Problem
- Given an object p?P, a BRNN query finds all the
objects o?O whose nearest neighbor (NN) in P are
p.
421. Introduction
hospitals
NN Nearest neighbor RNN Reverse nearest neighbor
P p1, p2, p3
O o1, o2, o3
residential estates
p1
o1
RNN
NN in P p2
p2
Capacities of hospitals are not considered.
Demands of customers are not considered.
RNN o1
NN in P p3
o2
o3
p3
There is a serving capacity of p3
NN in P p3
RNN o2, o3
433. Related Work
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Our problem considers that the demand of each
object in A and the capacity of each object in B
are both equal to any positive integer.
- Bottleneck Matching Problem (BMP)
- Given two sets of objects, namely A and B, and a
matching distance cost between each object in A
and each object in B, - BMP is to find a perfect matching (or assignment)
between A and B which minimizes the matching
distance.
One may come up with a straightforward solution
to solve our problem as follows.
For each object p in P (in our problem),
we duplicate this object p p.w times
For each object o in O (in our problem),
we duplicate this object o o.w times
Thus, use an existing algorithm for BMP to solve
our problem.
However, this approach is cumbersome and
undesirable (esp. the capacities/demands are very
large).
This problem considers that the demand of each
object in A and the capacity of each object in B
are both equal to 1.
44Outline
- Introduction
- Problem Definition
- Related Work
- Algorithm Threshold-Adapt
- Algorithm Swap-Chain
- Empirical Study
- Conclusion
454. Algorithm Threshold-Adapt
- Threshold-Adapt (for demands/ capacities equal to
any positive integer) shares the same skeleton
with Threshold (for demands/capacities equal to 1)
- Threshold-Adapt is an algorithm which searches
the best solution in the solution search.
- However, this algorithm is not scalable
464. Algorithm Threshold-Adapt
- Before we introduce this algorithm, we give two
concepts. - Concept 1 Full assignment
- Concept 2 Feasibility
474. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Concept 1 Full Assignment
- Suppose that the total demands from O are at most
the total capacities from P - An assignment A between O and P is said to be
full if each object in O is matched with an
object in P.
A full assignment
484. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Concept 1 Full Assignment
- Suppose that the total demands from O are at most
the total capacities from P - An assignment A between O and P is said to be
full if each object in O is matched with an
object in P.
A non-full assignment
494. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Concept 1 Full Assignment
- We want to find the full assignment with the
smallest mmd value.
mmd 10
A full assignment
504. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
Concept 1 Full Assignment
We want to find the full assignment with the
smallest mmd value.
mmd 6
A full assignment
We choose this full assignment since it has the
smallest mmd value
514. Algorithm Threshold-Adapt
Concept 2 Feasibility
- A value is said to be feasible for our problem if
there exists a full assignment such that its mmd
value is at most this value.
Consider a value 6
There exists a full assignment such that its mmd
value is at most 6
6 is feasible.
524. Algorithm Threshold-Adapt
Concept 2 Feasibility
- A value is said to be feasible for our problem if
there exists a full assignment such that its mmd
value is at most this value.
Consider a value 10
There exists a full assignment such that its mmd
value is at most 10
10 is feasible.
534. Algorithm Threshold-Adapt
Concept 2 Feasibility
- A value is said to be feasible for our problem if
there exists a full assignment such that its mmd
value is at most this value.
Consider a value 10
There exists a full assignment such that its mmd
value is at most 10
10 is feasible.
There can be more than one assignment such that
its mmd value is at most 10.
544. Algorithm Threshold-Adapt
Concept 2 Feasibility
- A value is said to be feasible for our problem if
there exists a full assignment such that its mmd
value is at most this value.
Consider a value 1
There does exist a full assignment such that its
mmd value is at most 1
1 is not feasible.
554. Algorithm Threshold-Adapt
- We have described the two concepts.
Concept 1 Full Assignment
Concept 2 Feasibility
- We present Threshold-Adapt next.
564. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
optimal mmd (i.e., 6)
- Let S be the set of all pairwise distances
between O and P
, 11, 6, 2
S
, 10, 7, 4
5
, 3
, 9
5
Observation The optimal mmd value is in S.
3
Step 1 for each value v in S, we determine
whether v is feasible for our problem Step 2
find the smallest value v which is
feasible. Step 3 return the full assignment with
its mmd value equal to v
9
574. Algorithm Threshold-Adapt
Problem to find an assignment between P and O
such that the maximum matching distance (mmd) is
minimized.
- Let S be the set of all pairwise distances
between O and P
- There are two remaining issues
- Issue 1 How to determine whether a value v is
feasible - Issue 2 How to improve the efficiency of this
algorithm
Can be done by Maximum-flow Algorithm
, 11, 6, 2
S
, 10, 7, 4
5
, 3
, 9
5
Observation The optimal mmd value is in S.
3
Can be speeded up by binary search
Step 1 for each value v in S, we determine
whether v is feasible for our problem Step 2
find the smallest value v which is
feasible. Step 3 return the full assignment with
its mmd value equal to v
9
584. Algorithm Threshold-Adapt
- The time complexity of Threshold-Adapt is O(n2
? . log n) - where
- ? is the complexity analysis of the maximum-flow
algorithm.
594. Algorithm Threshold-Adapt
- The space complexity of Threshold-Adapt is O(n2)
This algorithm is not scalable
604. Algorithm Swap-Chain
- The time complexity of Swap-Chain is equal to
O(?? ? R . I) - where
- ? is the time complexity of Step 1
- R is the number of extreme matches found in
Swap-Chain - I is the time complexity of performing the
re-matching operation for a given extreme match
? O(n log n) if the fair assignment is used.
R is typically a small number.
In our experiments, R is equal to 500 on average
on average when the dataset size is 1M.
I O(n (log n k)) where k ltltn
616. Empirical Study
Real dataset