Title: Facility Location in Dynamic Geometric Data Streams
1Facility Location in Dynamic Geometric Data
Streams
Christiane Lammersen Christian Sohler
2Dynamic Geometric Data Streams
- Streams of geometric data arise in
- Mobile networks
- Sensor networks
-
- Continuously changing data
- Mobile networks position of nodes
- Sensor networks measured data
- Communication in form of update operations
- Update consists of ID of node, old value, new
value
IITK Workshop on Algorithms for
Christiane Lammersen Processing Massive Data Sets
2
3Hierarchical Communication Systems
- upper layer offers lower layer a certain service
- each node can be a server
- cost for server ? access time
3
3
3
4Hierarchical Communication Systems
- upper layer offers lower layer a certain service
- each node can be a server
- cost for server ? access time
5Dynamic Geometric Data Streams
- m insert and delete operations
- points in low-dimensional, discrete space 1,
..., Dd - polylog(D, m) memory space, one pass
Indyk 04
D
5
6Dynamic Uniform FLP
- point set P
- facilities have uniform opening cost f
- clients have uniform demand b
- goal maintaining F ? P, so as to minimize
- FLP related to k-Median but
- F can be Q(P)
- problem in streaming
- approximation of the cost
6
6
6
7Related Work
- P. Indyk Algorithms for Dynamic Geometric
Problems over Data Streams, STOC 04 - O(log2D)-approximation for cost of FLP
- Idea nested squared grids, open facility in all
heavy - cells
- G. Frahling and C. Sohler Coresets in Dynamic
Geometric Data Streams, STOC 05 - space partition based on heavy cells
7
7
7
8Construction of Our Streaming Method
deterministic method Edet(P) Q(OPT(P))
randomized method Erand(P) Q(Edet(P))
streaming method Estream(P) Q(Erand(P))
9Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
- Indyk 04, Frahling and Sohler 05
10Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
Idea Open one facility in each heavy cell in the
space partition.
10
11Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
Idea Open one facility in each heavy cell in the
space partition.
11
12Nested Grids
- Impose log(D)1
- nested squared grids
D 16 Level 4
13Nested Grids
- Impose log(D)1
- nested squared grids
D 16 Level 3
14Nested Grids
- Impose log(D)1
- nested squared grids
D 16 Level 2
15Nested Grids
- Impose log(D)1
- nested squared grids
D 16 Level 1
16Nested Grids
- Impose log(D)1
- nested squared grids
D 16 Level 0
17Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
18Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 4
19Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 3
20Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 3
21Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 3
22Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 3
23Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 2
24Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 2
25Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 2
26Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 2
27Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 1
28Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 1
29Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 1
30Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
Cell in level i is heavy if it contains f / 2i
points.
f 8 D 16 Level 0
31Space Partition
- In each grid, identify the
- heavy cells
- Partition the input space
- based on the heavy cells
32Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
33Cost Estimator
- For each cell size, count
- the number of points
- within cells of that size
- gt estimator for cost
34Cost Estimator
- For each cell size, count
- the number of points
- within cells of that size
- gt estimator for cost
35Cost Estimator
- For each cell size, count
- the number of points
- within cells of that size
- gt estimator for cost
9 points
36Cost Estimator
- For each cell size, count
- the number of points
- within cells of that size
- gt estimator for cost
37Cost Estimator
- For each cell size, count
- the number of points
- within cells of that size
- gt estimator for cost
7 points
38Value of Cost Estimator is W(OPT(P))
- Contribution of heavy
- cell C in level i is at most
- Contribution of light cell
- C in level i is at most
- A heavy cell in level i
- contains Q( f / 2i) points.
- The space partition is balanced.
- The distance of a cell in
- level i to heavy cell is O(2i).
39Value of Cost Estimator is O(OPT(P))
- Contribution of distant
- cell C in level i is at
- least n(C) .2i-1
- OPT(P) ? f . FOPT
- Estimated cost for near
- cell C in level i is
- n(C) .2i O( f )
- There is a constant
- number of near cells.
- Estimated cost for near
- cells is O( f . FOPT)
level i
40Deterministic Method
- Impose log(D)1 nested squared grids
- In each grid, identify the heavy cells
- Partition the input space based on the heavy
cells - For each cell size, count the number of points
within cells of that size - gt estimator for cost
41Randomized Method
- Idea
- Heavy cell in level i contains at least f /2i
points - Sample a point in level i with probability 2i/f
- Problem coin flips delete operations
- Solution
- Hash function hi 1,, Dd ? 1,, ? f / 2i ?
- Sample set Si p? P hi( p) 1
41
42Randomized Method
- for each level i do
- F(i) ? set of all marked cells C in level i such
that - no subcell of C is marked
- no smaller cell within a distance of less than
2i-1 is marked - return
Erand(P) Q(Edet(P))
43Streaming Method
- Idea Reduction to counting distinct elements
- Implementation
- For each level i count distinct elements in
- DE1(i) CC is in level i and marked?CC is
in level i and a) or b) fails - and DE2(i) CC is in level i and a) or b)
fails - Output difference as cost for level i
DE1(i)
DE2(i)
DE1(i1)
DE2(i1)
44Conclusion Future Work
- Streaming Algorithm for Dynamic FLP
- constant factor approximation of cost
- update-time O(log(1/d) . polylog(D))
- space O(log(1/d) . polylog(D))
- failure probability d
- Future Work
- approximation factor not exponential in d
- (1e)-approximation algorithm
44
44
44
45Thank you for your attention!
Department of Computer Science Technische
Universität Dortmund Otto-Hahn-Str. 14 44221
Dortmund, Germany Phone 49 231 755-4762
Fax. 49 231 755-2047 Email
christiane.lammersen_at_tu-dortmund.de http//ls2-ww
w.cs.uni-dortmund.de/lammersen/