Geometric Problems in High Dimensions: Sketching - PowerPoint PPT Presentation

About This Presentation

Title:

Geometric Problems in High Dimensions: Sketching

Description:

Theorem: For any r and eps 0 (small enough) ... What is D(G(p),G(q)) ? Since G(p)=(g1(p), g2(p),...,gt(p)), we have: ... Define G(A)=(g1(A), g2(A),..., gt(A) ... – PowerPoint PPT presentation

Number of Views:13

Avg rating:3.0/5.0

Slides: 16

Provided by: lars84

Learn more at: https://people.csail.mit.edu

Category:

Tags: dimensions | geometric | gt | high | problems | sketching

Transcript and Presenter's Notes

Title: Geometric Problems in High Dimensions: Sketching

1
Geometric Problems in High Dimensions Sketching

Piotr Indyk

2
Dimensionality Reduction in Hamming Metric

Theorem For any r and epsgt0 (small
enough), there is a distribution of mappings G
0,1d ? 0,1t, such that for any two points p,
q the probability that
If D(p,q)lt r then D(G(p), G(q)) lt
(ceps/10)t
If D(p,q)gt(1eps)r then D(G(p), G(q))
gt(ceps/20)t
is at least 1-P, as long as
tClog(2/P)/eps2, C large constant.
Given n points, we can reduce the dimension to
O(log n), and still approximately preserve the
distances between them
The mapping works (with high probability) even if
you dont know the points in advance

3
Proof

Mapping G(p) (g1(p), g2(p),,gt(p)), where
gj(p)fj(pIj)
I a multiset of s indices taken independently
uniformly at random from 1d
pI projection of p
f a random function into 0,1
Example p01101, s3, I2,2,4 ? pI 110

4
Analysis

What is PrpI qI ?
It is equal to (1-D(p,q)/d)s
We set sd/r. Then PrpI qI e-D(p,q)/r,
which looks more or less like this
Thus
If D(p,q)lt r then PrpI qI gt 1/e
If D(p,q)gt(1eps)r then PrpI qI lt 1/e eps/3

5
Analysis II

What is Prg(p) ltgt g(q) ?
It is equal to PrpI qI0 (1- PrpI qI)
1/2 (1- PrpI qI)/2
Thus
If D(p,q)lt r then Prg(p) ltgt g(q) lt (1-1/e)/2
c
If D(p,q)gt(1eps)r then Prg(p) ltgt g(q) gt
ceps/6

6
Analysis III

What is D(G(p),G(q)) ? Since G(p)(g1(p),
g2(p),,gt(p)), we have
D(G(p),G(q))Sj gj(p)ltgt gj(q)
By linearity of expectations
ED(G(p),G(q)) Sj Prgj(p) ltgt gj(q) t
Prgj(p) ltgt gj(q)
To get the high probability bound, use Chernoff
inequality

7
Chernoff bound

Let X1, X2Xt be independent random 0-1
variables, such that PrXi1r. Let X Sj Xj .
Then for any 0ltblt1
Pr X t r gt b t r lt2e-b2tr/3
Proof I Cormen, Leiserson, Rivest, Stein,
Appendix C
Proof II attend one of David Kargers classes.
Proof III do it yourself.

8
Analysis IV

In our case Xjgj(p)ltgt gj(q), XD(G(p),G(q)).
Therefore
For rc
PrXgt(ceps/20)t lt PrX-tcgteps/20 tc
lt2e-(eps/20)2tc/3
For rceps/6
PrXlt(ceps/10)tltPrX-(ceps/6)tgteps/20
tclt2e-(eps/20)2t(ceps/6)/3
In both cases, the probability of failure is at
most 2e-(eps/20)2tc/3

9
Finally

2e-(eps/20)2tc/3 2e-(eps/20)2 c/3 C
log(2/P)/eps2 2e-log(2/P)cC/1200
Take C so that cC/1200 1. We get
2e-log(2/P)cC/1200 2e-log(2/P) P
Thus, the probability of failure is at most P.

10
Algorithmic Implications

Approximate Near Neighbor
Given A set of n points in 0,1d, epsgt0, rgt0
Goal A data structure that for any query q
if there is a point p within distance r from q,
then report p within distance (1eps)r from q
Can solve Approximate Nearest Neighbor by taking
r1,(1eps),

11
Algorithm I - Practical

Set probability of error to 1/poly(n) ? tO(log
n/eps2)
Map all points p to G(p)
To answer a query q
Compute G(q)
Find the nearest neighbor G(p) of G(q)
If D(p,q) lt r(1eps), report p
Query time O(n log n/eps2)

12
Algorithm II - Theoretical

The exact nearest neighbor problem in 0,1t can
be solved with
2t space
O(t) query time
(just store pre-computed answers to all queries)
By applying mapping G(.), we solve approximate
near neighbor with
nO(1/eps2) space
O(d log n/eps2) time

13
Another Sketching Method

In many applications, the points tend to be quite
sparse
Large dimension
Very few 1s
Easier to think about them as sets. E.g.,
consider a set of words in a document.
The previous method would require very large s
For two sets A,B, define Sim(A,B)A n B/A U B
If AB, Sim(A,B)1
If A,B disjoint, Sim(A,B)0
How to compute short sketches of sets that
preserve Sim(.) ?

14
Min Approach

Mapping g(A)mina in A h(a), where h is a random
permutation of the elements in the universe
Fact
Prg(A)g(B)Sim(A,B)
Proof Where is min( h(A) U h(B) ) ?

15
Min Sketching

Define G(A)(g1(A), g2(A),, gt(A) )
By Chernoff bound, we can conclude that if tC
log(1/P)/eps2, then for any A,B, the number of
js such that gj(A) gj(B) is equal to
t Sim(A,B) /- eps
with probability at least 1-P

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Geometric Problems in High Dimensions: Sketching PowerPoint PPT Presentation

Geometric Problems in High Dimensions: Sketching - We have seen several algorithms for low-dimensional problems (d=2, ... 1, r=1. There are 2d unit cubes touching the origin, and thus intersecting the unit ball: ... | PowerPoint PPT presentation | free to view

Streaming Algorithms for Geometric Problems PowerPoint PPT Presentation

Streaming Algorithms for Geometric Problems - Geometric Data Stream Algorithms as Data Structures. Data structures that support: ... The algorithms will maintain certain statistics over nP(.), which will allow it ... | PowerPoint PPT presentation | free to view

Streaming Algorithms for Geometric Problems PowerPoint PPT Presentation

Streaming Algorithms for Geometric Problems - Geometric Data Stream Algorithms as Data Structures. Data structures that support: ... The algorithms will maintain certain statistics over nP(.), which will allow it ... | PowerPoint PPT presentation | free to view

Online Learning for Real-World Problems PowerPoint PPT Presentation

Online Learning for Real-World Problems - Self-Bounded Learning of Prediction Suffix Trees, ... Self-Bounded Learning of Prediction Suffix Trees, DSS 04 First-Order Probabilistic Models for Coreference ... | PowerPoint PPT presentation | free to view

Convex Hull in Two Dimensions PowerPoint PPT Presentation

Convex Hull in Two Dimensions - Convex Hull in Two Dimensions Jyun-Ming Chen Refs: deBerg et al. (Chap. 1) O Rourke (Chap. 3) * | PowerPoint PPT presentation | free to view

Online Learning for Real-World Problems PowerPoint PPT Presentation

Online Learning for Real-World Problems - Triceratops. 6. Online Learning. Tyrannosaurus rex. Velocireptor. 7 ... Algorithm works in rounds. On round the online algorithm : Receives an input instance ... | PowerPoint PPT presentation | free to view

Tutorial on Neural Network Models for Speech and Image Processing PowerPoint PPT Presentation

Tutorial on Neural Network Models for Speech and Image Processing - ... Applications in speech and image processing PART I Feature Extraction and Classification Problems in ... Analysis Feature extraction Image ... | PowerPoint PPT presentation | free to view

Building Geometric Thinking with Hands-On Tasks & in Virtual Environments PowerPoint PPT Presentation

Building Geometric Thinking with Hands-On Tasks & in Virtual Environments - Building Geometric Thinking with Hands-On Tasks & in Virtual Environments Jean J. McGehee jeanm@uca.edu University of Central Arkansas Today Geometric Habits of Mind ... | PowerPoint PPT presentation | free to view

Dual Quaternion Synthesis PowerPoint PPT Presentation

Dual Quaternion Synthesis - ... and the dimensions of each link such the the chain ... of Spatial RR Robot Manipulators Using the Denavit-Hartenberg Parameters, J. Mechanical Design ... | PowerPoint PPT presentation | free to view

NearLinear Approximation Algorithms for Geometric Hitting Sets Pankaj Agarwal Esther Ezra Micha Shar PowerPoint PPT Presentation

NearLinear Approximation Algorithms for Geometric Hitting Sets Pankaj Agarwal Esther Ezra Micha Shar - Construct the shallow cutting. Efficiently report all regions stabs by the points ... orthogonal to the xd-axis. Solve the problem recursively on each hyperplane. ... | PowerPoint PPT presentation | free to view

Materials Handbook PowerPoint PPT Presentation

Materials Handbook - ... Holly R. Trellue, Joshua P. Finch, Nate Carstens, LANL ... Josh Finch- Purdue (new test problems) UNCLASSIFIED. UNCLASSIFIED. Funding. AFCI. Mars Odyssey ... | PowerPoint PPT presentation | free to view

ContentBased Image Retrieval at the End of the Early Years PowerPoint PPT Presentation

ContentBased Image Retrieval at the End of the Early Years - Local Shape refers to the identification of conspicuous geometric details. Texture is... Difficult except in very narrow domains such as mugshots ... | PowerPoint PPT presentation | free to view

Do now: lines and angles PowerPoint PPT Presentation

Do now: lines and angles - Do now: lines and angles The letter Z illustrates alternate interior angles. Find at least 1 other letter that illustrates pairs of angles (corresponding, alternate ... | PowerPoint PPT presentation | free to view

CS285 PowerPoint PPT Presentation

CS285 - Prepare a set of cross-sectional blue prints. at equally spaced height intervals, ... It provides tangible (high-quality haptic) output for objects with which users ... | PowerPoint PPT presentation | free to view

Low Degree Spanning Trees of Small Weight PowerPoint PPT Presentation

Low Degree Spanning Trees of Small Weight - Low Degree Spanning Trees of Small Weight Samir Khuller, Balaji Raghavachari, and Neal Young | PowerPoint PPT presentation | free to view

Graphical means of expression of technical details without the barrier of a language. PowerPoint PPT Presentation

Graphical means of expression of technical details without the barrier of a language. - ... (grammar) of any language (Hindi/English) communication of thought between people becomes easier. ... etc., if they exceed that of the alphabets. | PowerPoint PPT presentation | free to view

Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com PowerPoint PPT Presentation

Integration%20of%20Artificial%20Intelligence%20and%20Operations%20Research%20Techniques%20for%20Combinatorial%20Problems%20%20Carla%20P.%20Gomes%20Cornell%20University%20gomes@cs.cornell.edu%20Ken%20McAloon%20and%20Carol%20Tretkoff%20ILOG%20{mcaloon,tretkoff}@ilog.com - Products: Ammonium Gas = NH3 Ammonium Chloride = NH4Cl ... Nutritional Requirements. At least 100% of vitamins A, C, B1, B2, niacin, calcium and iron ... | PowerPoint PPT presentation | free to view

Yan%20Chen PowerPoint PPT Presentation

Yan%20Chen - Intrusion Detection and Forensics for Self-defending Wireless Networks Yan Chen Lab for Internet and Security Technology EECS Department Northwestern University | PowerPoint PPT presentation | free to view

Compact Data Representations and their Applications PowerPoint PPT Presentation

Compact Data Representations and their Applications - ... distance ... are points in normed space. Embedding original distance function in ... between two sets of points. Point weights multiple copies of points ... | PowerPoint PPT presentation | free to view

Content%20Based%20Image%20Retrieval PowerPoint PPT Presentation

Content%20Based%20Image%20Retrieval - ... CBIR is a very active area: research is moving to commercialize projects just now Lecture 5: Bibliography Christian B hm, Stefan Berchtold, ... | PowerPoint PPT presentation | free to view

Legible Cities: FocusDependent MultiResolution Visualization of Urban Relationships PowerPoint PPT Presentation

Legible Cities: FocusDependent MultiResolution Visualization of Urban Relationships - Learning form Las Vegas.Cambridge: The MIT Press, 1977. Legible Cities: ... Matrix view shows information about building clusters ... | PowerPoint PPT presentation | free to view

Legible Cities: FocusDependent MultiResolution Visualization of Urban Relationships PowerPoint PPT Presentation

Legible Cities: FocusDependent MultiResolution Visualization of Urban Relationships - Signs, Symbolism, Information. Information Overlay. R. Venturi, D. S. Brown, and S. Izenour. ... Study of Symbolism. Information Overlay. City.org Approach. Rem ... | PowerPoint PPT presentation | free to view

Overcoming the L1 Non-Embeddability Barrier PowerPoint PPT Presentation

Overcoming the L1 Non-Embeddability Barrier - Overcoming the L1 Non-Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT) | PowerPoint PPT presentation | free to view

Minimum%20Energy%20Broadcast%20Routing%20Problem PowerPoint PPT Presentation

Minimum%20Energy%20Broadcast%20Routing%20Problem - Signal Attenuation: A radio station s with transmission power PS= r a reaches ... MST, 40-approx (Clementi, Crescenzi, Penna Rossi, Vocca STACS 2001) ... | PowerPoint PPT presentation | free to view

BENDERGESTALT II PowerPoint PPT Presentation

BENDERGESTALT II - BG II Features and Psychometrics. Administration/Scoring. Interpretation and Derived Scores ... Psychometrics. Reliability .91. Test-Retest .80. Inter-rater .80 ... | PowerPoint PPT presentation | free to view

CS 361A (Advanced Data Structures and Algorithms) PowerPoint PPT Presentation

CS 361A (Advanced Data Structures and Algorithms) - CS 361A (Advanced Data Structures and Algorithms) Lecture 19 (Dec 5, 2005) Nearest Neighbors: Dimensionality Reduction and Locality-Sensitive Hashing | PowerPoint PPT presentation | free to view

10th Grade TAKS - Released Tests - by Objective PowerPoint PPT Presentation

10th Grade TAKS - Released Tests - by Objective - 10th Grade TAKS - Released Tests - by Objective Objective 1 Functional relationships 2 Properties and attributes of functions 3 Linear functions | PowerPoint PPT presentation | free to view