Title: Fast Nearest Neighbor Search with Keywords
1Project seminar onFast Nearest Neighbor Search
with Keywords
- By-
- Jadhav Pradip.
- Narkhede Rahul.
- Mandalik Sagar.
- Mamdyal Ankit.
Guided By- Prof Mrs. M . N . Kale. Department of
information technology
2Outline
- Introduction.
- Literature survey.
- Project statement.
- System requirement specification.
- Planning scheduling.
- Conclusion.
- referances
3IntroductionNeed
- Conventional spatial queries, such as range
search and nearest neighbor retrieval, involve
only conditions on objects. - The importance of spatial databases is reflected
by the convenience of modeling entities of
reality in a geometric manner. - For example, locations of restaurants, hotels,
hospitals and so on are often represented as
points in a map.
4Static Network, Variable QueriesFind Gas
Stations, Hotels, Markets etc
5Basic concept
- Today, many modern applications call for novel
forms of queries that aim to find objects
satisfying both a spatial predicate, and a
predicate on their associated texts. - Currently the best solution to such queries is
based on the IR2-tree, which, as shown in this
paper.
6Application
- We have seen plenty of applications calling for a
search engine that is able to efficiently support
novel forms of spatial queries that are
integrated with keyword search. - All things that user want to search like
hospital, hotel, restaurant, and so on.
7Literature surveyRelated work
- there are methods like
- spatial index
- inverted index
- nearest neighbor search.
- the objective is to enable a querying interface
that is similar to that of search engines, and
can be easily used by naive users without
knowledge
8Spatial index
- spatial index is used for creating indices
because there is huge amount of data need to be
stored for searching that data stored in the form
of xml documents. - If the data storage created in the form of
indices then space required is less also time
needed for searching the keyword is less
inverted index
9Inverted index
- The reversed index data structured in a central
module of a usual search engine indexing
algorithms. - A goal of a search engine presentation is
- optimize the speed of the query.
- find the documents where word occurs.
10Nearest neighbor search (NNS)
- It also identified as closeness search.
- parallel search is an optimization problem for
finding closest points in metric spaces.
11Existing system
- The existing application proposed by Tao and
Shang builds a novel access method. - This is called as spatial inverted index.
- This index is different from conventional index.
- It can deal with multi-dimensional data
compatible with NN queries. - This method has higher performance when compared
with IR2-tree in terms of query response time.
12Drawbacks of Existing System
- The existing system does not apply the proposed
indexing to a search engine kind of application. - It does not provide a prototype application to
demonstrate proof of concept practically.
13Advantages of Proposed System
- Faster NN search.
- Prototype application to demonstrate the proof
concept. - Search engine to demonstrate the faster searches.
14What is to be developed?
- The goal of propose system is to combine keyword
search with the existing location-finding
services. - Let P be a set of multidimensional points.
- each point p belongs to P is associated with a
set of words, which is denoted as Wp and termed
the document of p.
15Problem Statement
- Input
- Spatial Network S, Node q from S
- Output
- k-nearest neighbors of q
- Objective
- Facilitate fast shortest path queries based on
different search criteria's - Constraints/ Assumptions
- Static spatial network
- Contiguous (connected) regions
16Example
- if p stands for a restaurant, Wp can be its menu.
- if p is a hotel, Wp can be the description of its
services and facilities. - a nearest neighbor (NN) query specifies a point q
and a set Wq of keywords. - It returns the point in Pq that is the nearest to
q, where Pq is defined as.
17Sample dataset of hotel object
18Technology used
- The IR2-Tree
- DBXplorer
- Nearest Neighbor Search
- Spatial inverted list
- Merging distance browsing
19The IR2-Tree
- the IR2-tree combines the R-tree with signature
files. - Signature file in general refers to a
hashing-based framework. - The IR2-tree is an R-tree where each (leaf or
nonleaf) entry E is augmented. - On conventional R-trees, the best-first algorithm
14 is a well-known solution to NN search.
20depth-first search
Example of bit string computation
Example of an IR2-tree. (a) shows the MBRs of the
underlying R-tree and (b) gives the signatures of
the entries
21DRAWBACKS OF THE IR2-TREE
- the number of false hits can be really large.
- The signature should have Vð1Þ bit for every
distinct word of W.
22DBXplorer
- A System for Keyword-Based Search over Relational
Databases. - A significant amount of the worlds enterprise
data resides in relational databases. - Enabling keyword search in databases that does
not require knowledge of the schema is a
challenging task.
23Architecture of DBXplorer
24Algorithm use
25Space and Time Requirements
26Spatial inverted index
- Query processing with an SI-index can be done
either by merging, or together with R-trees in a
distance browsing manner. - The spatial inverted list (SI-index) is
essentially a compressed version of an I-index
27Theoretical analysis
28System requirement specificationH/W system
configuration
- Processor pentium-3.
- Speed 1.1 GHz.
- RAM 256MB (min).
- Hard Drive 20GB.
- Floppy Drive 1.44MB.
- Key board standard windows keyboard.
- Mouse two button mouse.
- Monitor - SVGA
29S/W system configuration
- Operating system windows XP/07/08.
- Application server Tomcat 6.x.
- Front end java.
- Script JavaScript.
- Server side script JSP.
- Database oracle 10g.
- Data connectivity JDBC.
30Conclusion
- Thus the proposed system remedied the situation
by developing an access method called the spatial
inverted index (SI-index).
31Referances
- S. Agrawal, S. Chaudhuri, and G. Das, Dbxplorer
A System for Keyword-Based Search over Relational
Databases, Proc. Intl Conf. Data Eng. (ICDE),
pp. 5-16, 2002. - N. Beckmann, H. Kriegel, R. Schneider, and B.
Seeger, The R-tree An Efficient and Robust
Access Method for Points and Rectangles, Proc.
ACM SIGMOD Intl Conf. Management of Data, pp.
322-331, 1990. - X. Cao, L. Chen, G. Cong, C.S. Jensen, Q. Qu, A.
Skovsgaard, D.Wu, and M.L. Yiu, Spatial Keyword
Querying, Proc. 31st Intl Conf. Conceptual
Modeling (ER), pp. 16-29, 2012. - Y.-Y. Chen, T. Suel, and A. Markowetz, Efficient
Query Processing in Geographic Web Search
Engines, Proc. ACM SIGMOD Intl Conf. Management
of Data, pp. 277-288, 2006. - G.R. Hjaltason and H. Samet, Distance Browsing
in Spatial Databases, ACM Trans. Database
Systems, vol. 24, no. 2, pp. 265-318, 1999.
32Thank you !!!