Query Paradigms Model and Semantics

About This Presentation

Title:

Query Paradigms Model and Semantics

Description:

The filter condition selects the objects that are 'good enough' Example: ... We use selectivity estimates to map a ranking expression into a filter condition ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 44

Provided by: wwwfacul

Category:

more less

Transcript and Presenter's Notes

Title: Query Paradigms Model and Semantics

1
Query Paradigms Model and Semantics

Sumin Song

2
What are the sceneries?

Information retrieval
Multimedia
Relational database

3
TR vs. Database Retrieval

Information
Unstructured/free text vs. structured data
Ambiguous vs. well-defined semantics
Query
Ambiguous vs. well-defined semantics
Incomplete vs. complete specification
Answers
Relevant documents vs. matched records
TR is an empirically defined problem!

4
Document Selection vs. Ranking
R(q)
1
Doc Selection f(d,q)?
True R(q)
-
-
-
-
0

-
-
-

-

-
-
-
-
-
-
-
-
-
0.98 d1 0.95 d2 0.83 d3 - 0.80 d4 0.76 d5
- 0.56 d6 - 0.34 d7 - 0.21 d8 0.21 d9 -
-
Doc Ranking f(d,q)?
-
5
TR System Architecture
docs
INDEXING
Query Rep
query
Doc Rep
User
Ranking
results
6
Relevance Feedback
7
Pseudo/Blind/Automatic Feedback
Results d1 3.5 d2 2.4 dk 0.5 ...
Retrieval Engine
Query
Updated query
Document collection
Judgments d1 d2 d3 dk - ...
Feedback
8
VS Model illustration
9
Relevance Feedback in VS

Basic setting Learn from examples
Positive examples docs known to be relevant
Negative examples docs known to be non-relevant
How do you learn from this to improve
performance?
General method Query modification
Adding new (weighted) terms
Adjusting weights of old terms
Doing both

10
Probability model

What is the probability that THIS document is
relevant to THIS query?

11
Feedback as Model Interpolation
Document D
Results
Query Q
Feedback Docs Fd1, d2 , , dn
Generative model
12
Examples of Information Filtering

News filtering
Email filtering
Movie/book recommenders
Literature recommenders
And many others

13
Content-based Filtering vs. Collaborative
Filtering

Basic filtering question Will user U like item
X?
Two different ways of answering it
Look at what U likes
Look at who likes X
Can be combined

gt characterize X gt content-based filtering
gt characterize U gt collaborative filtering
14
Adaptive Information Filtering

Stable long term interest, dynamic info source
System must make a delivery decision immediately
as a document arrives

Filtering System

15
Score Distribution Approaches( Aramptzis
Hameren 01 Zhang Callan 01)

Assume generative model of scores p(sR), p(sN)
Estimate the model with training data
Find the threshold by optimizing the expected
utility under the estimated model
Specific methods differ in the way of defining
and estimating the scoring distributions

16
What is Collaborative Filtering (CF)?

Making filtering decisions for an individual user
based on the judgments of other users
Inferring individuals interest/preferences from
that of other similar users
General idea
Given a user u, find similar users u1, , um
Predict us preferences based on the preferences
of u1, , um

17
CF Assumptions

Users with a common interest will have similar
preferences
Users with similar preferences probably share the
same interest
Examples
interest is IR gt favor SIGIR papers
favor SIGIR papers gt interest is IR
Sufficiently large number of user preferences are
available

18
Rating-based vs. Preference-based

Rating-based Users preferences are encoded
using numerical ratings on items
Complete ordering
Absolute values can be meaningful
But, values must be normalized to combine
Preference-based Users preferences are
represented by partial ordering of items
Partial ordering
Easier to exploit implicit preferences

19
A Formal Framework for Rating
Objects O
o1 o2 oj on 3
1.5 . 2 2 1 3

Users U
Xijf(ui,oj)?
u1 u2 ui ... um
The task

Assume known f values for some (u,o)s
Predict f values for other (u,o)s
Essentially function approximation, like other
learning problems

?
Unknown function f U x O? R
20
Optimizing Queries Over Multimedia Repositories

Surajit Chaudhuri
Hewlett-Packard Laboratories
Luis Gravano
Hewlett-Packard Laboratories
Stanford University

21
Our objects have multiple complex attributes

Object
Attributes
caption, color histogram of image, texture of
image, Fagin96

Sunset on the Atlantic Ocean
22
Attribute handling differs from traditional
systems

Attribute values v1,v2 have a grade of match
Grade(v1,v2) (between 0 and 1)
Attributes can only be accessed through indexes

23
Queries have two main components

SELECT oid
FROM Repository
WHERE Filter_Condition
ORDER k by
Ranking_Expression

24
The filter condition selects the objects that are
good enough

Example
Grade(color_histogram, reddish_ch) gt
0.7
Only objects with images that are red enough
are in the answer.

25
The ranking expression orders the objects that
are good enough

Example
Grade(caption, ocean sunset)
Objects are sorted based on how well their
caption matches the keywords ocean and sunset.

26
Users request the top k objects that are good
enough

SELECT oid
FROM Repository
WHERE Grade(color_histogram, reddish_ch) gt
0.7
ORDER 10 by
Grade(caption, ocean sunset)

27
Filter conditions and ranking expressions can be
complex

A filter condition is built from atomic
conditions using Ands and Ors
A ranking expression is built from atomic
expressions using Mins and Maxs

28
We gain expressivity by having both query
components

Filter conditions ranking expressions more
expressive than just ranking expressions

29
Attributes can only be accessed through indexes

Indexes support
Search gets all objects with a minimum grade of
match for an attribute value
Probe gets the grade of an object for an
attribute value

30
We optimize the processing of the filter condition

Example
a And b And c
An execution
Get all objects that satisfy a (through a
search)
Check if objects satisfy b And c (through probes)

31
We process the ranking expression as a modified
filter condition

We use selectivity estimates to map a ranking
expression into a filter condition
We process the new filter condition
We output the top objects for the original
ranking expression

32
We propagate k down to the atomic expressions

Top object for the ranking expression (k1)
Min(Grade(a1,v1), Grade(a2,v2))
Objects in repository 100
Then, objects needed from each subexpression
Fagin96 10

33
Evaluating Top-K Selection Queries

Surajit ChaudhuriMicrosoft Research
Luis GravanoColumbia University

34
Top-K Queries over Precise Relational Data

Support approximate matches with minimal changes
to the relational engine
Initial focus Selection queries with equality
conditions

35
Specifying Top-K Queries using SQL

Select
From R
Order k By Scoring_Function

36
On Saying Enough Already! in SQLMichael J.
Carey and Donald KossmannSIGMOD97

January 16, 1997
Jang Ho Park

37
Extending SQL

SELECT FROM WHERE GROUP BY HAVING ORDER
BY ltsort specification listgtSTOP AFTER ltvalue
expressiongt
ltvalue expressiongt evaluates to an integer that
specifies the maximum number of result tuples
desired.

38
Semantics of a STOP AFTER

With ORDER BY clause
Only the first N tuples in this ordering are
returned.
Without ORDER BY clause
Any N tuples that satisfy the rest of the query
are returned.
If fewer than N tuples in the result
STOP AFTER clause has no effect.

39
Implications

It is not hard just to extend SQL to allow users
to specify a limit on the result size of a query.
The advantage of extending SQL is that it
provides information that the DBMS can exploit
during query optimization and execution.

40
Example 1 Query

SELECT e.name, e.salary, d.nameFROM Emp e, Dept
dWHERE e.works_in d.dnoORDER BY e.salary
DESCSTOP AFTER 2

41
Supporting Top-k Join Queries in Relational
DatabasesIhab F. Ilyas Walid G. Aref Ahmed K.
ElmagarmidDepartment of Computer Sciences,
Purdue University
42
Query Example

SELECT
FROM R1, R2, , Rm
WHERE join condition(R1R2 Rm)
ORDER BY f(R1scoreR2score Rmscore)
STOP AFTER k

43
Query 1

SELECT A.1,B.2
FROM A,B,C
WHERE A.1 B.1 and B.2 C.2
ORDER BY (0.3A.10.7B.2)
STOP AFTER 5

Write a Comment

User Comments (0)

About PowerShow.com

Query Paradigms Model and Semantics - PowerPoint PPT Presentation

Query Paradigms Model and Semantics

The filter condition selects the objects that are 'good enough' Example: ... We use selectivity estimates to map a ranking expression into a filter condition ... – PowerPoint PPT presentation