Query Paradigms Model and Semantics - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Query Paradigms Model and Semantics

Description:

The filter condition selects the objects that are 'good enough' Example: ... We use selectivity estimates to map a ranking expression into a filter condition ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 44
Provided by: wwwfacul
Category:

less

Transcript and Presenter's Notes

Title: Query Paradigms Model and Semantics


1
Query Paradigms Model and Semantics
  • Sumin Song

2
What are the sceneries?
  • Information retrieval
  • Multimedia
  • Relational database

3
TR vs. Database Retrieval
  • Information
  • Unstructured/free text vs. structured data
  • Ambiguous vs. well-defined semantics
  • Query
  • Ambiguous vs. well-defined semantics
  • Incomplete vs. complete specification
  • Answers
  • Relevant documents vs. matched records
  • TR is an empirically defined problem!

4
Document Selection vs. Ranking
R(q)
1
Doc Selection f(d,q)?
True R(q)
-
-
-
-
0


-
-
-


-

-
-
-
-
-
-
-
-
-
0.98 d1 0.95 d2 0.83 d3 - 0.80 d4 0.76 d5
- 0.56 d6 - 0.34 d7 - 0.21 d8 0.21 d9 -
-
Doc Ranking f(d,q)?
-
5
TR System Architecture
docs
INDEXING
Query Rep
query
Doc Rep
User
Ranking
results
6
Relevance Feedback
7
Pseudo/Blind/Automatic Feedback
Results d1 3.5 d2 2.4 dk 0.5 ...
Retrieval Engine
Query
Updated query
Document collection
Judgments d1 d2 d3 dk - ...
Feedback
8
VS Model illustration
9
Relevance Feedback in VS
  • Basic setting Learn from examples
  • Positive examples docs known to be relevant
  • Negative examples docs known to be non-relevant
  • How do you learn from this to improve
    performance?
  • General method Query modification
  • Adding new (weighted) terms
  • Adjusting weights of old terms
  • Doing both

10
Probability model
  • What is the probability that THIS document is
    relevant to THIS query?

11
Feedback as Model Interpolation
Document D
Results
Query Q
Feedback Docs Fd1, d2 , , dn
Generative model
12
Examples of Information Filtering
  • News filtering
  • Email filtering
  • Movie/book recommenders
  • Literature recommenders
  • And many others

13
Content-based Filtering vs. Collaborative
Filtering
  • Basic filtering question Will user U like item
    X?
  • Two different ways of answering it
  • Look at what U likes
  • Look at who likes X
  • Can be combined

gt characterize X gt content-based filtering
gt characterize U gt collaborative filtering
14
Adaptive Information Filtering
  • Stable long term interest, dynamic info source
  • System must make a delivery decision immediately
    as a document arrives

Filtering System

15
Score Distribution Approaches( Aramptzis
Hameren 01 Zhang Callan 01)
  • Assume generative model of scores p(sR), p(sN)
  • Estimate the model with training data
  • Find the threshold by optimizing the expected
    utility under the estimated model
  • Specific methods differ in the way of defining
    and estimating the scoring distributions

16
What is Collaborative Filtering (CF)?
  • Making filtering decisions for an individual user
    based on the judgments of other users
  • Inferring individuals interest/preferences from
    that of other similar users
  • General idea
  • Given a user u, find similar users u1, , um
  • Predict us preferences based on the preferences
    of u1, , um

17
CF Assumptions
  • Users with a common interest will have similar
    preferences
  • Users with similar preferences probably share the
    same interest
  • Examples
  • interest is IR gt favor SIGIR papers
  • favor SIGIR papers gt interest is IR
  • Sufficiently large number of user preferences are
    available

18
Rating-based vs. Preference-based
  • Rating-based Users preferences are encoded
    using numerical ratings on items
  • Complete ordering
  • Absolute values can be meaningful
  • But, values must be normalized to combine
  • Preference-based Users preferences are
    represented by partial ordering of items
  • Partial ordering
  • Easier to exploit implicit preferences

19
A Formal Framework for Rating
Objects O
o1 o2 oj on 3
1.5 . 2 2 1 3

Users U
Xijf(ui,oj)?
u1 u2 ui ... um
The task
  • Assume known f values for some (u,o)s
  • Predict f values for other (u,o)s
  • Essentially function approximation, like other
    learning problems

?
Unknown function f U x O? R
20
Optimizing Queries Over Multimedia Repositories
  • Surajit Chaudhuri
  • Hewlett-Packard Laboratories
  • Luis Gravano
  • Hewlett-Packard Laboratories
  • Stanford University

21
Our objects have multiple complex attributes
  • Object
  • Attributes
  • caption, color histogram of image, texture of
    image, Fagin96

Sunset on the Atlantic Ocean
22
Attribute handling differs from traditional
systems
  • Attribute values v1,v2 have a grade of match
    Grade(v1,v2) (between 0 and 1)
  • Attributes can only be accessed through indexes

23
Queries have two main components
  • SELECT oid
  • FROM Repository
  • WHERE Filter_Condition
  • ORDER k by
  • Ranking_Expression

24
The filter condition selects the objects that are
good enough
  • Example
  • Grade(color_histogram, reddish_ch) gt
    0.7
  • Only objects with images that are red enough
    are in the answer.

25
The ranking expression orders the objects that
are good enough
  • Example
  • Grade(caption, ocean sunset)
  • Objects are sorted based on how well their
    caption matches the keywords ocean and sunset.

26
Users request the top k objects that are good
enough
  • SELECT oid
  • FROM Repository
  • WHERE Grade(color_histogram, reddish_ch) gt
    0.7
  • ORDER 10 by
  • Grade(caption, ocean sunset)

27
Filter conditions and ranking expressions can be
complex
  • A filter condition is built from atomic
    conditions using Ands and Ors
  • A ranking expression is built from atomic
    expressions using Mins and Maxs

28
We gain expressivity by having both query
components
  • Filter conditions ranking expressions more
    expressive than just ranking expressions

29
Attributes can only be accessed through indexes
  • Indexes support
  • Search gets all objects with a minimum grade of
    match for an attribute value
  • Probe gets the grade of an object for an
    attribute value

30
We optimize the processing of the filter condition
  • Example
  • a And b And c
  • An execution
  • Get all objects that satisfy a (through a
    search)
  • Check if objects satisfy b And c (through probes)

31
We process the ranking expression as a modified
filter condition
  • We use selectivity estimates to map a ranking
    expression into a filter condition
  • We process the new filter condition
  • We output the top objects for the original
    ranking expression

32
We propagate k down to the atomic expressions
  • Top object for the ranking expression (k1)
  • Min(Grade(a1,v1), Grade(a2,v2))
  • Objects in repository 100
  • Then, objects needed from each subexpression
    Fagin96 10

33
Evaluating Top-K Selection Queries
  • Surajit ChaudhuriMicrosoft Research
  • Luis GravanoColumbia University

34
Top-K Queries over Precise Relational Data
  • Support approximate matches with minimal changes
    to the relational engine
  • Initial focus Selection queries with equality
    conditions

35
Specifying Top-K Queries using SQL
  • Select
  • From R
  • Order k By Scoring_Function

36
On Saying Enough Already! in SQLMichael J.
Carey and Donald KossmannSIGMOD97
  • January 16, 1997
  • Jang Ho Park

37
Extending SQL
  • SELECT FROM WHERE GROUP BY HAVING ORDER
    BY ltsort specification listgtSTOP AFTER ltvalue
    expressiongt
  • ltvalue expressiongt evaluates to an integer that
    specifies the maximum number of result tuples
    desired.

38
Semantics of a STOP AFTER
  • With ORDER BY clause
  • Only the first N tuples in this ordering are
    returned.
  • Without ORDER BY clause
  • Any N tuples that satisfy the rest of the query
    are returned.
  • If fewer than N tuples in the result
  • STOP AFTER clause has no effect.

39
Implications
  • It is not hard just to extend SQL to allow users
    to specify a limit on the result size of a query.
  • The advantage of extending SQL is that it
    provides information that the DBMS can exploit
    during query optimization and execution.

40
Example 1 Query
  • SELECT e.name, e.salary, d.nameFROM Emp e, Dept
    dWHERE e.works_in d.dnoORDER BY e.salary
    DESCSTOP AFTER 2

41
Supporting Top-k Join Queries in Relational
DatabasesIhab F. Ilyas Walid G. Aref Ahmed K.
ElmagarmidDepartment of Computer Sciences,
Purdue University
42
Query Example
  • SELECT
  • FROM R1, R2, , Rm
  • WHERE join condition(R1R2 Rm)
  • ORDER BY f(R1scoreR2score Rmscore)
  • STOP AFTER k

43
Query 1
  • SELECT A.1,B.2
  • FROM A,B,C
  • WHERE A.1 B.1 and B.2 C.2
  • ORDER BY (0.3A.10.7B.2)
  • STOP AFTER 5
Write a Comment
User Comments (0)
About PowerShow.com