Title: CoBase: Scalable and Extensible Cooperative Information System
1CoBase Scalable and Extensible Cooperative
Information System
Wesley W. Chu Computer Science Department Univer
sity of California, Los Angeles
http//www.cobase.cs.ucla.edu
2Conventional Query Answering
- Need to know the detailed database schema
- Cannot get approximate answers
- Cannot answer conceptual queries
- Cooperative Query Answering
- Derive approximate Answers
- Answer Conceptual Queries
3Cooperative Queries
4Generalization and Specialization
5Type Abstraction Hierarchy (TAH)
Provide multi-level knowledge representations
Chemical-Suit Size TAH (A non-numerical TAH)
All_Sizes
Small_Size
Large_Size
Very_Small
Large_to_Extra_Large
Small_to_Medium
Very_Large
XL
XXL
L
M
S
XXS
XXXS
6Type Abstraction Hierarchy (TAH)
(Location Example)
7Relaxation Agent
Use knowledge-based approach (generalization
and specialization via Type Abstraction
Hierarchy)
to relax the followings for matching
- query conditions
- constraints
8Query Relaxation
9(No Transcript)
10Visualization of Relaxation Process
Query Find seaports in the given region.
relaxed region
given region
11(No Transcript)
12Relaxation Control Primitives
- not-relaxable
- runway-length
- relaxation-order
- (runway length, location)
- preference-list
- unacceptable-list
- answer-size
- relaxation-level
13Relaxation Primitives
- (approximate)
- 9 am
- between
- near-to (context-sensitive)
- Airport near-to LAX
- Restaurant near-to UCLA
- similar-to
- Airport similar-to LAX base-on (traffic,runway)
- within
14Similar-to
Find all airports in Tunisia similar to the
Bizerte airport based on runway length and (more
importantly) runway width.
select aport_name, runway_length, runway_width
from runways, countries where aport_name similar-
to Bizerte based-on ((runway_length 1.0)
(runway_width 2.0)) and
country_state_name Tunisia and
countries.glc_cd runways.glc_cd
15Similar-to Result
Similar-to module ranks the returned answers
according to mean-squared error.
16Unacceptable List Operator
Constraint
CoBase Relaxation Manager
Tunisia
Tunisia
Central Tunisia
SW Tunisia
NE Tunisia
Central Tunisia
SW Tunisia
NW Tunisia
...
Gafsa
El Borma
Bizerte
El Borma
Gafsa
Trimmed TAH
Type Abstraction Hierarchy
17TAH Generation for Numerical Attribute Values
- Relaxation Error
- Difference between the exact value and the
returned approximate value
- The expected error is weighted by the probability
of occurrence of each value
- DISC (Distribution Sensitive Clustering) is based
on the attribute values and frequency
distribution of the data
18TAH Generation for Non-numerical Attribute Values
- Pattern Based Knowledge Induction (PBKI)
- Rule-based approach
- Clusters attribute values into TAH based on other
attributes in the relation (i.e.,
Inter-Attributes Relationships)
- Provides attribute correlation value (measure how
well the rules applied to the databases)
19Type Abstraction Hierarchy (TAH)
Provide multi-level knowledge representations
20Associative Query Answering
Provide relevant information not explicitly asked
by the user User Query List all airports with r
unway length between 8500 and approximately 10
000 feet
21CoBase and GLADIntegration
22CoBase Functionality
- Provide approximate matching
- Find HETs with capacity of approximate 5-ton
- Provide conceptual query answering
- Find Earth Moving Equipment
- Provide content-sensitive spatial queries
- Find storage sites near selected location
- (Integration with MATT map server)
- Provide relaxation control
- Relaxation order
- Not-relaxable
- At-least (answer set, quantity on hand)
23Cooperative Operations Added to GLAD
- Implicit Query Relaxation
- Explicit Query Relaxation
- Approximate operator
- Similar-to/based-on
- Spatial relaxation
- Relaxation Control
- Relaxation-order
- Not-relaxable
- At-least (answer-set size, quantity on hand)
24CoBase Features Added to GLAD
- Enhance GLAD queries with cooperative operators
(similar-to, relaxation-order, etc.)
- Display the query relaxation process
- modified query conditions (value, spatial)
- type abstraction hierarchies
- Rank returned answers with similarity measures
- e.g., spatial relaxation ranks answers according
to their distance from the selected location
25CoBase and GLAD TIE
Report Collection
Spatial Area Selection
Filter Editor
Display Generator
Query Collection
NSNs
Object Cache
Report Query Constructor
CoBase Query Editor
CoBase Relaxation Manager
GLAD
Data Cache
CoBase Data Source Manager
Databases
26GLAD Query
Find NSNs of aircraft with passenger capacity
10, combat type 'I', capacity weight and price capacity_qty, capacity_wt_ston
from nsn_description where (upper(class) '7'
and upper(cbs_category_nomen) 'AIRCRAFT'
and price 10
and upper (combat_type) 'I' and capacity_wt_st
on
27CoGLAD Query with Relaxation Control Operators
Find NSNs of aircrafts with passenger capacity
10, combat type 'I', capacity weight and price capacity is not relaxable. Relax price first and
then capacity weight. select nsn, price, pax_c
apacity_qty, capacity_wt_ston from nsn_descriptio
n where (upper(class) '7' and upper(cbs_categ
ory_nomen) 'AIRCRAFT' and price pax_capacity_qty 10 and upper (combat_type)
'I' and capacity_wt_ston ax_capacity_qty relaxation-order price capacity_w
t_ston
28CoGLAD Querywith Similar-to Operator
Find aircraft similar to NSN '0000IB0000961'
based on the attributes price, passenger capacity
and air mileage. Passenger capacity has a weight
of 8 and price and air mileage has a
weight of 1. select nsn from nsn_description
where upper(nsn) similar-to '0000IB0000961' bas
ed-on ((price 1.0) (pax_capacity_qty 8.0)
(air_mileage 1.0)) at-least 4 '0000IB
0000961' is an answer from the previous query
29CoGLAD Querywith Approximate Operator
Find DLA stock report with NSN like 8340 (FSC
for tents and tarpaulin) and on-hand quantity is
approximate 150. select nsn, ric from dla_sto
ck_report where nsn like 8340 and on_hand_q
uantity 150
30Adding Constraints to a Query
GLAD query select nsn, ric from dla_stock_report
where nsn like 8340 and nomenclature like
TARP Query with added constraints sele
ct nsn, ric from dla_stock_report where nsn like
8340 and nomenclature like TARP and
on_hand_quantity 150 and
size_in_square_feet 350
31Example of Spatial Relaxation
32Spatial Relaxation with Relaxation Control
- relaxation-order size, (latitude, longitude)
- not-relaxable price
- at-least
- value size of the tarpaulin
- quantity on hand relax until enough quantity on
hand (specified by the user) is obtained
33Scalable and Extensible CoBase Architecture
34Mediator Inter-Communications via KQML
CoBase Ontology
Module Objects APIs
Content Language Data Actions
35(No Transcript)
36Query Answers Without CoBase
Query find chemical suits
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41(No Transcript)
42(No Transcript)
43Electronic Warfare
- Identify and locate sources of radiated
electromagnetic energy
- Determine emitter type based on the operating
parameters of observed signals
- Radio Frequency (RF)
- Pulse Repetition Frequency (PRF)
- Pulse Duration (PD)
- Scan Period (SP)
- other operating parameters
- Determine platform sites near the line of the
bearing of an emitter
This research is a joint effort between CoBase
and Lockheed Martin Communication Systems (Russ
Frew, et al.), Camden, NJ
44Performance Improvement by Using CoBase in EW
Conventional DB parameter ranges from emitter
specifications CoBase DB peak parameters (RF
,PRF) and parameter ranges (PD,SP)
KB TAHs based on RF and PRF peak parameters
TAHs based on PD and SP parameter ranges
Case 1 emitter signals without noise
Case 2 add noise - PD SP (10), PRF (5), RF
(2.5) Sample Size 1000 signals Emitter
Types 75
This research is a joint effort between CoBase
and Lockheed Martin Communication Systems (Russ
Frew, et al.), Camden, NJ
45Current CoBase Users and Applications
46(No Transcript)
47XML Query Relaxation
48Outline
- Motivations
- Relaxation Systems
- XML Overview
- XML Query Languages
- Relaxation Types
- CoXML
- X-TAH
49Motivations
- XML (eXtensible Markup Language) is becoming the
standard for information structured documents and
sharing of data on the web
- The use of XML as a data source has the following
properties that make query relaxation essential
- The schema of an XML model can be very large.
Users of such a system cannot be required to know
the entire schema.
- On the web there are many heterogeneous data
sources. Users cannot be expected to know the
structure of each source separately.
50Relaxation Systems
- CoSQL
- Transfer XML to relational tables
- Cooperatively answer queries on the relational
system using generated TAHs.
- CoXML
- Answer queries directly on the XML data using
X-TAHs.
51XML Overview
- XML (eXtensible Markup Language) is a format for
specifying structured documents and data.
- XML is extensible since it allows users to define
their own schema (unlike HTML which is a
pre-defined markup language).
52XML (cont.)
- XML is a hierarchical data model.
- A XML document consists of two parts
- Schema
- Data
- The schema describes the structure of the data.
- Example
ng, body) rom (PCDATA) ELEMENT body (PCDATA) Tove Jani Reminder Don't forget me this weekend! note
Schema
Data
53XML Query Languages
- XML can be represented as an ordered tree with
- Nodes representing elements and attributes
- Edges representing inclusion relationships
- An XML query can similarly be represented as a
tree with edges of two types
- / for parent-child relationships
- // for ancestor-descendent relationships
54XML Query Language Example
- The following XML
-
-
-
-
-
-
- Yields the following tree
- 1,a
- 2,d 4,c
-
- 3,b
- A possible query is
- 1a
-
- 2b 3c
55Query Relaxation
- XML Query Relaxation can be categorized into two
main types
- Value Relaxation values are relaxed to expand
the scope values are allowed to take
- Structure Relaxation the structure of the query
tree is relaxed to allow for more answers
56Value Relaxation
- In value relaxation, the scope of a value is
expanded to allow additional answers to be
returned by a query.
- For example,
- find a person with salary of 50K to 55K
- Can be relaxed to find a person with salary of
45K to 60K
57Structure Relaxation
- In structure relaxation nodes and/or edges of the
query tree can be relaxed to allow for more
answers.
- There are three types of structural relaxation
- Edge Relaxation
- Node Relaxation
- Order Relaxation
58Edge Relaxation
- A parent-child edge can be relaxed to a
ancestor-descendent edge.
- For example
-
- 1,a
-
- 2,b 4,b 7,b 9,d 12,d
-
- 3,d 5,d 8,c 10,b 13,b
- 6,c 11,c 14,d
-
-
- 15,c
- Original query a/b/c ? 1,7,8
59Node Relaxation
- Nodes can be relaxed in several ways
- A node can be relabeled with a similar tag name
based on the domain knowledge.
- For example article/sec ? article/section
- A node can be replaced with a dont care such
that it will match any non-null answer.
- For example /a/b/c ? a/_ /c
- A node can be removed while ensuring the
superset property.
- For example a/b/c ? a/b
60Order Relaxation
- The order in an XML query can be relaxed to allow
any ordering of search conditions.
- For example
- 1a ? 1a
- 2b
- Two documents
- D1 D2
-
-
-
-
-
-
- Original query ? matches D1 only
- Relaxed query ? matches D1 and D2
61CoXML
- Cooperative XML Query Answering
62Knowledge Base
- The knowledge base of an XML query answering
system facilitates query relaxation by providing
the query processor with the following
information - Domain Ontology the semantic relationships among
the tag names. This aids with node re-labeling
- X-TAH Knowledge-based XML Relaxation Index
63X-TAH
- Two types of X-TAHs can be generated from the
data
- Value X-TAH for value relaxations
- Structure X-TAH for structural relaxations
- Internal nodes summarize the characteristics of
the objects in the cluster
- Leaf nodes are objects in a cluster
- Value for a value X-TAH
- Structure fragment for a structure X-TAH
64X-TAH Examples
- Value X-TAH
- 1-6
- 3-6 1-2
-
- 5-6 3-4 1 2
- 5 6 3 4
- Structure X-TAH
- R3
- R1 R2
- O1 O2 O3 O4 O5 O6
- O1 //article/bm/sec O2 //article/bm/sec/ss1
- O3 //article/bm/sec/ss1/ss2 O4
//article/bm/app/sec
- O5 //article/bm/app/sec/ss1 O6
//article/bm/app/sec/ss1/ss2
65Query Relaxation
- CoXMLs Query Relaxation Manager relaxes a query
in the following three steps
- A set of relaxable conditions and their order are
generated
- For each condition, a X-TAH is selected to help
guide the relaxation process
- The relaxed queries are sent to the Query
Processor to derive approximate answers
66Conclusions
- Provide user and context sensitive query
relaxations (structured ,semi-structured and
unstructured data)
- Provide additional information (associative query
answering) based on past cases
- CoSQL (Cooperative SQL)
- similar-to, near-to, approximate
- relaxation control operators
- CoXML( Cooperative XML)
- Value relaxation
- Structure relaxation ( edge, node, order)
67References
- 1 W.W.Chu,H.Yang, K.Chiang, M.Minock, G.Chow,
and C.Larson, "CoBase A Scalable and Extensible
Cooperative Information System", Journal of
Intelligence Information Systems, 6, 1996 - 2 Shaorong Liu and Wesley W. Chu, Cooperative
XML(CoXML) Query Answering at INEX 2003, INEX
Workshop 2003
- 3 Dongwon Lee "Query Relaxation for XML Model
In Ph.D Dissertation, University of California,
Los Angeles, June 2002
- 4 Dongwon Lee, Murali Mani, Wesley W.
Chu"Effective Schema Conversions between XML and
Relational Models In European Conf. on
Artificial Intelligence (ECAI), Knowledge
Transformation Workshop (ECAI-OT), Lyon, France,
July 2002 (Invited) - http//www.cobase.cs.ucla.edu