CoBase: Scalable and Extensible Cooperative Information System - PowerPoint PPT Presentation

About This Presentation
Title:

CoBase: Scalable and Extensible Cooperative Information System

Description:

CoBase: Scalable and Extensible Cooperative Information System Wesley W. Chu Computer Science Department University of California, Los Angeles http://www.cobase.cs ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 119
Provided by: myl59
Learn more at: http://web.cs.ucla.edu
Category:

less

Transcript and Presenter's Notes

Title: CoBase: Scalable and Extensible Cooperative Information System


1
CoBase Scalable and Extensible Cooperative
Information System
Wesley W. Chu Computer Science Department Universi
ty of California, Los Angeles http//www.cobase.c
s.ucla.edu
2
Conventional Query Answering
  • Need to know the detailed database schema
  • Cannot get approximate answers
  • Cannot answer conceptual queries
  • Cooperative Query Answering
  • Derive approximate Answers
  • Answer Conceptual Queries

3
Cooperative Queries
4
Generalization and Specialization
5
Type Abstraction Hierarchy (TAH)
Provide multi-level knowledge representations
Chemical-Suit Size TAH (A non-numerical TAH)
All_Sizes
Small_Size
Large_Size
Very_Small
Large_to_Extra_Large
Small_to_Medium
Very_Large
XL
XXL
L
M
S
XXS
XXXS
6
Type Abstraction Hierarchy (TAH)
(Location Example)
7
Relaxation Agent
Use knowledge-based approach (generalization and
specialization via Type Abstraction Hierarchy) to
relax the followings for matching
  • query conditions
  • constraints

8
Query Relaxation
9
(No Transcript)
10
Visualization of Relaxation Process
Query Find seaports in the given region.
relaxed region
given region
11
(No Transcript)
12
Relaxation Control Primitives
  • not-relaxable
  • runway-length
  • relaxation-order
  • (runway length, location)
  • preference-list
  • unacceptable-list
  • answer-size
  • relaxation-level

13
Relaxation Primitives
  • (approximate)
  • 9 am
  • between
  • near-to (context-sensitive)
  • Airport near-to LAX
  • Restaurant near-to UCLA
  • similar-to
  • Airport similar-to LAX base-on (traffic,runway)
  • within

14
Similar-to
Find all airports in Tunisia similar to the
Bizerte airport based on runway length and (more
importantly) runway width.
select aport_name, runway_length,
runway_width from runways, countries where
aport_name similar-to Bizerte based-on
((runway_length 1.0)
(runway_width 2.0)) and
country_state_name Tunisia and
countries.glc_cd runways.glc_cd
15
Similar-to Result
Similar-to module ranks the returned
answers according to mean-squared error.
16
Unacceptable List Operator
Constraint
CoBase Relaxation Manager
Tunisia
Tunisia
Central Tunisia
SW Tunisia
NE Tunisia
Central Tunisia
SW Tunisia
NW Tunisia
...
Gafsa
El Borma
Bizerte
El Borma
Gafsa
Trimmed TAH
Type Abstraction Hierarchy
17
TAH Generation for Numerical Attribute Values
  • Relaxation Error
  • Difference between the exact value and the
    returned approximate value
  • The expected error is weighted by the probability
    of occurrence of each value
  • DISC (Distribution Sensitive Clustering) is based
    on the attribute values and frequency
    distribution of the data

18
TAH Generation for Non-numerical Attribute Values
  • Pattern Based Knowledge Induction (PBKI)
  • Rule-based approach
  • Clusters attribute values into TAH based on other
    attributes in the relation (i.e.,
    Inter-Attributes Relationships)
  • Provides attribute correlation value (measure how
    well the rules applied to the databases)

19
Type Abstraction Hierarchy (TAH)
Provide multi-level knowledge representations
20
Associative Query Answering
Provide relevant information not explicitly asked
by the user User Query List all airports with
runway length between 8500 and approximately
10000 feet
21
CoBase and GLADIntegration
  • Wesley W. Chu

22
CoBase Functionality
  • Provide approximate matching
  • Find HETs with capacity of approximate 5-ton
  • Provide conceptual query answering
  • Find Earth Moving Equipment
  • Provide content-sensitive spatial queries
  • Find storage sites near selected location
  • (Integration with MATT map server)
  • Provide relaxation control
  • Relaxation order
  • Not-relaxable
  • At-least (answer set, quantity on hand)

23
Cooperative Operations Added to GLAD
  • Implicit Query Relaxation
  • Explicit Query Relaxation
  • Approximate operator
  • Similar-to/based-on
  • Spatial relaxation
  • Relaxation Control
  • Relaxation-order
  • Not-relaxable
  • At-least (answer-set size, quantity on hand)

24
CoBase Features Added to GLAD
  • Enhance GLAD queries with cooperative operators
    (similar-to, relaxation-order, etc.)
  • Display the query relaxation process
  • modified query conditions (value, spatial)
  • type abstraction hierarchies
  • Rank returned answers with similarity measures
  • e.g., spatial relaxation ranks answers according
    to their distance from the selected location

25
CoBase and GLAD TIE
Report Collection
Spatial Area Selection
Filter Editor
Display Generator
Query Collection
NSNs
Object Cache
Report Query Constructor
CoBase Query Editor
CoBase Relaxation Manager
GLAD
Data Cache
CoBase Data Source Manager
Databases
26
GLAD Query
Find NSNs of aircraft with passenger capacity gt
10, combat type 'I', capacity weight lt 2 tons
and price lt 700,000. select nsn, price,
pax_capacity_qty, capacity_wt_ston from
nsn_description where (upper(class) '7' and
upper(cbs_category_nomen) 'AIRCRAFT' and price
lt 700000 and pax_capacity_qty gt 10 and upper
(combat_type) 'I' and capacity_wt_ston lt 2)
27
CoGLAD Query with Relaxation Control Operators
Find NSNs of aircrafts with passenger capacity gt
10, combat type 'I', capacity weight lt 2 tons
and price lt 700,000. Attribute
passenger capacity is not relaxable. Relax price
first and then capacity weight. select nsn,
price, pax_capacity_qty, capacity_wt_ston from
nsn_description where (upper(class) '7' and
upper(cbs_category_nomen) 'AIRCRAFT' and price
lt 700000 and pax_capacity_qty gt 10 and upper
(combat_type) 'I' and capacity_wt_ston lt
2) not-relaxable pax_capacity_qty relaxation-order
price capacity_wt_ston
28
CoGLAD Querywith Similar-to Operator
Find aircraft similar to NSN '0000IB0000961'
based on the attributes price, passenger capacity
and air mileage. Passenger capacity has a weight
of 8 and price and air mileage has a weight of
1. select nsn from nsn_description where
upper(nsn) similar-to '0000IB0000961' based-on
((price 1.0) (pax_capacity_qty 8.0)
(air_mileage 1.0)) at-least 4 '0000IB0000961'
is an answer from the previous query
29
CoGLAD Querywith Approximate Operator
Find DLA stock report with NSN like 8340 (FSC
for tents and tarpaulin) and on-hand quantity is
approximate 150. select nsn, ric from
dla_stock_report where nsn like 8340
and on_hand_quantity 150
30
Adding Constraints to a Query
GLAD query select nsn, ric from
dla_stock_report where nsn like 8340 and
nomenclature like TARP Query with added
constraints select nsn, ric from
dla_stock_report where nsn like 8340
and nomenclature like TARP and
on_hand_quantity 150 and size_in_square_fee
t 350
31
Example of Spatial Relaxation
32
Spatial Relaxation with Relaxation Control
  • relaxation-order size, (latitude, longitude)
  • not-relaxable price
  • at-least
  • value size of the tarpaulin
  • quantity on hand relax until enough quantity on
    hand (specified by the user) is obtained

33
Scalable and Extensible CoBase Architecture
34
Mediator Inter-Communications via KQML
CoBase Ontology
Module Objects APIs
Content Language Data Actions
35
(No Transcript)
36
Query Answers Without CoBase
Query find chemical suits
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
Electronic Warfare
  • Identify and locate sources of radiated
    electromagnetic energy
  • Determine emitter type based on the operating
    parameters of observed signals
  • Radio Frequency (RF)
  • Pulse Repetition Frequency (PRF)
  • Pulse Duration (PD)
  • Scan Period (SP)
  • other operating parameters
  • Determine platform sites near the line of the
    bearing of an emitter

This research is a joint effort between CoBase
and Lockheed Martin Communication Systems (Russ
Frew, et al.), Camden, NJ
44
Performance Improvement by Using CoBase in EW
Conventional DB parameter ranges from emitter
specifications CoBase DB peak parameters
(RF,PRF) and parameter ranges (PD,SP) KB TAHs
based on RF and PRF peak parameters TAHs based
on PD and SP parameter ranges Case 1 emitter
signals without noise Case 2 add noise - PD SP
(10), PRF (5), RF (2.5) Sample Size 1000
signals Emitter Types 75
This research is a joint effort between CoBase
and Lockheed Martin Communication Systems (Russ
Frew, et al.), Camden, NJ
45
Current CoBase Users and Applications
46
Conclusions
  • Provide user and context sensitive query
    relaxations (structured and unstructured data)
  • Provide additional information (associative query
    answering) based on past cases
  • CoSQL (Cooperative SQL)
  • similar-to, near-to, approximate
  • relaxation control operators
  • GUI
  • map server, high-level query formation

47
(No Transcript)
48
CoSent An Active Data Base Technology
  • Natural language-like rule supports conceptual
    approximate terms
  • Decompose natural language-like rule to low level
    rules via knowledge based (TAH)
  • Mimic human cognitive process and thus ease in
    rule specification
  • Ease in rule maintenance

49
CoSent An Active Database Technologies
CoSent monitors temporal composition events and
executes rules with conceptual and approximate
terms.
  • Trigger with high-level rules containing
  • conceptual term (e.g., bad, heavy) and
  • approximate operators (e.g., similar-to, near-to,
    approximate)
  • Allow trigger conditions to be specified with
    fuzzy and conceptual terms
  • Mimic human cognitive expression

50
Key Features of CoSent
  • User defined rules transformed into low-level
    range values via knowledge base--Type Abstraction
    Hierarchies (TAHs)
  • TAHs are typically generated from data sources
    automatically
  • Leveraged on conventional DBMS (e.g., Oracle,
    Sybase, Teradata) triggering systems
  • Rule definition is either specified by domain
    expert or derived by data mining technologies

51
Example of Rule Definitions with Data Mining
Technology
  • Find attributes that frequently appear together
    for a given target attribute.
  • If bad road condition and also bad weather, then
    cause traffic congestion.
  • If a person wrote many bad checks and also has
    past eviction, then this person is a poor credit
    risk.
  • Based on the frequency of occurrence, the derived
    rules can be ranked according to certain
    information measure.

52
Conventional vs. Natural Language-Like Rules
Conventional Rule If wind_speed gt MAX_WIND_SPEED
and wave_height gt MAX_WAVE_HEIGHT then notify
affected units in regions.
  • Natural Language-Like Rul
  • If the weather turns bad,
  • then notify all affected units in that region and
    all those that are near to that region.

53
Natural Language-Like Rule Specifications
Example 1 If the number of departures of large
cargo carrier (e.g., C-5, C-141) becomes
significantly low in the past seven days, notify
the Air Mobility Command.
Example 2 If the aircraft has a fuel
contamination problem and the aircraft type is
similar-toC-5 based on the fuel type and
fueling method, then notify the authority
54
Example
DoD Transportation PlanningWeather Report Table
Wind Speed is the hourly average over an
eight-minute period for buoys and a two-minute
period for land stations Wave height is sampled
in a 20-minute period
55
TAH Example
Wave Height
Wave Height 0.6, 7.2
VERY HIGH 2.45, 7.2
LOW 1.25, 1.75
HIGH 1.75, 2.45
VERY LOW 0.6, 1.25
56
A Portion of Wave Height TAH
57
Triggering Based on Temporal Composite Events
Notify the commander if within the past seven
days, the total departure of C-5 is significantly
low and the filter problem on C-5 is extremely
high.
C-5 Departure
C-5 Filter Problem
Low 9-134.5
High 134.5-208
High 53-79
Low 0-53
Ex High 60-79
Signt. Low 9-53
Very High 134.5-162
Extra. Low 0-36
Very High 53-60
Very Low 53-134.5
Signt High 162-208
Very Low 36-53
58
Natural Language-Like Rule Translations
Rule Translation/Relaxation
59
CoSent Architecture
60
CoSent Demo
  • Natural Language-like rule with conceptual terms
    very high wave height and very strong wind
    speed
  • Natural language-like rule with approximate term
    nearby and conceptual term bad weather
  • Install trigger by drag-and-drop on the desired
    location on the map

61
Natural Language-Like Rule
  • Natural language-like rule containing conceptual
    terms, such as wave_height very-high and
    wind_speed very-strong, can be translated to
    range values by domain knowledge. For instance,
    type abstraction hierarchy.
  • Natural language-like rules reduce the number of
    rules, thus easing rule maintenance

62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
Rules With Approximate Terms
  • Rules can contain approximate terms, such as
    near-by and approximate, thus ease in rule
    specification
  • The Trigger can be installed on the desired
    location on a map by drag-and-drop method
  • The near-by region affected by the bad weather
    condition is specified by the trigger condition
    shown by a red circle

68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
Map Server Architecture
76
Current Capabilities of Map Server
  • Visualization of Query Answers
  • Icons
  • Paths
  • Enter Query Constraints Graphically
  • Visualization of Query Relaxation Process

77
Visualization of Relaxation Process
Query Find seaports in the given region.
relaxed region
given region
78
Explanation Agent
  • Based on process traces and invocation rules,
    generate English-like explanation of
  • Relaxation process
  • Quality of approximate matching
  • Further explanation on definitions and terms in
    explanation

79
Explanation of Relaxation Process
80
Relaxation Primitive within
81
Extend near-to Primitive Points to Regions
82
Dynamic Nearness
  • Uses transaction history to identify nearness
    between tuples and values
  • If two tuples (or attribute values) appear
    together in a query answer, then that is a piece
    of evidence that they should be clustered
    together.
  • Gather evidence over time
  • Evolve the hierarchy

83
The BOOKS Relation
84
Schematic of a Browsing System
85
Schematic of a Query Modification System
86
The Links Between Tuples in BOOKS
87
Dynamic Links After Two Queries
88
Links with Counts
89
Number of Links with Threshold Value q
90
Number of Links is determined by Maximum Answer
Set Size a
91
Query Formation From High-LevelConcepts for
Relational Databases
  • Guogen Zhang
  • Wesley Chu
  • Frank Meng
  • Gladys Kong

92
Outlines
  • Overview
  • Semantic Graph Model
  • High-Level Query Formation for SPJ queries
  • Incremental Query Formation for Complex Queries
  • Conclusions

93
Overview Query Formation
  • Based on semantic graph model, including
    user-defined relationships
  • User specifies requests and constraints
  • Formulate simple query by graph search technique
  • Candidates ranked by information measure
  • English-like query description
  • A complex query can be formulated by a series of
    simple queries

94
Related Work
  • Query formulation as Steiner tree problem (Wald
    and Sorenson, 1984)
  • limited to partial 2-tree graphs
  • Formulate simple Select-Project-Join (SPJ)
    queries via Universal Relation Model no need to
    specify natural joins (Ullman 1988, Vardi, 1988)
  • Object-oriented query path expression completion
    partial order relationship between different path
    for ranking (Ioannidis and Lashkari, 1994)
  • Query-by-Icon (QBI) Massari and Chrysanthis,
    1995
  • Natural language interfaces (text/voice) logical
    form to query

95
Semantic Graph Model
  • Weighted graph G(V,E)
  • Nodes entities -- strong, weak, user-defined
  • Links relationships -- ISA, HAS, simple,
    complex, user-defined
  • For relational databases
  • nodes relations
  • links natural and user-defined joins
  • Weight information measure of a node or link

96
Query Feature
  • Query expression in a semantic graph
  • Query Topic, T A set of Joins represented by
    links
  • Query Constraints, C Query Conditions
  • Query Aspect, A Attribute list

97
A query topic for aircraft can land on airports
at geographical locations of countries
98
Semi-Automatic Generation of Semantic Model
  • Find natural joins through key and foreign key
    between nodes.
  • User-defined links can be added into the graph
    model.
  • Designers need to specify link types and assign
    names to all the elements in the graph.

99
Example of Semantic Model Generation
  • AIRPORT APORT_NM, GEOLOC_TYPE, GLC_CD, ELEV_FT,
  • key APORT_NM.
  • RUNWAY APORT_NM, RUNWAY_NM, GLC_CD,
    RUNWAY_LENGTH_FT,
  • RUNWAY_WIDTH_FT, key RUNWAY_NM.
  • GEOLOC GLC_CD, GLC_NM, CY_CD, LATITUDE,
    LONGITUDE,
  • key GLC_CD.
  • COUNTRY CY_CD, CY_NM, key CY_CD.
  • Links
  • AIRPORT--RUNWAY APORT_NM
  • AIRPORT--GEOLOC GLC_CD
  • RUNWAY--GEOLOC GLC_CD
  • GEOLOC--COUNTRY CY_CD

100
Information Measure
  • Information measure of a node or link, a
  • I(a) - log P(a)
  • where P(a) is the probability of a being
    used
  • in queries.
  • Assume nodes and links are independent, for a
    subgraph with a set of elements Aai i 1, ,
    n, information measure is additive
  • n
  • I(A) SUM I(ai)
  • i 1

101
Information Measure (cont.)
  • Initial Information Measure
  • all the nodes 1
  • different nodes have a different value
  • Information measure is normalized and converted
    into counts
  • Probability of a node or a link is P(ai) ci/c
  • Update Information measure
  • Ranking based on Information measure, thus adapt
    to user feedback

102
Query Formulation
  • To formulate (simple) queries without knowledge
    of query language or database schema
  • Example
  • Find airports in Tunisia that can land a C-5
    cargo plane
  • User input
  • Query aspect AIRPORTS.APORT_NM
  • Constraints
  • AIRCRAFT_AIRFIELD_CHARS.AC_TYPE_NAME C-5
  • COUNTRY_STATE.CY_NM Tunisia
  • Links CAN LAND

103
Formulated Query
  • SELECT R3.APORT_NM
  • FROM AIRCRAFT_AIRFIELD_CHARS R0
  • AIRPORTS R3, COUNTRY_STATE R11
  • GEOLOC R12, RUNWAYS R16
  • WHERE R0.AC_TYPE_NM C-5
  • AND R11.CY_NM Tunisia
  • AND R0.WT_MIN_AVG_LAND_DIST_FT lt
  • R16.RUNWAY_LENGTH-FT
  • AND R0.WT_MIN_RUNWAY_WIDTH_FT lt
  • R16.RUNWAY_WIDTH_FT
  • AND R11.GLC_CD R3. GLC_CD
  • AND R3.APORT_NM R16.APORT_NM
  • AND R11.CY_CD R11.CY_CD

104
Query Completion as Graph Search Problem
  • Given An incomplete input query topic Ti
  • Find a set of links to complete the topic (to
    make Ti connected)
  • Minimum Missing Information principle
  • The query completion candidate Tc (the missing
    links and nodes) for an incomplete input topic Ti
    contains the minimum information

105
Query Formulation Algorithm
  • Input subgraph T of the semantic graph G
  • Find candidates with the minimum Information
    measure
  • Two methods used to limit the search scope
  • L-step-bound paths paths that connect two
    components with at most L links, to limit search
    within the neighborhood of the input subgraph
  • k-minimum completion candidates only at most k
    candidates with minimum Information measure are
    kept (alpha-beta pruning)

106
Initial Components and 2-Step-BoundPaths For the
CAN LAND Query
107
The Semantic Graph For theTransportation Domain
108
Incremental Query Formulation
  • Incremental Query Formulation
  • To assist user reach a complex query goal with a
    series of simple queries
  • The subsequent queries may depend on results of
    preceding queries (derived relations)
  • Issues
  • Incorporate derived relations into the semantic
    graph
  • Suggest missing attributes to link isolated
    derived nodes to the graph

109
Incremental Query Examples
  • Find airports in Tunisia.
  • Which of these airports can land a C-5?
  • What is the weather at these airports?

110
Incorporating Derived Relations
  • Source relation contributes attributes to the
    derived relations
  • Derived relation inherits properties of
    attributes from their source relations
  • Deriving link links to the source relations
    through inherited keys
  • Inherited link inherits links from the source
    relations

111
Extended semantic graph showing derived nodes,
derived links and inherited links
112
Suggesting Key Attributes for a Query
  • Find source relations for the isolated derived
    relation.
  • Suggest key of the source relations as attributes
    to include.

113
Concept and Attribute Specification Interface
114
Query Constraint Specification
115
Action Specification
116
English-Like Query Descriptionand the Formulated
Query
117
Conclusions
  • Semantic graph model provides a basis for query
    formulation search
  • Ranking of query candidates by information
    measure in formulation provides adaptive behavior
  • Incremental query formulation is effective for
    complex queries
  • GUI and voice interface can be built for query
    formulation from high-level concepts

118
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com