Title: Semantic in Information Systems: Current Issues New Trends
1Semantic in Information SystemsCurrent Issues
- New Trends
- Kokou Yétongnon
- Professor
2Summary of ideas
- semantic and Integration of information systems
is a recurring problem - (new variations on an old problem)
- Global semantic of multiple systems is hard to
determine - (large number of information sources with
varying levels of semantic)
3Summary of ideas
- Dynamic environments require new ways of dealing
with semantic and interoperability - Semantic must locally emerge from interactions
among information systems - (Negotiations, agreements etc)
4Outline
- Introduction and Motivation
- Data Integration
- Modeling Semantic
- Information System Interoperability
- Semantic in Dynamic Environments
- Conclusion
5The Traditional View of Information Systems
- Where is the semantic of the system
Schema
- semantic partially represented in the
- Schema
- Occurrences conform or agree with
- the schema
Instances / Occurrences
The traditional, safer and formal picture of
information systems the simple life
6The Traditional View of Information Systems
- When do semantic issues arise?
- Schema is created by several designers
- Each handles a part of the system
Schema
Integrated Schema
- Semantic reconciliation easy
- Make all designers agree on
- Differences in terms or concepts
Schema1
Schema n
7The Traditional View of Information Systems
- Main goal of the centralized system is to process
company information
Schema
- Typical Applications
- Payroll
- Managing enterprise project or
- production system
Instances / Occurrences
8Information Systems in Networked Environment
- Distributed systems and widespread resource
sharing evolve from two major developments - Powerful, efficient and smaller computers and
processors - efficient communication concepts
- Internet
- Wireless communication
- Ad hoc, mobile networks (sensors, PDA, other
devices )
9Information Systems in Networked Environment
- Data have also evolved in
- Complexity (image, multimedia, hypermedia)
- Volume (and storage requirement)
- Type
- So the related semantic is also evolving in
complexity
10Information Systems in Networked Environment
- The main IT goal is
- Not only to process enterprise information
- But also to sharing information
- Global need a system must handle information
from a variety of sources (proprietary data,
public information in web pages, information in
web services)
11Information Systems in Networked Environment
- What are the new challenges? Discovering and
extracting relevant information
12Information Systems in Networked Environment
- How does the traditional view evolve when we have
multiple information systems?
- May not be present
- Multiple schema in distributed environments
- Schema May be expressed using different models
- Description levels, precision and power may be
different - Formal description may co-exist with natural
language descriptions
Schema
Instances / Occurrences
13Information Systems in Networked Environment
- How does the traditional view evolve when we have
multiple information systems?
- Occurrences distributed over multiple sites
- Replicated or duplicated over sites
- Fragmented over multiple sites
Schema
Instances / Occurrences
14Information Systems in Networked Environment
Schema
, ... ,
High level Cooperation or interoperation of
IS Information sharing
Schema
, ... ,
Low level Communication Network
Schema
, ... ,
15Information Systems in Networked Environment
- If we could build a Global Semantics
Global Semantics
Query
Virtual Integrated Information System
Schema
Schema
Schema
, ... ,
, ... ,
, ... ,
15
16Information Systems in Networked Environment
What are the main issues?
Schema
, ... ,
Schema
Semantics
, ... ,
- What it is?
- How to best represent the semantics of data?
- How to reconcile differences in semantics?
- What about missing semantics?
Schema
, ... ,
17Data IntegrationA Higher-Level Virtual View
Independence of
source location
Query
data model, syntax
semantic variations
Mediated Schema
Semantic
Mappings
S1 S2 S3
ltcdgt lttitlegtThe best of lt/titlegt
ltartistgt Carreraslt/artistgt
ltartistgtPavarotti lt/artistgt
ltartistgtDomingo lt/artistgt ltprice,
USgt19.95lt/pricegt
SSN Name Category 123-45-6789 Charles
undergrad
SSN CID 123-45-6789 CSE444
234-56-7890 Dan grad
123-45-6789 CSE444
234-56-7890 CSE142
CID Name Quarter
CSE444 Databases fall
CSE541 Operating systems winter
18Application Areas
Enterprise Databases
Ent. Integration Applications
Business analysis
Single Mediated View
Portals
Legacy Databases
Services and Applications
19Application Areas
Phenotype Structured
Sequenceable
Vocabulary Experiment
Gene
Entity
Microarray
ProteinNucleotide
Sequence
Experiment
OMIMSwiss-
HUGO GO
Prot
Gene-
Locus-
ClinicsEntrez
LinkGEO
Hundreds of biomedical data sources available
growing rapidly!
20Application area
- Scientific Data Grid (Physics)
- CERNs EDMS
- PDM (Product Data Management)
- Engineering Data Management System
- MDAS Massive Data Analysis System
- San Diego Supercomputer Center 95-97, DARPA
financed - Manage resources in a heterogeneous distributed
system - Metadata and data description
- Detect available resources, storage spaces
21Application Areas
- WEB
- E-commerce (Amazon.com, Barnes and Nobles)
- E-tickets reservation
- Online hotel Reservation
22The Semantic WebBerners-Lee
- To allow knowledge sharing at the web scale
(interaction between Machines or users) - Web resources must be described by ontologies
(precise explicit semantics) - Need rich domain model
- Powerful standards (RDF/OWL)
23The Semantic WebBerners-Lee
- Challenges
- Complex Semantic integration issues at the web
level (This may be too complicated for non
technical end user, unless fully automated) - Lack of convincing applications at the semantic
web level
24The Semantic WebBerners-Lee
- Where are the real obstacles to the semantic
integration? - Systems
- Managing different platforms
- Query processing across multiple platforms
- Social
- Locating and capturing relevant information in
the enterprise - Convincing people to share data (privacy and
performance reason - Logic
- Schema and heterogeneity
25Virtual EnterprisesWorkflow model
- Challenges
- Decentralized organizational structures
- Variety of Information Sources and Services
26Virtual EnterprisesWorkflow model
- Issues operational aspects of the business
process - Interoperability
- Autonomy
- Openness and sharing information
- Dynamic Participation, Mergers and Acquisition
- On-the-Fly Integration
27Workflow Management Syst.
- WfMS Languages
- WSDL basis for many inter-organizational
workflow specification language - Other languages ebXML, WSCL, XLANG, BPML
- Problems with languages
- Advanced but lack of common taxonomy
- Do not support different views of the workflow
- Solution
- Agreement based inter-organizational workflow
28Agreement based Workflow Model
- Local Modeling view
- Each org creates a local/personal view of its own
workflow - Based on Loose interaction
View-b
Compatibility Analysis
WFLa
WFAb
Agent Negotiation
Global View
Agent Negotiation
Agent Negotiation
WFAc
WFLa
WFAb
Compatibility Analysis
Compatibility Analysis
View-c
WFAc
WFLa
WFAb
29Semantic Modeling perspectives Ontologies
- An ontology provides a shared representation and
understanding of data and services in a common
domain - Use ontology for interaction between people and
application systems.
30Semantic interoperability
- Ability of a system component to provide
information sharing and inter-application
cooperative control - Ontologies are used as a comparaison reference
- Ontologies are forms of a-priory agreement on
concepts, therefore their use is insufficient in
ad hoc dynamic environment where all possible
interpretations are anticipated
31Semantic interoperability
- Interoperation of systems X and Y
Ontology Partial? Global? Predefined?
Mutual understanding of R and Y
Request R
Information system X
Information system Y
Response Y
32Semantic interoperabilityMining semantic from
the instance
Missing values? Requires an ontology? Maintenance?
Extraction based on cluster Similar instances?
Extract Semantic Based on similarity Of instances
WEB pages containing Instances
33Semantic interoperabilityMining semantic from
the instance
- How to define similar instances in this case?
- Bookstore Example (Amazon.com), consisting of
- Items (books, etc)
- Customers (millions)
- Each book is represented by a vector such that
Vi is 1 if customer i bought a copy of the book
34Semantic interoperabilityMining semantic from
the instance
- How to define similar instances in this case?
- Books bought by the same groups of customers?
- Use other clustering method
- We plot the vector in an n dimensional space such
that each dimension represents a customer and
each point defines the customers who bought the
same book
35Semantic in Dynamic EnvironmentP2P - Self
Organization Systems
- Example of Semantic Overlay P2P Networks (SONs).
- Peers with similar content are clustered together
on content hierarchies (similar to ontologies)
36Semantic Overlay Networksfor a music application
Rock
A
B
Rap
R
C
D
E
F
H
G
o
37Classification Hierarchiesof the semantic
overlay network
Music
Music
Decade
10s
Now
90s
Rock
Jazz
Sub style
Music
Soft
New Orleans
Dance
Bop
Tone
Pop
Fusion
Warm
Sweet
Exiting
38Generating semantic Overlay Networks
Query
Query Classifier
Concept hierarchy
SONs
Query Result
SON Definition
Data Distribution
Node Classifier
Document Classifier
New Nodes
39Conclusion
- We have shown that
- Semantics is ever pervasive in information
systems - Semantic integration of information systems is a
hard problem - Because Schema when present never captures the
intended meaning of information - Because automatic resolution of differences is
not yet satisfactory - Because technological, social and logic obstacles
still exist
40Conclusion
- We have shown that
- Ontologies are increasingly used in integration
solutions - Ontologies based have limitations when used in ad
hoc dynamic environments - One modest objective of integration solutions
should be to limit the required human effort
41Thats All