Title: Umbrella 0'5 Project Report
1Umbrella 0.5Project Report 1
- Heterogeneous Data Source Access Using Standard
SQL Queries - Eric Matson MSE
- Kansas State University
2Agenda
- Overview
- Methods Tools
- Requirements
- Project Management and Control
- GUI Walkthrough/Prototyping
- Summary
3Overview
- Purpose
- Goals
- Description
- Architecture
- Key Elements
- Process
- Test Lab
- Scope
- Future Directions
- Literature Search Results
4Purpose
- Query heterogeneous data sources using standard
SQL statements - Achieve database integration and access at the
query level - Allow the data to be black box whereas users
dont need to have understanding to source,
architecture, or technical details
5Goals
- Complete the Requirements of CIS895
- Successfully Navigate Software Engineering
Process - Prove Umbrella Concept
- Prove Ability to Process Umbrella Queries
- Split the query into to process heterogeneous
sources - Manage the connection processing with the data
sources - Return and create new relation(s)
- Create easy to use Graphical User Interface (GUI)
6Description
- Enter queries using standard SQL statements
- Query heterogeneous data sources
- Text Files
- Formatted
- Delimited
- Relational Databases
- PostgreSQL
- MySQL
- Informix
- Knowledge Base Formats
- Prolog Rules
- CLIPS Deftemplate facts
- Non-relational Databases
- ODBC Data Sources
- Unify Data together as single relation
7Description Architecture
Relational Database
Parser
D a t a C o n n e c t o r s
Object Database
API
Splitter
TCP/IP
Text File
Router
Umbrella GUI
Unifier
KB File
Profile
Data Server
Data Sources
Engine
8Description Key Elements
- GUI User Interface
- Umbrella Engine Agent
- Parser Checks the query for correctness and
validity - Splitter Splits query to appropriate data
sources - Router Manages routing and connection to data
source - Unifier Unites data upon return
- Profile Database to manage data source
information - Data Connector Interface to data sources
- Data Server
- Text Wraps data and serves the Umbrella Engine
- KB Wraps fact and serves the Umbrella Engine
9Description - Process
SELECT FROM Job, Name, Addr WHERE Job.SSN
Name.SSN AND Job.SSN Addr.SSN
SSN Title Salary
Job
Parse
SELECT FROM Job
SSN Title Salary First Middle Last Street State ZI
P
Split
SSN First Middle Last
Unify
SELECT FROM Name
DC
Name
DC
Route
DC
SSN Street State ZIP
SELECT FROM Addr
Addr
10Description - Test Lab
Server
Client
Network
PostgreSQL
Linksys 10 Mb Hub
Windows 98
MySQL
Red Hat 7.1
192.168.1.2
192.168.1.1
Text
11Scope
- Functionality
- Umbrella 0.5 Query Engine
- Graphical User Interface (GUI)
- Data Servers for Text Data Sources
- Data Connectors
- SQL Select Statement in Limited Format
- Data Sources
- PostgreSQL ANSI Compliant ORDBMS
- MySQL non-ANSI Compliant RDBMS
- Formatted Text
- Delimited Text
12Future Directions
- Version 0.6 SQL release
- Implement full grammar support for SQL
- Enhance dataset join algorithm
- Enhance GUI
- Add ODBC, Berkeley DB, Interbase and ADABAS data
connectors - Version 0.7 KD/AI release
- Add Lisp, Prolog, CLIPS data connectors
- Enhance Query Efficiency Algorithms
- Version 0.8 Integration release
- Add API layer for access by C/C, Java, VB
- Add tools for advanced searches and Data Mining
- Add Security Module
- Version 0.9 Data Connector release
- Add Oracle, Informix, UDB, Sybase data connectors
- Add Object Database data connectors
- Add XML data connector
- Version 1.0 Mainframe release
- Add IMS, DB2, VSAM data connectors
13Literature Search
- Looked at US/Canadian Research Units
- Companies
- Universities
- Similar Research/Product Development
- WWW.Jibe.com - Only JDBC/ODBC sources
- Queens University(Canada)/IBM Older
pre-Internet Explosion - Stanford Mariposa Project
- Summary
- Some Similar Projects
- Most are Tangential in Scope and Function
14Methods Tools (MT)
- Development Process
- Object Modeling
- Formalism/Constraints
- Development Tools
- Infrastructure
-
15MT Development ProcessCOMET(Concurrent
Object Modeling and architectural design mEThod
Hassan Gomaa)
Planning
Delivery
Non-COMET Process
COMET Process
Requirements Modeling
Analysis Modeling
Design Modeling
Incremental Software Construction
Throwaway Prototyping
Incremental Software Integration
Systems Testing
Incremental Prototyping
Phase 1
Phase 2
Phase 3
16MT Object Modeling
- Unified Modeling Language (UML) used
- UML is standard OO modeling language
- Well understood
- UML Tool
- Poseidon
- ArgoUML
- Both are very similar, Poseidon is an extension
of ArgoUML
17MT Formalism/Constraints
- Object Constraint Language (OCL)
- Easier to understand
- Less mathematical
- Less developed
- USE
- Graphical tool to support OCL
18MT Development Tools
- Sun Java 1.3 JDK/JRE
- Newest release of development kit
- Write Once, Run Anywhere (Portable)
- Forte (Netbeans) Integrated Development
Environment - Community Edition (CE)
- Easy to use
- Free!
19MT - Infrastructure
Server OS Red Hat Linux 7.1
Object/Relational Database Server PostgreSQL
Relational Database Server MySQL
20Requirements
- Object Diagram
- Class Descriptions
- SQL Grammar
- SQL Example
21Requirements Object Diagram
DataConn
Engine
Parser
RelDataConn
TextDataConn
Query
Splitter
FormDataConn
JDBCConn
DelDataConn
SubQuery
Interagator
PostgresConn
MySQLConn
Profile
Router
JDBCProfile
TextProfile
Unifier
TextDesc
DelTextProf
FormTextProf
ResultSet
DataServer
22Requirements Class Descriptions
- Engine Control agent of Umbrella architecture
- Parser Checks query for grammar/valid data
objects - Splitter Splits queries into valid sub-queries
- Router Routes queries to data source/manages
connection - Unifier Unifies sub-query result sets into
single relation - Query SQL statement
- SubQuery Created by Splitter for specific data
source - ResultSet Data returned from sub-query of data
source - TextDesc Defines characteristics of text data
source - Interrogator Queries a new data source for
information on field definitions - DataServer Server that supports access to text
data sources
23Requirements Class Descriptions
- Profile Defines characteristics of a data
source - JDBCProfile
- FormattedTextProfile
- DelimitedTextProfile
- DataConnector Defines specifics to query a data
source - RelationalDataConnector
- JDBCDataConnector
- MySQLDataConnector
- PostgreSQLDataConnector
- TextDataConnector
- FormattedTextDataConnector
- DelimitedTextDataConnector
24Requirements SQL Grammar
- ltstatementgt SELECT ltselectgt FROM lttablegt.
SELECT ltselectgt FROM lttablegt WHERE
ltconditiongt. - ltselectgt ltfieldgt ltselectgt,ltfieldgt
- ltfieldgt ltfieldnamegt lttablenamegt.ltfieldnamegt
- ltfieldnamegt ltalphagt ltalphagtltdigitgt
- lttablegt lttablenamegt lttablenamegt
,lttablenamegt - lttablenamegt ltalphagt ltalphagtltdigitgt
- ltconditionalsgt ltrelationgt AND ltrelationgtOR
ltrelationgt - ltrelationgt ltfieldgt ltoperandgt ltrightsidegt
- ltoperandgt lt gt
- ltrightsidegt ltfieldgt ltvartypegt
- ltvartypegt ltstringgt ltintegergt ltrealgt
- ltstringgt ltalphagt ltalphagtltdigitgt
- ltintegergt ltdigitgt ltdigitgt
- ltrealgt ltintegergt . ltintegergt
- ltalphagt abcxyzABCXYZ
- ltdigitgt 0123456789
25Requirements SQL Example
- SELECT FROM address
- SELECT ssn, age FROM employee
- SELECT lastname, firstname FROM employee WHERE
age gt 40 - SELECT lastname, firstname, salary FROM employee,
hr WHERE employee.ssn hr.ssn AND employee.age gt
25
26Project Planning and Control
- Project Planning
- Work Breakdown Structure (WBS)
- Gantt Chart
- Description of Estimation Techniques
- Project Control
- Effort/Budget
- Project Log and Time Tracking
- Change Management (Feedback Loops)
27Project Planning - WBS
1 Umbrella
Goals
1.1 Planning
1.1.1 Define Purpose
1.2 Requirements Modeling
1.1.2 Define Scope
Outcome
1.3 Throwaway Prototyping
1.1.3 Define Purpose
1.4 Analysis Modeling
1.1.4 Define Goals
1.5 Design Modeling
1.6 Incr. SW Construction
1.7 Incr. SW Integration
1.8 Incr. Prototyping
1.9 Systems Testing
1.10 Delivery
28Project Planning Gantt Chart
September
October
November
December
January
Planning
Reqs Modeling
Throwaway Prototyping
Analysis Mod.
Design Mod.
Incr. SW Const
Incr. SW Integ.
Incr. Prototyping
Systems Testing
Delivery
Review 1
Review 2
Review 3
29Project Planning Estimation Techniques
- Estimate Tasks on Defined WBS
- All tasks broken into chunks of work between
.25 and 8 hours - Keep project on schedule
- WBS Estimation Used
- Small Project Team
- Well Understood Capabilities of Project Member(s)
- Change Budget Allocated at 20 of Initial
Estimate - Works with Iterative Schemes
- Change is Planned and Accepted
30Project Control Effort/Budget
- Effort
- Effort Data Hours Man/Mo (168 hr/mo)
- Predicted 565 3.36
- Current 183.25 1.09
- Budget
- Time (50/hr) 28,250
- Current 9,162.50
31Project Control Time Log/Issues
- Time Log
- Track time spent each day on project tasks
- WBS Number
- Hours
- Viewed on Engineering Notebook Site
- Issues
- As design issues are realized
- Issues Documented
- Solutions Arrived at and Documented
- Folded in to Change Management Process
- If Impact and In Scope
32Project Control Change Mgmt.
- Development Process (COMET) Oriented for Change
- Feedback changes will be tracked
- Change Control Number
- Task Name
- Added Hours to Project Schedule
- Description of Change
- WBS Number of Artifacts Altered
- Dependencies
33Graphical User Interface
File Connection Query Help
Connections --- 192.168.1.2 ----
/opt/data ----formatted.txt ----
PostgreSQL Server ----Employee
----Salary --- 192.168.1.1 ---- MySQL
Server ----Items
----Invoice ----Customers ----
/var/lib/data ----delimited.txt Queri
es --- Employees and Customers ---
Employee ---- ssn, lastname ---
Customers ---- ssn
Define Query
Name
Employees and Customers
Query
SELECT ssn, lastname FRO
111-22-3333 Elvis Presley 222-33-4444 Richard
Nixon 333-44-5555 Mikhail Gorbechev
Results
Status Subquery1 OK Subquery2 Failed
34Summary
- Umbrella accesses heterogeneous data sources
- Defined set of tools and methodologies
- High Level Requirements Defined
- Project Management Well Defined
- Easy to Use GUI
- Next Steps
- Questions
- Feedback
- Signoff
- Proceed to Phase 2