Title: Managing GridDatabases with GRelC
1Managing Grid-Databases with GRelC
- Ph.D. Sandro Fiore
- SPACI Consortium and University of Salento
(Lecce), Italy - ISSGC2007 - July 12th
2Outline
- Motivations
- GRelC Project
- GRelC DAS
- Architecture
- SDK
- GUI
- Queries
- Experimental Results
- Porting on gLite
- Deployment
- On Line User Tutorial (GILDA)
- Conclusions
3Motivations
- Data Grids should provide a low level framework
also for grid-database management (fine grained
approach) - No new DBMS or new query language
- Legacy systems/databases and standard SQL
- Need for more complex and efficient query in
grid - Integration with production grid environments
(based on gLite, globus, ) - Main requirements security, transparency,
interoperability, efficiency, robustness, etc.
4Introducing the GRelC Project
- Grid Relational Catalog is a project which aims
at designing and developing a set of efficient,
secure and transparent Data Grid Services
(Starting date, Jan 2001). - GRelC Data Access Service aims at providing a
large set of functionalities to access both
relational and non relational DataBases in a grid
environment.
5GRelC Project a bit of history
6GRelC DAS Architecture
GRelC DAS
7GRelC DAS Main Features
- Entirely based on C programming language
- Multithreaded web service
- It exposes the web service interface GSI enabled
and WS-I compliant - Mutual authentication based on GSI (X.509v3
digital certificates) - GRelC DAS Authorization based on ACL for local
management - VOMS Support, for global management
- Information System Support (BDII compliant)
- Wide set of data access control policyies
- Full GSI support data encryption, data
integrity, protection against replay attacks and
detection of out of sequence packets
8GRelC DAS Main Features
- XML data validation for recordset
- SingleQuery, MultiQuery and MultiSingleQuery
Support - Support for synchronous and asynchronous queries
- Dinamic binding to heterogeneous DBMSs
- Two levels logging (users, connections, queries,
etc.) - GSI enabled remote administration tools and
remote log - Compression, chunking, prefetching and streaming
to enhance performance on a WAN - Wide SDK for developers (both for C and C)
- No dependencies concerning other middleware (only
GSI)
9GRelC DAS Architecture
10GRelC DAS SDAI
11Standard Database Access Interface
- Features
- Standard access to data sources
- Types uniformity
- Error uniformity
- Plug-in architecture based on dynamic libraries
- Dinamic binding to
- PostgreSQL MySQL SQLite IBM/DB2,
Oracle9.i, MS-SQL Server, UnixODBC, Textual DBs,
etc.
12New drivers IBM/DB2, Oracle, MS-SQL
SQ Access Policy
MQ Access Policy
AuthUser
DATA RESOURCES
Authorized client
Production Drivers
Client built on top of the set of Services
Unix ODBC Data Source
MySQL
SOAP
PostgreSQL
Oracle
IBM/DB2
XML GSS-API GSI
MS SQL Server
Pre-ProductionDrivers
AuthDB User
Configuration Policy
13GRelC SDAI Library APIs (I)
- int grelc_sdai_handle_set_grelc_dbname
(grelc_sdai_handle handle, const char value) - int grelc_sdai_bind(grelc_sdai_handle handle)
- int grelc_sdai_unbind(grelc_sdai_handle handle)
- int grelc_sdai_init()
- int grelc_sdai_exit()
- int grelc_sdai_query_submission(grelc_sdai_handle
handle, char query) - int grelc_sdai_ntuples(grelc_sdai_handle
handle) - int grelc_sdai_nfields(grelc_sdai_handle
handle) - int grelc_sdai_field_name(char field,
grelc_sdai_handle handle, int i) - int grelc_sdai_field_type(grelc_sdai_handle
handle, int i)
14GRelC SDAI Library APIs (II)
- int grelc_sdai_get_value(char value,
grelc_sdai_handle handle, int i, int j) - int grelc_sdai_clear_result(grelc_sdai_handle
handle) - int grelc_sdai_lock(grelc_sdai_handle handle,
int mode, char table) - int grelc_sdai_unlock(grelc_sdai_handle handle)
- int grelc_sdai_begin_transaction(grelc_sdai_handle
handle) - int grelc_sdai_commit_transaction(grelc_sdai_handl
e handle) - int grelc_sdai_rollback_transaction(grelc_sdai_han
dle handle) - int grelc_sdai_get_tables (grelc_sdai_handle
handle) - int grelc_sdai_get_fields (grelc_sdai_handle
handle, char table)
15A simple SDAI Client
SDAI Client
- if (res grelc_sdai_bind (handle))
-
- fprintf (stderr, "ERROR! Database bind failed
Code d!\n", res) -
- return -1
-
-
- if (grelc_sdai_query_submission (handle,
query)) -
- fprintf (stderr, "ERROR! Query submission
failed!\n") -
- return -2
-
- if (strcasestr (query, "SELECT"))
-
- for (outer 0 outer lt grelc_sdai_ntuples
(handle) outer) -
16GRelC DAS Internal Components
17The GRelC Library APIs Classification
- Database access and query services
- bind
- unbind
- query submission
- Remote manipulation services
- get_value
- get_current_tuple
- Resultset store and retrieving services
- store_result_disk
- fetch_stored_recordset
- User management services
- add_user
- remove_user
- set_user_policy
- Enterprise Grid management services
- add_host
- add_dbms
- Virtual space management services
- create_virtual_database
Wide SDK both for C and C developers
18SDK (I)
- Database access and query services
- grelc__data_access_bind
- grelc__data_access__unbind
- grelc__data_access__query_submission
- grelc__data_access__multi_query_submission
- On-line Approach
- grelc__data_access__ntuples
- grelc__data_access__nfields
- grelc__data_access__field_name
- grelc__data_access__field_type
- grelc__data_access__get_value
- grelc__data_access__clear_result
- grelc__data_access__get_current_tuple
- Memory Approach
- grelc__data_access__store_result_memory
- File Approach
- grelc__data_access__store_result_disk
- grelc__data_access__fetch_stored_recordset.
19SDK (II)
- User management services
- grelc__data_access__add_user
- grelc__data_access__delete_user
- grelc__data_access__get_users
- grelc__data_access__set_user_policy
- grelc__data_access__get_user_policy
- grelc__data_access__delete_stored_procedure
- grelc__data_access__alter_stored_procedure_alias
- grelc__data_access__alter_stored_procedure
- Enterprise Grid management services
- grelc__data_access __add_host
- grelc__data_access __delete_host
- grelc__data_access __get_hosts
- grelc__data_access __add_dbms
- grelc__data_access __delete_dbms
- grelc__data_access __set_dbms_port
- grelc__data_access __set_dbms_login.
20SDK (III)
- Virtual space management services
- grelc__data_access__get_databases
- grelc__data_access__create_virtual_database
- grelc__data_access__drop_virtual_database
- grelc__data_access__register_database
- grelc__data_access__create_database
- grelc__data_access__drop_database
- grelc__data_access__create_phy_db_and_register
- grelc__data_access__clear_out_database
- grelc__data_access__make_dump
- grelc__data_access__get_login
-
- QoS service
- grelc__data_access __relocate_database
21SDK (IV)
- GRelCRecordset
- grelc_service_open_recordset
- grelc_service_movefirst
- grelc_service_movenext
- grelc_service_move
- grelc_service_eof
- grelc_service_eof_group_records
- grelc_service_get_value
- grelc_service_get_attribute_name
- grelc_service_get_attribute_type
- grelc_service_get_num_attribute
- grelc_service_get_num_records
- findfirst
- findnext
- grelc_service_free_recordset
- remove_file_from_disk.
22GRelC-CppProxy a C Module
- A C module was created in order to allow an
easy development of new web services client with
this language.
This module hides the communication layer with
Web Services
23CppProxy Class
class CppProxy public int bind
(string grelc_db_name) int unbind () int
query_submission (string query) .... int
create_database (string grelc_db_name, string
identity, string dbms, string host, string
istance, int log_type) int create_physical_dat
abase_and_register(string grelc_db_name, string
dbms, string host, string istance, int
log_type) int drop_database (string
grelc_db_name) int get_log(int num_lines,
string log) int get_log_database(string
grelc_dbname, int num_lines, string log)
int get_host_position_info(HostInfoRespons
e response) int get_value (int row, int
column, string value) private struct soap
soap struct gsi_plugin_data data char
connection bool connected bool
enable_credential string dn
24XGRelC A consolle for Grid-DBs Mng
- Functionalities
- User management
- Web Service registration
- Host Management
- Logging
- DBMS configuration
- Database creation
- Import Database
- Database configuration
- Query submission
- Map deployment
25XGRelC GUI Snapshots
26GRelC Queries
- GRelC latest release supports the following query
types - Single Query Online
- Single Query Memory ( chunk management)
- Single Query File ( chunk management)
- Single Query File ZIP ( chunk management)
- Single Query Prefetch (parallel chunk
donwload/processing) - Single Query Stream (resultset streaming)
- Web Single Query XHTML ( chunk management /
paging) - CSS v2.0, XHTML v1.0 Strict
- Results displayed in the following formats
- Tabular
- XML
- HTML
- RAW
27Single Query On-Line
Client
SQS submission
Get Value/Get Tuple
Result submission
SQL
Recordset
DBMS
GRelC-Data-Access
This kind of query is suitable for DML statements
or to retrieve small resultsets
28Single Query File Approach (Zip)
This kind of query is suitable to retrieve
medium/large resultsets
GrelCRecordset APIs
GrelCLoad Recordset
GrelCRecordset in XMLformat
Client
SQS query
Data Delivery
SQL
Recordset
DBMS
GRelC Data Access
29Single Query File chunk (Zip)
This kind of query is suitable to retrieve
medium/large resultsets
GrelCRecordset APIs
GrelCLoad Recordset
GrelCRecordset in XMLformat
Client
SQS query
Data Delivery
SQL
Recordset
DBMS
GRelC Data Access
30GRelC Data Access Clients
31Single Query HTML
Client
Http connection
SQS submission
URI Result
SQL
Recordset
DBMS
GRelC-Data-Access
32Single Query HTML (Pre-production)
Client
HTTPS connection using X.509 Certificates
SQS submission
URI Result
SQL
Recordset
DBMS
GRelC-Data-Access
33Asynchronous Query
- Asynchronous queries
- Batch mode
- Users can define a lifetime for results
availability on the GRelC DAS - decoupling client/server (e.g. WN gLite)
- New clients (submission, status, abort)
- Additional thread to manage requests
- Preliminary internal tests were ok
- Added within the current release v2.2.0
34Asynchronous Query
1 Asynchronous Query Submission
GrelCRecordset APIs
2 Request Dispatching
GrelCLoad Recordset
3 Data Delivery
4 Data Manipulation
GrelCRecordset in XMLformat
Client
SQS query
Get File
ID Query
Data Delivery
Recordset
SQL
DBMS
GRelC Data Access
ID Query
35Async Query State diagram
FAILED
failure
RUNNING
purge v timeout
execution
DONE
completion
query submission
purge v timeout
QUEUED
PURGE
purge v timeout
abort
ABORTED
timeout
36Async Query Functions list
- Insert Query in the Catalog
grelc_service_insert_async_query -s ltserver_IPgt
-p ltportgt -d ltdb_namegt -q ltquerygt
grelc_service_check_status_async_query -s
ltserver_IPgt -p ltportgt -i ltid_querygt
grelc_service_abort_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt
grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt
grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -i ltid_querygt -f ltdestination_file_namegt
grelc_service_purge_async_query -s ltserver_IPgt -p
ltportgt -d ltdngt -S ltstatusgt
37Single Query Stream
Client
SQS submission
Result submision
Recordset
SQL
DBMS
GRelC-Data-Access
This kind of query is suitable to retrieve VERY
LARGE resultsets
38Testbed
- SQs Comparison
- Test DB bioinformatics relational database
- Sequential tests
- SELECT statements
39Test Performance (III)
40Test Performance (IV)
41GRelC gLite
42GRelC on gLite Porting
- Porting of GRelC on gLite was straighforward
- Porting on gLite is ok both for client and server
side - The middleware works fine both on LCG-2-7-0 and
current gLite 3.x middleware - GRelC DAS runs also on several platforms
- Linux
- MAC OS X
- FreeBSD
- Both IA64 and IA32 platforms are supported (we
currently installed on SPACI-LECCE-IA64 (EGEE SA1
partner) the GRelC DAS)
43GRelC on gLite New Service
- Straighforward integration within the EGEE farm
model - GRelC DAS provides fine grained data mng service
- This service can be used both as farm service and
as VO service depending on the context, the
database policies/constraints, etc.
BDII query
Datatransfer (files)
BDII
Extended EGEE Farm Model
ComputingElement
StorageElement
Files
Wn
Wn
Wn
44GRelC on gLite VOMS
- We provide global authorization by means of VOMS
Extensions - High level of scalability concerning DAPs related
to VOs - Double level authorization framework both local
and global policies management can be provided
(mixed mode)
Coarse Grained
Fine Grained
45Two-level authorization
- Global authorization (through VOMS extensions)
- Local authorization (by means of the local GRelC
DAS authorization framework) - The two masks obtained from global and local
authorization are combined to infer the final
User Privileges Mask (UPM) - 3 scenarios
- global mode, coarse grained approach
- local mode, fine grained approach
- combined mode
46Global Mode
- User credentials must be obtained through
voms-proxy-init - The UPM is inferred from the available VOMS
extensions - No additional authorization setting is required
on the GRelC DAS - Easy and fast setup procedure
- It scales well
- Feasible for a real production grid environment
47Global Mode
48Local Mode
- User credentials must be obtained through
grid-proxy-init - The UPM is drawn out of the GRelC DAS metadata
catalogue - No VOMS extensions are added to the user proxy
- The setup procedure must be carried out on each
GRelC DAS - Scalability is worse
49Local Mode
50Combined Mode
- User credentials must be obtained through
voms-proxy-init - The UPM is inferred joining information on access
policies coming from VOMS extensions and the
GRelC DAS metadata catalogue - VOMS level (grant or revoke)
- GRelC DAS level (setting, undefining, unsetting)
51Combined Mode - An Example
52Roles and Groups on VOMS (I)
Case A (fine grained)
/gilda/grelc/das/host1/grid-db1/Rolegrelc-db-inse
rt
53Roles and Groups on VOMS (II)
Case B(intermediate level)
/gilda/grelc/das/host1/Rolegrelc-db-insert
54Roles and Groups on VOMS (III)
Case C (coarse grained)
/gilda/grelc/das/Rolegrelc-db-insert
55GRelC on gLite BDII
- GLUE schema extension providing information about
VOs and Databases (we plan to interact with OGF
GLUE-WG) - Local admin can set up the Information Provider
Level parameterMin 0 to publish just basic info
(only the contact string)Max 7 for all info
(contact string, VOs, DBs, tables, fields, etc.)
Information System Extensions
Database specific Information
56GRelC on gLite Porting on SLC4.x
- Porting on SLC4.x is an on going activity
- Preliminary results are very good
- Porting will be completed before EGEE Conference
in Budapest - A release based on SLC4.x will be available on
the GRelC website in September - Current test are connected both with IA32 and
IA64 (Itanium2 processors) platforms - This activity is part of the SPACI-LECCE-IA64 SA1
activity within the EGEE Project
57INFN GRID Deployment
- Involved Sites
- INAF Trieste (IA32)
- INFN Bari (IA32)
- INFN Catania (IA32)
- INFN Padova (IA32)
- SPACI LECCE (IA32, IA64)
- Testing Activities
- Sequential tests
- Concurrent tests
- Bugs report
- Bug Fixing
- Optimization
DAS Server DAS Client
INAF Trieste
INFN Padova
INFN Bari
SPACI-Lecce
SPACI Lecce
INFN Catania
58SEPAC Grid Deployment
59Important numbers technologies
- Some important numbers about the GRelC project
- 1 Patent
- About 18 International works
- More than 50.000 code lines
- 91 C classes
- 28 QT GUI Windows
- 103 services
- Wide documentation
- Technologies
- GSI
- gSOAP
- GSI-plugin
- QT Library
- SDAI Library
60GRelC WebSite
- Main sections
- Download
- (rpms available)
- News
- Publications
- Events
- Deployment
- Documentation
- Components
- ..
-
GRelC Website URL http//grelc.unile.it/ Mailing
List mail grelc-user_at_sara.unile.it
61User tutorial GILDA t-Infrastructure
- GRelC DAS User Tutorial
- on GILDA Grid CT Wiki Website
- Info about
- Log in to the grid
- Query Submission
- For any information about GILDA t-Infrastructure
please - contact roberto.barbera_at_ct.infn.it
grid-prod_at_ct.infn.it - GRelC DAS Tutorial link https//grid.ct.infn.it/
twiki/bin/view/GILDA/GRelCDataAccessService
Special thanks to the GILDA Staff for their
support
62Conclusions
- GRelC DAS provides support in Grid for a wide
range of DBMSs. - It is currently tested on several grid
environments (SPACI, SEPAC, GILDA, INFNGRID) - A wide SDK is available for developers
- CLI XGRelC Graphical Interface to ease Grid-DB
mng - gLite compliant (porting on gLite 3.x and
integration with VOMS framework, BDII, etc.) - Support for several platforms (IA32 and IA64)
- Currently the software is candidate at the EGEE
Respect Program
63For any information
- Supervisor Prof. Giovanni Aloisio
(giovanni.aloisio_at_unile.it) - Project P. I. Ph. D. Sandro Fiore
(sandro.fiore_at_unile.it) - Team Members
- Ph. D. Massimo Cafaro
- MSc Alessandro Negro
- MSc Salvatore Vadacca
- GRelC WebSite http//grelc.unile.it
- Mailing lists grelc-user_at_sara.unile.it