Title: System Evolvability Features of the SenseLab Project
1- System Evolvability Features of the SenseLab
Project - Luis Marenco, MD
- Center for Medical Informatics
- Yale University School of Medicine
2Outline
- The nature of some bioscience applications (e.g.
Neuroscience) where domain knowledge is in
constant revision requires an application
infrastructure capable to evolve over time. - The reasons and one possible solution to this
problem will be reviewed in the following
topics - Motivation The SenseLab project
- Background issues of standard applications
- Evolvable applications goals
- Some Possible solution scenarios
- EAV/CR - features for evolvable applications
- EAV/CR derived methodologies for evolvable
applications - EAV/CR application demo (SenseLab)
- EAV/CR Solution Framework
3Motivation The SenseLab Project
- The SenseLab project is a ongoing effort to
integrate multidisciplinary sensory data using
the olfactory system as model domain. - The process involves the development of
neuroinformatics databases and tools in support
of neuroscience research. - SenseLab web-portal contains the following
web-databases - Neuronal research NeuronDB, ModelDB, and
CellPropDB - Olfactory research ORDB, OdorDB, and OdorMapDB
- The fundamental problem statement is the
maintenance burden due - Constant domain evolution
- Research of not well understood process like
olfaction involves constant factoring-in of new
variables or disciplines
4Background Issues of Standard Applications
- Standard database applications are characterized
with code entwined with metadata descriptors from
back-end databases. The limitations to this
approach are - Increased coding as database complexity grows
- Limited code reusability
- Lack of robust data interoperability (messages
mirror the schema) - Complexity derived by use of multiple tools to
maintain schema data editing, and security - To advance knowledge represented as metadata, the
necessary schema changes will lead to - Downtime and application breakdown
- Interface redesign (GUI and Inter-application
recoding) - Increased code complexity
- Increased probability of coding errors
5Background Issues of Standard Applications (2)
- Traditional Web-database applications
- Data entry and security Cumbersome, expensive
and non-portable to other applications - Searching mechanisms Limited, difficult to
standardize and expensive to create. The hidden
web remains an issue - Site-wide architectures are cumbersome to adapt
to new web formats (e.g. Semantic-web types) - Metadata maintenance
- Data dictionary Incomplete
- Complex, non centralized, and requires more than
one tool - Requires specific database expertise. Non
portable knowledge - Tools and software libraries are specific to
every vendor database
6Evolvable Application Goals
- PRIMARY
- Create a programmatic approach capable to allow
databases structural changes without disrupting
the existing data and code - Minimize codemetadata dependency focusing on
automated interface generation (GUI Inter App.) - Improve code simplification as project matures
Extreme Programming principles - SECONDARY
- Facilitate system integration to a Web platform
- Accessibility from common web browsers.
- Incorporate role-based security with public and
private data - Create generic interfaces and formats for data
exchange - Improve code reusability leveraging previous
approach - Foresee robust interoperability with standardized
protocols
7Possible Solution Scenarios (some)
- Use of object oriented or object relational
databases Immature and unsupported - Leverage other application approaches (e.g.
Protégé) The part that is related with flexible
data structures Lack of features (e.g.
non-distributed or web-based, no security
implied). Future version will possibly cover
these features. - Built a new ground-up solution to provide
needed features The EAV/CR Application Framework
(Combination of data storage approach software
practices)
8EAV/CR Storage Approach
- EAV/CR (Entity-Attribute-Value with Classes and
Relationships) data storage system is derived
from the EAV row based data modeling approach
widely used in Electronic Patient Record Systems
and MS Windows Registry, among others.
9Relational (left) to EAV/CR (right) Comparison
EAV/CR uses a limited number of tables to
represent any amount of tables from a relational
DB. EAV/CR treats data (VALUES) and metadata
(CLASSES, ENTITIES, and ATTTRIBUTES) as
relational data allowing flexible domain
representation.
10EAV/CR Storage Approach (2)
- EAV/CR augments standard EAV by
- Grouping entities in Classes C
- Using strong data typing for value storage
- Allowing computed attributes (functions)
- Allowing entity relationships R (related and
hierarchical attribs.) - Including implicit data and metadata versioning
and timestamp - Including Web oriented features Metadata have
been enriched with web parameters to automate
web-interface generation (Web forms, XML, ) - Assisting ontological representation Mapping
standardized vocabulary and semantic
relationships identifiers to data and metadata
elements
11EAV/CR Features for Evolvable Applications
- Automatic system adaptability to DB structural
changes - Generic metadata-driven database navigation
- Robust data-entry and schema-maintenance web
forms generation - Ability to create database portals to present
different subsets of the data to users with a
particular research focus - Centralized role-based security. Uses a
compartmentalized distributed administration
model to minimize dedicated administration costs - Monitoring tools
12EAV/CR Features for Evolvable Applications (2)
- Expandable system architecture Allows parallel
processing by scaling-out. Parallel web servers
can connect to the same EAV/CR database
preserving security, data and metadata
concurrency - Delegated user profile management Users are
responsible of their own profiles, administrators
provide access to users to specific database
resources. (Web portal model)
13EAV/CR derived methodologies for evolvability
- Data Services Creation of the EDSP InfoSet
protocol to allow description of database
ontology, metadata, and data in a simple XML
format. (It brings the EAV/CR approach to the XML
world). - The following processes depend on EDSP
- Data transference
- Middle tier components
- Automated Ad-hoc query interface generation
- Using EDSP as the source for these processes
improves software components stability and
reusability
14EAV/CR Application Framework
- Programming model
- Component programmer
- Domain programmer
- EAV/CR Framework Toolkit (version1. Codename?)
- Database Component Encapsulates EAV/CR logic
presenting interfaces for domain programmers.
Created in MS C.NET - Plumbing code Generic Web portal scripts.
IIS-ASP-VBScript
15Summary
- EAV/CR and Evolvability
- High data integration
- Flexibility in database schema evolution /
maintenance - Code reuse and increased reliability
- Extensible application architecture
- Disadvantages
- Querying complexity
- Multi-parameterized queries performance penalty
- Complex EAV/CR components programming
16Demo Metadata driven Ad hoc interface generation
- Boolean expression can be added for complex
associations. Results can be retrieved in HTML,
XML text and other formats.
17Demo Metadata driven Ad hoc interface generation
(2)
- The same generic code behind this interface is
reused in other databases augmenting the value
added in this robust evolvable design.
18Demo EAV/CR Centralized Schema Management
- The Schema Manager tool displays and allows
edition of the database structure. This figure
shows the database inventory of the SenseLabs
EAV/CR data store with links to specific elements
Next gtgt After selecting CellPropDB
19Demo EAV/CR Centralized Schema Management (2)
- Selecting a database (e.g. ModelDB), displays
the web database information, this can be changed
at any time.
20Demo EAV/CR Centralized Schema Management (3)
- On the left, selecting Classes displays the
list of Classes for ModelDB, on the right the
Class Models is being edited
21Demo EAV/CR Centralized Schema Management (4)
- While in the class Models, selecting the
Attributes tab shows all its attributes (left).
On the right, the attribute neurons shows its
relation to neuron objects from NeuronDB.
22Demo EAV/CR Centralized Schema Management (5)
- Similarly like in previous slides, Schema manager
allows entering of new users and granting rights
to specific databases. Lastly, by clicking on the
diagram link, shows the ER representation of the
ModelDB database.
23Demo InfoSets and Evolvable Interoperability
- The creation of the EDSP (EAV/CR dataset
protocol) allows transference of database schema
and data in a simple consistent format based upon
the universally accepted XML format. This picture
show a partial rendering of some olfactory
receptors molecules from ORDB
24Demo InfoSets and Evolvable Interoperability (2)
- Meanwhile, exchange of data with other standard
protocols is achieved through XML
transformations. Below is the previous EDSP
message transformed into Microsoft XDR, format
used by the MS Office Suite to import into MS
Access and MS SQL Server.
25Demo InfoSets and Evolvable Interoperability (4)
- A practical use of the XDR is demonstrated here
while importing data directly from a SenseLab URL
to an Access or SQL Server database.
26Demo InfoSets and Evolvable Interoperability (5)
- This example points to a particular olfactory
receptor at - (http//senselab.med.yale.edu/senselab/site/dbGate
/Xtract.asp?o1798xsledsp-officedata)
27Demo InfoSets and Evolvable Interoperability (6)
- Access shows the tables to be generated
28Demo InfoSets and Evolvable Interoperability (7)
29Demo InfoSets and Evolvable Interoperability (8)
- relationships, and the data (preserving strong
data typing ) - All in one deEAVfication process.