Title: Agenda
1Agenda
- Project Status Summary Scott Hawker
- NASA Technical Standards Program Status
Summary Paul Gill - NTSP architecture upgrade
- Lessons Learned/Training development
- Other development plans
- Architecture overview
- Architecture components Scott Hawker
- XML technologies and tools
Randy Smith - Repositories, web services, stylesheets,
- object layer, etc.
- SA_MetaMatch update
Christina Yau - Interactive Process (7120.5B) status from August
Hong Ma - Capturing and searching Lessons Learned
Swapna Gupta using XML schemas, repositories,
and stylesheets - Dynamic lessons learned concepts and early
prototypes Hong Ma/Swapna Gupta - Next steps and directions
All
2Progress Summary
- Significant progress in Summer 2003
- SA_MetaMatch release and use
- Interactive 7120.5B prototype
- Including capture and search of Lessons Learned
- Lessons Learned Software System requirements and
architecture - Some progress in Fall 2003
- Delay in funding and focus on thesis reports
- Thesis on SA_MetaMatch, Interactive Process,
Native XML-based repository - Progress on XML-based lessons learned, search
engines, web services, etc. - Interaction with Dave Rine (George Mason)
- Review of NASA knowledge management and taxonomy
development
(continued)
3Summary
4SA_MetaMatch Update
5SA_MetaMatch
- Purpose
- Discover relevant documents through document
metadata and indexing - Focus on relating standards and lessons learned
- Summer Status
- Fall Status
- Tasks Done
- Tasks In Process
- Tasks Being Studied
- Next Step
Open discussion at any time!
6Summer Status
- First version of SA_MetaMatch released and tested
- NASA Standards and Lessons Learned looking for
candidate document relationships - A sample of 10 Standards and 70 LLIS
- User interface screen developed
- Main Menu
- Generate Metadata and Index
- Run MetaMatch
- Match against all sample documents
- Apply weighting scheme
- 3 different matching algorithms
7Summer Status
- User interface screen developed
- Display Result
- Match documents result
- Match words summary
- Popup menu
8Fall Status
- Tasks Done
- Automatic Metadata and Index Generation for LLIS
- Testing XML repository interface for SA_MetaMatch
- Analysis of preliminary results for thesis
- Tasks In Process
- Use of XML repository in SA_MetaMatch
- Interactive word filtering
- Group document by type
- Match against Training Material
- Backup, Import / Export function
- Tasks Being Studied
- Match phrase of words
- Apply ontology
9Fall Task Done
- Auto-Generate LLIS Metadata and Index
- NASA used Auto-Gen to generate metadata for 1360
LLIS documents
10Fall Task Done
- Proved concept of XML repository integration with
SA_MetaMatch - Results and analysis in thesis
- Number of relevant documents found
- Distributions of relevant documents rank
- Selective removal of irrelevant words
- Use of special pattern words
11Fall Task Done
- Number of relevant documents found
12Fall Task Done
- Distributions of relevant documents rank
13Fall Task Done
- Problem high-score words which are not useful in
matching - Approach Selective removal of irrelevant words
14Completed Tasks
- Use of special pattern words
15Work In Process
- Enhance SA_MetaMatch to use XML repository
- Word filtering
- User interface for interactive word removal
- Document type grouping
- E.g. All, Standards, LLIS, Training
- Enhance to match Training materials
- New document type
- ppt conversion for indexing
- Backup, Import / Export function
16Being Studied
- Match word phrase
- Apply Ontology Concept
- thesaurus, taxonomy, abbreviations, etc
17Next Steps And Further Study
- Continue the Tasks In Process
- To be enhanced
- Interface
- MetaGen
- Multiple instances
- Dynamic model
- Create MetaEdit
- Create / Edit configuration
- Match groups
- Match parameters
- Filtering
- MetaMatch
18Next Steps And Further Study
- Interface to be enhanced
- Results Display
- Interactive word freq display
- Add relevant links
- Web-based
- Search ability
- E.g. use Verity, Google, etc
- Further study
- Model Document Content Elements
- Ontology concept application
- Stemming
- Learn from query
19Comments and Suggestions fromThesis Committee
- Use data mining techniques to find
- Best ratio of element weights
- Cut off score for relevant documents
- Consult ISO standards document for metadata
elements used
20- Need to discuss transition of code, maintenance,
user manuals, etc. to NASA
21Capturing and searching Lessons Learned using
XML schemas, repositories, and stylesheets
22The Architecture
23Status Capturing New Lessons
- Lessons collected through an HTML form
- Any number of files (std. MIME types) can be
attached with the form - Lessons stored as XML documents in
- One version, stores lessons in file system
- Another, stores them in Xindice repository
- Attached files stored in file system, the lessons
have pointers to associated files
24Status Capturing New Lessons (contd.)
- HTML form conforms to New Lesson format, i.e., a
lesson has - Authors information
- Full name
- Email address
- Phone number
- Essential Lesson information
- Lesson title
- Lesson date (date lesson was learnt)
- Lesson summary
- Lesson description
- Other information associated with the lesson
- Organization
- Facility
- Project
- Phase
- Keywords (any number)
- Attached files (optional, any number)
- This format is specified in an XML schema, all
new lessons (XML document) conform to this schema
25Capturing New Lessons
- Link Capturing New Lessons
- http//swel.cs.ua.edu8080/swapna/servlet/Pos
tLessonsNewMultiFiles
26Capturing Lessons
- The LLIS format can be specified to contain
- Author information (same as previous format)
- Essential Lesson information (same as previous
format) - Other associated information
- Lesson Number (system generated)
- Organization
- Lesson learned (any number)
- Recommendations (any number)
- Evidence
- Enterprises (any number)
- Processes (any number)
- Keywords (any number)
- Approval information
- Date (of approval)
- Name (of approver(s))
- Approving organization
- Phone Number(of approving organization??)
27Capturing Lessons
- The LLIS and New lesson format share common
features like author information and the
essential lesson information - So, we made a hierarchy of schemas
- A base schema containing the common elements
- New lesson schema and LLIS schema extend the base
schema - Diagram on next slide
- As new lesson formats are added, the hierarchy
would grow horizontally / vertically
28Schema Hierarchy
29Capturing Lessons
- Requirements still to be met
- System should allow a lesson to include a URI
(web link) - Referenced information should be stored in the
new system, to ensure its availability - System should allow authorized users to define
additional lesson elements (elaborated further in
next few slides) - System shall notify administrator of a lesson
submission via email
30Defining New Lesson Elements
- Fixed lesson formats, like new lesson format and
LLIS format, require as many forms for lesson
collection, as the formats - The formats are restrictive, may not provide
right fields for the user to divide his/her
lesson information into - But, we have several formats, each with several
types of elements (defined in their schemas), so
we have a rich library of lesson elements - Why not allow the user to create his/her own form
(lesson template) choosing from these elements? - Why not let the user define his own elements if
need be? These may or may not be approved later.
If approved, would enrich the existing library of
lesson elements
31Defining New Lesson Elements
- A base schema would define essential elements,
like author information, which would have to be
part of any lesson irrespective of its format - The user would then have the option to extend
this schema with elements he chooses from the
existing library of lesson elements or with
elements he may define - The schema hierarchy wouldnt be very deep, just
one level deep, diagram on next slide - The user would still have the option of using one
of the pre-defined formats, so the earlier
hierarchy (shown in one of the previous slides)
would still exist
32Lessons Learned Integration and CaptureThe
Problem
Inserted by Scott Integrate or delete?
- Diversity of lessons learned formats and styles
across NASA (and outside of NASA) - How to provide logically integrated view for
search, integration, and use? - How to capture new lessons without introducing
yet another legacy? - Typical approach consensus on universal data
model that comprehends all available lesson
elements - Elusive consensus
- Rigid field/tag names and allowed structures
- Creates the next legacy
33Lessons Learned Integration and CaptureOur
Approach
Inserted by Scott Integrate or delete?
- Extensible library of lesson elements
(fields/tags) - Elements have description of purpose, role
- Ontology-guided selection and creation of
elements - Extraction templates for legacy lessons
- New lesson author can select appropriate elements
or add new ones - Lesson elements have associated capture and
display stylesheet elements - Capture and exploit structure of a given lesson,
but do not force structure on lessons - Ontology-guided selection of terms in element
content - NASA Taxonomy, NASA Thesaurus, WordNet, Working
Group ontologies, etc. - Ontology-guided search for relevant lessons
exploits structure and content ontologies
34Schema Hierarchy for the Conceived System
The library of elements to extend the base schema
with
The original schema hierarchy
Extends Base Schema
Extends Base Schema
Note The New lesson form/schema can be
constructed using elements in the first row, and
the LLIS form/schema using those in the second
row. Elements can be compound, example,
Recommendations consists of zero or more
Recommendation. Approval Info consists of
Name, Phone, Date, Org. These constituent
elements could also be made available as
individual elements.
35Possible User Interface for Posting Lessons
Post a lesson
Choose a std. lesson form
New lesson form
Choose a standard form
LLIS lesson form
Create your own form
On next slide
36Possible User Interface for Posting Lessons
(contd.)
A description of the type and meaning of element
can be provided when the mouse hovers on the
element or as part of tree menu
Form with essential elements already added
A tree menu
37Categories of Users Submitting Lessons
- In the system described above, the users
submitting lessons could be categorized into
following three types - Users who would use one of the standard lesson
formats for submitting lesson - Users who would use existing library of lesson
elements to build their own form - Users who would create their own lesson elements
for use - We believe most users would belong to the first
category, i.e., theyll use a fixed standard form
for lesson submission
38Administration of new element addition
- The administrators interface should allow
him/her to approve a newly created element and
add it to the library of elements available for
use - Alternatively, the element may be found not
useful or redundant, and the administrator can
move the corresponding information in the
submitted lesson into another appropriate element - Allow the administrator to specify presentation
style for newly approved elements (example,
order). This information would be used by the
stylesheet
39Status Retrieving / Searching Lessons
- A crude search mechanism is prototyped
- Allows searching for lessons by value of certain
element - In the prototype, the search is by value of
element Phase (part of new lesson format) - Lessons are displayed using an XML stylesheet
- URLs for attached files are provided
- Link http//swel.cs.ua.edu8080/swapna/servlet/Po
stLessons2 - http//swel.cs.ua.edu8080/swapna/servlet/Ge
tLessons3 - The above prototype is based on the version that
stores lessons in file system, it can be extended
to provide search using other lesson elements
40Retrieving / Searching Lessons
- Requirements to be met
- Retrieving and displaying lessons from XML
repository - Xindice makes the task of performing the crude
search shown previously very easy, by use of
XPath - Use indexing, like SA-Metamatch to provide better
results - Keyword-based search (not just the keywords
stored as lesson elements in the lesson) - Advanced search (AND, OR, NOT combinations,
co-located words, phrases, word stemming etc) - Allowing lessons to be organized in taxanomic
structure (categories) - Categories corresponding to technical disciplines
defined by NASA Technical Standards Program - Support classification of lessons as technical or
programmatic - System should be extendable to support additional
category structures - System should allow categorization of lessons
into zero or more categories - Support search using taxanomic structure
41Other Requirements To Be Met
- Providing an XML wrapper for legacy systems, like
relational databases - Lessons stored in the legacy systems should be
available from the same user interface as those
in the current system - Lessons stored in legacy databases should be
displayed with same fields and content they were
captured in
42Other Requirements To Be Met (contd.)
- Administrative Functions
- Allow authorized user to delete a lesson
- Allow authorized user to edit a lesson
- Allow authorized user to approve a lesson
- Administration of new lesson elements (described
earlier)
43Other Requirements To Be Met (contd.)
- User Profiles and Lesson Push
- System should allow users to register interest in
lessons - Support specification using advanced search
features, like Boolean operators, phrases etc. - System should allow users to register interest
using lesson categories - System should allow users to register interest
naming their current program/project management
or engineering task - System shall notify users when a new approved
lesson meets their profile - The notification shall be via email
- The notification may be via alternate
means-to-be-defined
44Other Requirements To Be Met (contd.)
- Workflow Support
- Provide workflow utility to facilitate approval
of entered lessons - Provide workflow utility to facilitate capture
and application of lessons as part of performing
Program/Project Management Process (example,
NPG7120.5B) - System should be extensible, allow authorized
users to insert a workflow utility - Metrics
- Provide metrics, to be defined
45Immediate Next Steps ??
46NTSP's Lessons Learned/Training development
47Other NASA development needs
- - Search engine
- - Knowledge Management System
- - Learning Management System
- - ???
48Interactive Process
Scott Update this with slides from recent
presentations
49Abstract
- Aerospace systems demand high-quality software
engineering processes to deliver high-quality
products - NASA Marshall Space Flight Centers Avionics
Flight Software development group (group code
ED14) seeks to improve their ability to deliver
space systems that meet all their requirements,
incorporate prior engineering experience, and are
delivered on time and on budget.
- We developed a web-based process web portal that
- Provides guidance on software development
techniques - Provides associated standards and lessons learned
at appropriate points in their development
process - Captures developers experience with the process
and incorporates this new experience, as new
lessons learned, into the web of NASA knowledge
for subsequent use by other developers.
- The web portal is based on an underlying formal
model of the software engineering process
activities and artifacts - A semantic basis for context-based search and for
reasoning about the engineering process
- Result information web portal to search for and
deliver process information to support flight
software development
50Approach
The Semantics of the shapes
Document
Multiple documents
Model
Website
Specification
Process
51Topics
- Initial Web Portal and Assessment
- Initial Activity Model and Assessment
- Refined Activity Model
- Model-Driven Web Portal
- Initial Mapping to PSL
- Conclusion
- Possible Future Work
52Structure of an Activity Description
- Regular pattern to the structure of an activity
in ED 14 OWI - Activity Name
- Activity Description
- Sub-Activities
- Products/Documents Used
- Products/Documents Developed
- Quality-Related Activities
- Quality Records
- So, an activity has
- a description
- sub-activities (described the same way)
- a list of products and documents used
- a list of products and documents developed
53Initial ED14 Web Portal Prototype
- Focusing on ED14 OWI software design activities,
we developed a web portal prototype to deliver
information to activities - The development of the initial ED14 Web Portal
prototype used HTML and Javascript as
hard-coded representations of the engineering
activities and their use of standards documents - The web portal prototype is available at
http//cs.ua.edu/graduate/hma/Advisor/MainMenu.htm
- Snapshot from the web site follow
54Initial Prototype Activity Navigation
55Assessment for Initial Prototype
- Hard-coding the web pages would make it
difficult to change the web portal as the OWI
evolves - There is no support for capturing comments,
issues, and annotations for the information in
the web portal, or for capturing lessons learned - The implementation of the information is not in
a form to support query and reasoning about that
information.
56UML Model of Activities and Documents(First
Iteration)
57Activity Models Standards and Formality
- Issues
- Clarify role of document
- - Input, output, guidance (Standard, Guideline,
Lesson, etc.) - Document granularity some of the documents are
large, and the portion that is used or produced
by an activity is a small portion of the larger
document. - Project-specific enactment of generic process
- - Specific documents and activities for a
specific project
58Activity Model Standards
- The IDEF family of standards (focus on IDEFØ) are
used to model manufacturing production systems
and supporting information systems. - The OMG Software Process Engineering Metamodel
(SPEM) standard is used to model processes as a
configurable assembly of activities and work
products. - Business process modeling standards, such as the
Workflow Management Coalitions reference models
and the associated OMG Workflow standard,focus on
the information to support business activities
(including engineering activities) and the flow
of control between activities (workflow). - These standards guide the modeling effort.
59Activity Definition
- A process is a collection of activities executed
and coordinated to achieve goals - Activities produce a result (output) by acting
upon one or more inputs - Inputs and outputs are data and/or physical
objects (materials, documents, etc.) involved in
an activity - Inputs serve different roles in the activities
- Some inputs are consumed or transformed to
produce outputs - Some inputs are guidance or control on how an
activity is performed - Associated with an input is a resource or
mechanism that performs the activity
Guidance or Controls
Activity
Outputs
Inputs
Resources or Mechanisms
IDEFØ and SPEM
60ED14 Preliminary Software Design Activity
- NASA-STD-2201-93
- NASA-STD-8719.13A
- NASA-GB-1740.13-96
- IEEE-1016
- IEEE-1016.1
- IEEE-12207.1
- Software Requirements Spec.
- Hardware Design Concepts
- Interface Control Doc.
- Trade Studies
Prelim. Software Design
- Preliminary Software Design Description
- Trade Studies
- Software Design Team
- Project Development Team
61Subactivities of ED 14 Software Design Activity
62Process Definition, Enactment, and Execution
Process Models
Process Design and Definition
Activity Instantiation and Control
Activity Enactment
Engineering Tools and Information
Activity Execution
63Refined Activity Model
- Toward a formal ontology
- Packages in the ontology
- Based on
- IDEF
- OMG Software Process Engineering Metamodel
- OMG Workflow Management
- Other Software Engineering models
64Basic Elements Package
- All model elements can have
- Guidance
- External description
PresentationElement is a human
Element
readable textual and graphical
notation for the corresponding model
elements.
subject
ModelElement
presentation
PresentationElement
name Name
0..
0..
0..
0..
1..
1..
annotatedElement
0..
0..
ExternalDescription
- name String
description
- location String
0..
guidance
0..
- content String
kind
guidance
GuidanceKind
Guidance
- medium String
0..
0..
1
1
- language String
GuidanceKind examples technique,
Specializes
directive, checklist, tool mentor, guideline,
"presentation" role of
template, estimate, etc.
location URI (file path, URL, etc.)
ModelElement
medium format (MIMEtype?)
language English, Japanese, etc.
65EnactmentPackage
- Mechanisms to
- Start and complete a specific instance of an
activity - Record history about that activity instance.
66Activities Package
Models the three key elements of activities
Person(Role) performing an Activity to produce a
Product
67Model-Driven ED14 Web Portal
- Examples of instantiating the Activity Model for
the ED14 OWI - Activity aggregation hierarchy
- Activity descriptions text, web page, and flow
diagram - Input and output WorkProducts for Preliminary
Software Design - Guidance elements for the ED14 Preliminary
Software Design activity - Model-Driven ED14 web pages
- Drive the web portal using the model class
instances. - Write scripts to generate web pages with a
data-driven, dynamic approach. - The web pages are a visulization of the
underlying model of roles, activities, and work
products. - The underlying semantic model drives the
user-interactive web pages
68Activities for ED14 OWI
69Activity Descriptions
ExternalDescription
Plain text
name Text Description
location file//cs.ua.edu/.
description
content The Preliminary Software
Design phase will begin
format text/plain
language eng
subject
subject
ExternalDescription
name Web Page Description
Web page
subject
location http//cs.ua.edu/.
content lthtmlgt ltheadgt
ltmetagt lt/headgt
- Graphical elements for user interfaces or
generated documents - For search and reasoning, knowing the names,
types, and relationships among the specific
elements will be valuable
description
format text/html
language eng
Flow diagram
ExternalDescription
name Flow Diagram Web Page
location http//cs.ua.edu/.
content lthtmlgtprelimDesign.jpeg
description
format text/html
language eng
70Input and Output WorkProducts to an Activity
71Guidance Elements for an Activity
guidance
guidance
guidance
guidance
guidance
guidance
guidance
annotatedElement
72Implement Activity Model with XML Data
lttreegt ltbranch iddesigngt ltbranchTextgtSoftware
Designlt/branchTextgt ltleafgt ltleafTextgtDescripti
onlt/leafTextgt ltlinkgt\design.htmllt/linkgt lt/leafgt
ltbranch iddesignsubgt ltbranchTextgtSub
Activitieslt/branchTextgt ltbranch
idPredesigngt ltbranchTextgtPreliminary
Designlt/branchTextgt ltleafgt ltleafTextgtDescriptio
nlt/leafTextgt ltlinkgt\Predesign.htmllt/linkgt lt/lea
fgt ltleafgt ltleafTextgtProduct/Document
Usedlt/leafTextgt ltlinkgt\inputProduct.htmlt/linkgt
lt/leafgt ltleafgt ltleafTextgtProduct/Document
Developedlt/leafTextgt ltlinkgt\outputProduct.htmlt/l
inkgt lt/leafgt lttreegt
The web portal is driven by an XML document
(tree.xml) that implements the activity model
instances.
tree.xml
73Process Web Site Snapshot 2
Activity name Search
- XML document drives
- Navigation menu structure
- Page content
Process Info in Consistent Structure
74Implement Activity Model with Relational Databases
The content of the navigation menu is stored in
the Microsoft Access database Activity/subactivi
ty instances are implemented as a
(Parent)folder/subfolder, and the Activity
ExternalDescriptions are implemented as
subfolders under a parent Description
subfolder of the parent activity folder. This
implements the three types of ExternalDescription
instances for an Activity as follows
75Capture Lessons Learned to be XML Document
76Capture Lessons Learned as an XML Document
- The structure of a lesson is defined by this XML
schema - A Java servlet saves the submitted lesson as an
XML document conforming to the schema - XML stylesheets define the web page layout of a
lesson capture form and a lesson display page
lt?xml version"1.0" ?gt - ltxsdschema
xmlnsxsd"http//www.w3.org/2001/XMLSchema"
elementFormDefault"qualified"gt - ltxsdelement
name"lesson"gt - ltxsdcomplexTypegt -
ltxsdsequencegt ltxsdelement name"name"
type"xsdstring" /gt ltxsdelement name"phone"
type"xsdstring" /gt ltxsdelement name"email"
type"xsdstring" /gt ltxsdelement name"org"
type"xsdstring" /gt ltxsdelement
name"facility" type"xsdstring" /gt
ltxsdelement name"project" type"xsdstring" /gt
ltxsdelement name"activity" type"xsdstring"
/gt ltxsdelement name"date" type"xsdstring"
/gt ltxsdelement name"topic" type"xsdstring"
/gt ltxsdelement name"keywords"
type"xsdstring" /gt ltxsdelement
name"summary" type"xsdstring" /gt
ltxsdelement name"description" type"xsdstring"
/gt lt/xsdsequencegt lt/xsdcomplexTypegt
lt/xsdelementgt lt/xsdschemagt
77Capture New Lessons
Capture New Lessons for Later Use
78Search Lessons Learned
- Besides the lesson instance in the form of XML,
a separate XML document (lessonslist.xml) is
maintained for search - Contains the list of submitted lessons
- Each element in this list represents a submitted
lesson - Contains the name of the lesson, lesson number,
its location and the activity it is associated
with.
79Activity Name Search
Search Lessons Learned
The results of searching by activity
name Preliminary Design
80Improvement to Model-Driven Web Portal
- Dynamic model-driven web pages would make it
easy to change the web portal content as the
Software engineering process evolves - web content also can be changed through user
interaction - We have implemented the support for capturing
lessons learned in the web portal - Users can search through the lessons learned in
the web portal by activity name or other key
words.
81Initial Conceptual Mapping to PSL
82Conclusion
- We have prototyped an interactive process web
portal - Software engineering process information for a
flight software development organization - Link to standards, lessons learned, etc.
- We have prototyped a formal underlying model
(ontology) of the software engineering process - Model-driven process web portal
- Enables semantic-based search, process
automation, process assembly, etc. - We performed an iterative procedure for modeling
- We demonstrated the use of XML, relational
databases, Java, and web technologies to
implement a model-driven web portal.
83Possible Future Work
- Further implement the model-driven web site
- Implement project-specific tailoring
- Implement the project-specific enactment package
in the activity model - Incorporate PSL concepts
- Represent PSL concepts and relationships in UML
- Transform PSL axioms from KIF to UML OCL
- Incorporate PSL concepts to current model
- Move closer to a formal ontology enabling
automated reasoning, including adoption of
process modeling standards and semantic web
standards - Integrate with Standards Advisor MetaMatch tools
to search for standards, lessons learned, and
other information relevant to the process
activities - Integrate with the Standards Advisor XML
repository for storage, retrieval, update and
search of XML-based documents.
84XML Technologies
85Technologies
- Baseline
- Layered Approach
- Technologies
- What has been done
- What we are doing
- What lies ahead
- Crosscutting
86Technology Adoption
- Training Material
- Lessons Learned
- NASA Standards
- Government Standards
- Industry Standards
Discover
87Technology Adoption
Delivery
Technology Independent Location
Independent Format Independent
88Technology Adoption
89Technology Adoption
Multiple technologies for each layer Each layer
is sublayered
.ppt
.html
.pdf
.doc
Relational DB
Other Repository
Knowledge Exchange via EXtensible Markup
Language XML
90Why XML?
- Extensible
- allows groups of people or organizations to
define vocabularies, data types, and
relationships to enable knowledge exchange in
their domain - flexible development of user-defined document
types - define standards, extend standards, map between
standards - Robust, non-proprietary, persistent, and
verifiable format - Widely accepted, tool support
91XML Terminology
- XSL - language for expressing stylesheets. Two
parts - a language for transforming XML documents, and
- an XML vocabulary for specifying formatting
semantics. - XML Schema - describes the valid format of an XML
data-set, include what elements are (and are not)
allowed at any point - XSLT- a language for transforming XML documents
into other XML documents. - XPath - A language for addressing parts of an XML
document - XLink - XML Linking Language allows elements to
be inserted into XML documents in order to create
and describe links between resources. - XQuery - a query language designed to be broadly
applicable across many types of XML data sources - See www.w3c.org for details
92Technology Adoption
XML
Groundwork/test-bed for development across the
layers
.ppt
.html
.pdf
.doc
Knowledge Location
Knowledge Presentation Strategies
Relational DB
- Current Proof Of Concept
- Simple Search on Keyword
- Single Format (Fixed Fields)
- Stylesheets
Other Repository
93Knowledge Location
- More than search Less than search
- Capture content and semantics
- Reduce information overload
- Exploit metadata to find related information
- Thus finding related documents more accurately
- SA_MetaMatch
- Christina Yau, Masters Thesis
- Provide concept from ontology
- Related terms and relationships from ontology to
help users in searching and in understanding
whether a found document is relevant to their
task. - To use existing search technology to crawl,
index, access, and display a diversity of data - including HTML and XML pages, application-specific
document types (Word, Acrobat, etc.), and
ODBC-compliant databases. - Investigating SWISH-e, Verity, Cuadra Star
- Integration of search engines with XML
repositories - Adopted Xindice, Open source XML repository
- Bindu Arla, Masters Thesis
94Knowledge Location
- Current Work
- Search requirements, see previous slides
- XQuery and XPath
- Swishe, an open source indexer
- Niveditha Thagarapu, MS Thesis
- Shashi Gireddy, PhD
- Research Areas
- Semantic Web Technologies
- Ontology Development
- Search Strategies
- Knowledge representation
- Process representation
95Knowledge Presentation Strategies
- Schemas and Stylesheets to provide a common
consistent view. - Developing a family of stylesheets
- Vijaya Uppala
- XML administrative functions
- Change management, configuration management,
workflow - Format and location Independence
- Native available
- Platform independence
- Current Work
- Sangram Vajre, Shashi Gireddy
- Web services
- Swapna Gupta, Ma Hong
- Lead Development, XML Repository integration
96Crosscutting Concerns
- Fences or Hurdles
- Security
- Politics
- Third party intellectual property.
97Crosscutting Concerns
- Web Services
- Motivation
- Decentralized
- Provide services and request service through
secured firewalled installations - SOAP
- Simple Object Access Protocol (Historical)
- WSDL
- Web Services Description Language
- UDDI
- Universal Description, Discovery and Integration
98Summary
- Lots of organizational knowledge
- LLIS
- Training Material
- Standards
- Concentrate on bringing that knowledge to bear in
solving SPECIFIC problems. - Use Context
- Limit the noise
- Relevant
- Timely
- Layered approach
- Decouple knowledge sources from search from
presentation - Use XML based technologies to facilitate
communications - Move toward distributed services
99Consider content from the NSF proposal
100Object Layer Above XML Repository
101- Object Layer above XML Repository
- Goals
- To ease use of xml technologies and repositories.
- Portability of client programs independent of
repository. - Similar kind of API for all kind of packages,
e.g. LLIS, ED14, training materials - Xml document validation upon schema
102Object layer architectural design
Client Packages
metamatch
ED14 lessons Group
Java doc API
edu.ua.cs.swel. llis
Document Manager
Object Layer
edu.ua.cs.swel. ED14lessonslist
Class generation
Xml schema
edu.ua.cs.swel. lessonslearned
Connect xindice
Marshalling-unmarshalling objects
XML Database
Object Database
Relational Database
Database Layer
103Current developed packages from schemas
- Package edu.ua.cs.swel
- Package edu.ua.cs.swel.llis
- Package edu.ua.cs.swel.lessonslearned
- Package edu.ua.cs.swel.ED14lessonslist
Link to javadoc api
104Other features
- Highly extensible any number of packages can be
plugged in this layer with minimal impact - Changes in schema, changes the structure of the
class. - The API uses best of java properties
-inheritance, exceptions, polymorphism,
collections - Client programs and packages need not be changed,
with change of repository.
105Relation to Dave Rines Work
106Next Steps
- Show flattened UML representation
- Show current LLIS and new LLIS form
- Identify elements/fields/vocabulary for an LL XML
Schema - DocBook as a guide for identifying pieces