Title: Internet XML Packaging Concepts
1Internet XML PackagingConcepts
- Lou Reich
- CCSDS Workshop
- 3-April-2002
2Rationale
- The current CCSDS Standards for Data Packaging
and have not undergone a major revision in 15
years - The computing environment and the understanding
of metadata have changed radically - Physical media ?Electronic Transfer
- No standard language for metadata ?XML
- Homogeneous Remote Procedure Call?CORBA, SOAP
- Little understanding of long-term
preservation?OAIS RM
3Goals
- Assume both media and network exchange
- Maintain current SFDU functionality
- Direct mapping to OAIS Information Models
- Linkage of data and software
- Support for multiple encoding/compression on
individual data item or on entire package - Use of XML based technologies
- Reuse of existing standards effort and tools
4Areas of Investigation
- Existing XML Packaging Mechanisms
- IMS Global Learning Co,iXF,XPACK(ESA)
- Web Services Frameworks
- SOAP,WSDL,DAML/S
- Digital Library Efforts
- METS,FEDORA
- GRID Computing
- GLOBUS Toolkit
5Packaging Methodologies
6Packaging Mechanisms
- Single file
- XML restricts contents
- Consolidate the entire directory structure,
including all files, into a single file. The
current implementation is ZIP, but others such as
CAB, JAR, and TAR can also be used. - An alternative method, increasingly being adopted
by other specifications (e.g. ebXML, BizTalk
Framework 2.0, SOAP with Attachments), is to use
MIME, in particular MIME/related, as a packaging
format. This is more general as a mechanism for
messages that transfer multiple files and does
not exclude using a compression technique for
contained files. - Use of a manifest or table of contents object
makes single file more complex but enable both
directory and message models
7Web Services
8mustUnderstand Attribute
- The "mustUnderstand" is the same as the
"mandatory" in the HTTP Extension Framework - Requires that the receiving SOAP processor must
accept, understand and obey semantics of header
or fail - This allows old applications to gracefully fail
on services that they do not understand
9WSDL
- Think "TypeLib for SOAP"
- WSDL Web Service Description Language
- Uniform representation for services
- Transport Protocol neutral
- Access Protocol neutral (not only SOAP)
- Describes
- Schema for Data Types
- Call Signatures (Message)
- Interfaces (Port Types)
- Endpoint Mappings (Bindings)
- Endpoints (Services)
10ebXML Message Structure
Communication Protocol Envelope (HTTP, SMTP, etc.)
Message Package
SOAP Messages with Attachments MIME Envelope
Header Container
MIME Part
SOAP-ENVEnvelope
ebXML Header Information
SOAP-ENVHeader
ebMessageHeader
ebTraceHeaderList
Otheretc
ebXML Message Service Handler control data
SOAP-ENVBody
ebManifest
ebetc
Otheretc
MIME Part
Payload Container(s)
Payload
11DIME Direct Internet Message Encapsulation
- DIME is a lightweight encapsulation format
- Publicly available on gotdotnet
- Native support for "multipart"
- A DIME message is a collection of records
- Support for chunking of records
- Records can be chunked for streamed data
- Efficient parsing
- Size, type, and message id available up front
12Digital Libraries
13(No Transcript)
14OAIS vs. METS
Updated DMS 02-03-31
ltMETSgt
described by
delimited by
Archival Information Package
Descriptive Information
Packaging Information
identifies
derived from
xml xml schema METS xml schema
ltdmdSecgt
GDM
Preservation Description Information
Content Information
further described by
ltfileGrpgt
ltamdSecgt
ltmdRefgt
Reference Information
OBJID ltaltRecordIDgt
Representation Information
Data Object
Context Information
ltrightsMDgt
RightsMD
lttechMDgt
ltfilegt
ltstructMapgt
ltfileGrpgt
ltsourceMDgt
Provenance Information
digiprovMD
SourceMD
Semantics
Structure
described by
Fixity Information
lttechMDgt
AllFilesMD
AllFilesMD
TextMD
ImageMD
AudioMD
VideoMD
Legend Black Arial OAIS Red Times New Roman
METS Primary Schema Green Times New Roman Italics
Extension Schema
15FEDORA Architectural Model
16Shared Image Behavior Definitions
17GRID
18Globus Toolkit
- Different package types
- pgm Dynamically-linked binary
- pgm_static Statically-linked binary
- rtl Run-time library
- data Data files
- dev Headers, static libs, libtool library files
- doc Documentation
19Issues
20Major Issues
- METS vs XPACK
- General Relationships are still a research issue
- Attributes/elements to replace
- class in label
- Replacement of ADID
- Mime type(Self Describing)
- Specific Data Description ( ADID)
- Package Types
- EDU
- ADU
- DDU
- PDU (process description) -
- RDF
- DAML/S
- BDU (Behavior Description Unit)
- FERDORA/METS
- Web Services
- Phased Standards development plan
21Detailed Issues
- Namespace Issues
- Unique identifier issues
- Use of xlink vs url on reference packages
- JAR/ZIP type formats vs Soap with Attachments
- Attributes/elements to replace
- class in label
- Replacement of ADID
- Mime type(Self Describing)
- Specific Data Description ( ADID)
- Use of multipart mime and binary MIME types
- Any attribute vs element issues
- Performance
- Depth of Nesting of Packages
- Compression/Decompression