Title: A Distributed ScienceDataDescription RegistryRepository: Implementation and Operational Experience
1A DistributedScience-Data-Description
Registry/RepositoryImplementation and
Operational Experience
- Donald Sawyer/NASA/GSFC
- John Garrett/SPSystems
- November 14, 2007
- GDFR Governance Workshop
2Overview of Presentation
- Control Authority (CA) Office and context
- CA Concept and Standards
- Current CA Organization and Status
- CA Registration Issues
- Closing Thoughts
- References for CA standards
3Control Authority Offices
- NASA has operated one or more Control Authority
Offices (CAOs) since 1994 - Register descriptions of scientific data,
INCLUDING their formats - Assign globally unique identifiers
- Support long-term data understanding, and the
reuse of registered data descriptions - Operate in conformance with international
standards - SFDU-Control Authority Procedures, June 1993,
CCSDS 631.0-B-1/ISO 13764 - SFDU - Control Authority Data Structures,
November 1994, CCSDS 632.0-B-1/ISO 15395 - However original expectations for scope and
operation of distributed CAOs has not been fully
met - Lets look at the context
4What is CCSDS ?
- Consultative Committee for Space Data Systems
- International collaboration of space agencies,
starting 1982 - Develops a variety of science-discipline-independe
nt standards - Became the working body for ISO TC 20 / SC 13
about 1990 - TC20 Aircraft and Space Vehicles
- SC13 Space Data and Information Transfer Systems
- Information Interchange Panel (1982-2003)
- Facilitate and promote the interchange,
preservation, and use of space related
information - Develop Recommendations (standards) in areas of
location, access, transfer, understanding, usage,
and archiving of information, e.g. - SFDU-Control Authority Procedures, 1993
- Reference Model for an Open Archival Information
System (OAIS), 2002 - Reorganized (2003) into Data Archive Ingest
Working Group, Information Packaging/Registry
Working Group, and recent Repository Audit and
Certification BOF working group
5Information Interchange Problem
- Much space-science observational data may be
characterized as data products - Long sequences of complex, repetitive, data
structures - Need description of data structures format and
associated meanings - Space agency participants in CCSDS recognized the
utility of registering descriptions of science
data products - Data descriptions could be better preserved, and
improved, and would have unique identifiers - Unique identifiers could be more easily
associated with the complex data structures,
thereby improving data/documentation linkage - Resulting program of work led to
- Standard data packaging with associated unique
identifiers - Operational procedures and data structures for a
distributed Control Authority Organization that
registers and disseminates data descriptions
6Control Authority (CA)
7Registration Requirements
- Distributed, federated, Control Authority Offices
(CAO) register and disseminate descriptions - Data descriptions available separately from data
product instances - Provide access over time to data descriptions
- Ensure knowledge of data description existence
- Standard way to identify location of data
descriptions - Internationally unique identifier (ADID)
- Capability to accept standard service requests
- Capability to modify data descriptions
- Provide notification of updated data descriptions
8Control Authority Service Concept
- Data Producer sends a data description to a
Member Agency Control Authority Office (MACAO) - Description sent as a standardized Registration
Package (RP) or Revision Registration Package
(RRP) - MACAO converts the Registration Package to a
registered Data Description Package (DDP) - Unique identifier, called an Authority and
Description ID (ADID), is returned to the Data
Producer - Data Producer incorporates the ADID within the
data product - Data product is exchanged/disseminated to data
users - Data User contacts the MACAO
- Provides the ADID
- Receives the registered Data Description Package
- Able to understand the data product
9Control Authority Organizational Structure
(WDC-SI)
10Control Authority Procedures
http//public.ccsds.org/publications/archive/630x0
b1.pdf
2 CONTROL AUTHORITY ORGANIZATION AND
RESPONSIBILITIES 3 PROCEDURES FOR USER
SERVICES... ......................................
...........................7 3.1 Request
Processing .......................................
..................................................
....7 3.2 User Services Provided by the CA
Agent.............................................
...................8 3.2.1 CCSDS Data Description
Dissemination.....................................
.................8 3.2.2 Control Authority Annual
Report Dissemination..............................
.............8 3.3 User Services Provided by the
MACAO ............................................
......................9 3.3.1 Data Description
Registration .....................................
................................9 3.3.2 Data
Description Dissemination ........................
..........................................10 3.3.3
Data Description Revision .......................
..................................................
.12 4 PROCEDURES FOR INTERNAL ADMINISTRATION..
.............................................1
3 4.1 CCSDS Secretariat Internal Administration
..................................................
............13 4.1.1 Establishing a MACAO
..................................................
............................13 4.1.2 Dissolution
of a Primary MACA O...............................
..................................14 4.1.3 CA
Annual Report Publication.........................
...........................................14 4.2
MACAO Internal Administration ....................
..................................................
........15 4.2.1 Establishing a Descendant MACAO
..................................................
.........15 4.2.2 Dissolution of a Descendant
MACAO ............................................
.............16 4.2.3 MACAO Annual Report
Production .......................................
.....................16
11CA Agent Primary Responsibilities
- Disseminate CA associated CCSDS standards
- Ensure uniqueness of MACAO Control Authority IDs
(CAIDs) - Maintain organizational information on
established MACAOs. - Publish an annual report that summarizes CA
organization and activities - When a Primary MACAO can no longer fulfill
duties, aid in negotiation to identify one which
will assume them - Assist agencies in establishing Primary MACAOs
- Provide operational guidelines and instructions
to Primary MACAOs
12Example CA Agent Information(World Data Center -
Satellite Information)
- National Aeronautics and Space Administration
(NASA) - USA Preferred Agency ID N
- Primary MACAO CAID NSSD
- Descendant MACAO CAID NJPL
- Descendant MACAO CAID NURS
- MACAO CAID NSSD
- MACAO Type Agency Primary
- Web Page http//ssdoo.gsfc.nasa.gov/nost/cao-nssd
/cao-nssd.html - MACAO Contact NASA Primary Control Authority
Office at the National Space Science Data Center
(NSSDC) NASA/Goddard Space Flight Center
NASA/Science Office of Standards and Technology
(NOST) SFDU Support Office - Code 633 Greenbelt, MD 20771 USA
- E-mail John.G.Garrett_at_nasa.gov
- Telephone 1 301 286 3575
- FAX 1 301 286 1771
13MACAO Primary Responsibilities
- Register new and revised data descriptions.
- Disseminate data descriptions upon request.
- Ensure availability of data descriptions
- As necessary, establish Descendant MACAOs
(Ascendant MACAOs maintains overall
responsibility) - Provide operational guidelines and instructions
to Descendant MACAOs - Inform Ascendant MACAO of intention to cease
operation - Assume the preservation, revision, and
dissemination duties of any Descendant MACAOs
ceasing operation. - Maintain statistics of requests for the annual
report - (Primary MACAO only) Request a CAID for each
Descendant MACAO to be established within its
organization - (Primary MACAO only) Maintain a log of CAIDs
assigned to its Descendant MACAOs.
14MACAO Data Description Registration
Administrative Attributes
- Identifier (Authority and Description (ADID)
- includes Control Authority ID (CAID)
- Revision Number
- Type of Registration Package
- Registration, Revision, or Dissemination
- Title of Registered Description
- Short Description of Registered Description
- Revision Comment
- Submission Date
- Registration Date
- Revisable
- Releasable
- Originator Identification and Contact Information
- Reviser Identification and Contact Information
15Control Authority Data Structures
http//public.ccsds.org/publications/archive/632x0
b1.pdf
- Establishes a standard package for submitting
registration information - Uses SFDU data packaging standard (CCSDS
620.0-B-2) that encapsulates data/metadata
objects, each with a specific ADID and
classification - Identifies small set of classifications for the
contained metadata objects - C data administration service
- K catalog information (registration attributes)
- D format information (structure)
- E data entity dictionary information
(semantics) - S supplementary information
- C class object allows incorporation of other
Descriptions by reference to their ADIDs, thus
supporting reuse - Representation Network is supported through ADIDs
- Establishes a standard package for the
dissemination of registered data descriptions
16Current CAO Organizational Structure and Status
17Control Authority Offices
- CNES - French Space Agency
- FCST - CNES/Centre Spatial de Toulouse
- ESA - European Space Agency
- EESA - European Space Agency (ESA) Primary
Control Authority Office - ECLU - ESA CLUSTER Mission Control Authority
Office - EEUR - ESA EURECA Mission Control Authority
Office - EHUY - ESA Huygens Mission Control Authority
Office - EMEX - ESA Mars Express Mission Control Authority
Office - EXMM - ESA Huygens Mission Control Authority
Office - NASA
- NSSD - NASA Primary CAO at the National Space
Science Data Center (NSSDC) - NJPL - Jet Propulsion Laboratory Control
Authority Office - NURS - Upper Atmospheric Research Satellite
Control Authority Office
18CAO Activity
- No new Control Authority Offices visible to CA
Agent have been established since 2000 - Although these are ISO standards, no organization
outside CCSDS has asked to participate - CAO activities have been largely local
- No CA Agent report since 1996
- Activity appears confined to ESA and NASA
- ESA believed to still be using ADID registered
descriptions in their dissemination of spacecraft
telemetry data
19CAO Activity (2/3)
- NASAs descendant MACAOs are no longer active
- NJPL, at Jet Propulsion Laboratory
- Planetary Data System created it own media volume
directory structure for holding data products and
data descriptions - Most descriptions were submitted on hardcopy
- NURS, at Goddard Space Flight Center
- Active only during the life of the mission
- NJPL and NURS descriptions have not been sent to
NASAs primary MACAO at the National Space
Science Data Center (NSSDC)
20CAO Activity (3/3)
- NSSD, NASAs primary MACAO, continues to register
data descriptions - NSSDC has adopted OAIS Archival Information
Package concepts - AIP packaging implementation incorporates ADIDs
- NSSD has assigned about 560 ADIDs, starting in
1989 - About 300 have little description- need
conversion from current forms - 99 registered descriptions are visible from the
Web - 10 registered descriptions have been revised
- Descriptions are submitted and registered for all
newly arriving data - Descriptions are registered for older data
undergoing migration into AIPs
21NSSD Description Characterization
- All expressed in ASCII
- However, some PDF and TIFF is planned
- Data structures are both ascii and binary
- Data structure types include
- FITS
- CDF
- CDF with ISTP conventions
- PDS Labels
- Flat File (2-file pair)
- IDFS (multi-file)
- GIF
- Numerous unique data structures
- Wrapping encodings include
- GZIP
- ZIP
- TAR
22NSSD CAO Registration Issues
- Multiple encodings of a data object are not
conveniently supported - e.g., data objects format is described, then
data object has been compressed, and is stored - Work around is to assign new ADID to combination
of, for example, format and compression,
resulting in ADID proliferation - ADIDs are often assigned with minimal description
so as to support operational needs - Resources for description submission are not
always keeping pace
23Closing Thoughts
- Long term commitments are hard to maintain unless
they are integral to the organizations business
model and technical approach - NSSDC science data-description-registration
needs - Ability to associate different semantics with an
underlying data structure description - Ability to create a Representation Network
- NSSDC, as an archive, would probably want to
download externally registered format
descriptions to ensure its data objects remain
understandable - May not be necessary if external registry becomes
sufficiently stable and reliable (trusted
repository)
24Sources for CCSDS CA Recommendations
- CCSDS 620.0-B-1 Standard Formatted Data Units --
Control Authority Procedures, http//public.ccsds.
org/publications/archive/630x0b1.pdf, June 1993. - Also known as ISO 137641996
- CCSDS 622.0-B-1 Standard Formatted Data Units --
Control Authority Data Structures,
http//public.ccsds.org/publications/archive/632x0
b1.pdf November 1994. - Also known as ISO 153951998
- CCSDS Report - CCSDS 631.0-G-2 Standard
Formatted Data Units -- Control Authority
Procedures Tutorial, http//public.ccsds.org/publi
cations/archive/631x0g2.pdf November 1994.