Title: Comprehensive Large Arraydata Stewardship System CLASS
1Comprehensive Large Array-data Stewardship
System (CLASS)
CLASS Background January, 2005
2Topics
- CLASS Vision
- CLASS Overview
- Current Status
- Recent Accomplishments
- Technical Overview
- System Functions
- System Overview
3CLASS Vision
4CLASS Goals
- Provide one-stop shopping and access capability
for NOAA and NESDIS environmental data and
products - Provide a common look and feel for accessing NOAA
and NESDIS environmental data and products - Provide an efficient architecture for archiving
and distribution of NOAA and NESDIS environmental
data and products - Reduce implementation costs by using
reengineering, evolutionary effort - Allow NOAA to fulfill its requirements regarding
archive, access, and distribution of the seven
large array data sets
5CLASS Overview (1)
- CLASS is a web-based data archive and
distribution system for NOAAs environmental data - Archive ingest, storage, metadata management,
and data quality assurance - Distribution access, visualization, and data
delivery - CLASS is synonymous with
- LTA Long-Term Archive
- ADS Archive and Distribution System
- AAS Archive and Access System
- CLASS is an extension of an 1995 operational
system SAA (Satellite Active Archive) - Transition to the CLASS architecture began in
2001 - Dual-Site Operational CLASS began 02 April 2004
6CLASS Overview (2)
- As implementation continues for the next 10
years, CLASS will evolve to support - additional data streams (campaigns)
- broader user base
- new functionality
- CLASS concurrently supports both ongoing
operations and new requirements implementation
7CLASS Overview (3)
8NESDIS Consolidation of CLASS-aligned Projects
- September 2003 -- NESDIS placed CLASS in the
Office of Systems Development (OSD) - February 2004 -- NESDIS consolidated
CLASS-aligned Projects and Budgets under the
CLASS Project Manager - CLASS
- Satellite Active Archive (SAA)
- GOES Active Archive (GAA)
- Earth Observing System (EOS) Archive
- NOAA Virtual Data System (NVDS) transferred to
NESDIS/NCDC, as an operational system
9CLASS Project Organization
Users
NESDIS ITAT
NOAA Data Stewardship Committee
CLASS Project Richard G. Reynolds Charles S.
Bryant
Archive Requirements Working Group (ARWG)
System Engineering Team (SET)
CLASS Project Management Team (CPMT)
CLASS Operations Team (COT)
NGDC Development Teams (Boulder, CO)
OSD/TMC Development Team (Fairmont, WV)
OSD/CSC Development Team (Suitland, MD)
System Integration Test Team (Suitland, MD)
OSDPD-CSC Operations (Suitland, MD)
NCDC-TMC Operations (Asheville, NC)
10CLASS Current Status
- Operational system
- Dual-site in Suitland and Asheville
- Coordinated by the CLASS Operations Team (COT)
- Supports the following campaigns and derived data
products (for purposes of this presentation, a
campaign refers to a data stream from a remote
sensing or in situ system) - POES
- GOES
- DMSP
- Radarsat
- Fully documented and implemented CMM Level 3
software life cycle process - Online library of key system documentation
- Integration and test environment in Suitland
- Development environments in Fairmont and Suitland
- Systems engineering support in Suitland,
Fairmont, and Boulder
11Technical Overview
Figure Here
12System Overview
Visualize Data
Visualization Data
Data Products And Metadata
Ingest and Store Data
Data Set Inventory
Access Data
Interface with Users
Customers
Data Caches
Data Providers
Process Orders
Archive
Maintain, Monitor, Control
Orders
CLASS Internet/Intranet
CLASS Operators
13System Functions (Ingest and Store Data)
- Ingest satellite data and derived product data
- Create browse image data files for selected data
sets store on-line - Create netCDF files from selected product data
- Store some files in permanent cache, others in
temporary cache - Archive all files in robotic system
14System Functions (Interface with Users)
- Log in, set up user profile
- Initiate catalog search
- View search results catalog data, browse images,
dataset coverage maps - Order data
- View order status
- Visualize and download product data netCDF files
15System Functions (Visualize Data)
- Invoked by User Interface
- Interfaces with tools for creating visualization
images - Ferret for data analysis
- Browse Image Generator
- CoastWatch Image Generator for CoastWatch images
- Returns URL of file containing image to User
Interface
16System Functions (Access Data)
- Invoked by User Interface and Order Processing
modules - Search catalog for data sets that meet
user-specified criteria - Data type
- Time range
- Geographic coverage
- Other criteria appropriate for each data type
- Locate files on-line or retrieve files from
robotic storage - After ingest or retrieval, keep files on-line for
several days for quick access
17System Functions (Process Orders)
- Fill orders submitted through User Interface
- Put ordered files in FTP area for users to pull
or put on physical media - Notify users when files are available
- Provide subscription order service
- Place orders automatically for newly ingested
data sets that meet subscription criteria - Push files to subscribers or make files available
for pull - Provide bulk order service
- Create orders for large amounts of data that
cannot be ordered conveniently through User
Interface - Retrieve files in time-ordered blocks and place
in FTP area for user to pull or put on physical
media - Notify user when each block is available
18System Functions (Maintain, Monitor, Control)
- Maintenance
- Automatically create new log files each day
- Automatically create new FTP area directories as
needed - Automatically clean up temporary caches, work
directories, FTP area, database tables - Monitoring and Control
- Record all activity and error messages in
standard log files - Send error and warning messages from log files to
operators via e-mail - Provide tools and operator interface for
monitoring and controlling functions - View order status cancel, restart, prioritize
orders - View ingest statistics initiate re-ingest
- View ingest and order activities stop and
restart activities - Modify allocation of processes to platforms
- Modify processing parameters
19System Overview (Subsystem Design - 1)
- Functions are performed by several subsystems
that fall into 4 design categories - Data Storage and Distribution Subsystems
Ingest, Subscription and Bulk Order Generation,
Data Recall, Delivery - Each subsystem consists of several independent
processes - Data are transferred between processes via the
database - A workflow engine, the Activity Controller,
starts processes as needed, routes work items
(data sets, orders, recall requests) to processes
in predefined sequences, monitors processes and
work item progress - Object-oriented design implemented in C
- Â Servers Inventory, Visualization, Order
- Each server is a resident process designed for
quick response - Requests and responses are transmitted via socket
messages in XML format - Object-oriented design implemented in C
20System Overview (Subsystem Design - 2)
- Functions are performed by several subsystems
that fall into 4 design categories (Continued) - User Interface
- Java/XML-based web interface uses off-the-shelf
components - Apache Web Server Cocoon - publishing
framework Avalon/Excalibur - database connection
pooling Tomcat - servlet engine LogKit -
message logging Informix Java Database
Connectivity (JDBC) Driver - Java servlets perform special functions
- Communication with Servers via socket messages in
XML format - Monitoring and Maintenance Tools Log Monitor,
Cache Cleanup, Work Space Cleanup, Independent
Monitoring, Operator Interface - Stand-alone tools
- Various invocation modes resident (Log
Monitor) run periodically by cron (Independent
Monitoring) run as needed by operators or
Background Subsystems (Cache Cleanup, Work Space
Cleanup). - Various implementations C (Cache Cleanup),
Perl (Log Monitor, Work Space Cleanup,
Independent Monitoring, Operator Interface)
21System Overview (Distributed Redundant Archive)
Supplier
Supplier
Supplier
Supplier
Supplier
Supplier
Ingestprocess
Archiveinterchange
Ingestprocess
Archiveinterchange
Operationalinventory
Operationalinventory
Archiver
Archiver
Operationaldatastore
Operationaldatastore
Roboticstorage
Roboticstorage
Asheville
Suitland
22System Overview (Ingest and Store Data)
23System Overview (Interface with Users, Visualize
Data, Access Data, Order Process)
24System Overview (Maintenance, Monitoring, Control)