Title: Inside the Data Extractor
1Inside the Data Extractor
- Ag Stephens, 27 July 2004
2Motivation
- The BADC needs a generic tool for allowing users
to extract, convert, subset and plot datasets. - Live Access Server let us down.
- Many scripts in place built on CDAT, Convsh/Xconv
and existing Fortran, but NO interface. - LAS did provide inspiration and a prototype
interface to steal from. i.e. let the user step
through selecting dataset(s), parameter(s),
spatiotemporal subset and plot or write to a data
file.
3Design Principles
- Could develop into NDG tool and CDAT CGI tool.
Therefore needs to be modular and object oriented
to provide as much code portability as possible. - Written all in Python with (initially) a CDAT
core to interact with data and plots. Note a
bit of HTML and Javascript was also inevitable,
and a bit of stolen Javais this starting to
sound like LAS mark 2? .
4An Overview of Version 0.1
- The first version of the Data Extractor that goes
live at the BADC is version 0.1 which consists
of - dxui.py (CGI script Data Extractor User
Interface). - dx (the Data Extractor package a group of
modules). - splatui.py (CGI script Spatial Plotting and
Animation Tool User Interface). - geosplat (GeoSpatial Plotting and Animation Tool
package a group of modules). - some HTML headers, footers, style sheets,
Javascript functions within HTML and an external
Java Map Applet (optional).
5The Process Chain
User downloads NetCDF
splatui.py is called with the location of the
NetCDF file as its argument. It imports the
geosplat package.
User selects plot or animation options.
A NetCDF file is produced as output from user
selections
User downloads graphics file.
dxui.py imports the dx package and interacts with
user allowing them access to authorised datasets.
6Modules and Classes 1 dxui.py
- dxui.py
- contains only the main() function that controls
the flow of dx by calling relevant class
instances and functions from the dx package. - takes in a number of arguments (currently via CGI
but could be command-line or Web Service) and
reacts accordingly.
7Modules and Classes 2 Security
- dx and geosplat use a python version of the
standard BADC (cookie-based) security system
which was originally written in Perl. The
Security class is a straight wrapper for the
Weblogon BADC class. Security checks are made in
the following places - main() each time it is called the status is
checked to see that the user is logged in and
which groups are they allowed to view. This only
deals with datasets at the XML level. - products.py (Product and DataFile classes) when
a user grabs a load of actual data files another
check is done to ensure the user is allowed to
read them.
8Modules and Classes 3 Request
9Modules and Classes 4 Datasetdb
- Datasetdb class to interrogate XML (A Metadata)
to populate options presented by dx. Main
methods - getSubsetURIList
- getDatasetFromURI
- getProtectID
- getDatasets
- getSubsets
- getVariables
- getDomain
- getVerticalSpatialDomain
- getTemporalDomain
10Modules and Classes 5 UserInterface
11Modules and Classes 6 DataFile
- Product class that basically does all the
talking to cdms objects in order to extract a
data object in python. - DataFile sub-class of Product, writes output to
a NetCDF file. - GribFile in the making, but not cdms.
12Modules and Classes 7 others
13Modules and Classes 8 splatui.py
- splatui.py
- contains only the main() function that controls
the flow of geosplat by calling relevant class
instances and functions from the geosplat
package. - takes in a number of arguments (currently via CGI
but could be command-line or Web Service) and
reacts accordingly.
14Modules and Classes 9 in GeoSPlAT
15(No Transcript)
16Other issues
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)