Title: NERC DataGrid Status: ESP June 2004
1NERC DataGrid Status ESP June 2004
Bryan Lawrence on behalf of the NDG, BADC and
BODC. Ray Cramer, Marta Gutierrez, Kerstin
Kleese, Siva Kondapalli, Sue Latham, Roy Lowry,
Kevin ONeill, Ag Stephens, Andrew Woolf
British Atmospheric Data Centre http//badc.nerc.a
c.uk
2Outline
- NDG Aims and Metadata Taxonomy
- (Review ? )
- Demonstration of NDG in action
- (no grid services yet, but shape of things to
come should be clear) - Stub-B
- New Tool DataExtractor
- Status
- Issues with metadata
- Chemistry data at BADC
- Numerical Simulation Discovery
- back to Status
3Complexity Volume Remote Access Grid
Challenge
British Atmospheric Data Centre
British Oceanographic Data Centre
http//ndg.nerc.ac.uk
4NDG Metadata Taxonomy
5(No Transcript)
6NDG Metadata Architecture
- Service based model
- clear separation between discovery and use
- discovery service standards compliant and
interoperable
7(D) - Discovery
Open Archives Initiative Digital Library
Protocol for harvesting metadata. NDG Supports
Multiple Discovery Services build your own
OAI
OAI
8NDG Structure
9(No Transcript)
10(No Transcript)
11(No Transcript)
12(No Transcript)
13(No Transcript)
14(No Transcript)
15(No Transcript)
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20Role of B metadata domain ontology
- B metadata is a store of metadata intended to
- Allow the production of the various industry
standard discovery formats (DIF, DC, FGDC/GEO,
19115) - Provide a more complete metadata store than that
demanded by the usual discovery formats,
leveraging the metadata holdings of the data
centres - Allow a smooth link across to the data browse and
use elements of the NDG - Expected to expand in importance as we can add
more semantic detail to the schema
21B metadata a simplified view
22How is the B metadata implemented?
- Core linking concept is the deployment
of a Data Production Tool
at an Observation Station
on behalf of an Activity
that produces a Data Entity
Activity
DataProductionTool
ObservationStation
Links the metadata records into a structure that
can be turned into a navigable XML using Xquery
or XSLT with any of the record types as the root
element.
Deployment
Each of the main metadata objects has security
data attached to it. This means that this can be
applied to queries on the metadata
Data Entity
23Stub B what is it?
- B metadata works well in databases, but what
about - presentation
- standalone generation of D
- storing metadata locally as files
- Given a raw B record for a Data Entity contains
just - the basic data entity details
- a series of references to related records
- no details such as
- activity name,
- instrument name,
- station
stub B is the base entity expanded through its
own related deployments and internal references
24Role of Stub B
- Makes application developers lives easier,
especially in the presentation of search results - Allow off-line storage of metadata by users
- Basis of D production via XSLT
- Hook into main B repositories
- Potential discovery format (while there are lots
around already this could allow more discipline
dependent discovery)
25Discovery Metadata Usage
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32Where are we?
- Major effort on defining feature types for
observation types so we can build an OGC/ISO
compatible data extractor for observations and
numerical data. - Main thrust for Andrew Woolf and 0.5 New FTE
- Ag Stephens contributing when time available
- Security Infrastructure Development
- Collaboration with CCLRC e-science, ECOGrid and
0.5 FTE - Ongoing work on metadata definition and
population - Oceanographic data
- Siva Kondapalli
- Chemistry data
- Main thrust for Sue Latham
- Numerical Modelling data
- DIF numerical definition (moving to ISO), BADC
and UK Community - Katherine Boutons work at NCAS/CGAM
- Remote Sensing Data
- Collaboration with NEODC and PML
- Ongoing work on databases and interfaces, DIF to
ISO and B - Kevin ONeill and Marta Gutierrez
33Authorisation
- Role-based access
- ltdatasetgt
- lthostgt badc.nerc.ac.uk lt/hostgt
- ltnamegtukmo-obs lt/namegt
- ltaccess-requiresgt researcher ltaccess-requiresgt
- ltaccess-requiresgt ukmo-obs lt/access-requiresgt
- ltprocessing-requiresgt nerc lt/processing-requiresgt
- lt/datasetgt
- Key concept Only hosts that trust each other
share data, even within a larger virtual
organisation e.g. at BADC - lttrustedgt
- ltbodcgt
- lthostgtndg.bodc.nerc.ac.uklt/hostgt
- ltattribute remotenamenercgt nerc lt/attributegt
- ltattribute remotenameashoegt ashoe lt/attributegt
- ltattribute remotenamestaffgt nerc lt/attributegt
- ltothergt bodc lt/othergt
- lt/bodcgt
- lt/trustedgt
Signed conditions of use form exists for this
dataset
34NDG Security
Certificate based, pass encrypted credentials
between user and gatekeeper.
35Extending the CF convention for chemistry
grep -i sulphate vars2.csv "Allen, Andrew and
Grenfell, Lee ", Sulphate / coarse
(ug/m3) "Allen, Andrew and Grenfell, Lee ",
Sulphate / fine (ug/m3) "Bradbury, Carl ",
SULPHATE LOADING (ug/m3) "James, Jonathan And
Allen, Andrew ", Sulphate / coarse
(ug/m3) "James, Jonathan And Allen, Andrew ",
Sulphate / fine (ug/m3) "James, Jonathan And
Allen, Andrew ", Sulphate / finecoarse
(ug/m3) "James, Jonathan And Allen, Andrew ",
Sulphate / finecoarse (ug/m3) "McArdle, Nicola
and Thompson, Adrian ", sulphate (nmol
m-3) "McArdle, Nicola and Thompson, Adrian ",
sulphate (µM) "McArdle, Nicola and Thompson,
Adrian ", sulphate lt1.1 µm diameter (nmol
m-3) "McArdle, Nicola and Thompson, Adrian ",
sulphate lt1.2 µm diameter (nmol m-3) "McArdle,
Nicola and Thompson, Adrian ", sulphate lt1µm
diameter (nmol m-3) "McArdle, Nicola and
Thompson, Adrian ", sulphate gt 1µm diameter
(nmol-3) "McArdle, Nicola and Thompson, Adrian ",
sulphate gt1.1 µm diameter (nmol m-3) "McArdle,
Nicola and Thompson, Adrian ", sulphate gt1.2 µm
diameter (nmol m-3) "McArdle, Nicola and
Thompson, Adrian ", sulphate bulk (nmol
m-3) "McArdle, Nicola and Thompson, Adrian ",
sulphate bulk (nmol m-3) "McFadyen, Gordon ",
Sulphate "Robertson, Leonie and Davison, Brian ",
Coarse sulphate concentration (ug
m-3) "Robertson, Leonie and Davison, Brian ",
Fine sulphate concentration (ug m-3)
- Currently 35,000 Ames format files, mostly
Atmospheric Chemistry - Real problems with vocabulary, and units
- Spinning up a new project
- need community help!
grep -i butane vars2.csv i-Butane (ppt)
iso-Butane (ppt) n-Butane (ppt) iso-Butane
pptv ISO-BUTANE (pptv) i-,n-butane
ACSOE just one of many datasets with this problem
36DRAFT DIF Component (1)
Key New Groups Numerical Model ID
Information Numerical Model Components (from)
Atmosphere, Ocean-Dynamic, Ocean-Thermodynamic,
Cryosphere, Land-Surface with possible appends
Chemistry, 4D-VAR, 3D-VAR, QG details
for each Numerical Simulation ID
Information Initial Condition Information
details Forcing Information details
37DRAFT DIF Component (2)
required repeatable Group Numerical_Model
Model_Name Model_Version
Model_Calendar model calendar valid - eg CF
calendar or ISO Group Model_Component
Model_Component_type Model Component Valid
Model_Component_Resolution Group
Model_Component_VerticalDomain
VerticalDomain_Top VerticalDomain_Botto
m End_Group Model_Component_Timeste
p Group Model_Component_Summary
Multiple text lines allowed End_Group
End_Group URL End_Group
38Draft DIF Components (3)
required repeatable Group
Numerical_Simulation Numerical_Simulation_Name
Numerical_Simulation_ID recorded using
PURI Group run_period start_date
yyyy-mm-dd-hh end_date
yyyy-mm-dd-hh real_date yes,no
End_Group Group Initial_Condition
Ensemble Numeric Value Summary
End_Group Group Forcing Ensemble
Numeric Value Ensemble Parent uri or
0 Summary End_Group End_Group
39Where are we?
- Major effort on defining feature types for
observation types so we can build an OGC/ISO
compatible data extractor for observations and
numerical data. - Main thrust for Andrew Woolf and 0.5 New FTE
- Ag Stephens contributing when time available
- Security Infrastructure Development
- Collaboration with CCLRC e-science, ECOGrid and
0.5 FTE - Ongoing work on metadata definition and
population - Oceanographic data
- Siva Kondapalli
- Chemistry data
- Main thrust for Sue Latham
- Numerical Modelling data
- DIF numerical definition (moving to ISO), BADC
and UK Community - Katherine Boutons work at NCAS/CGAM
- Remote Sensing Data
- Collaboration with NEODC and PML
- Ongoing work on databases and interfaces, DIF to
ISO and B - Kevin ONeill and Marta Gutierrez
40(B) Metadata Model
41(B) Metadata Model Overview
GIS/ISO Feature Types
42(A) NDG Semantic Data Model