Title: HDF-EOS%202/5%20to%20netCDF%20Converter
1HDF-EOS 2/5 to netCDFConverter
- Bob Bane, Richard Ullman, Jingli Yang
- Data Usability Group
- NASA/Goddard Space Flight Center
2Introduction
- Status report
- Properties of netCDF and HDF-EOS
- Conversion strategy
3Status Report
- Last year - hdfeos52netcdf
- HDF-EOS 5 -gt netCDF
- COARDS compatible
- Current
- Uses he25 interoperability library, so does both
HDF-EOS 2 and 5 - CF compatible
4Data Formats and Conventions
- Generic data containers
- HDF, netCDF
- Conventions for domain-specific metadata
- HDF-EOS, COARDS/CF
- HDF -gt HDF-EOS
- netCDF -gt COARDS/CF
5netCDF
- netCDF files contain
- Variables
- multi-dimensional arrays of basic data types
(character/integer/float) - Dimensions
- named sizes for dimensions of variables
- Attributes
- named one-dimensional arrays
- properties of variables
6netCDF Conventions
- Metadata is stored in attributes
- Conventions for names units
- Coordinate vector
- Variable with the same name as a dimension
- Value is a vector of same size as the dimension
- Is a mapping between (0,1,2) dimension indexing
and physical quantities for dimension
7COARDS Conventions
- Cooperative Ocean/Atmospheric Research Data
Service - Conventions for use of netCDF
- Order of dimensions for variables
- Names of attributes (Units, _FillValue)
- Coordinate variables
- http//ferret.wrc.noaa.gov/noaa_coop/coop_cdf_prof
ile.html
8CF Conventions
- Climate and Forecast
- Follow-on to COARDS
- Tighter
- Many attributes optional in COARDS are required
in CF - More capable
- Multi-dimensional geolocation support
- http//www.cgd.ucar.edu/cms/eaton/cf-metadata/
9HDF
- Hierarchical Data Format
- HDF files contain
- Datasets
- multi-dimensional arrays of basic data types
- Dimensions
- Named sizes of dataset dimensions
- Groups
- Named groups of datasets (and groups)
- Attributes
- Named properties of datasets and groups, similar
to netCDF
10HDF-EOS
- Conventions and API for HDF
- HDF-EOS files contain
- Fields (datasets)
- Points
- Individually geolocated measurements
- Swaths
- Groups of data and geolocation fields, and
mappings between them - Grids
- Groups of data fields with rectilinear geolocation
11HDF-EOS (cont)
- HDF-EOS 2 over HDF4
- HDF-EOS 5 over HDF5
- HDF5 very different from HDF4
- HDF-EOS 2/5 near identical API
- Our he25 library allows uniform access to HDF-EOS
2/5, so converter works for both - Looks/works like HDF-EOS 5
- On HDF-EOS 4 files, translates in/out
12Observations
- HDF-EOS is bigger than netCDF
- Additional structured metadata (ODL)
- HDF-EOS API calls for geolocation
- netCDF file HDF-EOS Swath/Grid
- Both are groups of related datasets
13Conversion Strategies
- One HDF-EOS file -gt one netCDF file
- Alternative is one Swath/Grid -gt one file
- COARDS/CF - if original HDF-EOS followed
conventions, converted netCDF will also - Most HDF-EOS data producers are aware of
COARDS/CF - Skip HDF-EOS Point datasets
- Reconsider this if real world Point data emerges
14Conversion Strategies (cont)
- Convert data to enable future processing
- Geolocation data, attributes (units)
- Other metadata less important
- Could transfer ODL metadata as a string, but why?
- Can always go back to the original file and use
good HDF-EOS tools
15Conversion in General
HDF-EOS
Swath s1 Dimensions(lat,lon,time) Datafield
f1(lat,lon,time) Geofield f2(lat,lon,time) Swath
s2 Dimensions(lat,lon,time) Datafield
f3(lat,lon,time) Geofield f4(lat,lon.time)
netCDF
Dimensions(lat,lon,time,s2_time) Variable
s1_f1(lat,lon,time) Variable s1_GEO_f2(lat,lon,tim
e) Variable s2_f3(lat,lon,s2_time) Variable
s2_GEO_f4(lat,lon,s2_time)
- Flatten HDF-EOS hierarchy
- Encode names, types in variable names
16Swaths
netCDF
Dimensions(lat,glat,lon,glon,time,s2_time) Attribu
tes s2_DimensionMap lat/glat, lon/glon
s2_DMOffsets (0,0) s2_DMIncrements
(1,1) Variable s2_f3(lat,lon,s2_time)
Attributes coordinates s2_GEO_f3 Variable
s2_GEO_f4(glat,glon,s2_time)
HDF-EOS
Swath s2 Dimensions(lat, glat ,lon, glon,
time) DimensionMap(lat, glat, 0, 1)
DimensionMap(lon, glon, 0, 1) Datafield
f3(lat,lon,time) Geofield f4(glat,glon.time)
- Swath name, geofield type encoded in variable
names - Record dimension map in global attributes
17Grids
netCDF
HDF-EOS
Dimensions(lat,lon,time) Variable lat(lat)
(lowright,upright) Variable lon(lon)
(lowleft, upleft) Variable g1_f1(lat,lon,time)
Grid g1 Dimensions(lat,lon,time)
Corners(upleft, upright, lowleft,
lowright) Datafield f1(lat,lon,time)
- Grid geolocation becomes coordinate variables
18Converter
- C command-line application
- hdfeos2netcdf HDF_file netCDF_file
- Should be portable to all HDF-EOS5/netCDF
platforms - Naturally uses all libraries
19Where is the Software?
- http//hdfeos.gsfc.nasa.gov
- Tools category
- System hdfeos2netcdf
20Big Picture
HDF-EOS
File Attributes fa1 fa value Swath s1
Attributes sa1 sa value
Dimensions(lat,lon,time) Datafield
f1(lat,lon,time) Geofield f2(lat,lon,time) Swath
s2 Dimensions(lat,lon,time) Datafield
f3(lat,lon,time) Geofield f4(lat,lon.time)
netCDF
File Attributes fa1 fa value s1_sa1 sa
value Dimensions(lat,lon,time,s2_time) Variable
s1_f1(lat,lon,time) Variable s1_GEO_f2(lat,lon,tim
e) Variable s2_f3(lat,lon,s2_time) Variable
s2_GEO_f4(lat,lon,s2_time)