NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett - PowerPoint PPT Presentation

About This Presentation
Title:

NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett

Description:

The purpose of the interoperability features is to allow users ... NetCDF-Java can read many data formats; the idea is to ... HDF format, superseded by HDF5. ... – PowerPoint PPT presentation

Number of Views:263
Avg rating:3.0/5.0
Slides: 32
Provided by: ed9117
Category:

less

Transcript and Presenter's Notes

Title: NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett


1
NetCDF-4 Interoperability with HDF4 and HDF5Ed
Hartnett Unidata, 8/4/9
2
Purpose of Interoperability Features World
Conquest
  • The purpose of the interoperability features is
    to allow users to use netCDF programs on
    non-netCDF data archives.
  • NetCDF-Java can read many data formats the idea
    is to bring some of this functionality to the
    C/Fortran/C libraries.

3
Warning and Request
  • HDF4 and HDF5 interoperability features are still
    being tested. They are not ready for operational
    use yet.
  • The interoperability features are available in
    the netCDF daily snapshot release.
  • Please use them and send feedback to
  • support-netcdf_at_unidata.ucar.edu

4
Overview
  • HDF4 Interoperability
  • What is HDF4 and why bother with it?
  • Reading HDF4 files with netCDF.
  • Limitations and request for help.
  • HDF5 Interoperability
  • What is HDF5 and why bother with it?
  • Reading HDF5 files with netCDF.
  • Limitations.

5
What is HDF4?
  • The original HDF format, superseded by HDF5.
  • HDF4 has built-in 32-bit limits that make it
    unattractive for new data sets. It is still
    actively supported by The HDF Group, but no new
    features are added.
  • Get more info about HDF4 at http//www.hdfgroup.o
    rg/products/hdf4

6
Why Read HDF4?
  • Some important data sets are distributed in HDF4,
    for example the Aqua/Terra satellite data.

7
HDF4 Background
  • HDF4 has several different APIs. The one of
    greatest interest to netCDF users is the SD
    (Scientific Data) API.
  • The SD API is (intentionally) very similar to the
    netCDF classic data model.

8
Confusing HDF4 Includes NetCDF v2 API
  • A netCDF V2 API is provided with HDF4 which
    writes SD data files.
  • This must be turned off at HDF4 install-time if
    netCDF and HDF4 are to be linked in the same
    application.
  • There is no easy way to use both HDF4 with netCDF
    API and netCDF with HDF4 read capability in the
    same program.

9
Reading HDF4 SD Files
  • Starting with version 4.1, netCDF will be able to
    read HDF4 files created with the Scientific
    Dataset (SD) API.
  • This is read-only NetCDF can't write HDF4!
  • The intention is to make netCDF software work
    automatically with important HDF4 scientific data
    collections.

10
Building NetCDF to Read HDF4
  • This is only available for those who also build
    netCDF with HDF5.
  • HDF4, HDF5, zlib, and other compression libraries
    must exist before netCDF is built.
  • Build like this
  • ./configure with-hdf5/home/ed enable-hdf4

11
Compiling with HDF4
  • Include netcdf header file as usual.
  • Include locations of netCDF, HDF5, and HDF4
    include directories
  • -I/loc/of/netcdf/include -I/loc/of/hdf5/include
    -I/loc/of/hdf4/include

12
Linking with HDF4
  • The HDF4 and HDF5 libraries (and associated
    libraries) are needed and must be linked into all
    netCDF applications. The locations of the lib
    directories must also be provided
  • -L/loc/of/netcdf/lib -L/loc/of/hdf5/lib
    -L/loc/of/hdf4/lib
  • -lmfhdf -ldf -ljpeg -lhdf5_hl -lhdf5 -lz

13
Use nc-config to Help with Compile Flags
  • The nc-config utility is provided to help with
    compiler flags

./nc-config --cflags -I/usr/local/include
./nc-config --libs -L/usr/local/lib -lnetcdf
-L/machine/local/lib -lhdf5_hl -lhdf5 -lz -lm
-lhdf4 ./nc-config --flibs -M/usr/local/lib
-lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5
-lz -lm -lhdf4
14
Implementation Notes
  • You don't need to identify the file as HDF4 when
    opening it with netCDF, but you do have to open
    it read-only.
  • The HDF4 SD API provides a named, shared
    dimension, which fits easily into the netCDF
    model.
  • The HDF4 SD API uses other HDF4 APIs, (like
    vgroups) to store metadata. This can be confusing
    when using the HDF4 data dumping tool hdp.

15
C Code to Read HDF4 SD File
/ Create a file with one SDS, containing
our phony data. / sd_id
SDstart(FILE_NAME, DFACC_CREATE) sds_id
SDcreate(sd_id, PRES_NAME, DFNT_INT32,
DIMS_2, dim_size)
SDwritedata(sds_id, start, NULL, edge, (void
)data_out) if (SDendaccess(sds_id)) ERR
if (SDend(sd_id)) ERR / Now open
with netCDF and check the contents. / if
(nc_open(FILE_NAME, NC_NOWRITE, ncid)) ERR
if (nc_inq(ncid, ndims_in, nvars_in,
natts_in, unlimdim_in))
ERR ...
16
ncdump and HDF4 SD Files
  • With HDF4 reading enabled, ncdump works on HDF4
    files.
  • Sample MODIS file

../ncdump/ncdump -h MOD29.A2000055.0005.005.20062
67200024.hdf netcdf MOD29.A2000055.0005.005.20062
67200024 dimensions Coarse_swath_lines_
5km\MOD_Swath_Sea_Ice 406
Coarse_swath_pixels_5km\MOD_Swath_Sea_Ice 271
Along_swath_lines_1km\MOD_Swath_Sea_Ice
2030 Cross_swath_pixels_1km\MOD_Swat
h_Sea_Ice 1354 variables float
Latitude(Coarse_swath_lines_5km\MOD_Swath_Sea_Ice
, Coarse_swath_pixels_5km\MOD_Swath_Sea_Ice)
Latitudelong_name "Coarse 5 km
resolution latitude"
Latitudeunits "degrees" ...
17
HDF-EOS Not Understood
  • Many HDF4 data sets of interest follow the
    HDF-EOS metadata standard.
  • Stored as a long text string in global
    attributes, the HDF-EOS metadata looks messy.

// global attributes
HDFEOSVersion "HDFEOS_V2.9"
StructMetadata.0 "GROUPSwathStructure\n\tGROUP
SWATH_1\n\t\tSwathName\"MOD_Swath_Sea_Ice\"\n\t\
tGROUPDimension\n\t\t\\tOBJECTDimension_1\n\t\t\
t\tDimensionName\"Coarse_swath_lines_5km\"\n\t\t\
t\tSize406\n\t\t\tEND_OBJECTDimension_1\n\t\t\tO
BJECTDimension_2\n\t\t\t\tDimensionName\"Coarse_
swath_pixels_5km\"\n\t\t\t\tSize271\n\t\t\t...
18
HDF4 Read Testing
  • Tested in libsrc4/tst_interops2.c, which creates
    some HDF4 files with the SD API, and then reads
    them with netCDF.
  • If enable-hdf4-file-tests is used with netCDF
    configure, some Aura/Terra satellite data files
    are downloaded from Unidata FTP site, then read
    by libsrc4/tst_interops3.c.

19
HDF4 Interoperability Limitations
  • File must be opened read-only.
  • Only HDF4 SD data files are currently understood.
  • This feature cannot be used at the same time as
    HDF4's netCDF v2 API, because HDF4 steals the
    netCDF v2 API function names. So you must use
    disable-netcdf when building HDF4. (It might
    also work to disable-v2 for the netCDF build.)

20
Future HDF4 Work
  • More tests.
  • Support for HDF4 image types.
  • Test support for compressed data.
  • Add some support for HDF-EOS metadata in the
    libcf library, using the HDF-EOS toolkit.

21
Request for User Help What Data to Read?
  • Please send me pointers to scientifically
    important HDF4 datasets.
  • The intention is not to read any HDF4 data, just
    those of wide scientific interest.

22
Contribute Code to Write HDF4?
  • Some programmers use the netCDF v2 API to write
    HDF4 files.
  • It would not be too hard to write the glue code
    to allow the v2 API -gt HDF4 output from the
    netCDF library.
  • The next step would be to allow netCDF v3/v4 API
    code to write HDF4 files.
  • Writing HDF4 seems like a low priority to our
    users. I would be happy to help any user who
    would like to undertake this task.

23
What is HDF5?
  • HDF5 is an extremely general data storage format
    with many advanced features on-the-fly
    compression, parallel I/O, a rich data model,
    etc.
  • Starting with netCDF-4.0, netCDF has been able to
    use HDF5 as a storage layer, exposing some of the
    advanced features.
  • But, until version 4.1, only HDF5 files created
    with netCDF-4 could be understood by netCDF-4.

24
Why Read HDF5 Files?
  • Many important datasets are available in HDF5
    format, including data from the Aqua satellite.

25
Rules for Reading HDF5 Files
  • NetCDF-4.1 provides read-only access to existing
    HDF5 files if they do not violate some rules
  • Must not use circular group structure.
  • HDF5 reference type (and some other obscure
    types) are not understood.
  • Write access still only possible with
    netCDF-4/HDF5 files.

26
HDF5 Version 1.8 Background
  • In version 1.8, HDF5 introduced dimension
    scales as a way of supporting shared dimensions.
  • Also in version 1.8, HDF5 introduced ordering by
    creation, rather than ordering alphabetically.
  • But most data providers don't use these features,
    but instead use HDF5 1.6.

27
NetCDF-4.1 Relaxes Some Restrictions for HDF5
Files
  • Before netCDF-4.1, HDF5 files had to use creation
    ordering and dimension scales in order to be
    understood by netCDF-4.
  • Starting with netCDF-4.1, read-only access is
    possible to HDF5 files with alphabetical ordering
    and no dimension scales. (Created by HDF5 1.6
    perhaps.)
  • HDF5 may have dimension scales for all
    dimensions, or for no dimensions (not for just
    some of them).

28
HDF5 C Code to Write HDF5 File
/ Create file. / if ((fileid
H5Fcreate(FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT,
H5P_DEFAULT)) lt 0)
ERR / Create the space for the
dataset. / dims0 LAT_LEN
dims1 LON_LEN if ((pres_spaceid
H5Screate_simple(DIMS_2, dims, dims)) lt 0) ERR
/ Create a variable. It will not have
dimension scales. / if ((pres_datasetid
H5Dcreate(fileid, PRES_NAME,

H5T_NATIVE_FLOAT,
pres_spaceid, H5P_DEFAULT))
lt 0) ERR if (H5Dclose(pres_datasetid) lt 0
H5Sclose(pres_spaceid) lt 0
H5Fclose(fileid) lt 0) ERR
29
NetCDF C Code to Read HDF5 File
/ Read the data with netCDF. / if
(nc_open(FILE_NAME, NC_NOWRITE, ncid)) ERR
if (nc_inq(ncid, ndims_in, nvars_in,
natts_in, unlimdim_in))
ERR if (ndims_in ! 2 nvars_in ! 1
natts_in ! 0 unlimdim_in ! -1)
ERR if (nc_close(ncid)) ERR
30
Future Plans for HDF5 Interoperability
  • More testing.
  • Proper handling of reference types. This will
    require (probably) an extension of the netCDF
    APIs.
  • Better handling of strange group structures, if
    this proves necessary to read important data.

31
Summary
  • With the 4.1 release, the netCDF C/Fortran/C
    libraries allow read-only access to some existing
    HDF4 and HDF5 data archives.
  • The intention is not to develop a completely
    general translation, but instead to focus on
    datasets of significance to the Earth science
    community.
  • Write capability is quite possible, but we don't
    plan on providing it because the demand for this
    is low.
Write a Comment
User Comments (0)
About PowerShow.com