Whats New with HDF

1 / 33
About This Presentation
Title:

Whats New with HDF

Description:

Free software. NCSA HDF library and utilities. Other software ... HDF Help email address. hdfhelp_at_ncsa.uiuc.edu. HDF users mailing list. hdfnews_at_ncsa.uiuc.edu ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 34
Provided by: HDF6

less

Transcript and Presenter's Notes

Title: Whats New with HDF


1
Whats Newwith HDF?
  • Mike Folk
  • mfolk_at_ncsa.uiuc.edu
  • http//hdf.ncsa.uiuc.edu

2
Outline
  • HDF project overview
  • HDF5 Data Model
  • HDF5 Library and tools
  • HDF Info Center

3
What is HDF?
4
Why HDF?
  • Big data
  • Need to manage large complex collections of
    data
  • Need a variety of data types and structures
  • Large data structures and objects
  • Metadata in a variety of forms
  • Availability of data
  • Need to move from place to place
  • Need to share data
  • Open standard encourages wide use

5
Why HDF?
  • Ease of access
  • Software has to work on many machines
  • I/O library, as well as tools
  • Efficiency
  • Fast I/O
  • Efficient storage

HDF was created to address these concerns so that
others dont have to.
6
What is HDF?
  • Flexible, self-describing file format
  • Datatypes and data objects for scientific data
  • Software libraries and tools

7
Two HDFs
  • HDF4 original version of HDF
  • HDF5 new format and library
  • http//hdf.ncsa.uiuc.edu/HDF5
  • Why?
  • bigger, faster machines and storage systems
  • larger datasets
  • new I/O paradigms
  • parallel computing and I/O
  • complex data structures
  • complex subsetting
  • thread safety

8
Example HDF5 file
/ (root)
/foo
lat lon temp -------------- 12 23
3.1 15 24 4.2 17 21 3.6
Table
Raster image
Raster image
2-D array
9
HDF Software
10
HDF Applications Software
  • Free software
  • NCSA HDF library and utilities
  • Other software
  • Commercial/other software that understands
  • most of HDF (Noesys, IDL, HDF Explorer)
  • certain HDF objects (MATLAB, WebWinds)
  • HDF applications
  • http//hdf.ncsa.uiuc.edu/tools.html

11
Major Project 1 EOSDIS
Earth Observing System Data Information System
  • Open standard for exchange of remote-sensed data
  • Scores of instruments and datasets
  • 1 terabytes per day per platform
  • HDF Requirements
  • Earth science data types
  • Swath, grid, point data
  • Efficient storage and access
  • Support for scientists, data producers,
    archiving, etc.
  • HDF tools, utilities, access software

12
EOS Constellation
13
HDF-EOS Swath profile
Geolocation fields
Data fields
Brightness Temperature
Time
Dimension Name Geotrack Size 21
Latitude
Longitude
14
Major Project 2 ASCI
  • ASCI Data Models and Formats (DMF) Group
  • Open standard exchange format and I/O library for
    ASCI
  • DOE tri-lab ASCI applications
  • HDF requirements
  • large datasets (gt a terabyte)
  • ASCI data types, especially meshes
  • good performance in massive parallel environments

15
(No Transcript)
16
ASCI DMF Data Abstraction
  • Objectives
  • Sound data model withrobust data abstractions
  • Computational mechanicsdata meshes fields
  • Based on mathematical field of fiber bundles
  • Common format allows common tools sharing
  • Common API shield apps from model complexities

APPLICATION
Mesh APIs (SNL/LANL)
Fiber Bundle Kernel (LLNL)
Data Structure Layer (LLNL)
HDF5 (NCSA)
MPI IO (ANL)
17
HDF5
18
New HDF5 Features
  • More scalable
  • Larger arrays and files
  • More objects
  • Improved data model
  • New datatypes
  • Single comprehensive dataset object
  • Improved software
  • More flexible, robust library
  • More flexible API
  • More I/O options

19
HDF5 file structure
File header infoVersion , etc.
User block
Root group/
Other objects (datasets groups, etc.)
20
HDF5 data model
  • Two primary objects
  • Dataset
  • multidimensional array of elements
  • rich variety of datatypes
  • group
  • directory-like structure
  • contains datasets, groups, other objects

21
Dataset components
  • a multidimensional array of data elements
  • header with metadata
  • datatype
  • dataspace
  • attributes
  • storage info

22
Simple datatypes
  • The usual scalars integer float
  • user-defined scalars (e.g. 13-bit integers)
  • variable length (e.g. strings)
  • pointers to objects or regions of datasets
  • enumeration
  • opaque

23
Compound datatypes
  • User-defined
  • Comparable to C structs
  • Members can be simple or compound types
  • Members can be multidimensional

24
HDF5 dataset array of elements
3
5
Dimensionality 5 x 3
int8
int4
int16
2x3 array of float32
Datatype
25
Groups
  • A mechanism for collections of related objects
  • Every file starts with a root group
  • Similar to UNIXdirectories
  • Can have attributes

26
Example HDF5 file
/ (root)
/foo
/a
/foo/z
lat lon temp -------------- 12 23
3.1 15 24 4.2 17 21 3.6
/c
/b
/foo/b
/foo/x
/foo/y
Table
Raster image
Raster image
2-D array
27
Special Storage Options
Better subsetting access time extendable
  • chunked
  • compressed
  • extendable
  • split file

Improves storage efficiency, transmission speed
Arrays can be extended in any direction
Metadata in one file, raw data in another.
28
The HDF5 Library
29
Features
  • Support for high performance applications
  • Ability to create complex data structures
  • Complex subsetting
  • Flexible, efficient I/O (parallel, remote, etc.)
  • Support for key language models
  • OO compatible
  • C Fortran primarily
  • Also Java, C

30
Subsetting and subsamplingMappings between file
arrays/selections and memory arrays/selections.
(b) A regular series of blocks from a 2D array
to a contiguous sequence at a certain offset in a
1D array
(a) A hyperslab from a 2D array to the corner of
a smaller 2D array
(c) A sequence of points from a 2D array to a
sequence of points in a 3D array.
(d) Union of hyperslabs in file to union of
hyperslabs in memory. Number of elements must be
equal.
31
Files neednt be files - Virtual File Layer
VFL A public API for writing I/O drivers
Hid_t
File Handle
VFL Virtual File I/O Layer
I/O drivers
memory
mpio
stdio
network
Storage
Memory
Network
Files
32
HDF5 tools
  • Current
  • hdf5ls - lists contents of HDF5 file
  • h5dumper - higher level view
  • hdf5?hdf4 converter
  • VisAD data adapter
  • Future
  • Convert HDF5 ? ascii, binary, GIFF, etc
  • Convert HDF4 ? HDF5
  • Java tools
  • XML-based tools

33
HDF5 Information
  • HDF website
  • http//hdf.ncsa.uiuc.edu/
  • HDF5 Information Center
  • http//hdf.ncsa.uiuc.edu/HDF5/
  • HDF Help email address
  • hdfhelp_at_ncsa.uiuc.edu
  • HDF users mailing list
  • hdfnews_at_ncsa.uiuc.edu
Write a Comment
User Comments (0)