HDF Update - PowerPoint PPT Presentation

About This Presentation
Title:

HDF Update

Description:

HDF Update Mike Folk The HDF Group HDF and HDF-EOS Workshop X November 29, 2006 Outline Organizational info HDF Software Update Other Activities of Interest ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 68
Provided by: Pet133
Learn more at: http://www.hdfeos.org
Category:
Tags: hdf | font | size | unicode | update

less

Transcript and Presenter's Notes

Title: HDF Update


1
HDF Update
  • Mike Folk
  • The HDF Group
  • HDF and HDF-EOS Workshop X
  • November 29, 2006

2
Outline
  • Organizational info
  • HDF Software Update
  • Other Activities of Interest

3
Organizational info
4
The HDF Group THG
Founded Dec. 2006
Went solo July 15, 2006
Non-profit
5
THG missionTo support the vast community of HDF
users and to ensure the sustainable development
of HDF technologies and the ongoing accessibility
of HDF-stored data.
6
The HDF Team
Frank Baker Christian Chilan Peter Cao Vailin
Choi Mike Folk Anne Jennings Barbara
Jones Quincey Koziol James Laird Raymond Lu
John Mainzer Matthew Needham Pedro Nunes Tammi
ONeill Elena Pourmal Binh-minh Ribler Randy
Ribler Rishi Sinha Kent Yang
And all those wonderful folks out there who
contribute ideas, requests, bug reports, code,
and support.
7
Who is supporting HDF?
  • Organizations providing broad support
  • NASA, DOE, Boeing
  • Agencies supporting RD (2006)
  • NASA, NARA, DOE, NCSA, Agilent, Aberdeen Test
    Center, DD(X)
  • Collaborators who make in-kind contributions
  • Cactus, PyTables, NeXUS, CGNS, many others

8
HDF Software Update
9
HDF4 update
10
Platforms to be dropped
  • Operating systems
  • HPUX 11.00
  • Crays SV1 and TS IEEE
  • AIX 5.1 and 5.2
  • SGI IRIX64-6.5
  • Linux 2.4
  • Solaris 2.7, 2.8, 2.9
  • Windows 2000
  • MAC OSX 10.3
  • Compilers
  • GNU C compilers older than 3.4 (Linux)
  • Intel 8.
  • PGI V. 5., 6.0

11
Platforms to be added
  • Systems
  • MAC OSX 10.4 (Intel)
  • Solaris 2. on Intel
  • Cray XT3
  • Windows 64-bit (?)
  • Linux 2.6
  • HPUX 11.23
  • IBM Power 5
  • Compilers
  • g95
  • PGI V. 6.1
  • Intel 9.

12
New features
  • Configuration
  • Switched to use F77_FUNC macro for better Fortran
    support (no hard-coded compilers anymore!)
  • Support for shared libraries
  • Library
  • No hard-coded limit on number of opened files
  • New APIs to control number of files opened by
    application
  • Fortran support for SZIP compression

13
Bugs fixes
  • Tools
  • A lot of improvements to the hdp, hrepack, hdiff
    and hdfimport utilites based on users feedback
  • Library
  • Data corruption bug for several opened unlimited
    dimension SDSs
  • Better handling of SDSs with duplicated names in
    SDgetdimscale and more

14
HDF5 update
15
No new releases!
  • Focus on HDF5 release 1.8
  • HDF5-1.8.0 Alpha 5 release is available
    fromhdf.ncsa.uiuc.edu/HDF5/release/alpha/obtain
    518.html

16
Platforms to be dropped
  • Operating systems
  • HPUX 11.00
  • MAC OS 10.3
  • AIX 5.1 and 5.2
  • SGI IRIX64-6.5
  • Linux 2.4
  • Solaris 2.8 and 2.9
  • Compilers
  • GNU C compilers older than 3.4 (Linux)
  • Intel 8.
  • PGI V. 5., 6.0
  • MPICH 1.2.5

http//www.hdfgroup.org/HDF5/release/alpha/obtain5
18.html
17
Platforms to be added
  • Systems
  • Alpha Open VMS
  • MAC OSX 10.4 (Intel)
  • Solaris 2. on Intel (?)
  • Cray XT3
  • Windows 64-bit (32-bit binaries)
  • Linux 2.6
  • BG/L
  • Compilers
  • g95
  • PGI V. 6.1
  • Intel 9.
  • MPICH 1.2.7
  • MPICH2

18
New Features in HDF5 1.8
19
HDF5 1.8 new library features
  • Datatype and dataspace features
  • Serialized dataspaces and datatypes
  • Ability to create data type from text description
  • Integer to float conversions during I/O
  • Revised exception handling during type conversion
  • Compact storage for N-bit data types
  • Offsetsize storage filter, saving space
  • Null dataspace datasets with no elements
  • Data transformation filter

20
HDF5 1.8 new library features
  • Group revisions
  • Creation order access
  • Compact groups small groups take less space
  • Large group storage improvements
  • Intermediate group creation

21
HDF5 1.8 new library features
  • Link improvements
  • External links -- can refer to objects in another
    file
  • User defined links apps create own kinds of
    links
  • Attribute improvments
  • Storage improvements for large numbers of attr
  • Iterate or look up by creation order

22
HDF5 1.8 new library features
  • Support for Unicode UTF-8 character set
  • Shared header info duplicate header info
    shared, possibly saving space
  • Metadata cache improvements faster I/O on files
    with many objects
  • Data transformation filter
  • Stackable Virtual File Drivers
  • Better UNIX/Linux portability

23
HDF5 1.8 new APIs
  • New extendible error-handling API
  • New APIs to copy objects between files fast
  • Dimension scale model and API
  • HDFpacket API to read/write packets
    efficiently

24
HDF5 1.8 backward and forward compatibility
25
HDF5 1.8 vs. 1.6.5
  • Differences between 1.8 vs. 1.6.5
  • Some file format changes
  • Several new routines added
  • Old APIs deprecated -- removed in later release
  • Consequences
  • Application requiring 1.8 format changes will
    write objects that 1.6.5 library cannot read
  • To exploit 1.8 changes, apps need to be rewritten

26
Principle of Maximum file format compatibility
  • Unless instructed otherwise, the HDF5 library
    will write objects using the earliest version of
    the format possible for describing the
    information. Assures forward compatibility with
    the older versions whenever possible objects in
    new files can be read with old libraries if those
    objects are known to the old libraries.

27
Example Datatype header message
  • Compound datatype encoding
  • Version 1 used by 1.6.5 and earlier encodes
    compound datatypes with explicit array fields
  • Version 2 used for 1.8.0 has a new encoding,
    reducing storage overhead for compound data
  • By default 1.8.0 writes compound data in format
    compatible with 1.4.0 1.6.X libraries
  • But if feature is requested, compound data
    created by 1.8.0 will not be readable by earlier
    versions

28
HDF5 Forward Compatibility
  • Format
  • Can old libraries access files made by new
    library?
  • Old library versions will read all objects in a
    file created by a newer library if objects are
    known to the old library
  • API
  • Can old applications link with the new library?
  • Applications written to work with an older
    version of library will compile, link and run as
    expected with a newer version

29
HDF5 Backward Compatibility
  • File Format
  • Can new library access files made by old library?
  • Newer version of the library will always read
    files created with an older version
  • Library APIs
  • Can new applications link with the older
    libraries?
  • Application written for the newer version will
    compile and link with the older library unless
    new features are used

30
HDF5 Compatibility information
  • Backward and forward compatibility issues
  • http//hdfgroup.org/HDF5/faq/bkfwd-compat.html
  • API changes from release to release
  • http//hdfgroup.org/HDF5/doc_1.8pre/doc/ADGuide/C
    hanges.html
  • File Format changes
  • http//hdfgroup.org/HDF5/doc/H5.format.html

31
Command line tools
32
New features for old tools
  • h5dump
  • Dump data in binary format
  • h5diff
  • Compare dataset regions
  • Parallel h5diff (ph5diff)
  • Compare two files in MPI parallel environment
  • h5repack
  • Efficient data copy using H5Gcopy()
  • Able to handle big datasets

33
New HDF5 Tools
  • h5copy
  • Copies an group, dataset or named datatype from
    one location to another location
  • Copies within a file or across files
  • h5check
  • Verifies an HDF5 file against the defined HDF5
    File Format Specification
  • h5stat
  • Reports statistics about a file and objects in a
    file

34
HDF Java Products
35
HDFView changes
  • Quality improvements for HDF-java package
  • Full documentation of hdf-java object package
  • Test suite for hdf-java object package
  • Support 64-bit Java on Linux and Solaris
  • Many new features, including
  • Change font size easily
  • Grab and move image
  • Create new table (compound dataset) from template
  • Filter out fill value for image creation
  • -geometry option for very high resolution displays

36
Future work for Java
  • Update HDF5 JNI APIs for HDF5 1.8 release
  • Release HDFView 2.4 with bug fixes/new features
    with HDF5 1.8 release
  • New GUI features dealing with table, image and
    animation
  • Writing capability for HDF5-SRB model

37
Website Development for HDF-EOS Tools
Information Center
38
Website for HDF-EOS Tools
  • THG now manages HDF-EOS web site
  • Registered domain names hdfeos.net/.org/.com
  • Re-implemented major topic areas
  • Re-designed interface
  • Registered google search
  • Will continue maintenance
  • Phase two
  • Host mailing list
  • Support simple forum features

39
Website for HDF-EOS Tools
40
Other Activities of Interest
41
Performance RD
42
HDF5 - PnetCDF performance comparison
uP Power 5
I/O performance of PnetCDF is comparable with
parallel HDF5 when the libraries are used in
similar manners.
43
PnetCDF4 - PnetCDF comparison
I/O performance of parallel NetCDF4 is comparable
with PnetCDF with about 15 slowness on average
for the output of ROMS history file.
44
Collective I/O improvements
  • HDF5 supports collective IO for non-regular
    selections
  • Collective IO for chunked storage is not trivial.
  • Non-regular selection performance optimizations
  • Added IO options to achieve good collective IO
    performance
  • Added APIs for applications to participate in the
    optimization process
  • See the poster

45
DOE Labs
Lawrence Livermore NationalLaboratory
Sandia NationalLaboratory
46
DOE ASC and Others
  • Support HDF5 on major systems at Sandia
    Lawrence Livermore National Laboratories
  • RD efforts underway
  • File recovery after a crash
  • Very fast write speed goal is 300 MB/sec
  • Read-while-writing capability
  • Java library and HDFView improvements

Advanced Scientific Computing project
47
Flight test
48
Flight test collect, then process
49
Boeing HDF5 for flight test data
  • Boeing 787 active archive
  • 10 TB per flight-test day
  • Must handle raw, real-time data
  • High speed ingest, by packet
  • Post-processing, by time-history
  • Boeing High Level APIs
  • HDFpacket released with HDF5 1.8
  • HDFtime_history new, open version likely

50
Product data
STEP
51
Bioinformatics
caacaagccaaaactcgtacaa Cgagatatctcttggaaaaact gctc
acaatattgacgtacaag gttgttcatgaaactttcggta Acaatcgt
tgacattgcgacct aatacagcccagcaagcagaat
Managing genomic data
52
C HDF5 API for Agilent
53
Agilent C project
  • Why?
  • Heavy use of C at Agilent
  • Compatibility with Matlab
  • Other interest in HDF5 at Agilent
  • What?
  • Prototype API in C for Windows XP
  • Basic functions to create, open, close, read,
    write
  • Limited datatypes, no partial I/O
  • When?
  • March 2007

54
HDF5 Software
Fortran
C
Java
C
C API
HDF I/O Library
HDF File
55
NetCDF 4
56
NetCDF 4 project
  • Enhanced NetCDF-4 Interface to HDF5
  • Combine features of netCDF and HDF5
  • Take advantage of their separate strengths
  • Collaboration between NCSA, THG, Unidata
  • Currently in Alpha Release
  • Waiting for beta release


57
NetCDF-4 Architecture
netCDF-3 applications
netCDF-4 applications
HDF5 applications
netCDF files
netCDF-4 Library
netCDF-4 HDF5 files
HDF5 files
HDF5 Library
  • Supports access to netCDF files and HDF5 files
    created through netCDF-4 interface

58
Archival formats
  • Proposal to NOAA Scientific Data Stewardship
    program
  • Will investigate use of OAIS Archive Information
    Package standard with HDF5
  • PI Ruth Duerr (NSIDC) and Kent Yang

OAIS Open Archival Information System
59
Asymmetries between collecting and accessing data
60
  • Huge streams of data collected
  • To be accessed in little bits

61
Challenge efficient remote access
  • How do we efficiently find and access data from
    distributed repositories, when the data are big
    and complex?
  • Storage Resource Broker (SRB)
  • Efficient access to HDF5 objects in repository
  • OPeNDAP
  • Powerful protocol for remote querying and
    subsetting of scientific data

62
Example Storage resource broker
  • Storage Resource Broker repository for
    heterogeneous data collections
  • Simplifies storage, query and access to massive
    amounts of scientific data
  • Has data in HDF5, netCDF, other formats

63
Normal SRB configuration
client
HDF5 File (whole file or a sequence of bytes)
SRB Server
MCAT
64
OPeNDAP-HDF5 project
  • OPeNDAP
  • Powerful protocol for remote querying and
    subsetting of scientific data
  • Replaces direct file access with remote query and
    access
  • Widely used in Earth Sciences

65
OPeNDAP HDF5 Project
  • A NASA ROSES NRA project
  • Tasks
  • HDF5-DAP2 server (now a prototype)
  • HDF5-DAP4 server
  • DAP4 to HDF5 conversion utility
  • Investigate integrated DAP-aware HDF5 library

66
SQL Server and HDF5 with Microsoft
67
SQL Server and HDF5
  • Microsoft dream environment for scientists
  • Combine data management, computing
  • SQL Server 2005 solution
  • Combine RDBMS with scientific analysis tools,
    together in one integrated system.
  • HDF5 other formats manage scientific objects

68
HDF5 in SQL server
OLAP and Data Mining
Libraries (MATLAB,)
Web Services (XML, REST, RSS)
Visualization
Reporting
.NET Languages with Language Integrated Query
Entity Framework (EDM, eSQL, O-R mapping)
HDF5 EDM model
SQL Server
69
Thank you allandThank you NASA!
70
Acknowledgement
  • This report is based upon work supported in part
    by a Cooperative Agreement with NASA under NASA
    NNG05GC60A. Any opinions, findings, and
    conclusions or recommendations expressed in this
    material are those of the author(s) and do not
    necessarily reflect the views of the National
    Aeronautics and Space Administration.

71
Questions/comments?
72
Information Sources
  • HDF website
  • http//hdfgroup.org/
  • HDF5 Information Center
  • http//hdfgroup.org/HDF5/
  • HDF Helpdesk
  • hdfhelp_at_hdfgroup.org
  • HDF users mailing list
  • hdfnews_at_ncsa.uiuc.edu coming soon
    news_at_hdfgroup.org
Write a Comment
User Comments (0)
About PowerShow.com