Best Practices Writing (Read-only archives of) netCDF (version 3) - PowerPoint PPT Presentation

About This Presentation
Title:

Best Practices Writing (Read-only archives of) netCDF (version 3)

Description:

Best Practices Writing Readonly archives of netCDF version 3 – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 28
Provided by: car97
Category:

less

Transcript and Presenter's Notes

Title: Best Practices Writing (Read-only archives of) netCDF (version 3)


1
Best Practices Writing (Read-only archives
of)netCDF(version 3)
  • John Caron
  • Unidata
  • June 28, 2007

2
Overview
  • NetCDF solves file syntax writer and readers
    need to agree on semantics
  • Goal 1 intelligible by humans
  • Goal 2 readable by standard tools
  • Write a Conventions document
  • Types of metadata
  • Structural metadata ncdump -h
  • Use metadata units, coordinates
  • Search metadata bounding boxes, time ranges,
    standard variable names, keywords

3
Its on the web
  • http//www.unidata.ucar.edu/software/netcdf/docs/B
    estPractices.html
  • http//www.unidata.ucar.edu/software/netcdf/docs/w
    orkshop/bestpractices/index.html

4
NetCDF-3 Data Model
5
Attributes
  • Use standard attribute names if possible
  • netCDF Users Guide, CF-1.0
  • Use numeric when appropriate
  • calibration 23.7 // string
  • calibration 23.7f // float
  • Can be multivalued
  • special 23.7f, 10.6f

6
Global Attributes
  • Conventions "NCAR-RAF/nimbus"
  • Put document on the web, send us a link
  • Search metadata
  • bounding boxes, time ranges, keywords
  • NetCDF Attribute Convention for Dataset Discovery
  • Many others Sources
  • CF-1.0, FGDC, ISO, Dublin Core

7
Variable Attributes
  • long_name human readable plot title
  • units udunits compatible
  • sps vs s-1
  • display_units NO3 ppm
  • Missing values
  • _FillValue never written
  • missing_value
  • valid_min, valid_max, valid_range

8
Dimensions
  • Name make it meaningful
  • vector16 vs bins16, wind_vector
  • char date(vector16) vs date(date_strlen)
  • Shared dimension imply shared coordinates
  • char date(dim16)
  • float P(time,dim16) // BAD DOG!
  • versus
  • char date(date_strlen)
  • float P(time,bins16)
  • float T(time,bins16) // GOOD BOY!!

9
  • Example Conventions
  • http//www.unidata.ucar.edu/software/netcdf/conven
    tions.html
  • Debugging Tool
  • http//www.unidata.ucar.edu/software/netcdf-java/v
    2.2/webstart-dev/index.html

10
Nimbus Options (1)
  • Data Types
  • Allow any datatype
  • Use scale/offset to save space
  • Units
  • Change units to be udunit compatible
  • Add display_units (?) attributes

11
Coordinate Variables
  • dimensions
  • time 1761
  • lat 180
  • lon 360
  • z 42
  • variables
  • int time(time)
  • units "seconds since 1970-1-1 00000 000"
  • double lat(lat)
  • units degrees_north
  • double lon(lon)
  • units degrees_east
  • double z(z)
  • units m
  • positive up
  • float data(time,z,lat,lon)

12
Coordinate Variables
  • Variable name same as dimension name
  • Strictly monotonic values
  • No missing values
  • Simple case
  • All coordinates are 1D coordinate variables
  • Data variables have one dimension for each
    coordinate
  • data(time,z,lat,lon)
  • Correct rules only for gridded (model) data

13
Stationary Buoy (first attempt)
  • dimensions
  • time unlimited
  • lat 1
  • lon 1
  • float data(time, lat, lon)
  • int time(time)
  • double lat(lat)
  • double lon(lon)
  • Only works when lat1, lon1 (single buoy per
    file)

14
Multiple Buoys per file
  • float data(buoy,time)
  • int time(time)
  • int buoy(buoy)
  • long_name buoy id
  • double lat(buoy)
  • double lon(buoy)

15
Aircraft (Trajectory) Coordinates
  • float data(pt)
  • int time(pt)
  • double altitude(pt)
  • double lat(pt)
  • double lon(pt)

16
2D Coordinates
  • float data(time,z,y,x)
  • int time(time)
  • double z(z)
  • double y(y)
  • double x(x)
  • double lat(y,x)
  • double lon(y,x)

17
Generalize Coordinate Variable to Coordinate Axis
  • Can be multidimensional
  • Name can be different from the dimension
  • A set of axes for a variable is called a
    Coordinate System
  • How to associate a Coordinate System with a
    variable?
  • float data(pt)
  • datacoordinateslat lon altitude time

18
Nimbus Coordinates
  • coordinates "LONC LATC GGALT Time"
  • float LONC(Time7741)
  • _FillValue -32767.0f // float
  • units "degree_E"
  • long_name "GPS-Corrected Inertial
    Longitude"
  • valid_range -180.0f, 180.0f // float
  • Category "Position"
  • standard_name "longitude"

19
Nimbus Recommend (2)
  • Document Coordinates
  • All variables have same coordinate system,
    described by coordinates global attribute
  • Coordinate variable have standard_name attribute
    describing coordinate type latitude, longitude,
    altitude, time
  • Are missing values possible?
  • CF-1.0 units for lat/lon degrees_east,
    degrees_north (decimal degrees)

20
Bin Coordinates
  • float AS200_RWO(Time7741, sps11, Vector3131)
  • FillValue -32767.0f
  • units "count"
  • long_name "SPP-200 (PCASP) Raw Accumulation
    (per cell) - DMT"
  • Category "PMS Probe"
  • missing_value -32767.0f
  • SampledRate 10
  • DataQuality "Preliminary"
  • SerialNumber "PCAS108"
  • FirstBin 6 // int
  • LastBin 30 // int
  • CellSizes 0.05f, 0.065f, 0.08f, 0.095f,
    0.11f, 0.125f, 0.14f, 0.155f, 0.17f, 0.185f,
    0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f,
    0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f,
    1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f
  • CellSizeUnits "micrometers"

21
Bin Coordinates (alt)
  • float AS200_RWO(Time7741, sps11,
    AS200_RWO_BINS31)
  • long_name "SPP-200 (PCASP) Raw Accumulation
  • FillValue -32767.0f
  • units ""
  • display_units "count"
  • coordinates "LONC LATC GGALT Time
    AS200_RWO_BINS"
  • float AS200_RWO_BINS(AS200_RWO_BINS31)
  • FirstBin 6
  • LastBin 30
  • data
  • AS200_RWO_BINS 0.05f, 0.065f, 0.08f, 0.095f,
    0.11f, 0.125f, 0.14f, 0.155f, 0.17f, 0.185f,
    0.2f, 0.215f, 0.23f, 0.3f, 0.43f, 0.56f, 0.69f,
    0.82f, 0.95f, 1.1f, 1.25f, 1.4f, 1.55f, 1.7f,
    1.85f, 2.0f, 2.15f, 2.3f, 2.45f, 2.6f, 2.75f

22
Bin Coordinates (alt)
  • Advantages
  • Can be written outside of define mode
  • More likely to be interpreted by standard tools

23
Station data(same number of pts at each station)
  • float data(station, time)
  • int time(time)
  • double altitude(station)
  • double lat(station)
  • double lon(station)

24
Station data (different number of pts at each
station)
  • dimensions
  • record unlimited
  • char station_name(station, strlen)
  • double altitude(station)
  • double lat(station)
  • double lon(station)
  • int firstChild(station) // record index
  • int numChildren(station)
  • float data1(record)
  • float data2(record)
  • float data3(record)
  • float time(record)

25
NetCDF-3 file layout
Non-record variables
Record variables
Obs for one station
26
Unidata Obs Data Conventions
  • Different number of groups of observations
  • Nested groups
  • Linked list, contiguous list
  • Additional complexity
  • Performance implication
  • http//www.unidata.ucar.edu/software/netcdf-java/f
    ormats/UnidataObsConvention.html

27
Conclusions
  • NCAR-RAF/nimbus Conventions are quite good
  • Unidata is interested in helping out with future
    revisions, new formats
  • Netcdf-4 will offer new options
  • Standards are evolving please help!
  • CF could be standards umbrella
Write a Comment
User Comments (0)
About PowerShow.com