A View of the Ameriflux Data June ORNL Download - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

A View of the Ameriflux Data June ORNL Download

Description:

Available data (by count) viewed by site, type, etc. ... Data gaps indicated by -9999., - 99999., 9999. ... Colored by Site ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 26
Provided by: catharine6
Category:
Tags: ornl | ameriflux | by | data | download | june | view

less

Transcript and Presenter's Notes

Title: A View of the Ameriflux Data June ORNL Download


1
A View of the Ameriflux Data June ORNL Download
  • 19 July 2006
  • Catharine van Ingen
  • Microsoft Research E-Science Group

2
Outline
  • Intentions and goals
  • Brief overview of the data downloaded
  • Various views of the data
  • Available data (by count) viewed by site, type,
    etc.
  • Why data curation and data analysis are
    intertwined

3
Intentions and Goals
  • Give the computing team an overview of the data
  • Were trying to understand performance and schema
    and web page design and
  • Give some examples of what the tools can do
  • There are many more possibilities
  • Have a little fun with numbers

4
Ameriflux Overview
  • 149 Sites across the Americas
  • Each site reports a minimum of 22 common
    measurements.
  • Communal science each principle investigator
    acts independently to prepare and publish data.
  • Data published to and archived at Oak Ridge.
  • Total data reported to date on the order of 110M
    half-hourly measurements.
  • http//public.ornl.gov/ameriflux/

4
5
June ORNL Download Overview
  • Data automatically downloaded on June 16, 2006
    from http//cdiac.esd.ornl.gov/programs/ameriflux
    /data_system/aamer.html
  • 61 sites reporting data
  • 627 unique measurement column headings
  • 110M valid measurements
  • Early data checking performed
  • Data are always valid single precision floating
    point numbers
  • Data gaps indicated by -9999., -99999., 9999.
  • One and only on measurement from same site at
    same time unless identified as a repeat
    measurement
  • Data type specific range and other sanity checks
    pending

6
June ORNL Measurement Classification
  • The discovered column headings are represented
    as
  • Datumtype repeat_offset_offsetextended
    dataumtypeunits
  • Datumtype the short (lt16 characters) name for
    the data.
  • Example TA, PREC, or LE.
  • Repeat an optional number indicating that
    multiple measurements were taken at the same site
    and offset.
  • Example include TA2.
  • _offset_offset major and minor part of the z
    offset.
  • Example SWC_10 (SWC at 10 cm) orTA_10_7 (TA at
    10.7m).
  • Extended datumtype any remaining column text.
  • Example fir, E, sfc.
  • Units measurement units (should only be one!)
  • Example w/m2, or deg C.

7
What the Classification Means
  • Both the datumtype and extended datumtype are
    sometimes necessary to uniquely name some
    measurement types
  • Only 37 datumtypes are currently used
  • These account for 94 of all data
  • Other is a catch-all for all others - the
    extended datumtype is used to differentiate them
  • Example is Albedo or NEEP
  • The extended datumtype also modifies the
    datumtype
  • Examples are LE_actual and LE_potential or U_x,
    U_y, U_z.
  • Each of datumtype, extended datumtype, repeat and
    offset may be necessary to uniquely specify a
    specific measurement
  • Examples are SWC4_10, Rn2_sfc, FC_WPL_LE_4_65

8
And now for some plots
9
Claimers and Disclaimers
  • Y-axis is always the count of available (non-gap)
    data measurements
  • X-axis and color used to differentiate two other
    attributes
  • Often too many attributes to use color to
    distinguish reliably
  • Goal is to look at the entire data set rather
    than to show specific details
  • Some blurry rendering due to conversions and/or
    too many axis legends or plotted values
  • Detailed drilldown with fewer contributing
    attributes or sites or ? is possible
  • Some cutpaste errors may occur
  • This isnt science or computer science, but a
    step to information science
  • Comments in blue are what a non-scientist sees in
    the figures

10
Total Data Available by YearColored by Site
Overall number of sites and data taken is growing
11
Total Data Availability by MonthColored by Site
While there is some tendency away from taking
data in the winter, most sites report throughout
the year
12
Total Data Availability by SiteColored by Year
Many sites come and go after 4-5 years
13
Total Data Availability by SiteColored by Type
Sites report more data either because of
longevity or ?
14
Total Data Availability by Type Colored by Site
Data type reporting is far from uniform across
type
15
Other Data TypesColored by Site
There is a long tail in the other extended
types how generally useful are these?
16
Other Data TypesColored by Extended Type
A very few sites account for the vast bulk of
other extended data types
17
Non-Zero Repeat Counts by Site Colored by
Repeat Count
A very few sites account for the vast bulk of
repeated measurements
18
Non-Zero Repeat Counts by TypeColored by Repeat
Count
Are repeats just another way of reporting a
different offset?
19
Non-Zero Offsets by TypeColored by Offset
Magnitude
Measuring soil properties (SWC, TA) at different
offsets must be important for science. Measuring
others (TA, CO2, RH) may be important or just
convenient?
20
Non-Zero Offsets by OffsetColored by Type
Soil property measurements tend to be reported at
common offsets.
21
Extended Data TypeColored by Major Data Type
The most common extended data type (_cum) is just
a derived value cumulative and affects only
PREC (rainfall). Are PAR_OUT, Rg_OUT and Rgl_OUT
similar? Some extended data types are just unit
conversion issues.
22
Why data curation and data analysis are
intertwinedornow for something fun Thanks to
Gretchen Miller (Gmiller_at_berkeley.edu) for the
idea
23
Average Reported Temperature by Latitude
Whats going on at higher latitudes? (It should
be getting colder)
24
Data Availability by Month at Higher Latitudes
Colder month data is missing at northern sites!
25
Questions, Comments, Suggestions to
bwc-tci_at_lists.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com