Research Data, May 57 2003 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Research Data, May 57 2003

Description:

Archive Content. Systematic Dataset Updates. Significant and important ... Archive Content. Harvest, apply, and make more metadata available for the users ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 18
Provided by: steven172
Category:
Tags: content | data | research

less

Transcript and Presenter's Notes

Title: Research Data, May 57 2003


1
Research Data, May 5-7 2003
  • Archive Content
  • New Archive Developments
  • Archive Access and Provision

2
Archive Content
  • Systematic Dataset Updates
  • Significant and important effort

3
(No Transcript)
4
Systematic Dataset Updates, Strategies
  • Absolutely, must keep doing this
  • DSS bread and butter
  • Trend will be toward more network transfers
  • Some more frequently, per Forum suggestions
  • More behind the scenes work
  • Tighter data integrity checks
  • Identify data gaps and unit changes
  • Maintain media transfer capability for both I and
    O
  • Tapes and CDROMS are necessary

5
Archive Content
  • Harvest, apply, and make more metadata available
    for the users
  • Use NNR metadata to fix problems in the databases
  • Applies to forthcoming Reanalyses also
  • Provide users with relevant metadata
  • Systematically apply metadata
  • E.G. station history libraries

6
(No Transcript)
7
New Archive Developments
  • Acquisition of new datasets
  • ECMWFs ERA40
  • LTO tape transfer, 15 TB, production finished
    (reruns?)
  • NCEPs Regional Reanalysis
  • Network transfer, 12 TB, production started
  • More ocean datasets for climate analysis and
    modeling
  • Response to NSF Panel recommendation
  • Will look for collaboration opportunities

8
New Archive Developments
  • Acquisition of new datasets
  • New near real-time collections from the UNIDATA
    server
  • Currently backing up 2.7 GB/day, beginning Nov.
    2001
  • Includes, global station observations, low
    resolution model products, radar data, and
    profiler data
  • Collect a few more products
  • Build DSS datasets (metadata), and online access
  • Do more to help other Divisions
  • they are a barometer of the University
    community
  • E.G. with little effort we can help MMM, per Wei
    Wangs comments at the Forum

9
Archive Access and Provision
  • Filling one-off data requests for users TBC
  • Data Discovery
  • Improved DSS guide documents to datasets.
    Separated by research needs, e.g. precipitation
  • Improved, UCAR-wide, search success through
    structured metadata catalogs, e.g. THREDDS
    catalogs
  • Better linkages between DSS and USS primers
  • Integrated view for the users!

10
Archive Access and Provision
  • Access from the MSS
  • New datasets will NOT be in COS blocked format
  • Provide more and better tools with the datasets
  • Simplify access programs, provide in other
    languages
  • Provide COS unblocking scripts many computer
    platforms
  • Have this but not well advertised
  • Provide helpful format conversion scripts
  • E.G. NCL or line commands to convert GRIB to
    netCDF.

11
New Archive Developments
  • Stay involved with research projects that require
    data, e.g. reanalyses and CLIVAR
  • Why?
  • Focus on acquiring new data
  • Focus on improving extant observational archives
  • Leading recipient of new research data output
  • Benefit to our users

12
Archive Access and Provision
  • Access from the DSS server
  • Summary of status
  • Bottom line It is working well!

13
Access from the DSS server
  • Statistics Period  Jan - Dec 2002 (web only)
  • Server Activity
  • Total volume transferred by the server 840 GB
  • User Activity
  • Number of unique users downloading data
    files 9136    
  • Number of repeat users downloading data files 946

Top Ten for 2002 Total 275 different datasets
14
Access from the DSS server
  • Much more data freely online for Web and FTP
    download
  • More online data request forms
  • large-scale data extractions, delayed mode
    processing
  • More real-time processing of data requests
  • small-scale data extractions
  • Note, these will be complementary access
    functions with the CDP
  • Currently developing a server upgrade plan (with
    DSG) to make this possible.

15
Access from the CDP
  • Make Reanalysis products available
  • Push to full scale for the first time
  • 40 years gridded atmospheric data
  • About 2 TB for two products
  • Service concerns that require significant effort
  • Authorization and authentication of users
  • Integrated discovery and access interfaces
  • Coordinated service with DSS server
  • Ensure prompt response to user requests
  • Build on this experience other datasets will
    follow

16
Access from the CDP
  • Should the DSS server and CDP be staged from the
    same machine?
  • Pros
  • Co-located CPU and disk
  • Could mean fast multiple services
  • Remove one server from the long list of servers
    that need administration, maintenance, and
    upgrade.

17
Access from the CDP
  • Cons
  • The DSS server must be available 24x7 and provide
    a stable level of service.
  • Contrast, the CDP is a developing facility and
    may not always be stable.
  • Access testing, software testing, and initial
    debugging
  • Shared disk cannot be used to reduce data storage
    requirements.
  • SANS may offer some options in the future.
  • To protect the current successful DSS service
  • Keep the DSS server and CDP separate for now
  • Reconsider merging in the future
Write a Comment
User Comments (0)
About PowerShow.com