CDF-UK MINI-GRID - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

CDF-UK MINI-GRID

Description:

CDF collaborators in the UK applied for JIF grant for IT equipment in 1998. ... Data copier. Globus toolkit. Job Submission. 3rd Nov 2000. HEPiX/HEPNT 2000. 12 ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 16
Provided by: ianmca
Category:
Tags: cdf | grid | mini | copier

less

Transcript and Presenter's Notes

Title: CDF-UK MINI-GRID


1
CDF-UK MINI-GRID
  • Ian McArthur
  • Oxford University, Physics Department
  • Ian.McArthur_at_physics.ox.ac.uk

2
Background
  • CDF collaborators in the UK applied for JIF grant
    for IT equipment in 1998. Awarded 1.67M in
    summer 2000.
  • First half of grant will buy
  • Multiprocessor systems plus 1TB of disk for 4
    Universities
  • 2 multiprocessors plus 2.5 TB of disk for RAL
  • A 32 CPU farm for RAL
  • 5 TB of disk and 8 high end workstations for
    FNAL
  • Emphasis on high IO throughput super-workstations
    .
  • A dedicated network link from London to FNAL

3
CDF-UK Equipment Bid
4
Hardware and Network
  • Tender document is written and schedule is on
    target for equipment delivery in May 2001. Second
    phase starts June 2002
  • Developed a scheme for transparent access to CDF
    systems via the US link.
  • Each system CDF-UK requires to use the link has
    an alternative IP name and address to allow the
    data to be sent down the dedicated link.
  • A Network Address Translation scheme ensures that
    return traffic takes the same path (symmetric
    routing)
  • Demonstrated the scheme working with 2 Cisco
    routers on a local network.
  • Starting to talk to network providers to
    implement physical link.
  • Must try to make Kerberos work across this link

5
(No Transcript)
6
Software Project
  • JIF proposal only covered hardware but in the
    meantime GRID has arrived !
  • Aim to provide a scheme to allow efficient use of
    the new equipment and other distributed
    resources.
  • Concentrate on solving real-user issues.
  • Develop an architecture for locating data, data
    transfer and job submission within a distributed
    environment
  • Based on the GRID architecture initially on top
    of the Globus toolkit. Gives us experience in
    this rapidly developing field.

7
Some Requirements
  • Want an efficient environment so automate
    routine tasks as much as possible
  • With few resources available must make best use
    of the existing packages and require few or no
    modifications to existing software.
  • To make best use of the systems available
  • data may need to be moved to where these is
    available CPU,
  • or a job may need to be submitted to a remote
    site to avoid moving the data.
  • Produce a simple but useful system ASAP.

8
Design principles
  • All sites are equal
  • All sites hold meta-data describing only local
    data
  • Use LDAP to publish meta-data kept in
  • Oracle - at FNAL
  • msql - at most other places
  • may go to MySQL
  • Can introduce caching but keep it simple at first
  • Use local intelligence at each end of data
    transfer allows us to take account of local
    idiosyncrasies e.g. use of near-line storage,
    disk space management
  • Use existing Disk Inventory Manager

9
CDF Data
  • Dataset a primary dataset contains all the
    processed data from a specific physics channel.
  • Secondary datasets by event selection
  • Datasets will grow over time as more data is
    taken and data continues to be processed.
  • Fileset smallest collection of data which can be
    requested from the data handling system. At
    Fermilab, a fileset is mapped to a single
    partition on a tape and contains a few files.
  • File A member of a fileset. The smallest unit of
    data known to a filesystem, typically 1GB.
  • Metadata Stores relationships between files,
    filesets and datasets, run conditions, luminosity
    etc.

10
Data Location/Copy
11
Layers
User Interface
Dataset maintainer
...
Data locator
Data copier
Job Submission
Globus toolkit
12
Functionality at a site
  • A mechanism to allow jobs from participating
    sites to be run.
  • Publication of the local metadata
  • Publication of  information about other system
    resources (CPU, Disk, Batch queues etc).
  • Transmission of data via network.
  • This may involve staging of data from tape to
    disk before transmission.
  • Receive data from the network or from tapes.
  • Copy or construct metadata
  • Some sites may have reduced functionality

13
Scope
  • Plan to install at
  • 4 UK universities (Glasgow, Liverpool, Oxford,
    UCL)
  • RAL
  • FNAL (although this would be reduced
    functionality, data and metadata exporter)
  • More non-UK sites could be included
  • Intend to have basic utilities in place at time
    of equipment installation (May 2001)

14
Work so far
  • Project plan under development once finished
    additional resources will be requested.
  • Globus installed at a number of sites. Remote
    execution of shell commands checked.
  • Some bits demonstrated
  • LDAP to Oracle via Python script
  • Python convenient scripting language for the job
  • May use a daemon to hold connection to ORACLE
  • LDAP only implement search - and even this is
    quite tricky because your script should support
    filter, base and scope.
  • LDAP schema will not reflect full SQL schema but
    just what is needed.
  • Java to LDAP (via JNDI)
  • JNDI (Java Naming and Directory Interface) gives
    very elegant interface to LDAP

15
Longer Term Goals
  • User Interface to be implemented as Java
    application to give platform independence.
  • UI to automate or suggest strategies for moving
    data/submitting jobs
  • Need to include cost/elapsed time estimates for
    task completion
  • Need to look up dataset sizes, network health,
    time to copy from tape or disk, cpu load etc.
  • Look for more generic solutions
  • Evaluate any new GRID tools which might
    standardize any parts weve implemented
    ourselves.
  • Consolidation with other GRID projects
Write a Comment
User Comments (0)
About PowerShow.com