Don Quijote - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Don Quijote

Description:

Data Management for the ATLAS Automatic Production System. Miguel ... Provide servers as Pacman-caches. Much to ... wrap in Pacman the minimal Globus ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 16
Provided by: ppephysi
Category:
Tags: don | man | pac | quijote

less

Transcript and Presenter's Notes

Title: Don Quijote


1
Don Quijote
  • Data Management for the ATLAS Automatic
    Production System

Miguel Branco CERN ATC miguel.branco_at_cern.ch
2
Overview
  • Don Quijote
  • New Focus
  • Functionalities
  • POOL
  • Architecture
  • Current Status
  • NorduGrid
  • US Grid 3()
  • LCG-2
  • Integration with ATLAS prodsys
  • Future plans

3
Don Quijote
  • Data Management for the ATLAS Automatic
    Production System
  • Allow transparent registration and movement of
    replicas between all grid flavors used by ATLAS
  • US Grid
  • Nordugrid
  • LCG
  • (support for legacy systems might be introduced
    soon)
  • Avoid creating yet another catalog
  • which grid middleware wouldn't recognize (e.g
    Resource Brokers)
  • use existing catalogs and data management tools
  • find common features between tools and catalogs
  • bridge them and provide a unified interface
  • Accessible as a service
  • lightweight clients

4
Don Quijote new focus
  • Provide a single tool to end-users to manage data
    files
  • Integrates all tools that users would have to
    know about into a single one. E.g.
  • FCpublish, FCregister, (POOL File Catalogs)
  • edg-rm, edg-rmc, edg-lrc, (EDG)
  • globus-rls-cli, globus-url-copy, (Globus)
  • ldapsearch, (querying information system)
  • rfdir, rfcp, (common use of Castor)
  • Acts as a POOL-aware Replica Manager
  • Eases security requirements for end-users
  • Temporarily!

5
Functionalities
  • search fullSearch searchHosts ( lpn )
  • addRestricted ( lpn, url , guid, fsize, md5sum
    )
  • addTemporaryRestricted ( lpn, url, nrhours ,
    guid, fsize, md5sum )
  • keepUntil ( url, nrhours )
  • makePermanent ( url )
  • removeReplica ( url )
  • remove ( lpn )
  • rename ( old lpn, new lpn )
  • stageOut( url )
  • getToDestination ( src SE, lpn , dest )
  • putToSE ( src turl, lpn, dest SE , guid, md5sum
    )

Replica Catalogs Manipulation
File Movement
LPN Logical Collection Name Logical File Name
(unique)
6
Functionalities - POOL
  • Integrates file movement with POOL XML File
    Catalogs
  • Uses DQ POOL FC command line tools
  • Python scripts
  • Use-cases
  • Get local copy of file and generate or update
    corresponding PoolFileCatalog.xml
  • (to provide input data and input POOL XML catalog
    for a job)
  • Copy and register a local copy of a file to a
    grid flavor given UUID in the local
    PoolFileCatalog.xml
  • (to register output data from a job)

7
Architecture
  • Python Client
  • C client library
  • Configuration file indicating endpoint of each
    server
  • Servers
  • Per grid-flavor
  • GSI and insecure
  • Configuration file

User interface tool written in Python Servers and
client library written in C
8
Changes on Server-side
  • Why was server-side code rewritten?
  • Partly because of CMS experience
  • Persistent connections were necessary
  • Connection pooling mechanism
  • Each request could not instantiate a connection
    to the grid catalog too slow!
  • Partly from our initial experience
  • Flexible security mechanism
  • Either provide a single certificate for all, or
    delegate credentials
  • Initial version
  • A command line tool for each grid flavor with the
    same syntax and same output
  • Clarens server was forking out a process that
    executed the request by calling the command line
    tool
  • This proved to be inefficient and too restrictive
    e.g. could not maintain persistent connections
    across multiple requests!
  • Therefore,
  • Server code was built by extending the command
    line tools each tool is now a daemon

9
Current Status
  • Current structure

10
NorduGrid
  • Globus RLS 2.x
  • Only Classic Storage Elements (GridFTP servers)
  • Information System
  • Connects to LDAP
  • Special attributes in the RLS

11
LCG-2
  • EDG/LCG RLS (v2.2)
  • GFAL support
  • SRM/Castor support
  • SRM/dCache support
  • Classic Storage Element support
  • Information System
  • LDAP-based (MDS)
  • Native POOL Support
  • Using POOL-1.6.5

12
US Grid 3()
  • Globus RLS 2.x
  • DQ supports at the moment only Classic Storage
    Elements (GridFTP servers)
  • No information system interface
  • DQ creates a dummy information system which
    consists of a local configuration file

13
Integration with ATLAS prodsys
  • Executors are using their native grid tools to
    do file registration
  • But are adding extra-metadata attributes required
    by DQ
  • This allows integration with DQ
  • Windmill is using DQ
  • To locate replicas of files
  • Renaming of logical files to their final names
    (after validation)
  • This week move files across grids so that each
    executor finds at least a replica of all files
    required by the jobs

14
Future plans
  • Better integration with POOL
  • Must come from end-users experience
  • Better end-user documentation and support
  • For now, focus has been only on the Automatic
    Production System
  • Get best replica (not high priority)
  • within a grid
  • between grids
  • Monitoring
  • Still being discussed
  • Reliable transfer service
  • Using MySQL database to manage transfers and
    automatic retries

15
Future plans
  • Release command line tools appropriate for
    end-users
  • Request has been made to provide such tools for
    the Combined Test Beam effort
  • Provide servers as Pacman-caches
  • Much to improve
  • Reliability
  • Easy installation of client tool for users
    outside grid
  • Get local copies of files to non-grid machine
  • ? wrap in Pacman the minimal Globus GridFTP
    libraries
  • As true interoperability comes, Don Quijote goes
  • Common information schema similar catalogs
  • Common interface to storage resource managers
Write a Comment
User Comments (0)
About PowerShow.com