Title: Running Analysis on the GRID using AliEn, ROOT
1Running Analysis on the GRID using AliEn, ROOT
PROOFP.Buncic, F. Rademakers
2 Timeline
Functionality Simulation
Interoperability Reconstruction
Performance, Scalability, Standards Analysis
3AliEn Architecture
AliEn Core Components services
Interfaces
External software
Database Proxy
ADBI
User Application
File Metadata Catalogue
API (C/C/perl)
LDAP
Authentication
RB
FS
External Libraries
User Interface
Perl Core
Perl Modules
CE
Config Mgr
CLI
SOAP/XML
V.O. Packages Commands
SE
GUI
Package Mgr
Web Portal
()
Logger
Low level
High level
4AliEn Components
AliEn Web of Collaborating Services
Modules libraries
5Command Interface
6GUI AliEn Xfiles
7Portal
- http//alien.cern.ch
- Generic Web portal
- User can
- interact with alien
- submit jobs
- check jobs status
- Administrator can
- configure system
- monitor status
- check syslog
- update distribution
8File catalogue
If you're a programmer, one of the great things
about Linux and Unix is that everything is a file
-- or at least acts like one. From devices to
sockets, the "everything is a file" paradigm has
served Unix well for a long, long time.
9Resource Broker
Pull instead of traditional Push architecture
Authen
Broker
TransferBroker
TransferOptimiser
Logger
IS
10Job Execution and Scheduling Pull model Condor
JDL Components Computing Element, Task Queue,
Process Monitor, Cluster Monitor, Broker,
Manager, Optimizer
11Production Status
PPR production (2002-03) http//alien.cern.ch/Ali
en/main?taskproduction
12Near term plans
- Grid federations
- AliEn ? EDG
- AliEn ? AliEn (just installed in Moscow, to be
installed soon in OSC, Ohio) - AliEn ? LCG-1 (collaboration with India)
- OGSA
- Currently working on AliEn Web Service API
- Next step AliEn Web Service ? OGSA Grid Service
- P2P
- Exploring possibility to use P2P technology
(Jabber) as alternative SOAP transport
resource discovery mechanisam - Performance
- Ongoing work on internal DB API
- Investigating semantic query caching on client
side
13Near term plans(..)
- Monitoring
- Deploying MonaLisa (collaboration with CMS)
- 3D visualization and Grid control from mobile
devices (collaboration with Ericsson) - Deployment
- Virtual server
- Remote Software Management (collaboration with
Ericsson) - Analysis
- SuperPROOF (collaboration with ROOT team and with
help from HP) - SC2003
- Simulation
- Simulation of Web services based distributed
computing environment (collaboration with UH and
CMS/Monarc/MonaLisa)
14Near term goal
- SC2003
- Demo
- ALICE Physics Data Challenge
- 1Q2004
- 10 of resources
- 300TB of data to be generated in 3 months using
- LCG-1 resources
- All other resources we can possibly get hold of
- Our users (Alice physicists) expect us to provide
them with seamless and transparent access to
entire dataset (or large chunks if it) directly
from the ROOT prompt
15Two Analysis Scenarios
- Asynchronous
- Interactive batch
- Job splitting batch processing (transparent to
end user) - Can be done using existing tools
- AliEn ROOT
- Scheduled file transfers
- True Interactive
- Instantaneous analysis results
- Needs
- New functionality (AliEn PROOF)
- High system availability
- Reliable and fast file transport mechanism
including transparent file caching
16 17AliEn Catalogue
alien/
alice/
atlas/
soap//
mirror ltAgt
mirror ltAgt
prod/
data/
mirror ltBgt
mc/
root//
a/
b/
castor//
mirror ltBgt
original file
file01.root
original file
- We already have a File Catalogue based on
federated (MySQL) databases - We need reliable file transport mechanism to
support interactive work in distributed
environment
18File Access
open(alien/alice/data/file01.root)
-where is the file?
PFN
alien/
mirror ltAgt
alice/
atlas/
soap//
PFN
prod/
data/
mirror ltBgt
mc/
root//
PFN
a/
b/
castor//
root//
original file
file01.root
19Grid File Transfer Layer
Local Transfer Layer
User WS
MSS
SE2
MSS
Global Transfer Layer
SE1
MSS
Local Transfer Layer
SE3
- Local File Access
- on site access from SE to Mass Storage System
- many solutions existing (rootd/rfio/posix/dcache
api etc...) - Global File Access
- access/transfer between SE's and user
workstations - gridFTP, bbFTP, xrootd .... ?!?!?
20Scheduled File transfers Follows the model of job
scheduling and execution Components Storage
Element Transfer Daemon, Transfer Queue, Broker,
Manager, Optimizer
21Analysis Requirements
- Certificate based authentication with ACLs as in
File Catalogue - High/low-speed transfers (crypted/not crypted)
- External/dynamic regulation of transfer speed per
user/connection ('Network Weather Service') - Efficient and reliable enough to handle chaotic
load from analysis - New AliEn I/O Service
- Transfer re-routing through caches ('NWS')
- distributed caches (gt1 entry point)
- distributed I/O servers (gt1 entry point
redirection) - only 2 operations allowed
- read according to catalogue perm.
- write once/according to catalogue permissions gt
libAliEn - no directory manipulation/creation/deletion
- done by SE service
22Crosslink-Cache
Client A
Main Cache Cluster
Regional Cache
Cache Levels main regional local
23Cache-And-Forward Server
Client A
host3port9999
host2port9999
host1port9999
API
I/O d
API
Off-site Cache
On-site Cache
Client B
Local Disk
Supports load balancing, multithreading, I/O
bufering
24AliEn I/O Server Secure ACLs as set in the File
Catalogue Interfaces to local MSS via AliEn
MSI Uses Network Weather Service and Cache
discovery
25The Global Grid File System
Transfer Layer
Data Catalogue
Storage Elements
DB
AliEn can make all storage resources distributed
worldwide appear as a single (albeit big) hard
disk
26Case A
- Interactive Batch Analysis
27AliEnFS
- AliEnFS is written as a module for LUFS, Linux
Userland File System (http//lufs.sourceforge.net/
) - Kernel module delegates VFS calls to various FS
daemons, which run in user space allowing easy
use of existing cryptographic libraries
28LUFS GridFS Extension
User Space
Offered as a free gift to LCG Project
29 AliEn ROOT (A)
TGrid
TAlienTGrid
Authentication Catalogue Browsing
TAlienFileTFile
ROOT File access via AliEn
TAlienAnalysis
Parallel Grid Analysis Object
TAlienJob
AliEn Job (belonging to TAlienAnalysis Object)
TAlienJobIO
Managing File I/O for a specific Ana. Job
The Analysis Object
TAlienAnalysis
- each Analysis Object is stored with unique
names in the user directory - can be
reinstantiated anytime from a ROOT session
30AliEn ROOT (A)
?
provides
Analysis Macro
Input Files
Query for Input Data
new TAliEnAnalysis Object
USER
List of Input Data Locations
produces
Job Splitting
IO Object 1 for Site BI
IO Object 1 for Site C
IO Object 1 for Site A
IO Object 2 for Site A
Job Submission
Job Object 1 for Site B
Job Object 1 for Site A
Job Object 2 for Site A
Job Object 1 for Site C
Execution
Histogram Merging Tree Chaining
Results
31C equivalent (A)
// connect authenticate to the GRID Service
alien as user TGrid alien
TGridConnect("alien",user,"","") // create
a new analysis Object ( ltunique IDgt, lttitlegt,
subjobs) TAlienAnalysis analysis new
TAlienAnalysis(pass001",MyAnalysis",10) //
set the program, which executes the Analysis
Macro/Script analysis-gtExec("AliRoot.sh,"file/h
ome/peters/test.C") // script to
execute analysis-gtQuery("2002-10/V3.08.Rev.04/001
10/galice.root?ptgt0.2") analysis-gtOutputFileAut
oMerge(true) // merge all produced .root
files analysis-gtSplit() // split the task in
subjobs analysis-gtRun() // submit all subjobs
to the AliEn queue analysis-gtGetResults() //
download partial/final results and merge
them analysis-gtInfo() // display job
information
32Interactive batch
AliEn
ROOT
33AliEn ROOT Uses AliEn API to split jobs and
send them for execution Interactive batch Merges
output files (histograms, trees)
34Case B
True Interactive Analysis
35True Interactive Analysis (B)
Super PROOF
PROOF Classic
36AliEn PROOF (B)
- AliEn
- data splitting
- data access/replication
- access control
- SuperPROOF
- Uses AliEn API C API to carry out job
decomposition and collects output form slave
PROOF servers - PROOF
- Just like usual PROOF running on a local site
- process control
- static modelthe population of PROOF daemons is
maintained on dedicated sites/nodes - dynamic modelPROOF daemons are started on
demand by AliEn using dedicated queues - mixed modelminimum population for fast response
and dynamical start-up of PROOF daemons
37Pre-started analysis services
MetadataCatalog Service
ReplicaCatalog Service
ROOT
SuperPROOF
Match-makingService
InformationService
PROOF
Worker nodes
38Analysis Requirements
- Very preliminary personal estimate of satisfied
requirements - Reflects the current status (no future plans for
development) - Best worst case scenarios
- ltOverallgt
- All requirements taken together (equal weight)
- ltServicegt
- Average of all services
39Torres Excel chart
- In principle, the architecture is similar
- Deviant workflow in case of 2 scenarios
- Difference
- AliEn uses pull architecture
- ROOT as analysis prompt
- AliEn is self contained and provides all required
components services - But, they have unusual and different names
40Conclusions
- In its present state, AliEn ROOT can satisfy
75-85 of user requirements - Adding SuperPROOF into the picture should further
improve the situation - Possibility to do fast tests on subset of a large
dataset in true interactive fashion - We are open for suggestion and cooperation on all
fronts