Title: NextGRID Monitoring and Fabric Management Requirements
1NextGRID Monitoring and Fabric Management
Requirements
- SLA Management Example SweGrid Accounting System
and Test-bed
Thomas Sandholm, KTH, sandholm_at_pdc.kth.se
2NextGRID
- How do we make
- the Grid sustainable?
3Outline
- NextGrid WP4 Grid Foundations - Advanced
Deployment, Service Management and Migration - SLA Management Lifecycle Construction,
Negotiation, Attainment, Charging - Towards Adaptive Systems SLA Manager Bag of
Services - Example Test-bed SweGrid SGAS
- Example SLA Usage SLA Management in SGAS
- Requirements Checklist
4NextGRID WP4 Grid Foundations - Advanced
Deployment, Service Management and Migration
- Work Package - Grid Foundations
- Address basic properties, protocols, and core
services of individual OGSA services, e.g., QoS
Manageability engineer reference solution - Task - advanced deployment, service management
and migration - Requirement Decentralized automatic control
needed over hardware comprising Grid fabric as
well as applications and services running on that
fabric - Requirement Incremental evolution to avoid loss
of service - Focus on autonomous service management and SLA
management - Phase 1 analyse available monitoring and
supervision solutions. Requirements from existing
Grid projects, e.g., Framework 5 Projects, GRASP,
Android, SweGrid - Phase 2 develop management framework
,SLAnegotiation - Phase 3 integrate monitoring and management
solution and introduce intelligent
decision-making process. - NG Partners British Telecom (UK), HLRS
(Germany), KTH (Sweden)
5SLA Management Lifecycle
- Construction Phase offers prepared by service
providers (or their agents) with fixed and
negotiable terms, service requests with QoS
requirements prepared by customers (or their
agents) - Negotiation Phase negotiation protocol needed to
settle on negotiable terms and sign SLA. SLA-SLS
mapping. - Attainment Phase monitoring, policing,
re-negotiation, re-configuration, obligation
fulfillment. - Charging Phase accounting, usage recording,
auditing, archiving, price rating, billing.
6Towards Adaptive Systems SLA Manager Bag of
Services
SLA Manager
Access Flow Policing/Shaping (DiffServ Packet
Dropping)
Pricing Manager (GridBank Trader Service)
P2P Event Manager
SLA Provider (WS-Agreement, WSLA)
Usage Tracker/Analyzer (GGF-UR, Nework Traffic
Analyzers)
Policy Manager (PAP, PIP, PDP, PEP)
Service Registration/Discovery (WS-RF, UDDI)
Service Monitor/Controller (GGF-CMM, WSDM, WS-RF)
Negotiation Agent (Contract Net,
WS-AgreementNegotiation)
Policy Rule Base (XACML, FuzzyLogic)
Meta-Data Repository (Ontologies, WSDL)
Knowledge Repository
Usage Repository
7Example Test-bed SweGrid SGAS
- Swedish nation-wide computational resource
comprising 600 Intel P4 at 6 HPC Centers
interconnected with 10Gb/s GigaSunet network - Resources allocated to promising research
projects with demanding computational and storage
needs by national allocations comittee (SNAC) - SweGrid Accounting System (SGAS) provides soft
real-time allocation enforcement across all
centers in the Grid based on SNAC quota - 3-party policy-driven resource access (user
resource specification, local resource policy,
allocation authority policy) - Java Web services, OGSA, WS-Security, GSI,
GGF-UR, XACML standards-based Infrastructure - Integration platform for workload managers and
local accounting systems/schedulers - Currently built with GT3 (OGSI), transition to
GT4 (WS-RF) next year
8Example NextGRID Deliverable Use SLA Management
in SGAS
Resource Specification
3rd Party (ARC/Globus)
Service Registration/Discovery
Service Monitor
Resource
Remote Execution Service
Usage Tracker
Bank
Reservation Manager
Allocation Authority
Policy Manager
Policy Manager
9Requirements Checklist (incomplete in random
order)
- Decentralized automatic control needed over
hardware comprising Grid fabric as well as
applications and services running on that fabric
(WP4) - Incremental evolution to avoid loss of service
(WP4) - Common information models for service level
agreements and for the management information
that is required to deliver end-to-end
application quality (WP3) - Techniques for adapting the representation of
information according to its context (WP3) - Standardized QoS Ontologies to allow monitoring
on predefined SLA parameters with well-defined
metrics - Sensors and Controllers on various levels (e.g.
Resource, Workflow) wrapping instrumented code
accessed using standard protocols defined in WSDL
- Registration/Discovery of Sensors and Controllers
using standard protocols defined in WSDL - Both Push and Pull Event Handling of messages of
various criticality (filterable) - Virtualization of Resources, Abstract Runtime
(Hosting) Environments - Back-end SLS Control CPU, Bandwidth, Storage,
Memory - Front-end SLA Request availability, run time,
jitter, cost
10Example Test-bed Experience SGAS Resource
Administrator Interaction and Policy Introduction
- Involve RAs early in the process with surveys
- Feedback from running system crucial to move from
prototype to production - Use a phased low-risk, low-intrusion deployment
approach - Allow all stake-holders (e.g. RAs, users,
resource owners) to customize local policies
easily through XML document centric
configurations and transformations, e.g. RSL,
XACML, GGF-UR Style sheets. Provide sensible
defaults.