Condor at Cardiff - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Condor at Cardiff

Description:

Condor at Cardiff – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 36
Provided by: informat283
Category:
Tags: cardiff | condor

less

Transcript and Presenter's Notes

Title: Condor at Cardiff


1
Condor at Cardiff
  • Dr James Osborne

2
Contents
  • What is Condor
  • Condor at Cardiff
  • Condor Users at Cardiff
  • Green Computing at Cardiff
  • Advanced Research Computing at Cardiff
  • Virtualization
  • Patterns

3
What is Condor
  • Condor is the name for two species of New World
    vultures, each in a monotypic genus
  • They are the largest flying land birds in the
    Western Hemisphere

4
What is Condor
  • A specialised workload management system for
    compute-intensive jobs
  • Users submit their jobs to Condor
  • Condor places them into a queue
  • Condor chooses where and when to run them
  • Condor carefully monitors their progress
  • Condor informs the user upon completion
  • http//www.cs.wisc.edu/condor/

5
Condor at Cardiff - Pilot
  • The Condor pool began as a pilot service back in
    April of 2004 led by Dr Hugh Beedie, CTO of
    Information Services in conjunction with staff at
    the Welsh e-Science Centre
  • First user from the School of Business
  • A solution looking for problems

6
Condor at Cardiff - Production
  • The Condor pool transitioned to a production
    service back in January of 2006 with the
    appointment of Dr James Osborne as project
    manager
  • Latest user from the School of Psychology
  • Doubled size of pool, Tripled number of users
  • Distributed using Novell Zenworks
  • Common condor_config files EA, EI, S, SEA
  • Injected condor_config_local variables
  • IS_OWNED_BY, IS_EXECUTE_ALWAYS, RANK

7
Central Manager
master, collector, negotiator
Execute Nodes
1600 Workstations
Submit Nodes
30 Workstations
master, schedd, shadow
master, startd, starter
8
Condor Users at Cardiff
  • User in a computing context refers to one who
    uses a computer system
  • Users may need to identify themselves for the
    purposes of accounting, security, logging and
    resource management
  • Users are also widely characterized as the class
    of people that uses a system without complete
    technical expertise required to fully
    understand the system

9
Growth of User Base
10
Diversity of User Base
  • Architecture 1
  • Biosciences 9
  • Business 1
  • Computer Sci 6
  • Engineering 3
  • Epidemiology 2
  • History Arch 2
  • Mathematics 2
  • Optometry 2
  • Physics 2
  • Psychology 1
  • Social Sci 1
  • Total 32

11
Diversity of Applications
  • Blast, Damfilt
  • Dammin, Energyplus
  • Gasbor, Grinder
  • Lea, Leadmix
  • Matlab, Msvar
  • Oxcal, Perl
  • Pest, R
  • Sienna, Structure
  • Econometric Modelling
  • Fluid Dynamics
  • Fourier Analysis
  • Geological Modelling
  • Image Processing
  • Radiation Transport
  • Travelling Salesman
  • WIFI Roaming

12
Donna Lammie
Structural Biophysics Group
  • OPTOM
  • X-Ray Diffraction
  • Determine shape of molecules
  • Time on a single workstation 2-3 Days
  • Time on the Condor pool 2-3 Hours
  • Speed-up factor of 2000

13
Donna Lammie
C. Baldock et. al. Nanostructure of Fibrillin-1
Reveals Compact Conformation of EGF Arrays and
Mechanism for Extensibility. Proceedings of the
National Academy of Sciences of the United States
of America, 103(32)11922-11927, August 2006.
14
Patrick Downes
Research Assistant
  • Velindre Cancer Centre
  • Montecarlo simulation
  • Radiotherapy dose calculation
  • Time on a single workstation 3 Months
  • Time on the Condor pool 36 Hours
  • Speed-up of 6000

15
Patrick Downes
16
Green Computing at Cardiff
  • Green Computing is the study and practice of
    using computing resources efficiently
  • Typically, technological systems or computing
    products that incorporate green computing
    principles take into account the so-called triple
    bottom line of economic viability, social
    responsibility, and environmental impact

17
Power Consumption
Based on a P4 3GHz PC with 512MB RAM
18
Watts Up Pro
  • Measures
  • Watts, Volts, Amps, WattHrs, Cost, Avg Kwh, Mo
    Cost, Max Wts, Max Vlt, Max Amp, Min Wts, Min
    Vlt, Min Amp, Pwr Fct, Dty Cyc, Pwr Cyc
  • Freq 1 second
  • Duration 15 minutes

19
Economic Viability
Based on a P4 3GHz PC with 512MB RAM
  • Makes sound financial sense
  • Hibernate saves 60 per year
  • Condor 30 per year (max)
  • Dedicated 150 per year
  • Condor is 5 times cheaper

Saving of Hibernate Cost of 100W Electricity
(Idle State) for 16 Hours out of 24 Cost of
Condor Cost of 150W Electricity (Condor State)
Cost of 100W Electricity (Idle State) Cost of
Dedicated Cost of 150W Electricity (Condor
State) Cost of 100W Electricity (Air Con)
20
Environmental Impact
Based on a P4 3GHz PC with 512MB RAM
  • Makes sound environmental sense
  • Hibernate saves 650Kg CO2 per year
  • Condor 325Kg CO2 per year (max)
  • Dedicated 1,625Kg CO2 per year
  • Condor is 5 times greener

Saving of Hibernate Cost of 100W Electricity
(Idle State) for 16 Hours out of 24 Cost of
Condor Cost of 150W Electricity (Condor State)
Cost of 100W Electricity (Idle State) Cost of
Dedicated Cost of 150W Electricity (Condor
State) Cost of 100W Electricity (Air Con)
21
Across Campus
Based on 10,000 P4 3GHz PCs with 512MB RAM
  • Makes sound financial sense
  • Hibernate would save 600,000 per year
  • Hibernate 16 out of 24 hours
  • Makes sound environmental sense
  • Hibernate would save 6,500T CO2 per year
  • Rainforest required 52Km2
  • Rainforest required 40 area of Cardiff

Saving of Hibernate Cost of 100W Electricity
(Idle State) for 16 Hours out of 24 Cost of
Condor Cost of 150W Electricity (Condor State)
Cost of 100W Electricity (Idle State) Cost of
Dedicated Cost of 150W Electricity (Condor
State) Cost of 100W Electricity (Air Con)
22
Cardiffs Condor Pool
  • ...is the equivalent of a 500,000 supercomputer
  • costs 50,000 in equipment, power, and staff
  • improves return on investment
  • ...is one of the largest pools in the UK
  • and we plan to expand the pool
  • is probably the most utilised pool in the UK
  • by a factor of 10
  • ...has more users than other pool in the UK
  • and we are working hard to keep it that way

Nobody corrected me at the 1st Campus Grids SIG
in Oxford Nobody corrected me at the 21st Open
Grid Forum in Manchester
23
The ARC Spectrum
  • HPC
  • Tightly Coupled
  • Supercomputers
  • NUMA Machines
  • Million
  • HTC
  • Loosely Coupled
  • Small Clusters
  • Campus Grids
  • Thousand
  • H Thousand

Large Clusters SMP H Thousand Million
24
The ARC Division
  • ARCCA will provide, co-ordinate, support and
    develop advanced research computing services for
    researchers at Cardiff University
  • ARCCA will also work with clients and partners
    outside the University through a range of
    outreach activities
  • ARCCA is staffed with experts in the field who
    are already available to help and support your
    research needs through a range of services
  • ARCCA is procuring a range of dedicated high-end
    computing equipment which is planned to be fully
    operational by early 2008

25
The ARC Organisation
  • Prof Martyn Guest Director of ARC
  • Dr Christine Kitchen Manager of ARC
  • Dr James Osborne Applications
  • Mr Huw Lynes Infrastructure
  • Ms Liz Fitzgerald Admin Officer
  • Another Programmer

26
Prof Martyn Guest
  • 2007
  • Director of Advanced Research Computing at
    Cardiff
  • 1995
  • Associate Director of Computational Science and
    Engineering at Daresbury
  • 1971
  • PhD Theoretical Chemistry
  • 1967
  • BSc Chemistry

27
The ARC Cluster
  • 256 x Compute Nodes (Cluster)
  • Dual Socket Quad Core Intel Xeon E5472 3.0GHz
  • 16 Gb of Memory
  • ConnectX Infiniband Dual GigE
  • 4 x Compute Nodes (SMP)
  • Quad Socket Quad Core Intel Xeon X7350 2.93GHz
  • 32 Gb of Memory, 1Tb of Local Disk (RAID5)
  • ConnectX Infiniband Dual GigE Resilient PSU

28
The ARC Cluster
  • 4 x Login Nodes
  • Dual Socket Quad Core Intel Xeon E5472 3.0GHz
  • 32 Gb of Memory, 0.5Tb of Local Disk (RAID1)
  • ConnectX Infiniband Dual GigE Resilient PSU
  • 2 x Storage Nodes
  • Dual Socket Quad Core Intel Xeon E5472 3.0GHz
  • 32 Gigabytes of Memory Resilient PSU
  • ConnectX Infiniband Dual GigE Fibre Channel

29
Virtualization
  • Virtualization is a broad term that refers to the
    abstraction of computer resources
  • This includes making a single physical resource
    appear to function as multiple logical resources
  • Or it can include making multiple physical
    resources appear as a single logical resource

30
Central Manager Utilisation
Based on 6 months of monitoring
  • CPU (Percentage) 16.60 (average) 65.99 (max)
  • (Single Socket Single Core Intel Xeon 2.4GHz)
  • RAM (Gb) 1.09 (average) 1.60 (max)
  • 55.00 and 80.00 of current capacity (2 GB)
  • Disk (Gb) 1.25 (average) 1.50 (max)
  • 1.71 and 2.05 of current capacity (73 GB)

31
Central Manager Utilisation
Based on 6 months of monitoring
  • Net In (Kbps) 29.66 (average) 45.39 (max)
  • 0.02 and 0.04 of current capacity (Gigabit)
  • Net Out (Kbps) 39.13 (average) 86.17 (max)
  • 0.03 and 0.07 of current capacity (Gigabit)

32
Central Manager Virtualization
Based on 6 months of monitoring
  • 1 x Condor Server
  • Dual Socket Quad Core Intel Xeon E5472 3.0GHz
  • 32 Gb of Memory, 0.5Tb of Local Disk (RAID1)
  • Dual GigE Resilient PSU
  • 4 x Virtual Central Managers ?
  • 2 x Virtual Submit Nodes ?

33
Design Patterns
  • A Design Pattern is a general repeatable solution
    to a commonly occurring problem in software
    design
  • A design pattern is not a finished design that
    can be transformed directly into code
  • It is a description or template for how to solve
    a problem that can be used in many different
    situations

34
(No Transcript)
35
Questions
  • condor_at_cardiff.ac.uk
  • http//www.cardiff.ac.uk/arcca/
  • http//www.cs.wisc.edu/condor/
Write a Comment
User Comments (0)
About PowerShow.com