Title: Rocks
1Rocks
2Primary Goal
- Make clusters easy
- Target audience Scientists who want a capable
computational resource in their own lab
3Philosophy
- Not fun to care and feed for a system
- All compute nodes are 100 automatically
installed - Critical for scaling
- Essential to track software updates
- RHEL 3.0 has 558 source RPM updates
- Released on Oct 21, 2004
- RHEL 4.0 has 55 source RPM updates
- Released on Feb 14, 2005
- Run on heterogeneous standard high volume
components - Use the components that offer the best
price/performance!
4More Philosophy
- Use installation as common mechanism to manage a
cluster - Everyone installs a system
- On initial bring up
- When replacing a dead node
- Adding new nodes
- Rocks also uses installation to keep software
consistent - If you catch yourself wondering if a nodes
software is up-to-date, reinstall! - In 10 minutes, all doubt is erased
- Rocks doesnt attempt to incrementally update
software
5Rocks Cluster Distribution
- Fully-automated cluster-aware distribution
- Cluster on a CD set
- Software Packages
- Full Red Hat Linux distribution
- Current release (v3.3.0) is based on Red Hat
Linux Enterprise 3.0 rebuilt from source - De-facto standard cluster packages
- Rocks packages
- Rocks community packages
- System Configuration
- Configure the services in packages
6Rocks Hardware Architecture
7Minimum Components
Local Hard Drive
Power
Ethernet
OS on all nodes (not SSI)
i386 (Pentium/Athlon), x86_64 (Opteron/EM64T), ia6
4 (Itanium) server
8Optional Components
- High-performance network
- Myrinet
- Infiniband (Infinicon or Voltaire)
- Network-addressable power distribution unit
- keyboard/video/mouse network not required
- Non-commodity
- How do you manage your management network?
9Storage
- NFS
- The frontend exports all home directories
- Parallel Virtual File System version 1
- System nodes can be targeted as Compute PVFS or
strictly PVFS nodes - Lustre Roll is in development
10Minimum Hardware Requirements
- Frontend
- 2 ethernet connections
- 18 GB disk drive
- 512 MB memory
- Compute
- 1 ethernet connection
- 18 GB disk drive
- 512 MB memory
- Power
- Ethernet
11Cluster Software Stack
12Rocks Rolls
- Rolls are containers for software packages and
the configuration scripts for the packages - Rolls dissect a monolithic distribution
13Rolls
- Think of a roll as a package for a car
14Rolls User-Customizable Frontends
- Rolls are added by the Red Hat installer
- Software within a roll is added and configured at
initial installation time
15Red Hat Installer Modified to Accept Rolls
16Approach
- Install a frontend
- Insert Rocks Base CD
- Insert Roll CDs (optional components)
- Answer 7 screens of configuration data
- Drink coffee (takes about 30 minutes to install)
- Install compute nodes
- Login to frontend
- Execute insert-ethers
- Boot compute node with Rocks Base CD (or PXE)
- Insert-ethers discovers nodes
- Goto step 3
- Add user accounts
- Start computing
- Optional Rolls
- Condor
- Grid (based on NMI R4)
- Intel (compilers)
- Java
- SCE (developed in Thailand)
- Sun Grid Engine
- PBS (developed in Norway)
- Area51 (security monitoring tools)
17Login to Frontend
- Create ssh public/private key
- Ask for passphrase
- These keys are used to securely login into
compute nodes without having to enter a password
each time you login to a compute node - Execute insert-ethers
- This utility listens for new compute nodes
18Insert-ethers
- Used to integrate appliances into the cluster
19Boot a Compute Node in Installation Mode
- Instruct the node to network boot
- Network boot forces the compute node to run the
PXE protocol (Pre-eXecution Environment) - Also can use the Rocks Base CD
- If no CD and no PXE-enabled NIC, can use a boot
floppy built from Etherboot (http//www.rom-o-ma
tic.net)
20Insert-ethers Discovers the Node
21Insert-ethers Status
22eKVEthernet Keyboard and Video
- Monitor your compute node installation over the
ethernet network - No KVM required!
- Execute ssh compute-0-0
23Node Info Stored In A MySQL Database
- If you know SQL, you can execute powerful
commands - Rocks-supplied command line utilities are tied
into the database - E.g., get the hostname for the bottom 8 nodes of
each cabinet
cluster-fork --query"select name from nodes
where ranklt9" hostaname
24Cluster Database
25Kickstart
- Red Hats Kickstart
- Monolithic flat ASCII file
- No macro language
- Requires forking based on site information and
node type. - Rocks XML Kickstart
- Decompose a kickstart file into nodes and a graph
- Graph specifies OO framework
- Each node specifies a service and its
configuration - Macros and SQL for site configuration
- Driven from web cgi script
26Sample Node File
lt?xml version"1.0" standalone"no"?gt lt!DOCTYPE
kickstart SYSTEM "_at_KICKSTART_DTD_at_" lt!ENTITY ssh
"openssh"gtgt ltkickstartgt ltdescriptiongt Enable
SSH lt/descriptiongt ltpackagegtsshlt/packagegt
ltpackagegtssh-clientslt/packagegt ltpackagegtssh-s
erverlt/packagegt ltpackagegtssh-askpasslt/packagegt
ltpostgt ltfile name"/etc/ssh/ssh_config"gt Host
CheckHostIP no
ForwardX11 yes ForwardAgent
yes StrictHostKeyChecking
no UsePrivilegedPort no
FallBackToRsh no Protocol
1,2 lt/filegt chmod orx /root mkdir
/root/.ssh chmod orx /root/.ssh lt/postgt lt/kickst
artgt
27Sample Graph File
lt?xml version"1.0" standalone"no"?gt ltgraphgt ltd
escriptiongt Default Graph for Rocks. lt/descripti
ongt ltedge from"base" to"scripting"/gt ltedge
from"base" to"ssh"/gt ltedge from"base"
to"ssl"/gt ltedge from"base" to"grub"
arch"i386,x86_64"/gt ltedge from"base"
to"elilo" arch"ia64"/gt ltedge from"node"
to"base"/gt ltedge from"node" to"accounting"/gt
ltedge from"slave-node" to"node"/gt ltedge
from"slave-node" to"autofs-client"/gt ltedge
from"slave-node" to"dhcp-client"/gt ltedge
from"slave-node" to"snmp-server"/gt ltedge
from"slave-node" to"node-certs"/gt ltedge
from"compute" to"slave-node"/gt ltedge
from"master-node" to"node"/gt ltedge
from"master-node" to"x11"/gt lt/graphgt
28Kickstart framework
29Appliances
- Laptop / Desktop
- Appliances
- Final classes
- Node types
- Desktop IsA
- standalone
- Laptop IsA
- standalone
- pcmcia
- Code re-use is good
30Architecture Differences
- Conditional inheritance
- Annotate edges with target architectures
- if i386
- Base IsA grub
- if ia64
- Base IsA elilo
- One Graph, Many CPUs
- Heterogeneity is easy
31Compute Node Installation Timeline
32Available Rolls
- Area51
- Tripwire and rootkit
- Condor
- High-throughput computing grid package
- IB
- Infiniband drivers and MPI from Infinicon
- Intel
- Compiler and libraries for Intel-based clusters
(Scalable Systems) - Grid
- NMI packaging of Globus
- PBS/Maui
- Job scheduling
- SCE
- Scalable cluster environment (Thailand)
- SGE
- Job scheduling
- Viz
- Easily set up nVidia-based viz clusters
- Java
- Java environment
- RxC
- Graphical cluster management tool (Scalable
Systems) - Lava
- Workload management (Platform Computing)
- IB-Voltaire
- Infiniband drivers and MPI from Voltaire
33Futures
34Rocks 4.0.0
- Currently in beta
- Based on RHEL 4.0
- Kernel v2.6
- Using CentOS as base operating environment
- CentOS is a RHEL rebuild
- When asked for a roll, input stock CentOS CDs
- Implication
- Opens the door for using any RHEL-based media
- Official RHEL bits
- Other RHEL clones (e.g., Scientific Linux)
35More Rolls
- Application-specific rolls
- Oil and Gas
- Computational Chemistry
- Rendering
- Bioinformatics