SLAC Site Report - PowerPoint PPT Presentation

About This Presentation
Title:

SLAC Site Report

Description:

5 remaining Solaris 7 systems. Starting to downsize our Solaris batch farm ... Investigating LSF to manage Windows and Mac OSX clusters ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 17
Provided by: chuckb6
Learn more at: https://www.racf.bnl.gov
Category:
Tags: slac | report | site | windows

less

Transcript and Presenter's Notes

Title: SLAC Site Report


1
SLAC Site Report
  • Len Moss
  • SLAC Computing Services
  • Stanford Linear Accelerator Center

2
Experiment Status
  • BaBar
  • New run startup delayed due to serious electrical
    accident in the Linac.
  • Ramping down Objectivity for data in favor of new
    xrootd format
  • Glast
  • Starting telescope assembly and cosmic ray
    testing in December
  • Planning 75 node cluster with 25 nodes and 25 TB
    of disk in 2005

3
Storage Expansion

4
BaBar data in HPSS
5
Processor Updates
  • Purchased 300 Sun Fire 20Z systems, due this
    week
  • Dual 2.2 GHz Opterons, 2GB memory, 1 MB L2 cache
  • 36 GB SCSI drive, extra drive slot available
  • Power management via separate LAN and service
    processor
  • 256 systems for batch farm, replacing 256 Sun
    Netra T1s
  • Remainder for miscellaneous servers
  • Will run 32-bit RHEL 3 kernel
  • Replaced all power supplies on our Rackable Xeon
    systems
  • 384 systems taken off-site by vendor over long
    weekend power outage

6
(No Transcript)
7
Managed systems, including desktops
Solaris 9
Solaris 8
Solaris 7
RH 9
7.3
RHEL3
RH 7.2
RH 6.2
8
Linux Status
  • Red Hat Enterprise 3 (RHEL3) rolled out to nearly
    all servers and about 75 of desktops
  • Trying to upgrade all remaining pre-RHEL3
    systems, but
  • BaBar needs RH 7.2 build capacity for some time
  • Lack of support from Fedora Legacy is serious
    concern
  • Will try to restrict RH 7.2 to a few servers

9
Linux Status, contd.
  • Weekly Red Hat phone meetings very useful
  • Have opened about 50 issues, currently about 16
    active
  • Updates
  • Cron job to pull all updates from Red Hat Network
  • Use yum to update onsite systems
  • Provide RHN entitlements to update mobile and
    offsite systems
  • Starting to look at Scientific Linux, so far only
    for a few build and interactive servers

10
Solaris Status
  • Solaris
  • Solaris 9 on most Sun systems
  • 5 remaining Solaris 7 systems
  • Starting to downsize our Solaris batch farm
  • Will look at Solaris 10 soon

11
AFS Issues
  • Support
  • Agreement with Sine Nomine for AFS support on
    Linux and Solaris
  • Windows client support also available
  • Contributed support to Jeff Altman for work on
    Windows client
  • Future projects may include AFS device driver
  • Encryption
  • Recently turned on encryption for all clients
    except batch workers

12
Kerberos 5
  • Recently switched to Heimdal Kerberos 5 KDCs on
    AFS DB servers
  • Fully K4 and AFS (kaserver) compatible
  • No problems at cutover users never noticed
  • Clients still mostly AFS (klog, et al.) but
  • Gradually rolling out PAM module
  • Most admin Perl scripts using HeimdalKadm5
  • SLACs K5 realm info in DNS

13
Request Tracker (RT)
  • Replaced pure-email system with RT3
  • Open source, support available from Best
    Practical
  • Web interface for admins, users still use email
    (for now)
  • MySQL backend, DB growing at 0.5 MB/day
  • Average of 25 new tickets per weekday
  • Interface and work flow can be extensively
    customized

14
Platform LSF
  • Now using LSF to manage a 64-node MPI cluster
  • Investigating LSF to manage Windows and Mac OSX
    clusters
  • Platform now has world-wide HEP terms available
  • Unit buy-in cost depends on total HEP licenses
  • Annual support depends on cluster size, but is
    tiered and capped

15
New Projects
  • KIPAC
  • Mac OSX cluster
  • 10 workers, managed by LSF, and 4 servers
  • (see Chucks talk later this week)
  • Plan to buy a large SMP
  • Probably SGI Altix (Itanium, Suse Linux)
  • Research project Huge Memory Machine
  • Pilot phase 64-node AMD Opteron cluster with 0.5
    to 1.0 TB of memory
  • (see Chucks talk later this week)

16
Windows Storage
  • Windows storage at 8 TB and doubling every year
    (faster than Moores Law)
  • Quotas implemented using Veritas StorageCentral
  • User space allocated in 500 MB chunks up to 2 GB
  • Initial group space set to 10 GB or 10 above
    current use
  • Group space will grow equally over time
  • Groups with larger needs can purchase additional
    space in TB chunks
  • Veritas CommandCentral procured to manage
    storage
  • EMC CX600, Hitachi 9980, Sun 6120, EMC AX100,
    Emulex 9002 and 9802 HBAs
  • Brocade 3800 and 3900 switches

17
Windows Backup
  • Snapshot LUNs, Exchange 2003 and MS SQL storage
    to EMC AX100 SATA disk usingVeritas Storage
    Foundation and Flashsnap and MS VSS
  • Veritas NetBackup then archives to STK L180 LTO
    library
  • Expect to be able to recover by mounting a
    snapshot volume on the EMC AX100s, full recovery
    within 4 hours
  • Should be online by Q1 2005

18
Other Windows Projects
  • Migrated AD from Windows 2000 to Windows 2003
  • Using Thursbys ADmitMac
  • Windows Dfs access for Mac OSX users
  • Uses Windows Kerberos authentication
  • Plan to implement SpySweeper EE anti-spyware for
    Windows desktops

19
Other Windows Projects, contd.
  • Investigating Windows XP SP2
  • Currently blocked via GPO
  • Will probably treat more like a new OS rather
    than a Service Pack
  • Investigating Firewall Authenticated Bypass to
    manage systems that have the firewall applied
  • Windows questions?
  • Send mail to Brian Scott, btscott_at_slac.stanford.ed
    u
Write a Comment
User Comments (0)
About PowerShow.com