Working Effectively at NERSC - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Working Effectively at NERSC

Description:

Intended for large jobs, rather than mass throughput ... PESSL, MASS, WSMP, IMSL, NAG, NAG-SMP, NAG-MPI, LAPACK, PARPACK, SuperLU, ccSHT, ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 52
Provided by: tomde6
Category:

less

Transcript and Presenter's Notes

Title: Working Effectively at NERSC


1
  • Working Effectively at NERSC
  • In-Depth Advice for New Users
  • October 6, 2003
  • Thomas M. DeBoni
  • NERSC User Services Group
  • TMDeBoni_at_LBL.GOV
  • 510-486-8617

2
Topics
  • IBM SP system
  • About seaborg
  • Hardware
  • Resources
  • Security
  • Using seaborg
  • Interactively
  • Batch jobs
  • Programming, optimization, and performance
  • HPSS Mass Storage Systems
  • About HPSS
  • Hardware and file systems
  • Using HPSS
  • Security and access utilities

3
IBM SP System - seaborg
  • A capability system
  • Intended for large jobs, rather than mass
    throughput
  • Goals are high availability, high utilization,
    effective shared use by large user community
  • This means mostly batch-oriented
  • Some resources are dedicated to software
    development, pre- and post-processing,
    interaction with servers and mass storage, remote
    logins, etc.
  • So, some interactive, and fast-turnaround usage
    is possible, too

4
SP Hardware 1
  • IBM SP seaborg.nersc.gov

5
SP Hardware , 2
  • Each Nighthawk II node contains
  • 16 Power3 processors
  • 375 MHz, 1.5 GF/s peak performance (4
    flops/cpu/clock, via FMA floating-point
    multiply-add instruction)
  • L1 cache 64 KB data, 32 KB instructions
  • L1 line size 128 Bytes
  • L2 cache 8192 KB
  • Colony switch two adaptors/node
  • Most nodes contain 16 GB memory
  • 64 nodes have 32 GB
  • 4 nodes have 64 GB memory

6
Useful SP References
  • IBM Power 3 Documentation
  • http//publib-b.boulder.ibm.com/Redbooks.nsf/
  • RedbookAbstracts/sg245155.html?Open
  • NERSC Guide for New Users
  • http//hpcf.nersc.gov/help/new_user/
  • NERSC Document on Running on seaborg
  • http//hpcf.nersc.gov/computers/SP/running_jobs/

7
SP User Resources, 1
  • GPFS (General Parallel File System) used more
    extensively on seaborg than any other SP
  • HOME
  • Quota 10 GB, 15000 inodes
  • Not backed up
  • SCRATCH
  • Quota 250 GB, 50000 inodes
  • Persistent, but not permanent (may be purged)
  • The best target for parallel I/O by large-scale
    high performance programs
  • Special-request, temporary expansions of scratch
    space are possible, but reviewed by management
  • Not backed up

8
SP User Resources, 2
  • Use myquota command to see where you stand with
    regard to your limits
  • myquota
  • ------- Block (MB) ------ ---------
    Inode ---------
  • FileSystem Usage Quota InDoubt Usage
    Quota InDoubt
  • ---------- ------- ------- ------- -------
    ------- -------
  • /u4 4525 10240 75 6038
    15000 175
  • /scratch 388 256000 0 143
    50000 0
  • Please use only HOME and SCRATCH file systems
    others exist but are required to keep the system
    running, and their overuse may crash nodes

9
SP Security, 1
  • Newly assigned passwords are single-use only
    must be changed on first use
  • Password, shell changes are made on the special
    seaborg node sadmin.nersc.gov
  • Connect via ssh, using initial password
  • Use passwd or chsh command
  • You should be disconnected when password change
    is done
  • Change must propagate to all nodes usually takes
    no more than an hour (usually happens at ten
    minutes after the hour)
  • Support initializes new accounts and passwords
    afterwards, consultants can reset passwords

10
SP Security, 2
  • Use SSH to connect to login nodes
  • NERSC recommeds openssh
  • Keep it up to date
  • Terminal session connections should be made to
    seaborg.nersc.gov
  • Do not connect to specific login nodes
  • Do not connect directly to any compute nodes
  • No incoming ftp allowed
  • Use scp or sftp
  • Outgoing ftp is allowed
  • Dont expose cleartext passwords
  • Dont share accounts

11
Interactive Use of seaborg, 1
  • Everything on seaborg is done in terms of nodes
    - units of compute resources
  • A node is a shared-memory computer
  • 16 cpus
  • 16, 32, or 64 GB RAM
  • Access to GPFS (parallel file system), switch,
    and networks
  • You are charged for all 16 cpus in each node you
    use, except for login sessions
  • A one-cpu batch job costs as much as a 16-cpu
    parallel batch job
  • 17 cpus cost as much as 32, etc.

12
Interactive Use of seaborg, 2
  • Two node pools available to users
  • Login nodes (6)
  • Terminal sessions, interactive serial jobs
  • Compute nodes (380)
  • Everything else
  • There are also GPFS nodes, network nodes, and
    spare nodes
  • Good general reference
  • http//hpcf.nersc.gov/computers/SP/

13
Interactive Use of seaborg, 3
  • Login sessions are the same as on any other Unix
    system no need to specify or know which node
    youre on
  • Serial code execution is easy
  • ./a.out
  • Limits 128MB, 3600 CPU seconds
  • Parallel execution is slightly harder
  • poe ./a.out nodes 2 procs 32
  • Limits 8 nodes (128 processors), 30 minutes
  • These limits are intentional large/long runs
    should be run as batch jobs.
  • Note Use of POE is optional, if job was compiled
    for parallel execution

14
Interactive Use of seaborg, 4
  • When you use POE (explicitly or implicitly), you
    get resources from LoadLeveler which manages the
    compute nodes
  • POE gets these resources from the INTERACTIVE
    class
  • You may not succeed, if resources are not
    immediately available
  • llsubmit Processed command file through Submit
    Filter "/usr/common/nsg/etc/subfilter".
  • ERROR 0031-365 LoadLeveler unable to run job,
    reason
  • LoadL_negotiator 2544-870 Step
    s00509.nersc.gov.199123.0 was not considered to
    be run in this
  • scheduling cycle due to its relatively low
    priority or because there are not enough free
    resources.

15
Batch Use of seaborg
  • You need to submit your run as a batch job if
  • You want to make sure your job runs
  • If you need more resources (cpu, memory) than the
    interactive class allows
  • You need longer run times than the interactive
    class allows
  • You dont want to share your CPU with other users
  • You can monitor a batch job while it runs
  • Use llqs, watch the files in SCRATCH
  • There is no straightforward way to steer a
    batch job

16
Batch Execution, 1
  • LoadLeveler is your friend
  • llsubmit, llqs, llcancel

17
Batch Execution, 2
  • When you think It really means
  • Class Queue
  • Node 16 processors
  • Process Processor
  • Task Processor
  • Execution time Wall-clock time
  • Thread Loop slice (usually)
  • Queue limits TODAYs queue limits
  • These definitions can be deviated from, but not
    usually to any useful effect
  • E.g. running more than 16 threads or tasks per
    node

18
Batch Execution, 3
  • Batch jobs up to 8 hours long are safe from
    system downtime
  • 8 hour warning of scheduled downtimes
  • Jobs will not be allowed to begin execution after
    Tdowntime - 80000
  • Exception Backfill class
  • Batch jobs longer than 8 hours, and backfill
    class jobs will be killed at scheduled system
    downtime
  • You will be charged
  • Protect yourself with frequent checkpoints
  • There are no refunds given on seaborg

19
Batch Execution, 4
  • Policies exist governing how jobs may be
    submitted, how classes may be used, etc.
  • Premium class may be useful in urgent situations
  • Just before a publication deadline
  • Just before a major meeting
  • Charges accrue rapidly
  • New users and students tend to overuse this class
  • There are no refunds given on seaborg
  • NERSC Batch Policy Guide
  • http//hpcf.nersc.gov/computers/SP/running_jobs/ba
    tch.htmlpolicy

20
Batch Execution, 5
  • Misbehaving jobs will be killed
  • Misbehavior means anything that inhibits other
    jobs from using their fair share of the machine,
    such as
  • Generating LOTS of files (e.g., filling system
    logs)
  • Making LOTS of calls to system()
  • Doing LOTS of small-block I/O
  • Doing LOTS of small-message communication
  • Chained or self-submission to short-runtime
    classes
  • Using compute nodes for interactive work
  • There are no refunds given on seaborg

21
Batch Execution, 6
  • Misbehavior does not (necessarily) mean
  • Idling your processors during serial work
  • Doing inefficient calculation
  • Doing inefficient I/O
  • Doing inefficient intertask communication
  • There are no refunds given on seaborg

22
Batch Scripting, 1
  • A batch job consists of a script file and its
    computational requirements
  • Codes to execute
  • Files to manipulate
  • Shell commands to execute
  • A batch script is a shell script with some
    initial comment lines that are significant to
    LoadLeveler
  • The LoadLeveler lines characterize the job and
    its resource needs
  • Submit a job by naming its script file in a
    submission command
  • llsubmit myjob

23
Batch Scripting, 2
  • cat myjob
  • _at_ job_name myjob optional
  • _at_ account_no repo_name optional
  • _at_ requirements (Memory 65536) optional
  • _at_ output myjob.out advised
  • _at_ error myjob.err advised
  • _at_ environment COPY_ALL advised
  • _at_ notification complete advised
  • _at_ network.MPI csss,not_shared,us default
  • _at_ node_usage not_shared default
  • _at_ job_type parallel necessary
  • _at_ class regular necessary
  • _at_ task_per_node 16 necessary
  • _at_ node 4 necessary
  • _at_ wall_clock_limit 010000 necessary
  • _at_ queue necessary
  • ./a.out lt input_file gt output_file

24
Batch Scripting, 3
  • Batch (shell) scripts can be, essentially,
    complete programs, containing
  • Sequential commands
  • Conditional operations
  • Loops
  • They can require debugging
  • They can be executed interactively
  • Large parallel executions will not occur if
    resource requirements exceed interactive limits
  • Good advice keep it simple

25
Batch Jobs, 1
  • Potentially useful things to do in a batch job
    include
  • Moving files to from storage
  • Moving files to from SCRATCH
  • Creating and listing directories
  • Renaming files to identify their origins
    (appending dates, times, etc.)
  • Echoing messages for audit trail purposes
  • Writing restart files
  • Checking command completion status
  • Moving files to from other systems
  • Multiple code executions
  • But watch out for expensive serial operations

26
Batch Jobs, 2
  • Potentially troublesome things to do in a batch
    job include
  • Fetching files from storage (slow tape mounts)
  • Moving files to from other systems (slow
    network transfers)
  • Compilation
  • Batch file editing
  • Post-processing output data
  • I/O to HOME (might exceed quotas)
  • Steering - anything requiring interaction with
    the job
  • Anything significant that uses less than your
    full set of parallel resources

27
Monitoring Batch Jobs, 1
  • llq
  • llqs is formatted nicer
  • llqs -u username
  • Status (ST) field
  • R Running
  • I Idle
  • NQ Not Queued
  • ST Starting
  • RP Remove Pending
  • HU User Hold
  • HS System Hold

28
Monitoring Batch Jobs, 2
  • s00513 239 llsubmit moldy.scr
  • --------------------------------------------------
    ------------
  • User deboni Repo mpccc
  • Job Name moldyjob.216e.26 Group mpccc
  • Class Of Service debug Job Class
    debug
  • Job Accepted Mon Aug 25 162151 2003
  • --------------------------------------------------
    ------------
  • llsubmit Processed command file through Submit
    Filter "/usr/common/nsg/etc/subfilter".
  • llsubmit The job "s00613.nersc.gov.69612" has
    been submitted.
  • s00513 240 llqs -u deboni
  • Step Id JobName UserName Class ST
    NDS TK WallClck Submit Time
  • --------------- ---------- -------- --------- --
    --- -- -------- -----------
  • s00613.69612.0 moldyjob.2 deboni debug I
    4 16 002900 8/25 1621
  • s00513 241 llqs -u deboni
  • Step Id JobName UserName Class ST
    NDS TK WallClck Submit Time

29
Programming seaborg, 1
  • Languages
  • IBMs compilers are organized into compiler sets,
    with separate front ends (names)
  • Fortran 77, 90, 95, HPF xlf, pghpf
  • C xlc, gcc
  • C xlc, kcc
  • Some exist in multiple versions (xlf 7, xlf 8)
  • Some have dubious futures (kcc)
  • Some compile different language versions (C)
  • Special versions for shared-memory xlf_r xlc_r,
    etc.
  • Special versions for MPI mpxlf90, mpCC, etc.
  • Note Some of us recommend routine use of the
    _r versions, since they are compatible with
    64-bit addressing

30
Programming seaborg, 2
  • 32-bit addressing is the default
  • Gives your code access to a 2 GB address space
  • You must specify your heap and stack sizes during
    compilation
  • 64-bit addressing is available
  • Gives your code access to all the RAM on the
    large-memory nodes
  • No heap or stack size specs needed
  • Code must be fully compiled, using _r
    compilers, and relinked with 64-bit libraries
  • NERSC Guide to SP Memory Management
  • http//hpcf.nersc.gov/software/ibm/sp_memory.html

31
Programming seaborg, 3
  • Parallelism
  • Shared memory with Pthreads, OpenMP, and IBM SMP
    directives
  • Single-node only
  • Distributed memory with MPI, LAPI
  • Single and multiple nodes
  • Max 4096 tasks (cpus)
  • Hybrid, with both OpenMP and MPI
  • Max 6080 CPUs (entire compute pool)
  • http//hpcf.nersc.gov/computers/SP/programming.ht
    ml

32
Programming seaborg, 4
  • Libraries
  • Math
  • ESSL, PESSL, MASS, WSMP, IMSL, NAG, NAG-SMP,
    NAG-MPI, LAPACK, PARPACK, SuperLU, ccSHT, FFTW,
    ACTS (Aztec, PETSc, ScaLAPACK), Sparse Solvers,
    Random Numbers
  • Graphics
  • NCAR
  • I/O
  • netCDF, HDF, HDF5, MPI I/O
  • Physics
  • CERNLIB
  • A number of canned applications are also
    available (Gaussian 98, NWChem, etc.)
  • http//hpcf.nersc.gov/software/ibm/

33
Debugging on seaborg
  • Debugging - dont optimize an incorrect program
  • You may need to compile with -g or -G options
  • TotalView - a visual debugger for serial and
    parallel programs
  • http//hpcf.nersc.gov/software/ibm/totalview.php
  • pdbx - an IBM text-based parallel debugger
  • http//hpcf.nersc.gov/vendor_docs/ibm/pe/am103mst1
    2.htmlHDRUPDBX
  • gdb - the GNU debugger
  • http//hpcf.nersc.gov/software/tools/GNU.html
  • Assure - a source tool for checking OpenMP usage
  • http//hpcf.nersc.gov/software/tools/kap.html
  • ZeroFault - a tool for analyzing memory usage in
    running code
  • http//hpcf.nersc.gov/software/tools/zerofault.ht
    ml

34
Optimizing on seaborg
  • Optimization is essential
  • The compilers can do a lot for you, but you may
    have to experiment to find the best set of
    options
  • -On start at O3, try O4, O5
  • -qstrict strict arithmetic highly advised
  • -qarchpwr3 specific for seaborg highly advised
  • -qtunepwr3 specific for seaborg highly advised
  • -qhot high order transforms try it
  • -qipa interprocedural analysis try it
  • -qessl, high performance libraries for
    intrinsics and
  • -lessl,-lmass ordinary arithmetic try them
  • Note By default, arithmetic on seaborg is
    slightly better than IEEE standard this can be
    overridden, if desired, for compatibility with
    other machines -qfloatnomaf
  • http//hpcf.nersc.gov/computers/SP/options.html

35
Tuning Code on seaborg, 1
  • Measure your codes performance
  • Processor performance counters are built into the
    cpu chips, and are accessible in sets
  • Access these counters through three interfaces
  • hpmcount - a preamble command
  • hpmcount a.out
  • poe hpmcount a.out -nodes x -procs y
  • poe - a special utility for aggregating the
    counts for parallel jobs
  • poe hpmcount a.out -nodes x -procs y
  • hpmlib - a library for instrumenting regions of
    code
  • http//hpcf.nersc.gov/software/ibm/hpmcount/

36
Tuning Code on seaborg, 2POE Output from a
4-Node Run

  • hpmcount (V 2.4.2) summary (aggregate of 64 POE
    tasks)
  • Average execution time (wall clock time)
    976.545 seconds
  • Average amount of time in user mode
    964.780313 seconds
  • Average amount of time in system mode
    1.516094 seconds
  • Total maximum resident set size
    0.525768 Gbytes
  • Total shared memory use in text segment
    1476910504 Kbytessec
  • Total unshared memory use in data segment
    51641993652 Kbytessec
  • PM_CYC (Cycles)
    23086152670268
  • PM_INST_CMPL (Instructions completed)
    26765551984318
  • PM_TLB_MISS (TLB misses)
    7839521349
  • PM_ST_CMPL (Stores completed)
    4334617829585
  • PM_LD_CMPL (Loads completed)
    10148999201076
  • PM_FPU0_CMPL (FPU 0 instructions)
    3387695961706
  • PM_FPU1_CMPL (FPU 1 instructions)
    2085096714987
  • PM_EXEC_FMA (FMAs executed)
    2695124470750
  • Utilization rate
    98.48678125
  • Avg number of loads per TLB miss
    3198.709015625
  • Load and store operations
    14483617.031 M

37
Tuning Code on seaborg, 3
  • How good is good performance?
  • Peak floating point rate is 1.5 Gflips/processor
    or 24 Gflips/node
  • This is not achievable, as memory cannot keep up
    with the operand demand it would generate
  • You should be able to get
  • 100 Mfllips/processor with compilation options
  • 200 - 300 Mflips/processor with the right library
    choice
  • 400 - 500 Mflips/processor with careful
    engineering
  • 500 Mflips/processor with heroic effort
  • Targets for optimization include
  • Memory - strides and cache use
  • Virtual memory - TLB use
  • Communications organization
  • I/O organization
  • Results may be a code highly customized for
    an/our SP system

38
Tuning Code on seaborg, 4
  • Measurement in depth
  • Profile your code with xprofiler
  • http//hpcf.nersc.gov/software/ibm/xprofiler/
  • Measure your codes MPI performance with vampir
  • http//hpcf.nersc.gov/software/tools/vampir.html/
  • Analyze and tune code performance with tau
  • http//acts.nersc.gov/tau/at-nersc.html
  • Analyze execution traces with paraver, dimemas
  • http//hpcf.nersc.gov/software/tools/cepba.html/

39
A Word About Modules
  • module - a Unix utility for managing libraries,
    search paths, and environment variables used in
    compilation and loading
  • Allows easy management of software packages in
    the face of evolving file systems, etc.
  • Most seaborg modules and the module utility are
    maintained by User Services Group
  • Modules are in use on all NERSC computers
  • Software packages, utilities, libraries, etc. are
    installed into modules which can be made
    available with a single command
  • Loading a module makes it available, and you
    need not know where the software components are
    stored
  • Obviates hard-coded paths to software in your
    makefiles
  • Useful module commands
  • module avail - shows all installed software
    packages
  • module load module_name - load named package
  • module list - shows all loaded packages

40
HPSS Mass Storage Systems
  • HPSS High Performance Storage System
  • Designed/developed by government/industry
    consortium (including IBM)
  • Used at NERSC for archival storage
  • 35 disk cache, 8.5 PB tape storage
  • Connects to NERSC systems at up to 150 MB/sec.
  • 4 TB/day transferred in or out
  • Two systems archive.nersc.gov, hpss.nersc.gov
  • Access
  • On-site or off-site
  • Use hsi, ftp, or pftp
  • No clear-text passwords allowed
  • Dont use full domain names for on-site access
  • Accounts
  • Allocations by Storage Resource Units (SRUs)
  • http//hpcf.nersc.gov/storage/hpss

41
About HPSS, 1
  • archive, aka, the user system, is primarily
    intended for NERSC Users
  • hpss, the other system, is primarily intended for
    disaster recovery, full file system backups, etc.
  • You have space on both, but archive has more
    capacity, will likely be less busy and more
    responsive
  • HPSS is not a normal Unix file system it is a
    separate ensemble of systems accessed by special
    utilities
  • a hierarchy of disk and tape hardware
  • a database subsystem to keep track of files,
    tapes, disk caches, etc.
  • Access times are unpredictable, due to tape mount
    latency, but are typically short be careful in
    batch jobs

42
About HPSS, 2
  • HPSS is not a normal Unix file system
  • It should not be used as an I/O target of running
    codes (difficult to do, in any case)
  • It can be used as an I/O target of a batch job
  • Prefetch files (potentially dangerous, due to
    tape mount latency)
  • Post-store files (good idea, but allow a bit of
    extra time in a batch job to complete it)
  • It has very fast network connections to all NERSC
    computers, but is not connected as an I/O device
  • It can do third-party transfers
  • It can do simultaneous connections to, and
    transfers between, multiple HPSS systems or sites

43
About HPSS, 3
  • There are no backups of HPSS it is the backup
  • Files and directories can be shared
  • Via normal permissions
  • Via project directories
  • Project directories are available on request
  • A number of users need to share files
  • A special group is created, and they are made
    members of it
  • A special directory, owned by that group, is
    created in HPSS
  • The group members can move files into it, and
    share them there
  • Somebody pays for this usage, usually the
    requester

44
About HPSS, 4
  • HPSS is undergoing more or less constant upgrades
    and improvements
  • System software
  • Access utilities
  • Tape drives
  • Tapes
  • This tends to keep the NERSC systems ahead of
    demand
  • There is a regular debug/maintenance period every
    Tuesday from 1000 AM to 1200 noon
  • Access to HPSS will stall during this period
  • Watch out for this in batch jobs!
  • hpss_avail inquiry utility tests for
    availability

45
Using HPSS, 1
  • HPSS system software does not allow shells
  • No incoming ssh is possible (so, no sftp or scp)
  • Incoming ftp is allowed
  • Cleartext login names and passwords are refused
  • In-house access utilities auto-authenticate (hsi,
    pftp) after credentials are initialized
  • NERSC uses DCE authentication on HPSS
  • Support assigns initial passwords afterwards,
    consultants can reset them
  • All authentication services provided by special
    authentication server, auth.nersc.gov
  • Change passwords by connecting to special
    authentication server (see next slide)
  • No delay in usability of new authentication info,
    once set up or changed
  • http//hpcf.nersc.gov/storage/hpss/passwords

46
Using HPSS, 2
  • Access the authentication server by
  • ssh -l auth auth.nersc.gov
  • Use password exposed by module load www on any
    NERSC computer
  • Change DCE password by using chpass command
  • Get combo string by using ftppass command
  • Store combo strings in .netrc file to allow
    auto-authentication set file permissions to 600
  • http//hpcf.nersc.gov/storage/hpss/ftp_nopass.html

47
Using HPSS, 3
  • Example .netrc file
  • ls -l .netrc
  • -rw------- 1 fubar ccc 518 Apr 17
    2002 .netrc
  • cat .netrc
  • machine hpss.nersc.gov
  • login 0X2g19BrJ2tGSDF72j9NMHs5jS5Cwn2Bdlxn4zKisr
    k
  • password 0X2g19BrJ2tGSDF72j9NMHs5jS5Cwn2Bdlxn4zK
    isrk
  • machine archive.nersc.gov
  • login 0R0gH3BrJ2tGSDF72j9NMHs5jS5Cwn2Bdlxn4zKisr
    k
  • password 0R0gH3BrJ2tGSDF72j9NMHs5jS5Cwn2Bdlxn4zK
    isrk

48
Using HPSS, 4
  • Access utilities
  • From outside NERSC, use ftp
  • Special authentication encrypted combo strings
    are used for login name and password
  • Each combo string is good for use on one remote
    machine as many combo strings can be generated
    as are needed
  • From inside NERSC, ftp may be an alias for pftp
  • Locally-produced parallel version of ftp
  • Familiar command structure and syntax
  • Auto-authenticates, once credentials are
    initialized
  • Generate credentials by connecting and logging in
    normally
  • ftp -l archive
  • http//hpcf.nersc.gov/storage/hpss/ftpaccess

49
Using HPSS, 5
  • Access utilities
  • hsi
  • Uses DCE authentication (can use other forms)
  • Connects to archive by default
  • Rich, powerful, convenient command set
  • Useable as interactive session or one-line
    command
  • Effiecient at recursive and meta-data operations
  • get -R, put -R, cp -R
  • chmod, chown, mv, mdel
  • Allows simultaneous multi-site connections,
    transfers
  • Generate credentials for auto-authentication
  • hsi -l
  • http//hpcf.nersc.gov/storage/hpss/hsiaccess
  • http//hpcf.nersc.gov/storage/hpss/hsi/

50
Using HPSS, 6
  • Dos and Donts of HPSS
  • Dont open and close an access session within a
    loop this beats up on the servers
  • Do multiple operations in a single session e.g.,
    build a list of files to access, and get/put them
    all in one session
  • Dont store lots of small files HPSS is
    optimized for large files
  • Do aggregate small files into larger ones for
    storage e.g., tar can be used with hsi in a Unix
    pipe for reads or writes (see man hsi for
    details)
  • Dont use ftp to move files around within HPSS
  • Do use hsi to rename, move, or change permissions
  • Dont let others access your HPSS space directly
  • Do tell us if you have special sharing needs
  • Broad hierarchies are (arguably) more efficient
    than deep ones
  • Watch out for name collisions from truncation
    (HPSS allows longer names than some Unix systems)
  • Watch out for your gets and puts - avoid
    accidental overwrites
  • Large recursive operations can fail from resource
    exhaustion

51
Blank Slide
Write a Comment
User Comments (0)
About PowerShow.com