Storage and Distributed Work - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Storage and Distributed Work

Description:

Storage and Distributed Work. Rafael Marco, Celso Martinez, David Rodriguez, Oscar Ponce ... Replica Catalog ftp. Databases. Provide record information (rows/object) ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 16
Provided by: jesusmarc
Category:

less

Transcript and Presenter's Notes

Title: Storage and Distributed Work


1
Storage and Distributed Work
  • Rafael Marco, Celso Martinez, David Rodriguez,
    Oscar Ponce
  • Presented by Jesus Marco
  • IFCA Santander (Spain)
  • 23 April 2002
  • Meeting in Lyon

2
Issues?
  • Network storage vs local storage
  • Distributed file systems
  • Databases
  • Distributed work distributed storage
  • A practical point of view...

3
Local on-line Storage
  • Traditional disks
  • Low cost (150 GB - 300 euros)
  • EIDE disk (Ultra ATA 100 MB/s)
  • Now also FireWire or USB2
  • Medium cost ( 150 GB 1500 euros)
  • SCSI disk (Ultra 3 160 MB/s)
  • SCSI Fibre channel disk
  • RAID configurations for data servers
  • EIDE RAID
  • SCSI RAID

4
Network Storage
  • NAS storage systems
  • Simple to sophisticated servers
  • File access through Ethernet (Fast or Gigabit)
  • SAN storage systems
  • Fibre channel based
  • Record access through SCSI
  • iSCSI
  • Ethernet based but SCSI set
  • RAID for data servers
  • Gigabit cable/fibre ethernet vs Gigabit optical

5
Distributed File Systems
  • NFS
  • AFS, Coda (and /grid )
  • CXFS
  • GPFS
  • Shared-Disk File System for Large Computing
    Clusters
  • Recently released for IBM x-series based clusters
    (previously for i-series)
  • Data stripping
  • Distributed locking and recovery technology
  • Supports fully parallel access both to file data
    and metadata

6
Files and Databases
  • Hardware (disk) provides Sequential vs Direct
    access
  • Files
  • Direct for header, then sequential stream of bits
    (but jump if structure known)
  • Distributed access
  • Distributed file system
  • Replica Catalog ftp
  • Databases
  • Provide record information (rows/object)
  • Directly from disk (raw data for allocated disk)
  • Or from chunk files
  • User sees direct access to each record
  • Distributed databases
  • Propietary solutions (partition replication)
  • Ad-hoc catalog

7
Distributed Work
  • Example 1 Local (long) processing local
    storage pre/post-execution transfer of data
    (HTC)
  • Typical HEP simulation reconstruction jobs
  • Local Storage
  • Not critical (I/O moderate compared to processing
    time)
  • Long Term/ Catalog Storage
  • Low cost if possible, virtual staging to tapes
  • Good bandwidth for replication/distribution

8
Distributed Work
  • Example 2 Distributed processing on distributed
    storage
  • Typical HEP analysis
  • Local Storage
  • Critical (I/O rate important process 1TB in 5
    minutes)
  • Possibilities
  • Distributed file system on NAS or SAN with
    Gigabit
  • Expensive (? Not for EIDE RAID NAS? 1TB5000
    Euros?)
  • Not possible (?) across WAN
  • Local storage catalog
  • Numbers
  • 1TB transfer over Gigabit takes more than an
    hour, so files must be already close to
    processing nodes
  • Storage nodes (or elements)

9
Distributed Work
10
Distributed storage
  • Key point strip data to balance load
  • Example (from LEP analysis)
  • assume you have 1 TB of data from
  • Data 100 GB
  • QCD background 200 GB
  • WW background 400 GB
  • Higgs signal 300 GB
  • Your farm has 100 nodes, strip to each node
  • 1 GB of Data, 2 GB of QCD, 4 GB of WW, 3 GB of
    Higgs
  • 10 GB at 100 MB/s order of minutes for complete
    read
  • Process in parallel any query/data mining
  • But... How much can be done in parallel???

11
Distributed processing
  • Examples
  • Queries/histogramming
  • Trivial parallelization through load partitioning
  • Needs a master slaves
  • Communication MPI (collective)
  • Low Network traffic
  • Each node access one storage element (can be same
    node)
  • Processing can take place on the DBMS side
    (stored procedure)
  • Data Mining
  • Some good examples
  • Supervised learning Neural Networks (BFGS
    algorithm)
  • Unsupervised learning Self Organizing Maps

12
Distributed processing
  • Supervised learning Neural Networks
  • N input variables, LM hidden nodes, O output
  • Training find weights connecting via activation
    functions (sigmoids) all nodes ( NxLLxMMxO) so
    that
  • Answer to background is 0
  • Answer to signal is 1
  • Process
  • Start with random weights, apply NN, compute
    errors and best direction for minimization (from
    gradient), try new weights
  • Repeat EPOCHS (usually 100-1000) until finding
    minimum
  • Key errors are additive and so training can be
    distributed and global error is simply the
    addition of all errors
  • More subtle distribute also the matrix
    computations

13
Distributed Processing
  • First checks show good scalability (processing
    time following 1/Nnodes for N1-60) with usual NN
    ( 10-20-20-1)
  • Scalability issues with 50-100-100-1
  • Message passing much heavier (100 Kbytes)
  • Matrix computations in master take part of the
    time ( 3/5 in nodes, 2/5 in master) need to
    parallelize

14
Distributed Processing
  • SOM Unsupervised learning
  • Many N dimensional tuples, classified in a 2D
    grid where each node represents a class-model
  • Process more data, but very simple algorithms
  • Partitioning simple computing partial updates
    class-model on distributed data
  • First checks on Meteo DB (300 GB)

15
Next...
  • GRID issue does this scheme work in WAN? Latency
    issues...
  • We would like to test the best setup
  • Local disks (like we have now)
  • Separated Storage Elements
  • SAN (1TB will be available next months)
  • EIDE TB boxes (will be bought for CDF)
  • Our main issue is final user analysis
Write a Comment
User Comments (0)
About PowerShow.com