Storage and Distributed Work - PowerPoint PPT Presentation

1 / 15

About This Presentation

Title:

Storage and Distributed Work

Description:

Storage and Distributed Work. Rafael Marco, Celso Martinez, David Rodriguez, Oscar Ponce ... Replica Catalog ftp. Databases. Provide record information (rows/object) ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 16

Provided by: jesusmarc

Category:

Tags: distributed | replica | storage | work

Transcript and Presenter's Notes

Title: Storage and Distributed Work

1
Storage and Distributed Work

Rafael Marco, Celso Martinez, David Rodriguez,
Oscar Ponce
Presented by Jesus Marco
IFCA Santander (Spain)
23 April 2002
Meeting in Lyon

2
Issues?

Network storage vs local storage
Distributed file systems
Databases
Distributed work distributed storage
A practical point of view...

3
Local on-line Storage

Traditional disks
Low cost (150 GB - 300 euros)
EIDE disk (Ultra ATA 100 MB/s)
Now also FireWire or USB2
Medium cost ( 150 GB 1500 euros)
SCSI disk (Ultra 3 160 MB/s)
SCSI Fibre channel disk
RAID configurations for data servers
EIDE RAID
SCSI RAID

4
Network Storage

NAS storage systems
Simple to sophisticated servers
File access through Ethernet (Fast or Gigabit)
SAN storage systems
Fibre channel based
Record access through SCSI
iSCSI
Ethernet based but SCSI set
RAID for data servers
Gigabit cable/fibre ethernet vs Gigabit optical

5
Distributed File Systems

NFS
AFS, Coda (and /grid )
CXFS
GPFS
Shared-Disk File System for Large Computing
Clusters
Recently released for IBM x-series based clusters
(previously for i-series)
Data stripping
Distributed locking and recovery technology
Supports fully parallel access both to file data
and metadata

6
Files and Databases

Hardware (disk) provides Sequential vs Direct
access
Files
Direct for header, then sequential stream of bits
(but jump if structure known)
Distributed access
Distributed file system
Replica Catalog ftp
Databases
Provide record information (rows/object)
Directly from disk (raw data for allocated disk)
Or from chunk files
User sees direct access to each record
Distributed databases
Propietary solutions (partition replication)
Ad-hoc catalog

7
Distributed Work

Example 1 Local (long) processing local
storage pre/post-execution transfer of data
(HTC)
Typical HEP simulation reconstruction jobs
Local Storage
Not critical (I/O moderate compared to processing
time)
Long Term/ Catalog Storage
Low cost if possible, virtual staging to tapes
Good bandwidth for replication/distribution

8
Distributed Work

Example 2 Distributed processing on distributed
storage
Typical HEP analysis
Local Storage
Critical (I/O rate important process 1TB in 5
minutes)
Possibilities
Distributed file system on NAS or SAN with
Gigabit
Expensive (? Not for EIDE RAID NAS? 1TB5000
Euros?)
Not possible (?) across WAN
Local storage catalog
Numbers
1TB transfer over Gigabit takes more than an
hour, so files must be already close to
processing nodes
Storage nodes (or elements)

9
Distributed Work
10
Distributed storage

Key point strip data to balance load
Example (from LEP analysis)
assume you have 1 TB of data from
Data 100 GB
QCD background 200 GB
WW background 400 GB
Higgs signal 300 GB
Your farm has 100 nodes, strip to each node
1 GB of Data, 2 GB of QCD, 4 GB of WW, 3 GB of
Higgs
10 GB at 100 MB/s order of minutes for complete
read
Process in parallel any query/data mining
But... How much can be done in parallel???

11
Distributed processing

Examples
Queries/histogramming
Trivial parallelization through load partitioning
Needs a master slaves
Communication MPI (collective)
Low Network traffic
Each node access one storage element (can be same
node)
Processing can take place on the DBMS side
(stored procedure)
Data Mining
Some good examples
Supervised learning Neural Networks (BFGS
algorithm)
Unsupervised learning Self Organizing Maps

12
Distributed processing

Supervised learning Neural Networks
N input variables, LM hidden nodes, O output
Training find weights connecting via activation
functions (sigmoids) all nodes ( NxLLxMMxO) so
that
Answer to background is 0
Answer to signal is 1
Process
Start with random weights, apply NN, compute
errors and best direction for minimization (from
gradient), try new weights
Repeat EPOCHS (usually 100-1000) until finding
minimum
Key errors are additive and so training can be
distributed and global error is simply the
addition of all errors
More subtle distribute also the matrix
computations

13
Distributed Processing

First checks show good scalability (processing
time following 1/Nnodes for N1-60) with usual NN
( 10-20-20-1)
Scalability issues with 50-100-100-1
Message passing much heavier (100 Kbytes)
Matrix computations in master take part of the
time ( 3/5 in nodes, 2/5 in master) need to
parallelize

14
Distributed Processing

SOM Unsupervised learning
Many N dimensional tuples, classified in a 2D
grid where each node represents a class-model
Process more data, but very simple algorithms
Partitioning simple computing partial updates
class-model on distributed data
First checks on Meteo DB (300 GB)

15
Next...

GRID issue does this scheme work in WAN? Latency
issues...
We would like to test the best setup
Local disks (like we have now)
Separated Storage Elements
SAN (1TB will be available next months)
EIDE TB boxes (will be bought for CDF)
Our main issue is final user analysis

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

High-Performance Distributed System & Cloud Storage Architecture PowerPoint PPT Presentation

High-Performance Distributed System & Cloud Storage Architecture - Cloud Storage is a readily available object repository with easily accessible and equipped with strong security and sharing features. Utilizing this service allows data to be accessed and stored on UbiBot Cloud infrastructure. When creating a dependable IoT system, devices, cloud services, and their interactions all need to be carefully examined. | PowerPoint PPT presentation | free to view

The Importance of Keg Storage for Efficient Beer Distribution PowerPoint PPT Presentation

The Importance of Keg Storage for Efficient Beer Distribution - For breweries and distributors, storing and transporting beer can be a major logistical and financial undertaking. Bottled beer takes up a massive amount of space and is tedious to handle, while kegs provide an efficient and space-saving solution. As a leading manufacturer of stainless steel kegs, INOXCVA understands the benefits that kegs provide for storage and distribution. Here's an in-depth look at why keg storage is so important. Read More :- https://inoxcva.com/keg/blog/the-advantages-of-keg-storage-over-bottles-for-breweries | PowerPoint PPT presentation | free to view

Warehousing and Storage Services Market Worth US$ 599.3 billion by 2027 PowerPoint PPT Presentation

Warehousing and Storage Services Market Worth US$ 599.3 billion by 2027 - Warehousing and Storage Services Market | PowerPoint PPT presentation | free to view

Storage Area Network (SAN) PowerPoint PPT Presentation

Storage Area Network (SAN) - Storage Area Network (SAN) * * 2 Spanning Tree is not a routing protocol. OSPF does not use spanning-tree algorithm. * * http://technomagesinc.com/pdf/ip_paper.pdf ... | PowerPoint PPT presentation | free to view

Automated Vertical Storage System Nomenclature Guide PowerPoint PPT Presentation

Automated Vertical Storage System Nomenclature Guide - Check out this glossary of terms if you deal with an automated storage system, vertical lift module or vertical carousel in your facility. Read this blog. | PowerPoint PPT presentation | free to view

Distributed (storage) systems G22.3033-006 PowerPoint PPT Presentation

Distributed (storage) systems G22.3033-006 - Distributed (storage) systems. G22.3033-006. Lec 1: Course ... Distributed Systems (Tanenbaum and Steen) Advanced Programming in the UNIX environment (Stevens) ... | PowerPoint PPT presentation | free to view

The Efficient Ways to Manage a Cold Storage Warehouse PowerPoint PPT Presentation

The Efficient Ways to Manage a Cold Storage Warehouse - The cold storage warehouse caters to the needs of a variety of clients. This is the reason that we are witnessing the construction of big warehouses having proper amenities for cold storage goods and products. Moreover, as cold storage warehouses cater to the needs of diverse industries, it is essential to design an absolutely efficient and cost-effective cold storage warehouses. Go through the to know the efficient ways to manage a cold storage warehouse. | PowerPoint PPT presentation | free to view

Nomenclature Guide to Automated Vertical Storage Systems PowerPoint PPT Presentation

Nomenclature Guide to Automated Vertical Storage Systems - Consult this glossary of words if you work in a warehouse or distribution centre that uses automated vertical storage systems or vertical carousels. | PowerPoint PPT presentation | free to view

Storage Area Network (SAN) Market PowerPoint PPT Presentation

Storage Area Network (SAN) Market - Future Market Insights has recently published a market research report on Global Storage Area Network (SAN) market. The study presents a detailed analysis on the historical data, current and future market scenario for the Storage Area Network (SAN) market. | PowerPoint PPT presentation | free to view

WAREHOUSE & STORAGE : FUNCTION & TYPES OF WAREHOUSING PowerPoint PPT Presentation

WAREHOUSE & STORAGE : FUNCTION & TYPES OF WAREHOUSING - A warehouse might be characterized as a place utilized for the storage or aggregation of goods. The function of storage can be done successfully with the assistance of warehouse centers used for storing the goods. | PowerPoint PPT presentation | free to view

Mild Steel Storage Tanks | Steel Storage Tanks PowerPoint PPT Presentation

Mild Steel Storage Tanks | Steel Storage Tanks - Rostfrei Steels Leading Manufacturer Supplier of Steel Storage Tanks where we deals in Bolted Tanks ,Panel Tanks, tank fabrication services etc for more information: http://www.rostfreisteels.com/StorageTanks-1.aspx | PowerPoint PPT presentation | free to view

High Throughput Distributed Computing - 3 PowerPoint PPT Presentation

High Throughput Distributed Computing - 3 - High Throughput Distributed Computing - 3 Stephen Wolbers, Fermilab Heidi Schellman, Northwestern U. | PowerPoint PPT presentation | free to view

Making Storage Make Sense PowerPoint PPT Presentation

Making Storage Make Sense - http://www.steincoindustrial.com/ | The Equipto line of storage and shelving products can help large facilities, machine shops, and other clients safely and efficiently store a variety of materials. A few of the solutions include mobile aisle systems, modular cabinets, and industrial workbenches. | PowerPoint PPT presentation | free to view

High Throughput Distributed Computing - 1 PowerPoint PPT Presentation

High Throughput Distributed Computing - 1 - High Throughput Distributed Computing - 1 Stephen Wolbers, Fermilab Heidi Schellman, Northwestern U. | PowerPoint PPT presentation | free to view

Carbon Capture & Sequestration (Storage) CCS PowerPoint PPT Presentation

Carbon Capture & Sequestration (Storage) CCS - Carbon Capture & Sequestration (Storage) CCS | PowerPoint PPT presentation | free to view

Carbon Capture & Sequestration (Storage) CCS PowerPoint PPT Presentation

Carbon Capture & Sequestration (Storage) CCS - Carbon Capture & Sequestration (Storage) CCS | PowerPoint PPT presentation | free to view

Software Systems File Systems and Storage PowerPoint PPT Presentation

Software Systems File Systems and Storage - Software Systems File Systems and Storage Emery Berger and Mark Corner University of Massachusetts Amherst | PowerPoint PPT presentation | free to view

A Distributed Multimedia Data Management over the Grid PowerPoint PPT Presentation

A Distributed Multimedia Data Management over the Grid - A Distributed Multimedia Data Management over the Grid Kasturi Chatterjee Advisors for this Project: Dr. Shu-Ching Chen & Dr. Masoud Sadjadi Distributed Multimedia ... | PowerPoint PPT presentation | free to view

EE 394J-10 Distributed Technologies PowerPoint PPT Presentation

EE 394J-10 Distributed Technologies - EE 394J-10 Distributed Technologies Micro-grids architectures, stability and protections Dc Microgrids Stability OPTION #3: Bulk energy storage (primarily batteries ... | PowerPoint PPT presentation | free to view

WORLD-CLASS ESD WORK TABLES BY MESSUNG PowerPoint PPT Presentation

WORLD-CLASS ESD WORK TABLES BY MESSUNG - descrption -Today, electronics are an integral part of life - and industry. In Defence, Automotive, Pharmaceutical, FMCG, Education, Telecom, EMS, Power and many more, ESD protected areas are used for: •R&D •Electronics assembly •End-of-line testing •Repair •Training & didactic centres •Calibration stations, etc. An ESD work table is a crucial component of any ESD protected area. It dissipates harmful electro-static discharges away from the user and minimises damage to the critical components and sensitive electronic assemblies being produced or tested. | PowerPoint PPT presentation | free to view

Explicit Control in a Batch-aware Distributed File System PowerPoint PPT Presentation

Explicit Control in a Batch-aware Distributed File System - Explicit Control in a Batch-aware Distributed File System | PowerPoint PPT presentation | free to view

Distributed File System PowerPoint PPT Presentation

Distributed File System - By Manshu Zhang Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference DFS A distributed implementation of the classical time ... | PowerPoint PPT presentation | free to view

K12 Identity Based Storage Management PowerPoint PPT Presentation

K12 Identity Based Storage Management - K12 Identity Based Storage Management Identity Based Collaborative Learning Solution Agenda Introduction to Condrey Consulting Corporation Identity Based Storage ... | PowerPoint PPT presentation | free to view

Public Cloud vs. Private Cloud vs. Hybrid Cloud Storage: How to Choose the right virtual network? PowerPoint PPT Presentation

Public Cloud vs. Private Cloud vs. Hybrid Cloud Storage: How to Choose the right virtual network? - Cloud Storage is not only storage but also redirecting the application running path to Cloud from hard disk storage of your computer. For more info, please visit our website https://ictechnology.com.au/ | PowerPoint PPT presentation | free to view

Separating Abstractions from Resources in a Tactical Storage System PowerPoint PPT Presentation

Separating Abstractions from Resources in a Tactical Storage System - Separating Abstractions from Resources in a Tactical Storage System Douglas Thain, Sander Klous, Justin Wozniak, Paul Brenner, Aaron Striegel, and Jesus Izaguirre | PowerPoint PPT presentation | free to view

Global Storage Tank Industry 2015 - Big Market Research PowerPoint PPT Presentation

Global Storage Tank Industry 2015 - Big Market Research - The Storage Tank market analysis is provided for the international markets including development trends, competitive landscape analysis, and key regions development status. | PowerPoint PPT presentation | free to view

Ch4: Distributed Systems Architectures PowerPoint PPT Presentation

Ch4: Distributed Systems Architectures - Ch4: Distributed Systems Architectures Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together multiple ... | PowerPoint PPT presentation | free to view