les robertson cernit0899 1 - PowerPoint PPT Presentation

1 / 71
About This Presentation
Title:

les robertson cernit0899 1

Description:

photo - Seagate Technology, Inc. 36 GB capacity. half height 3.5' 12 platters, 24 heads ... principle of the scanning near-field optical microscope ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 72
Provided by: Rober877
Category:

less

Transcript and Presenter's Notes

Title: les robertson cernit0899 1


1
The Data Storage Challenge for LHC
  • CERN School of Computing
  • Stare Jablonki - September 1999
  • Les Robertson
  • CERN - IT Division
  • les.robertson_at_cern.ch

2
Part I - The technology
  • today's workhorses
  • magnetic hard disk
  • magneto-optics
  • magnetic tape systems
  • optical disks
  • exotic storage technologies
  • holography
  • atomic force microscopy
  • robotics for handling mass storage

3
disk storage
  • state of the art
  • technology limits - the super-paramagnetic
    problem
  • heads
  • access performance and caches
  • magneto optics OAW, Terastor

units very small sizes are expressed in
micrometres, denoted ? almost everything else
is in - inches - in square inches - in2 feet
- 1 foot 12 inches Gigabit 109 bits -
Gb Gigabit per square inch - Gb/in2 Gigabyte
109 bytes - GB
4
disk storage - state of the art
  • platters
  • sputtered magnetic andprotective layers
  • protective layer has texturedlanding area for
    the head - to avoid stiction on take-off
  • head flies at around 50 nanometres
  • current product - 3-4 Gb/in2
  • lab demonstrations - gt20 Gb/in2

5
super-paramagnetic limit
  • bit size - decreases in proportion as the areal
    density increases
  • width X length
  • 1 Gbpi2 3.5? X 0.18?
  • 10 Gbpi2 1? X 0.06?
  • 40 Gbpi2 0.5? X 0.03?
  • 80 Gbpi2 0.4? X 0.02?
  • fewer particles in a bitsmaller separation
    between bits
  • increased tendency fordomains spontaneously
    tochange polarisation

6
super-paramagnetic limit
  • super-paramagnetic limit
  • point where the fluctuations in thermodynamic
    energy at operating temperatures have a moderate
    probability of causing magnetic state changes
  • in current disks, the magnetic energy barrier is
    about 40 times the thermodynamic range
  • it is expected that new materials, recording
    techniques will push the barrier to at least
    100 Gbpi2

7
heads
  • inductive read heads
  • signal current varies as rate of flux change
  • MR read heads
  • NiFe conductor -- resistance changes with flux
    strength
  • independent of velocity
  • signal strength proportional to sense current
  • increased sensitivity in high density, high
    bandwidth recording
  • a transverse bias field is applied to
    discriminate between positive and negative
    recording polarisations

inductive read head
magneto-resistive read head
?R ? H ?V ? I ?R
Isense
8
inductive write head, MR read head
picture IBM Research - Almaden
9
Giant Magneto-Resistive Effectthe Spin Valve
  • Giant Magneto-Resistive
  • Multi-layer head
  • magneto-resistive layer (NiFe)
  • conducting layer (e.g. Ag, Cu)
  • pinned layer (e.g. Co) - fixed magnetic
    orientation
  • exchange layer ferro-magnetic material which
    maintains the pinned layer orientation
  • GMR exploits the different behaviour of
    conduction electrons with spin parallel to or
    opposed to the magnetic orientation of the MR and
    pinned layers - hence the term Spin Valve

GMR layers
exchange layer - magnetised
pinned layer (Co)
conducting layer (Cu)
MR layer (NiFe)
sense
current
10
Spin Valve
picture IBM Research - Almaden
11
merged head
12
Seagate Cheetah 36
36 GB capacity half height 3.5 12 platters, 24
heads 5.7 ms average seek 10,000 rpm 2.99 ms
latency 1 MB cache 18-28 MB/sec internal
transfer rate
photo - Seagate Technology, Inc.
13
Data transfer speed
  • Data transfer speed increases with
  • the linear density (? of the areal density - i.e.
    about 26 per year)
  • the rotation speed - which has only increased by
    about 50 in the past 5-6 years
  • The actual data transfer speed
  • is faster on outer tracks than
  • on inner tracks - so be careful
  • when reading specifications to
  • discriminate between average and
  • maximum transfer speed.

assumes recent evolution maintained 60 per year
increase in areal density, rotational speed
increasing 50 in 5 years
1999 1
14
The importance of the cache
  • Access time depends on
  • the seek time - which has hardly improved by 50
    in ten years
  • the latency - half a turn of the platter
  • Without a cache, thiswould lead to
    veryunimpressive performancefor small transfer
    sizes
  • The cache helps to getback to the nominaldata
    transfer rate - nomore than that!

15
Future possibilities
  • continuing developments of GMR - with the
    formidable research capability of IBM
  • current interest in the use of rare-earth/transiti
    on metal composites, evolved for MO recording
  • low Curie point
  • stable magnetisation at normal operating
    temperatures
  • stable magnetic domains demonstrated at a density
    of 250 Gb/in²
  • Longer term --
  • holography
  • atomic force microscopy
  • .

16
Optically Assisted Winchester (OAW)
  • Developed by a Seagate subsidiary - Quinta
  • Magnetic layer uses a composition of rare earth
    transition metals
  • Write
  • laser heats material beyond Curie point
  • induction coils changes magnetic orientation
  • magnetisation stable at normal temperatures
  • Read
  • rotation of polarisation of reflected light
    (Kerr effect)
  • Technology
  • laser delivery fibres
  • micro mirror (head of a pin)
  • micro-optics
  • Potential 100 Gb/in2 ?
  • limited by the resolution of the optics

17
The Solid Immersion Lens Near Field Recording
Terastor Corporation
  • Solid Immersion Lens
  • laser is focussed internally in a material with a
    very high refractive index
  • with a red laser can get the spot diameter down
    to 0.2?(the bit width for 160 Gb/in2)

where ? is the wavelength n the
refractive index na is the
numerical aperture
18
SIL NFR
  • Near field recording
  • principle of the scanning near-field optical
    microscope
  • the oscillating dipoles of the radiating surface
    produce an evanescent field which decays in
    about one wavelength
  • .. but activate other dipoles within this range

19
Developments in Magneto Optics
  • The recorded area of the disk cannot be narrower
    than the spot (or at least the high temperature
    area of the spot)
  • But when recording, spots can be overlapped to
    increase linear density
  • This is not possible on conventional MO disk,
    which has a thick transparent substrate over the
    recording layer, which required a high field
    coil, with a high inductance and so low
    modulation frequency
  • Surface recording reduces the separation of the
    head and recording layer, making crescent
    recording possible, and also enabling the use of
    high numerical aperture lenses - producing
    smaller spots
  • But it is a challenge for the designer of
    removable media

disk rotation
20
Magnetic Super Resolution - MSR
  • Easy to see how the crescents are recorded, but
    how are they read back?
  • Three layers
  • 1) recording layer
  • 2) intermediate masking layer temperaturesensiti
    ve magnetic orientation
  • low temperature parallel to plane
  • intermediate temperature perpendicular
  • high temperature loses orientation
  • couples the recording layer to the read-out
    layer only at intermediate temperatures
  • 3) read-out layer magnetised (erased) during
    read-out

21
magnetic tape
  • why use magnetic tapes?
  • basics
  • linear
  • helical scan
  • state of the art drive - the StorageTek 9840
  • current trends

22
Why use magnetic tape?
  • Why use a sequential access medium with a history
    of relatively poor reliability?
  • historically the answer has been --
  • cost - 10-100 times cheaper per Byte than disk
  • volumetric storage density
  • removable, transportable medium
  • backup
  • archive
  • data exchange
  • robotic storage - automated access to enormous
    amounts of data
  • but there is considerable competition from
  • hard disks - cost, storage density
  • optical storage - archive longevity, data exchange

23
Volumetric Storage Density
Assumes shelf storage of -- raw
tape, DVD cartridge -- disk without
enclosure, power
supply, fan -- no compression on tape
Storage Capacity and Density
400
120
350
100
3
300
80
250
GB
Native Cartridge Capacity -
Density TB/m
200
60
150
40
100
20
50
0
0
IBM 3590
STK 9840
STK Redwood
DVD-RAM (2-side)
Quantum DLT 8000
LTO Ultrium (future)
Seagate Cheetah (3.5" disk)
TB/cubic-metre
capacity (GB)
Device type
24
basic characteristics
  • medium
  • flexible substrate - 10? thick polyethylene
    PET/PEN
  • recording layer - 0.1-0.2?
  • Metal Particle
  • Metal Evaporated
  • stored in cartridge (1 reel) or cassette (2 reel)
  • tape extracted and loaded on drive
  • recording technology spin-off from magnetic
    disk developments
  • MR, GMR heads
  • track following servo systems
  • media

25
sequential access
  • basically a sequential medium
  • no delete/update
  • new data written at end
  • open - and read from start of file
  • usually a directory at the beginning of the tape
  • so open(file) can use servo information for
    afast skip to the start of the data

26
logical data format
  • The tape is organised logically as a set of
    files, separated by labels and tape marks.
  • In early drives, the drive could seek rapidly to
    the next tape mark, which was recorded with a
    very special patternModern drives use a
    directory and information on servo tracks to seek
    to the logical tape mark

file data
file data
file data
...
volume labels
tape mark
file labels
tape mark
tape mark
end of volume
tape mark
tape mark
file labels
tape mark
tape mark
file labels
tape mark
tape mark
27
physical data format
  • The data is recorded in blocks, each with a
    cyclic redundancy check (CRC) to detect errors
  • The logical block is recorded in a series of
    physical blocks, spread across the parallel
    recording channels
  • each channel correspondsto a set of physical
    headelements
  • Substantial recordingcapacity is reservedfor
    error correctiondata
  • The 4-channel DLT formatis shown - newer tape
    systems have even more complex patterns to
    supportrecovery from more severetape damage

28
linear recording
  • linear recording
  • tape passes over fixed head
  • multiple track read write
  • serpentine dual-directional recording
  • head unit
  • dual-directional
  • low head-medium contact pressure
  • multi-channel head array

29
linear recording
  • media issues
  • tape roughness, head contact, surface wear, dust
  • tape path complexity, tension gt distortion
  • lateral expansion/contraction with environmental
    changes
  • reel sag in long term storage

head array
tape has expanded laterally since it was recorded
30
helical scan
  • developed for entertainment business
  • high end market in broadcasting
  • mass market in domestic VCR
  • tape moves slowly past rapidly spinning head on
    scanner

31
helical scan
  • head wear problems due to tape contact pressure
  • helical path controlled using tape edge -
    requires very accurate slitting in manufacture
  • edge damage, tape warp cause track curving
  • linear tapes reserve a guard band at the edges
  • historically helical scan has had a higher track
    density than linear
  • 2800 tracks per inch helical
  • 7-800 tracks per inch linear
  • but linear tape is improving track density with
    MR heads, track following technology

32
data compression
  • an advantage of sequential access over random
    access disks is that the device can implement
    data compression
  • digital Lempel-Ziv 1 algorithm
  • replaces variable length phrases with code words
  • enhanced LZ 1 algorithm (e.g. StorageTek 9840)
    can give up to four times compression on
    commercial data, 2 times on pre-compressed
    physics data

33
the recording channel
write channel
read channel
channel complexity can increase with improved
ASIC technology
34
9840 Mechanism
Head
Coupling
Head 23 Patents Pending 1 Patent
Issued Mechanism 10 Patents Pending
1 Patent Issued
Reel Motor
Operator Panel
35
9840
  • 1/2 tape in IBM 3480 form factor
  • MP on PEN medium
  • 288 tracks
  • 16 parallel heads ( ? 18 stripes )
  • 2 metres/second past head - 10 MB/sec data rate
  • cassette (2 reel) with tape unloaded at mid point
  • tape path entirely in cassette
  • 4 sec load
  • 900 feet of tape ( 274 metres )
  • 8 sec average search
  • 16 sec max rewind
  • 20 Gbytes user data (uncompressed)
  • LZ-1 enhanced compression

36
Cartridge
Cartridge 6 Patents Pending 1
Patent Issued
37
(No Transcript)
38
current trends
Many new drives Several aggressive road
maps Major application is backup Expect strong
competition at the low end from optical
scheduled for 2000
39
Optical Recording
  • The historical advantage of optical over magnetic
    technology was the potential recording density
  • Red laser -- spot size 0.4? diameter 5
    Gbits/inch2
  • Many high end products - but never gave real
    competition to magnetic products
  • performance, cost
  • niche market for write-once applications
  • magnetic disk has now reached or exceeded optical
    recording densities
  • BUT for the first time we see real competition
    from low-end mass market products CD-R, DVD-R
    and DVD-RAM

40
Write Once - CD-R DVD-R
  • preformed polycarbonate substrate
  • wobbled groove to guide and clock laser
  • photo/heat sensitive dye layer
  • cyanine
  • reflection layer
  • gold
  • laser spot heats dye, changes its structure which
    in turn deforms the substrate
  • read-out laser is absorbed/scattered by the
    deformation

41
DVD-R
  • laser system
  • ? 640 nm numerical aperture 0.6 refractive
    index 0.8
  • spot diameter 0.4 ?
  • capacity of side 4.7GB
  • 1.3 MB/sec record read speed
  • Prices (Panasonic)
  • 5.4K for the drive
  • 35 double sided media ( 3.90 / GB)
  • (a CD-R 640 MB disk costs about 1 in quantity)

42
Erasable DVD-RAM
  • phase change recording layer - TeGeSb
  • heated by laser spot
  • high power writefast melt-cool cycleleaves
    amorphous spotwith low reflectivity
  • lower power eraseslower melt-cool cycleleaves
    crystalline spotwith high reflectivity
  • read-out - low power laser
  • land groove recording

43
DVD-RAM
  • capacity 2.6 GB per side
  • single layer only, unlike DVD-ROM
  • 4.7 GB per side in version 2 due in 2000
  • record and read-back performance - 1.3 MB/sec
  • access time 210 ms
  • 1999 prices
  • drive 640
  • double sided disk (5.2 GB) 35 (6.70 per GB)
  • With high volume
  • could we expect media costs to come down to 1-2
    per disk (like CD-R today)?
  • giving 0.2 per GB

44
exotic storage technologies
  • holography
  • atomic force microscopy
  • Keele Ultra High Density Memory

45
holographic storage
graphic Byte Magazine
46
atomic force microscopy
  • atomic force microscopy applied to data storage
    by IBM
  • sharp tip mounted on a micro-mechanical
    cantilever made from silicon nitride
  • heat pressure applied as it is passed over
    plastic substrate
  • read-out - the cantilevertip are scanned over
    the surface
  • 45 GB/in2 demonstrated
  • 300 GB/in2 theoretically possible

pictures - IBM Research Almaden
47
Keele Ultra High Density Memory
?
  • Basic research done at Keele University, by
    emeritus professor Ted Williams (inventor of an
    NMR scanner in late 70s/early 80s)
  • The Keele Ultra High Density Memory uses magneto
    optical alloys to store 2.3 TeraBytes of user
    memory on a device the size of a credit card, but
    8.5 cm thick, for less than 50!
  • Uses optical techniques to store and retrive data
    in 3D storage
  • Multi-layer (3) recording
  • Could put 100 Gbytes in a wristwatch
  • All information on the technology controlled by a
    venture capital company - which says that
    licensing negotiations are under way with a large
    company - products can be expected in under 2
    years

?
48
Robotics - no problembut prices are best at the
top!
65 per 9.4GB slot 7/GB
NSM jukebox 620 DVDs
20 per 50GB slot 0.4/GB
49
Part II - LHC requirements solutions
  • summary of the requirements of the LHC
    experiments
  • strawman LHC computing farm
  • cost factors an attempt to estimate the costs of
    storage in 2005
  • conclusions

50
LHC storage requirements
  • summary of the storage requirements of the LHC
    experiments
  • but this is just part of the computing fabric
  • which also includes processing and networking

51
Data Recording and Offline Computing Facilities
at CERN - for LHC experiments
  • For each LHC experiment capacity at CERN is
    needed for
  • Data Recording
  • First-pass reconstruction
  • Some re-processing
  • Basic Analysis (pass-1 pass-2) - ESD ? AODTAG
  • Support for a few analysis groups
    (ATLASCMS 4 groups, 100/1600 physicists)
  • Good external networking
  • Current assumption is that this would be
    complemented with a few large regional centres
    together providing about as much computing
    capacity as at CERN

raw data ? ESD
52
Capacity Estimates
  • Estimate uses figures from CMS in mid-98ATLAS
    would be similar
  • Raw data is recorded at 100 MB/sec

53
PetaByte
  • 1015 Bytes
  • 1,000 TeraBytes
  • 20,000 Redwood tapes
  • 30,000 Cheetah 36 disks
  • 100,000 dual-sided DVD-RAM disks
  • 1,500,000 sets of the Encyclopaedia Britannica
    (w/o photos)

54
disk capacity v. data rate
CERN physics 1999 12 MB/sec-per-TB
CMS 2006 74 MB/sec-per-TB
55
ALICE
  • ALICE requires a much higher data recording rate
    than ATLAS or CMS
  • 1 GB/sec - during the 1-2 month ions run
  • Total raw data 1 PByte per year
  • Tape data rates may remain modestly in the
    15-20 MB/sec range
  • Requiring a nominal 50-70 drives in practice
    100-150 drives and some good storage management
    software
  • This problem will be addressed by Fabrizio in his
    talks

56
storage network
12 Gbps
processors
5600 processors 1400 boxes 160 clusters 40
sub-farms
tapes
1.5 Gbps
0.8 Gbps
6 Gbps
8 Gbps
24 Gbps
farm network
960 Gbps
0.8 Gbps (daq)
100 drives
CMS Offline Farm at CERN circa 2006
LAN-WAN routers
250 Gbps
storage network
5 Gbps
0.8 Gbps
0.5 M SPECint95 0.5 PByte disk
5400 disks 340 arrays ...
disks
lmr for Monarc study- april 1999
57
Is there a problem?
  • Because HEP computing has the property of event
    independencewe can process any number of events
    in paralleland so we can use real commodity
    components (well, maybe not for tertiary
    storage)nothing special - just lots of them
  • The technology is looking good
  • but there are two small problems which come from
    the scale
  • -- Cost
  • -- Management
  • Fabrizio will talk about the storage management
    issues,
  • but note that the management problem applies
    across the board -
  • processors, network, storage, workflow, WAN

58
Cost evolution
  • cost factors
  • development costs
  • production costs ?
  • technology
  • market volume
  • marketing costs
  • distribution costs
  • price factors
  • production costs
  • profit
  • competition
  • the best technology often does not win

59
Share of Hard Disk Market Units shipped in 1998
1998 145M disks sold - total revenue 30
Bn 110M in PCs (IDE) 30M SCSI/FCAL - mostly
storage systems which generated
13Bn revenues
60
prices paid by CERN compared with 35 evolution
since 1990 simple disk arrays (JBOD)
61
How much should we budget for hard disk?
  • So we are reasonably happy that LHC can use
    inexpensive disk, and that the prices will
    continue to decrease steadily
  • To minimise data loss and other operational
    problems associated with failing disks, we will
    use RAID. Today RAID systems come with a
    substantial price penalty, but we can expect that
    in 2005-06 we shall only have to pay for the
    redundant disk capacity.
  • Bottom line At an estimated 4-8/GByte the
    500TB needed by CMS will cost 2-4M

62
tape price evolution
  • Estimating the cost of magnetic tape is not
    nearly so easy.

63
Total revenues 5Bn 0.5 linear devices
DLT, 3590, 9840, 3570 0.5 helical Redwood 19mm
helical AMPEX, Sony D1 8mm helical
EXABYTE, Sony AIT 4mm helical DAT
64
DVD?
  • As we saw earlier, DVD-R and DVD-RAM have the
    potential to provide a very convenient way of
    archiving modest amounts of data - 5-10 GBytes -
    at a modest data rate (1.4 MB/sec).
  • DVD is a random access device - offering a
    significantly different functionality from
    sequential access tape.
  • The cost today for a DVD-RAM disk is a few per
    GB, rather similar to the cost of 8mm, 4mm tape.
  • With a little improvement in the cost of the
    DVD-RAM drive - DVD-RAM could destroy the
    market for low-end tape (home, small office
    backup archive)

65
Data Centre Tapes
  • But we are concerned with data centre tapes -
    0.5 linearWhere performance, capacity,
    robotics, . are important factors
  • But so is overall costwhich today, for ATLAS
    or CMS would be dominated by the media cost!
  • ALICE is a bit different

66
Can we estimate how tape costs will evolve?
  • NO - we cannot estimate - only guess for media
    - which dominates the overall cost
  • Cost of high quality drives will not change much
  • Cost of a robot slot will not change (but we may
    see competitive pricing for DLT format robots)

CHF per GB of data
CHF per foot of tape
Fe
Cr
MP
log scale!
?
single supplier multiple suppliers
67
guesstimate for Magnetic Tape
  • Maybe the recording density increases by a factor
    of 4
  • So the cost of the media will fall to CHF 0.5 per
    GB
  • And a cartridge will hold 100 GB
  • The 5-year cost then works out at CHF 1/GB
  • Two problems for tapes
  • raw disk may be only 3 times more expensive
  • as we guessed earlier, DVD-RAM might be
    substantially cheaper(if there are suitably
    priced robotics!)

68
time to change the balance?
  • the classic model
  • use the disk as a cache of the active data, which
    is kept on tape
  • may not be the right one for LHC
  • we should consider using much more disk for
    all of the really active data
  • and using tape or something cheaper to archive
    the rest

69
conclusion (i)
  • disks OK
  • merging of magnetic and magneto-optical
    techniques will ensure that the technology can
    evolve smoothly well into the LHC time-frame
  • unlikely to be displaced as the standard for
    secondary storage
  • DVD - too slow, too small
  • holography - waiting for a material breakthrough
  • the rest are not on the LHC time-scale
  • robots OK

70
conclusion (ii)
  • ---- BUT tertiary storage is a problem
  • tape - reliability, cost, market - all
    questionable
  • DVD - may be a solution if a healthy market
    develops
  • very likely to eliminate tape for low-end PC
    applications
  • could well compete on price reliability for
    data centre applications
  • but likely to remain low capacity, low
    performance
  • removable magnetic or magneto-optic disk may
    compete strongly with tape - but are not likely
    to be cheaper

71
conclusion (iii)
  • which may just give us the opportunity we need to
    change the analysis model
  • active data on disk
  • exchange data using random access DVDs
  • and use tape as the last resort - like the
    rest of the industry!
  • but how do you select the active raw data?
Write a Comment
User Comments (0)
About PowerShow.com