Title: Storage Embedded Networks SEN
1Storage Embedded Networks (SEN)
ISMAIL ARI ari_at_cse.ucsc.edu Storage Systems
Research Group University of California Santa Cruz
With twenty five years of Internet experience
weve learned one way to deal with exponential
growth-caching. - Van Jacobson
2Roadmap
- Problems on the current Internet infrastructure
- Existing solutions and why they are inefficient
- Our solution SEN architecture
- Clusters of SEN devices
- Why SEN solution is better
- Detailed SEN analysis and comparison
- Simulation results
- Future Work
3Problems on the Internet infrastructure
- Wide Area Network (WAN) bandwidth is a precious
resource - today it is used inefficiently by doing redundant
packet file transmissions that could have been
avoided - Client response times are unbearable
- even for small html files
- Content providers want to reach many clients
- without being forced to beef-up their servers
forever - i.e. looking for opportunities to reduce server
side load to scale to supporting many clients
4Who should care about this research?
- ISPs, NSPs, content providers and distributors
- Clients sys-admins concerned about their data
access performance - SEN architecture reduces
- Mean Client Response Time (CRT) 50-70
- Server load typically 10-20
- WAN bandwidth usage 10-20
5Problems with current proxies
- Goal Improve both hit and miss response times,
rather than hit and miss rates! - Hierarchical proxies yield poor response times
- they degrade hit response times
- hop-by-hop, connection established,
store-and-forward - parents are usually far away and highly loaded
- they degrade miss response times,
- while trying to make proof of inexistence
- Basic rules of thumb are being violated
- do not slow down misses (just let me go!!)
- minimize number of hops to locate and access data
- cache data close to clients, even push the data
to reduce compulsory misses - Proxies also require provisioning and monitoring
6Our Solution SEN device
- A network device with embedded volatile and
non-volatile storage to be used for object
caching - via object snooping in trusted routers
- Lightweight and for-once messaging for exclusive
caching SEN clusters
7Motivations and Goals
- performance SEN reduces client response time,
network bandwidth, server load - scalability globally scalable networked-caching
- network devices provide best distribution
- automatic proxying
- no extra communication overhead between SEN
caches - monetary savings pay-once for cheap storage
instead of paying forever for expensive network
channels - flexible policy emulate push and pull at the
same time
8Salient Design Features
- Globally Unique Object Identification (GUOID)
- payload type independent ? scales to all
(old,new) traffic types - content-derived ?benefit data integrity
- extremely low probability for name clashes
- Ad-hoc multicast support
- Does not have strict timing restrictions that IP
multicast has - Backwards compatible
- Intermediary legacy routers are SEN devices with
Prob(hit)0
9SEN Operation
- 1- Standard routing protocols (OSPF,RIP) will
negotiate caching policy in addition to route
information - 2- Clients request
- 3- SEN nodes cache all bypassing data objects
- Replacement is done based on selected policy
- SEN node sends a local copy if it exists
- Forward otherwise
102-Level SEN Caching Hit-Rate Analysis
Hit Percentage-RANDOM
Hit Percentage-ZIPF
L ? LRU D ? Demotions M ? MRU N ? No demotions
3, 4, .. n level simulations show
superposition property. Policy switching is
very effective for exclusive caching
112-Level SEN Mean Client Response Time (CRT)
Analysis
Mean CRT-RANDOM
Mean CRT-ZIPF
L ? LRU D ? Demotions M ? MRU N ? No demotions
Inverse relation between hit-rates and response
times. MLN or LMN should be selected, since
demotions use WAN b/w.
12Comparison of SEN with Hierarchical Proxies
- UCSC network topology
- Parameters Changed
- workload (rand, zipf)
- total amount of memory
- link speeds
- departmental correlations
- replacement policies
- Metrics Measured
- hit rates
- mean response times
- server load reductions
13UCSC-SEN vs. UCSC-proxy Results
- Mean CRT improvements over UCSC-proxy
- 70(rand), 50(zipf) CRT reduction with ? memory
- 50(rand), 2(zipf) with extremely limited memory
- Similar hit-rates with same amount of memory
- Also similar cluster miss-rates and server load
(reduction) - Typically each SEN node acts as a proxy on its
own - 85 (conjoint-rand) 50(disjoint-rand)
- 62 (conjoint-rand) 20(disjoint-rand)
- Server load and WAN B/W usage reductions
- Increase in link delays is proportionally
reflected to CRT, without and any effect on
hit-rates
14Backup1 Scalable distributed data access
- Clients access objects with various popularity
stored at geographically distributed sites. - Popular objects create hot spots of network and
server load, - limiting the scalability and significantly
increasing the latency. - Distributed caches have proven useful in fighting
these problems by enabling data sharing and
bringing data closer to the clients - The more caches you have, the better.
- But, how to know which cache has which object?
- hierarchical proxy caching (mostly Harvest-Squid
based) - cluster of caches with summarized cache info,
location hints etc.
15Backup2 Solutions Exploiting File Packet Reuse
- Web caching proxy caching browser caching
- Proxy Caching Solutions
- Harvest-Squid proxy caches Internet Caching
Protocol(ICP) - Sharing data among caches summary cache,
location hints etc. - L4, L5 switching
- Inktomi, ArrowPoint, Cisco WCCP switches
- Caching in operating system context
- cooperative caching, global memory system (GMS)
- Packet loss and pathologies on the wireless and
wired links - Wireless SACK, I-TCP, Snoop Protocol, mobile IP,
ad-hoc - Wired C-TCP
- Problems
- solutions for specific data types and loss events
- cannot exploit correlations between clients
session dependent
16Backup3 Trace Workload Miss Rate
Characteristics (2-3GB cache size)
1- Do not increase latency for compulsory
misses 2- Capacity misses become critical if we
use memory instead of disks
17Backup4 Cant use demotions in WAN
Misses cause retrievals from upstream caches,
further resulting in demotions. Demotions at
each layer contribute to traffic increase and
quickly add up. Traffic increase due to
demotions is 50 in 2-level, 100 in 3-level
caching.
18Backup5 Analytical model
- Backwards compatibility
- Intermediary legacy routers are SEN devices with
Prob(hit)0 - logically collapse into SEN links
19Backup6 Validity of workload assumptions
20Backup7 Superposition
21Backup8 ucsc-sen vs. ucsc-proxy
22Backup9 ucsc-sen vs. ucsc-proxy,
fully-conjoint (100) fully-disjoint(0)
requests
23Backup10 LLD vs. MLN
0
9
4
5
8
0
9
4
5
8
I want to do some animations to show how
demotions work and how MLN caches more
exclusively without any demotions
But, I dont know how to overwrite the values in
the boxes ???