Web Caching File System - PowerPoint PPT Presentation

About This Presentation
Title:

Web Caching File System

Description:

Web Caching File System Jonathan Ledlie Matt McCormick – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 24
Provided by: Compute128
Category:
Tags: caching | file | system | web

less

Transcript and Presenter's Notes

Title: Web Caching File System


1
Web Caching File System
  • Jonathan Ledlie
  • Matt McCormick

2
Outline
  • Motivation - why design a new file system?
  • Current state of affairs
  • Design of web caching file system
  • Performance comparison - WCFS to Unix
  • Future work
  • Conclusions

3
Two points
  • Optimizing on invariants
  • Impending I/O bottleneck

4
Motivation
  • Disks are slow
  • Communication rates increasing rapidly
  • Web cache anomalies
  • only write to files when they are created
  • permissions stay constant for files
  • all files are have copies at original server

5
Web Caching File System vs.Unix File System
50,000 of each operation CFS is using one
thread. Unix file I/O is synchronous 500 mHz
PIII, 8.5 G disk
6
Current State of AffairsInternet Topology
Client
Client Side Cache
NETWORK
Client
SERVER
Client
Server Side Cache
Client
Client Side Cache
Client
Client
7
Current State of AffairsUnix File System
  • Life of file in web cache
  • create, write, close
  • open, read, close (multiple times)
  • delete
  • Using i-nodes
  • lots of flexibility that is not needed
  • extra access to disk for each file reference
  • Directory structure and name lookup

8
Design of WCFSSpecializations
  • Life of file
  • create
  • read (multiple times)
  • delete
  • No i-nodes or permanent file status data
  • faster create and file access
  • In memory hash table stores file locations
  • faster file lookup and delete
  • All file data written to consecutive blocks
  • faster reads and writes

9
Design of WCFSObject Diagram
CacheDisk
getNewCacheObject
Disk
Cache
Cache
Cache
RequestQueue
FileTable
BitMap
Request
Request
Request
10
Design of WCFSDisk Initialization
CacheDisk
Disk
  • First create cache disk object
  • creates disk object to represent physical disk
  • starts a disk thread running
  • Disk object and physical disk
  • utilize an SGI raw I/O patch for Linux
  • bypass kernel and kernel buffers

11
Design of WCFSDisk Object
Disk
RequestQueue
FileTable
BitMap
  • FileTable
  • stores names and locations of files on disk
  • MD5 conversion of url
  • RequestQueue
  • stores read and write requests from process
    threads
  • whenever anything in queue, disk thread runs
  • BitMap
  • keeps status of each block on disk
  • locates and marks spot on disk for files to be
    placed

12
Design of WCFSRequest Objects
RequestQueue
Request
Request
Request
  • Request
  • write ? starting block, length, buffer to write
    from
  • read ? starting block, length, buffer to write
    to
  • (implies files must be smaller than virtual
    memory)
  • Currently queued by FIFO (soon to be one-way
    elevator)

13
Design of WCFSCache Objects for Threading
CacheDisk
Cache
Cache
Cache
  • Multiple threads for handling clients
  • Each thread gets a single Cache object
  • Cache Object
  • create, read, remove, length, sync
  • Thread create and read ? Asynchronous
  • turned into request objects
  • placed in request queue for disk
  • Thread calls sync to guarantee its operations are
    done

14
Design of CFSCode Snippet
  • Common web caching operations
  • create(url, buffer, size)
  • read(url, buffer)
  • remove(url)
  • sync()
  • Equivalent Operations in Unix
  • fd creat(url, permissions)
  • write(fd, buffer, size)
  • close(fd)
  • fd open(url, mode)
  • read(fd, buffer, size)
  • close(fd)
  • unlink(url)

15
Design of WCFSBasic File System Layout
CacheDisk
getNewCacheObject
Disk
Cache
Cache
Cache
RequestQueue
FileTable
BitMap
Request
Request
Request
16
Design of WCFSFeature Recap
  • Raw I/O
  • Multi-threading
  • Asynchronous I/O
  • Quick name lookup
  • File data on consecutive blocks

17
Performance ComparisonsTrace
18
Performance ComparisonsCreate
19
Performance ComparisonsRead
20
Performance ComparisonsDelete
21
Two points, revisited
  • Optimizing on invariants
  • Impending I/O bottleneck

22
Whats coming...
  • Real raw I/O and proper memory alignment
  • Testing with more threads
  • Trace testing
  • Determining optimal fragmentation and cleaning
  • Is MD5 a bottleneck?
  • Elevator algorithm
  • Adding save on clean shutdown
  • Examine memory requirements for FileTable

23
Conclusions
  • Unix file system induces unnecessary overhead
  • Possible to take advantage of application
    specific traits
  • Specialization works
Write a Comment
User Comments (0)
About PowerShow.com