Title: FAWN : Fast Array of Wimpy Nodes
1FAWN Fast Array of Wimpy Nodes
- David Andersen, Jason Franklin, Michael Kaminsky,
Amar Phanishayee, Lawrence Tan, Vijay Vasudevan - Presented By Sivaguru Kannan
2Motivation
- Growing importance of High-Performance, Data
Intensive applications. - Key-value storage system (Amazon, LinkedIn,
Facebook) - Small, random access over large data sets
- Power costs dominate total cost of operations
- 50 of total cost, Data Centers being constructed
near electrical stations - Lack of a Power Efficient solution for Data
Intensive Computing - Disk based solution Slow random access, High
Power Consumption - DRAM based solution Expensive, High Power
Consumption
3Contributions
- Cost effective / Power Efficient Cluster for
Data Intensive workloads - Low Power embedded CPUs Flash Storage (Wimpy
node) - FAWN Architecture
- FAWN KV (Key-Value) Design and Implementation
- Log Structured Datastore (FAWN-DS)
- Evaluation
- FAWN versus Traditional clusters
4WHY FAWN?
- Increasing CPU-I/O Gap, Ineffective for I/O bound
applications - FAWN uses low power CPUs to reduce I/O induced
idle cycles - Increased Power consumption for high frequency
processors. More power required to bridge CPU-I/O
gap (Branch prediction, on-chip caching) - FAWN CPUs execute more instructions per joule
than high frequency processors - Inefficiency of Dynamic Power Scaling Solutions.
DVFS system at 20 capacity consumes 50 power. - FAWN approach reduces peak power consumption.
5 6Target Workload Characteristics
- Key-Value Lookup System
- I/O Intensive
- Random access over large sets
- Small request size (100s of bytes)
- Massively parallel, independent
- Service level agreement for response time
7FAWN Architecture
- FAWN Front End
- Routes Request/Response
- Manages Backend nodes
8FAWN Architecture (Contd.)
- FAWN Back End Node (Wimpy Node)
- Organized as a ring
- Physical node gt V Virtual nodes
- Each Virtual node responsible for a key range
- Data stored in flash using FAWN-DS
- Keys mapped to nodes by consistent hashing
9FAWN Architecture
- FAWN Data Store
- One data store file for each virtual node
- Log Structured, Updates appended to each file
(Semi-Random writes)
10FAWN Datastore
- 6 Byte Memory Entry
- Key Fragment(15 bit) Valid(1 bit) Offset (4
byte) - Only a fragment of key stored in Hash (Limited
DRAM) - May require more than one flash read after
finding appropriate hash value
11FAWN DS Operations
- Store
- Append new data to log.
- Change In-memory Hash table pointer.
- Old data is now orphaned.
- Delete
- Append Delete entry to Data log
- Invalidate hash table entry
- Hash memory is non-volatile (Delete entry is
necessary)
12FAWN DS maintenance Operations
- Merge
- Occurs when a node leaves a ring.
- Data log of old node to be merged with data log
of successor node. - Data Entries appended to successor node.
- Delete entries propagate to new node
13FAWN DS maintenance Operations
- Split
- Occurs when a node joins the ring
- Data log of successor node is split
- Entries falling in key range of new node written
to the new nodes datalog. - Delete entries propagate to new node
14Fawn DS maintenance Operations
- Compact
- Garbage Collection
- Skips entries out of data range of node (due to
split), orphaned entries (due to store) . - Writes valid entries to output datalog.
- Maintenance operations can be performed
concurrently with basic operations.
15FAWN Key Value SystemFront End
- Client request front end servers using API.
- Each Front End maintains information about a
group of back end nodes (Chunk of Key space). - If target key is in front ends key space
- Send Client request to appropriate backend node.
- Else
- Transfer Request to appropriate front end node
16FAWN Key Value SystemFront End
- Front End caches responses
- Reduces latency in case of skewed loads
17FAWN Key Value SystemMapping Keys to Virtual Node
- Consistent Hashing
- VIDs and Keys -160 bit
- Keys gt Smallest VID gt Key
- Advantage
- Adding node involves change only at successor
virtual node. (Split)
18FAWN Key Value SystemMapping Keys to Virtual Node
19FAWN Key Value System Replication
- Chain List of virtual nodes which contain
replicas for a key ranges data log. - Each Virtual node is a part of R chains.
- Head for 1, Tail for 1, mid for R-2 chains
Separate DS files for each replica.
20FAWN Key Value SystemChain Replication
- Puts to head. Propagates through chain.
- Buffer puts till receiving ack.
- Gets to tail. (Every node in chain has seen
update)
21FAWN Key Value System Join (with replication)
- Split key range of successor node, add to new
node. - Receive Copies for R ranges of data (1 original
R-1 replica) - Front End treats new virtual node as head/tail
for appropriate ranges. - Virtual nodes may free space for key ranges for
which they are no longer responsible.
22FAWN Key Value System Joining a node in a chain
E1 Tail for chain (B1-D1-E1) C1 New node
joins between B1 and D1
- PRE COPY PHASE
- Copy datalog from E1 to C1
23Joining a node in a chain Chain-Insertion,
Log-Flush Phase
- Chain Membership message to all nodes (B1,C1,D1
update neighbor list) from front end - Updates at B1 stream to C1, logged in temporary
buffer - C1s datastore is inconsistent, updates in the
time gap between end of pre-copy, chain insertion
not logged in C1 - D1 is the new tail for chain
24Joining a node in a chain Chain-Insertion,
Log-Flush Phase
- Put requests after insertion of C1 replicated at
B1, C1 D1. - D1 forwards message to E1.
- E1 pushes remaining log entries (after precopy
phase) to C1. - C1 merges Datalog information with new
information and updates In-memory Hash-table.
25FAWN Key Value System Leave (with replication)
- Merge Key Range with successor virtual node.
- Add new replica to R chains of which old node was
a member (Similar to Join)
26FAWN Key Value System Failure Detection
- Front end exchanges heart beat messages with
backend. - No response, initiate leave protocol
- Link failure not supported.
- Front End failure
- Back ends attach to new front end.
27Evaluation Hardware
- 21 node cluster
- Single Core 500 Mhz processor
- 256 MB DRAM
- 100 Mbps Ethernet
- Power Consumption per node
- 3W when idle, 6W at 100 CPU, network, flash
- Connected to front end using 2 16-port Ethernet
Switches
28Evaluation Workload
- Read-Intensive
- Object size 256 Bytes to 1KB
29Wimpy Node Performance
- Iozone, flash formatted with ext2 filesystem
- 3.5 GB Filesize
- 1KB entry
30FAWN-DS Performance
- DS lt 256MB, Queries serviced by back end buffer
cache
- Extra overhead compared to wimpy node performance
on flash due to hash lookup, data copies
31FAWN-DS Performance
- Bulk Store speed
- Log Structured, bulk writes sequential
- 96 of raw wimpy node speed. (23.2 Mbp/s for 2
million 1Kb entries) - Put Speed
- 1 FAWN KV Virtual node gt RV data files
- Sequential write to each file (Semi-Random)
- Semi-Random write performance varies with device
32FAWN-DS Vs BerkeleyDB
- BerkeleyDB Disk Based database (Not optimized
for Flash) - 0.07 Mbps for 7million,200 byte entries in flash
memory - BerkeleyDB over NILFS2 (Log Structured file
system) - Too much metadata generated for file system
checkpoint and rollback. - Log Structured FAWN DS necessary.
33FAWN-DS Read Intensive vs Write Intensive
Workloads
- Writes are sequential gt Higher throughput
- Pure Write gt Frequent Cleaning
- Throughput reduces to 700-1000 QPS during
compaction
34FAWN KV Cluster Performance
- Single node over network serves at 70 of rate of
stand alone FAWN-DS - Network overhead, request marshaling/unmarshaling
- Load Imbalance in network
35FAWNKV Cluster Performance
Get Performance 36000 256B queries/sec (20GB
Dataset) Power Consumption 36000/99 364
queries/joule
36Impact of Background Ops(Join)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40(No Transcript)
41Conclusion
- FAWN architecture reduces energy consumption of
cluster computing - FAWN-KV addresses challenges of wimpy nodes for
key value storage - Log-structured, memory efficient datastore
- Efficient replication and failover
- Meets energy efficiency and performance goals