FAWN : Fast Array of Wimpy Nodes - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

FAWN : Fast Array of Wimpy Nodes

Description:

Key-value storage system (Amazon, LinkedIn, Facebook) Small, random access over large data sets ... 50% of total cost, Data Centers being constructed near ... – PowerPoint PPT presentation

Number of Views:299

Avg rating:3.0/5.0

Slides: 42

Provided by: Siva8

Category:

more less

Transcript and Presenter's Notes

Title: FAWN : Fast Array of Wimpy Nodes

1
FAWN Fast Array of Wimpy Nodes

David Andersen, Jason Franklin, Michael Kaminsky,
Amar Phanishayee, Lawrence Tan, Vijay Vasudevan
Presented By Sivaguru Kannan

2
Motivation

Growing importance of High-Performance, Data
Intensive applications.
Key-value storage system (Amazon, LinkedIn,
Facebook)
Small, random access over large data sets
Power costs dominate total cost of operations
50 of total cost, Data Centers being constructed
near electrical stations
Lack of a Power Efficient solution for Data
Intensive Computing
Disk based solution Slow random access, High
Power Consumption
DRAM based solution Expensive, High Power
Consumption

3
Contributions

Cost effective / Power Efficient Cluster for
Data Intensive workloads
Low Power embedded CPUs Flash Storage (Wimpy
node)
FAWN Architecture
FAWN KV (Key-Value) Design and Implementation
Log Structured Datastore (FAWN-DS)
Evaluation
FAWN versus Traditional clusters

4
WHY FAWN?

Increasing CPU-I/O Gap, Ineffective for I/O bound
applications
FAWN uses low power CPUs to reduce I/O induced
idle cycles
Increased Power consumption for high frequency
processors. More power required to bridge CPU-I/O
gap (Branch prediction, on-chip caching)
FAWN CPUs execute more instructions per joule
than high frequency processors
Inefficiency of Dynamic Power Scaling Solutions.
DVFS system at 20 capacity consumes 50 power.
FAWN approach reduces peak power consumption.

Questions?

6
Target Workload Characteristics

Key-Value Lookup System
I/O Intensive
Random access over large sets
Small request size (100s of bytes)
Massively parallel, independent
Service level agreement for response time

7
FAWN Architecture

FAWN Front End
Routes Request/Response
Manages Backend nodes

8
FAWN Architecture (Contd.)

FAWN Back End Node (Wimpy Node)
Organized as a ring
Physical node gt V Virtual nodes
Each Virtual node responsible for a key range
Data stored in flash using FAWN-DS
Keys mapped to nodes by consistent hashing

9
FAWN Architecture

FAWN Data Store
One data store file for each virtual node
Log Structured, Updates appended to each file
(Semi-Random writes)

10
FAWN Datastore

6 Byte Memory Entry
Key Fragment(15 bit) Valid(1 bit) Offset (4
byte)
Only a fragment of key stored in Hash (Limited
DRAM)
May require more than one flash read after
finding appropriate hash value

11
FAWN DS Operations

Store
Append new data to log.
Change In-memory Hash table pointer.
Old data is now orphaned.
Delete
Append Delete entry to Data log
Invalidate hash table entry
Hash memory is non-volatile (Delete entry is
necessary)

12
FAWN DS maintenance Operations

Merge
Occurs when a node leaves a ring.
Data log of old node to be merged with data log
of successor node.
Data Entries appended to successor node.
Delete entries propagate to new node

13
FAWN DS maintenance Operations

Split
Occurs when a node joins the ring
Data log of successor node is split
Entries falling in key range of new node written
to the new nodes datalog.
Delete entries propagate to new node

14
Fawn DS maintenance Operations

Compact
Garbage Collection
Skips entries out of data range of node (due to
split), orphaned entries (due to store) .
Writes valid entries to output datalog.
Maintenance operations can be performed
concurrently with basic operations.

15
FAWN Key Value SystemFront End

Client request front end servers using API.
Each Front End maintains information about a
group of back end nodes (Chunk of Key space).
If target key is in front ends key space
Send Client request to appropriate backend node.
Else
Transfer Request to appropriate front end node

16
FAWN Key Value SystemFront End

Front End caches responses
Reduces latency in case of skewed loads

17
FAWN Key Value SystemMapping Keys to Virtual Node

Consistent Hashing
VIDs and Keys -160 bit
Keys gt Smallest VID gt Key
Advantage
Adding node involves change only at successor
virtual node. (Split)

18
FAWN Key Value SystemMapping Keys to Virtual Node
19
FAWN Key Value System Replication

Chain List of virtual nodes which contain
replicas for a key ranges data log.
Each Virtual node is a part of R chains.
Head for 1, Tail for 1, mid for R-2 chains

Separate DS files for each replica.
20
FAWN Key Value SystemChain Replication

Puts to head. Propagates through chain.
Buffer puts till receiving ack.
Gets to tail. (Every node in chain has seen
update)

21
FAWN Key Value System Join (with replication)

Split key range of successor node, add to new
node.
Receive Copies for R ranges of data (1 original
R-1 replica)
Front End treats new virtual node as head/tail
for appropriate ranges.
Virtual nodes may free space for key ranges for
which they are no longer responsible.

22
FAWN Key Value System Joining a node in a chain
E1 Tail for chain (B1-D1-E1) C1 New node
joins between B1 and D1

PRE COPY PHASE
Copy datalog from E1 to C1

23
Joining a node in a chain Chain-Insertion,
Log-Flush Phase

Chain Membership message to all nodes (B1,C1,D1
update neighbor list) from front end
Updates at B1 stream to C1, logged in temporary
buffer
C1s datastore is inconsistent, updates in the
time gap between end of pre-copy, chain insertion
not logged in C1
D1 is the new tail for chain

24
Joining a node in a chain Chain-Insertion,
Log-Flush Phase

Put requests after insertion of C1 replicated at
B1, C1 D1.
D1 forwards message to E1.
E1 pushes remaining log entries (after precopy
phase) to C1.
C1 merges Datalog information with new
information and updates In-memory Hash-table.

25
FAWN Key Value System Leave (with replication)

Merge Key Range with successor virtual node.
Add new replica to R chains of which old node was
a member (Similar to Join)

26
FAWN Key Value System Failure Detection

Front end exchanges heart beat messages with
backend.
No response, initiate leave protocol
Link failure not supported.
Front End failure
Back ends attach to new front end.

27
Evaluation Hardware

21 node cluster
Single Core 500 Mhz processor
256 MB DRAM
100 Mbps Ethernet
Power Consumption per node
3W when idle, 6W at 100 CPU, network, flash
Connected to front end using 2 16-port Ethernet
Switches

28
Evaluation Workload

Read-Intensive
Object size 256 Bytes to 1KB

29
Wimpy Node Performance

Iozone, flash formatted with ext2 filesystem

3.5 GB Filesize
1KB entry

30
FAWN-DS Performance

DS lt 256MB, Queries serviced by back end buffer
cache

Extra overhead compared to wimpy node performance
on flash due to hash lookup, data copies

31
FAWN-DS Performance

Bulk Store speed
Log Structured, bulk writes sequential
96 of raw wimpy node speed. (23.2 Mbp/s for 2
million 1Kb entries)
Put Speed
1 FAWN KV Virtual node gt RV data files
Sequential write to each file (Semi-Random)
Semi-Random write performance varies with device

32
FAWN-DS Vs BerkeleyDB

BerkeleyDB Disk Based database (Not optimized
for Flash)
0.07 Mbps for 7million,200 byte entries in flash
memory
BerkeleyDB over NILFS2 (Log Structured file
system)
Too much metadata generated for file system
checkpoint and rollback.
Log Structured FAWN DS necessary.

33
FAWN-DS Read Intensive vs Write Intensive
Workloads

Writes are sequential gt Higher throughput
Pure Write gt Frequent Cleaning
Throughput reduces to 700-1000 QPS during
compaction

34
FAWN KV Cluster Performance

Single node over network serves at 70 of rate of
stand alone FAWN-DS
Network overhead, request marshaling/unmarshaling
Load Imbalance in network

35
FAWNKV Cluster Performance

Power Consumption

Get Performance 36000 256B queries/sec (20GB
Dataset) Power Consumption 36000/99 364
queries/joule
36
Impact of Background Ops(Join)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
Conclusion