Dynamo: Amazon - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamo: Amazon

Description:

Dynamo: Amazon s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels – PowerPoint PPT presentation

Number of Views:455
Avg rating:3.0/5.0
Slides: 22
Provided by: KO493
Category:
Tags: amazon | dynamo

less

Transcript and Presenter's Notes

Title: Dynamo: Amazon


1
Dynamo Amazons Highly Available Key-value Store
  • DeCandia, Hastorun, Jampani, Kakulapati,
    Lakshman, Pilchin, Sivasubramanian, Vosshall,
    Vogels

PRESENTED BY KIMIISA OSHIKOJI
2
OUTLINE
  • Amazon
  • Dynamo
  • Architecture
  • Performance

3
AMAZON
  • Huge Infrastructure
  • Customer oriented business
  • Reliability is key

4
DYNAMO
  • Data storage system
  • Flexible
  • Automated addition and removal of storage nodes

5
DYNAMO-REQUIREMENTS
Requirement Effect
Query Model Read and write operations that are associated with a key
ACID Properties Properties for database transactions
Efficiency Systems must achieve latency and throughput requirements
Other Assumptions What Dynamo assumes
6
DYNAMO-QUERY MODEL
  • Key identifies operations
  • Operations dont require multiple data items
  • Data to be stored is relatively small

7
DYNAMO-ACID PROPERTIES
Property Effect
Atomicity Transactions happen or dont
Consistency Transactions consistent across states
Isolation Data cannot be accessed by external operations while its in an intermediate stage
Durability After transaction concluded it will never be undone
8
DYNAMO-EFFICINCY
9
DYNAMO-ASSUMPTIONS
  • Only used by internal Amazon systems
  • No security considerations
  • Limited scalability

10
DYNAMO-SLA
  • Service Level Agreement contract between client
    and service about their relationship
  • In Amazon a typical client request involves over
    100 services who might have dependencies
  • SLA are governed by 99.9th percentile

11
DYNAMO-DESIGN
  • Focus on correctness of an answer rather than how
    quickly it can be available
  • Eventually consistent data store
  • Writes can never be rejected
  • 99.9th percentile
  • Zero-hop DHT

12
DYNAMO-PRINCIPLES
Principle Effect
Incremental scalability A storage host can be scaled without undue impact to the system
Symmetry All nodes are the same
Decentralization Focus on peer to peer techniques
Heterogeneity Work must be distributed according to capabilities of the nodes
13
ARCHITECTURE-STORAGE
  • Objects stored with a key using
  • Get(key) locates object with key and returns
    object or list of objects with a context
  • Put(key, context) places an object at a replica
    along with the key and context
  • Context metadata about object

14
ARCHITECTURE-HASHING
15
ARCHITECTURE-REPLICATION
  • Data is replicated on N hosts (N is determined by
    user)
  • Coordinator nodes replicate the data for nodes
    they are responsible for coordinating

16
ARCHITECTURE-VERSIONING
  • Multiple versions can exist
  • Vector clock is used for version control
  • Vector clock size issue

17
ARCHITECTURE-FAILURE
Failure Type Description
Temporary failure of node Replica that would have been on failed node is sent to another with a hint as to original destination
Permanent failure of node Replica synchronization to insure no information is lost
Failure are not automatically detected by a
central node
18
ARCHITECTURE-ADDING
Discovery Type Description
Internal Gossip based protocol which leads to eventual consistent membership list
External Seed nodes, known by all nodes in system
19
PERFORMANCE-BUFFER
  • System can be optimized without sacrificing the
    99.9th percentile
  • Buffer usage can decrease latency by a factor of
    5 during peak traffic times

20
PERFORMANCE-LOAD DISTRIBUTION
Partitioning scheme Description
Partition by Token and T Tokens per node Range of nodes vary b/c of random selection of tokens
Partition into equal slices and T Tokens per node Tokens used to map values in hash space to nodes
Partition into equal slices and Q/S Tokens per node Each node in system must always have Q/S Tokens assigned to it
Third strategy is the best in terms of balancing
21
QUESTIONS?
Write a Comment
User Comments (0)
About PowerShow.com