Title: Scaling Javabased Dynamic Web Services
1Scaling Java-basedDynamic Web Services
Sara Sprenkle Committee Jeff Chase
Carla Ellis Amin Vahdat May 9, 2001
2Scaling Java-based Dynamic Web Services
- Wide-area Internet applications
- dynamic web services
- written in Java
- want good performance ? scaling
- Scaling
- replicate service
- What? When? Where?
- How? How Much?
3What is Ivory?
- Infrastructure for replicating and caching
distributed data - General for wide range of applications
- simplifies construction of scalable, dynamic Web
services - Challenges
- fast, efficient
- automatic
- application-appropriate
- general
4Conventional Web Caches
- Web caches store recently-requested static
content, e.g., html documents, gifs, and jpegs - decreased client latency
- better fault tolerance
- better load balancing
- incremental scalability
Client
request
cache miss
Primary Server
Proxy Cache
Client
content
response
Client
5Dynamic Content
- Web application servers provide personalized
services - create dynamic content by executing code (CGI,
Java servlets) in response to client requests
Client
request
Client
response
Client
generate response
6Service Caches
- Replicate primary servers Java service code and
data at service cache
updates
Primary Server
code
data
code
updates
7Service Caches
- Can act as a cache for more than one service
- Partial replicas
Primary Server
Service Cache
code
data
updates
data
code
code
Primary Server
data
code
updates
data
code
code
8Ivory Architecture
- layer for managing distributed data structures
Secondary
Primary
data partition
Secondary
9Achieving Ivorys Goals
- General support
- provide mechanisms, not restrict policies
- Bytecode transformers
- insert code into compiled Java classes
- automatic application adaptation
- Conits
- groups of related objects
- space efficiency
- reduce communication overhead
- application-appropriate
10Data Model
- Java objects linked by references
- Bytecode transformers (JOIE)
- transform application
- monitor changes to objects
- make calls into Ivory
11Granularity
- Efficient data replication, synchronization
- partial replication (caching)
- tradeoffs in replication granularity
- too large ? false sharing, too many faults
- too small ? state management overhead
- need control over granularity
12Conits Object Clusters
- Application-dependent object groupings
- Granularity for synchronization and update
propagation
13Conits performance enhancers
- Caching
- track residency of conit
- prefetching?reduce cache misses
- amortize cost of faulting objects
- Synchronization
- shared lock for object group
- reduce false sharing
- Consistency
- amortize cost of propagating updates
- versions grows with number of conits, not
replicas
14Conits a closer look
conitid nodeid node_unique_id ? globally
unique id
4
7
8
Conit Root ingress point into cluster
0
5
2
6
1
reference to other clusters
Conit Object objects that can belong to a conit,
identified by a unique id within the conit
15Conit Management
- Membership
- assign conit membership for subset of objects
(Conit Roots) - automatically assign membership of other objects
- lazy addition (only when propagating updates)
- Propagating updates
- transitive closure to conit boundary
16Conits a closer look
conitid nodeid node_unique_id ? globally
unique id
4
7
8
Conit Root ingress point into cluster
0
5
2
6
1
reference to other clusters
Conit Object objects that can belong to a conit,
identified by a unique id within the conit
17Conit Management - Versioning
- List of dirty lists
- labeled with a logical timestamp
- objects added to dirty list, removed from
previous dirty list - Nodes send their current timestamp t when request
updates - receive all objects that have been modified since
t.
Version 7
Version 8
Version 9
18Ivory State Management
NameCache
table of objects and their symbolic names
NodeManager
table of node locations and ids
StateManager
table of clusters/conits and ids
19Service Cache Built On Ivory
Service Cache
application threads running in the context of a
service
transformed to call into Ivory to maintain data
consistency
I V O R Y
20Experiment
- SimClient
- generates a workload for Web servers
- Workload
- two servlets one reading, one writing
- write workload on server constant
- read workload on server and replicas as much as
possible - Data structures
- four named linked lists
- one conit, 16 objects
- Server, Replicas Sparc Ultra 1s
- Javas Web Server
- Solaris 2.8
21Read Throughput
22Total Throughput
23Read Latency
24Write Latency
25Discussion
- Scales!
- Limited overhead
- Higher throughput with lower write rate
- 45 writes/sec ? each tree is modified about 11
times per second
26Future Work
- Further evaluation
- Tradeoffs in conit size
- Alternate approaches to versioning
- More wide area applications using Ivory
- Explore relation of conits in TACT
27Ivory Overview
- data-sharing layer used to replicate and cache
data structures over wide area - design features
- scales with more replicas
- automated
- application-appropriate
- general framework for use by wide-area,
distributed applications
28Acknowledgements
- Dejan Kostic design
- Syam Gadde SimClient
- Darrell Anderson evaluation feedback
- Marty Gilbert Java servlets
- Patrick Reynolds any and all questions
- Drew Gallatin machines
- Andy Danner proofreading
29Java-based Web Services
- Several applications (servlets) together provide
a service
bank account service
data
30Example setting up a service cache
On harpo
Service s ServiceManager.registerService(
example ) s.registerPrimary(
groucho.cs.duke.edu )
primary groucho 100
secondary harpo 170
Primary Server
Service Cache
updates
code
data
data
code
code
31API
- Touch
- get updates for this object if the object could
be stale - Fault
- fetch a non-resident conit from the primary
server - Commit
- push updates to the primary
- Evict
- throw a conit out of the cache
32Example Test Data Structure
ConitTree reference to first ListItem and to
next ConitTree
ListItem integer value and reference to next
ListItem
33Example 1 cluster creation
List list new List( 2 ) list.startNewConit(
) Naming.bind( list, spartacus )
spartacus
A,1
0
B,2
Node id 100 groucho
34Example 2 cluster update
List newList new List( ) newList.attach( list
) list.append( newList )
spartacus
1
0
Add objects to conits without worrying about
naming conflicts
2
3
Node id 100 groucho
35Example 3 cluster management
spartacus
1
0
A
B
2
C
D
Node id 100 groucho
36Example 3 cluster management
- Fault on spartacus
- Lazy add objects are added to the conit only
when requested for replication
Conit id 100001
spartacus
1
0
A,3
B,4
2
C,5
D,6
Node id 100 groucho
37Example 4 cross-conit reference
Conit id 100001
Conit id 100002
spartacus
0
0
1
Node id 100 groucho
38Example 4 cross-conit reference
- Fault on spartacus
- Primary sends the transitive closure of the
desired object(s) - Stop propagation when reach a cross-conit
reference
Conit id 100001
Conit id 100002
spartacus
0
0
1
Node id 100 groucho
39Example 4 cross-conit reference
- List list Naming.lookup( spartacus )
- list.printList()
Conit id 100001
spartacus
0
100002 0
1
Node id 170 harpo
40Example 4 resolving cross-conit reference
- Automatically adds objects to cluster
Conit id 100002
Conit id 100002
1
0
0
2
Node id 100 groucho
41Example 4 complete replica
- List list Naming.lookup( spartacus )
- list.printList()
Conit id 100001
Conit id 100002
spartacus
1
0
0
2
1
Node id 170 harpo
42No Server Reader