Title: Team 2: The House Party Blackjack
1Team 2 The HouseParty Blackjack
- Mohammad Ahmad
- Jun Han
- Joohoon Lee
- Paul Cheong
- Suk Chan Kang
2Team Members
Hwi Cheong (Paul) hcheong_at_andrew.cmu.edu
Mohammad Ahmad mohman_at_cmu.edu
Joohoon Lee jool_at_ece.cmu.edu
Jun Han junhan_at_andrew.cmu.edu
SukChan Kang sckang_at_andrew.cmu.edu
3Baseline Application
- Blackjack game application
- User can create tables and play Blackjack.
- User can create/retrieve profiles.
- Configuration
- Operating System Linux
- Middleware Enterprise Java Beans (EJB)
- Application Development Language Java
- Database MySQL
- Servers JBOSS
- J2EE 1.4
4Baseline Architecture
- Three-tier system
- Server completely stateless
- Hard-coded server name into clients
- Every client talks to HostBean (session)
5(No Transcript)
6(No Transcript)
7Fault-Tolerant Design
- Passive replication
- Completely stateless servers
- No need to transfer states from primary to backup
- All states stored in database
- Only one instance of HostBean (session bean)
needed to handle multiple client invocations ?
efficient on server-side - Degree of replication depends on number of
available machines - Sacred machines
- Replication Manager (chess)
- mySQL database (mahjongg)
- Clients
8Replication Manager
- Responsible for server availability notification
and recovery - Server availability notification
- Server notifies Replication Manager during boot.
- Replication Manager pings each available server
periodically. - Server recovery
- Process fault pinging fails reboot server by
sending script to machine - Machine fault (Crash fault) pinging fails
sending script does nothing machine has to be
booted and server has to be manually launched.
9Replication Manager (contd)
- Client-RM communication
- Client contacts Replication Manager each time it
fails over - Client quits when Replication Manager returns no
server or Replication Manager cant be reached.
10Evaluation of Performance (without failover)
11Observable Trend
12Failover Mechanism
- Server process is killed.
- Client receives a RemoteException
- Client contacts Replication Manager and asks for
a new server. - Replication Manager gives the client a new
server. - Client remakes invocation to new server
- Replication Manager sends script to recover
crashed server
13Failover Experiment Setup
- 3 servers initially available
- Replication Manager on chess
- 30 fault injections
- Client keeps making invocations until 30
failovers are complete. - 4 probes on server, 3 probes on client to
calculate latency
14Failover Experiment Result
Latency (ms)
Invocation
15Failover Experiment Results
- Maximum jitter 700ms
- Minimum jitter 300ms
- Average failover time 404ms
16Failover Pie-chart
Most of latency comes from getting an exception
from server and connecting to the new server
17Real-time Fault-Tolerant Baseline Architecture
Improvements
- Fail-over time Improvements
- Saving list of servers in client
- Reduces time communicating with replication
manager - Pre-creating host beans
- Client will create host beans on all servers as
soon as it receives list from replication manager - Runtime Improvements
- Caching on the server side
18Client-RM and Client-Server Improvements
- Client-RM and Client-Server communication
- Client contacts Replication Manager each time it
runs out of servers to receive a list of
available servers. - Client connects to all servers in the list and
makes a host beans in them, then starts the
application with one server - During each failover, client connects to the next
server in the list. - No looping inside list
- Client quits when Replication Manager returns an
empty list of servers or Replication Manager
cant be reached.
19Real-time Server
- Caching in server
- Saves commonly accessed database data in server
- Use Hashmap to map query to previously retrieved
data. - O(1) performance for caching
20Real-time Failover Experiment Setup
- 3 servers initially available
- Replication Manager on chess
- 30 fault injections
- Client keeps making invocations until 30
failovers are complete. - 4 probes on server, 5 probes on client to
calculate latency and naming service time - Client probes
- Probes around getPlayerName() and getTableName()
- Probes around getHost() for failover
- Server probes
- Record source of invocation name of method
- Record invocation arrival and result return times
21Real-time Failover Experiment Results
Latency (ms)
Invocation
22Real-time Failover Experiment Results
- Average failover time 217 ms
- Half the latency without improvements (404 ms)
- Non-failover RTT is visibly lower (shown on
graphs below)
Before Real-Time Implementation
After Real-Time Implementation
23Real-time Failover Experiment Results
24Open Issues
- Blackjack game GUI
- Load-balancing using Replication Manager
- Multiple number of clients per table (JMS)
- Profiling on JBoss to help improve performance
- Generating a more realistic workload
- TimeoutException
25Conclusions
- What we have accomplished
- Fault-tolerant system with automatic server
detection and recovery - Our real-time implementations proved to be
successful in improving failover time as well as
general performance - What we have learned
- Merging code can be a pain.
- A stateless bean are accessed by multiple
clients. - State can exist even in stateless beans and is
useful if accessed by all clients ? cache! - What we would do differently
- Start evaluation earlier
- Put more effort and time into implementing
timeouts to enable bounded detection of server
failure.