Title: Mobile Databases
1 2Mobile computing
- Portable computing devices and wireless
communications - Can access data from anywhere, anytime
- Example
- Brokerage services
- News reporting
- Traffic/Vehicle services
3Mobile DB
- Mobile database data management technology
enabling use of databases on mobile computing
environment - Data available anywhere independent of
availability of fixed network - Can access public data using internet browser
- Can access private data through distributed DB
- Data on mobile and fixed hosts sharable in
seamless way - More complex techniques needed to support this
distributed transaction processing, commit
4Identifying Mobile characteristics
- Origins in distributed systems
- Problems more challenging
- Asymmetric communication bandwidth
- Limited and intermittent connectivity
- Limited life of power supply of mobile units
- Changing topology of network
- Mobile database assumes a traditional database
requiring ACID properties
5Mobile databases
- How to guarantee ACID properties
- Environment requires new strategies for
- Processing transactions
- Concurrency control - caching
- Data dissemination
- Querying location dependency
6Mobile Computing Architecture
- Mobile units MU or Mobile Hosts MH
- Fixed hosts FH on fixed network
- Base Station BS serves as gateway to fixed and
wireless network - Geographic mobility domain divided into cells
- Mobile host wireless connection to BS of cell
- Movement of mobile units unrestricted
- Must maintain info for access contiguity
7Mobile DB
- Mobile DB mix of fixed and wireless network
- DBS distributed among wired and wireless componen
- Data management shared among base stations, fixed
hosts and mobile units - MU can be data client and/or data server
- If a server, with DBMS functionality
- minimally need R, W, C, A
8DB
DB
9Mobile DB
- When Mobile DB mix of fixed and wireless network
- Fixed FH location, high capacity, reliability,
low connection cost - Wireless support dynamic network topology, low
capacity, reliability, high connection cost
10Transaction
- What is a transaction?
- A Transaction is not always just an SQL query
- A transaction is also
- From the time you login to SQL Plus until you
exit
11Mobile strategies
- Provide data cache on mobile host
- Cache replicas of frequently accessed data
- Work offline
- Reduce power consumption
- Client may be unreachable
- Dozing - energy conserving state
- Out of reach
- Proxies used for unreachable (e.g. update info)
- What if data cached updated during disconnection?
12Mobile strategies
- Resources of MU can be limited
- Mobile hosts personalized
- Bring in fraction of data need to access
- MU has low security
- Mobile DBs high degree of unavailability
- Broadcasting accepted way to disseminate data
13Mobile DB - Conservative
- Can assume entire DB distributed among wired
components - Full or partial replication
- Base station or fixed host has DBMS functionality
- Must be able to locate mobile units
- Need query and transaction management features
for mobile environment
14Mobile DB - Conservative
- How is this different from distributed
non-mobile? - Difficult to maintain sustained connection to
server - Database server typically is stateless,
especially under broadcast systems - Mobile clients often cannot maintain a sustained
network connection
15MANET Extreme DB
- Mobile adhoc networks
- MUs do not need to communicate via a fixed
network - In MANET, MU responsible for routing own data,
acting as BS - Must be able to handle changes in network topology
16MANET Extreme DB
- Peer-to-peer
- No central control
- Difficult for transaction processing and data
consistency - Example applications
- Multi-user games
- Shared white-boards
- Battle information sharing
- Distributed calendars
17Mobile DBs Best of both
- Alternatively assume DB distributed among wired
and wireless components - What if MU has DBMS functionality?
- MU can be laptop
- Data management shared among base stations, fixed
hosts and mobile units - More interesting problems!! But solvable
18- Assume best of both Mobile DBs for next few
topics to consider problems/solutions
19Data Management Issues
- Environment requires new strategies for
- Querying location dependency
- Concurrency control
- Processing transactions
- Security
- Data dissemination
- Recovery/fault tolerance
20Query processing
- Must know location of data
- Query optimization more difficult because of
mobility and resource changes of MU - MU may be in transit or may cross cell boundaries
21Location-based services
- Location dependent cache information may become
stale - Frequently updated location dependent queries
- Apply spatial queries to refresh cache problem
22Transaction models
- Mobile transaction may execute on several BSs
- Central coordination lacking if data distributed
among wireless components - Long lived transactions
- ACID properties difficult to guarantee
- Can add proxies for unreachable components
- Proxies keep track of updates to cache
23Data distribution and replication
- Data unevenly distributed among BS and MU
- To compensate for high latency and unreliable
connectivity - Frequently accessed data is cached
- Can work offline if necessary
- Consistency constraints and cache management
24Recovery and Fault tolerance
- Site, media, transaction and communication
failures - Voluntary shutdown not a site failure
- Transaction failures can occur during handoff
25Security
- Mobile data less secure than data at fixed
location - Data is more volatile
- Must manage and authorize access to critical data
26Data Dissemination -Broadcasting
- Assumptions
- Requests are read-only (Most are)
- Because of latency, server can handle fewer
clients in same amount of time - Broadcasting acceptable solution
- Scalable single broadcast of data item can
satisfy all outstanding requests for data item -
27Data Dissemination -Broadcasting
- Assumptions
- Requests are read-only (Most are)
- Because of latency, server can handle fewer
clients in same amount of time - Broadcasting acceptable solution
- Scalable single broadcast of data item can
satisfy all outstanding requests for data item -
28Broadcasting
- Broadcast-based data dissemination approaches
- Push-based data broadcasting
- Pull-based data broadcasting
- Hybrid data broadcasting
29Push-based broadcasting
- Data contents within a file or database are
repeatedly broadcast through the broadcast
channel - channel becomes a disk
- clients can retrieve data as it goes by
- expected wait time for a data item is the same
30Flat broadcasting
31Broadcast Disks
- broadcast data in different frequencies according
to their relevant importance - multi-level memory hierarchy
- hot data are broadcast more frequently then cold
data - Data with similar access frequency are grouped
into disks
32Server Broadcast Program
33Pull-based broadcast
- also called adaptive approaches
- data items are broadcast on-demand
- only requested data will appear as data on air
34Pull-based
- Data broadcasting is prioritized according to
some metrics - Most common algorithms are
- First come First Served (FCFS) broadcasts the
pages in the order they are requested. - Most Requests First (MRF) broadcasts the page
with maximum number of pending requests. - Longest Wait First (LWF) selects the page that
has the largest total waiting time, i.e., the sum
of the time that all pending request for the item
have been waiting. (RW is approximation)
35Pull-based
- MRF best response time at high system loads and
page requests uniformly distributed - LWF best response time when page request
distributed by Zipf
36Hybrid data Broadcasting
- mixes both push and pull
- clients to send pull requests for misses on the
backchannel - server supports a Broadcast Disk plus interleaved
responses to the pulls on the front channel - alleviate the problem of excessively long waiting
time for some data
37Indexing
- Clients can save battery power by turning into
active mode only when interested data are
broadcast - (1, m) index method (Imielinski, et al. )
- Index is broadcast m times during the broadcast
of one version of the file
38Related Research Indexing (cont.)
39Data Consistency
- Assumption Read and Write transactions
- Challenges in mobile environments
- Difficult to maintain sustained connection to
server - Database server typically is stateless,
especially under broadcast systems - Mobile clients often cannot maintain a sustained
network connection - How to ensure conflict serializability?
40Research in Data Consistency
- Assumptions
- Read and Write transactions
- MU has DBMS functionality
- Mobile unit may often experience
voluntary/involuntary disconnections - Then, it can only read and update data copied
onto their local cache - What if data cached updated during disconnection?
41Concurrency Control
- Two-tier replication algorithm (Gray et al. 1996)
- Tentative and Base transactions
- Tentative transactions are transactions executed
over local copies if disconnected - tentative transaction will be submitted to the
server and reprocessed before final installation - Can be aborted by the server due to conflicts
with other transactions - Base transactions (transactions work only on
master data) - transaction becomes durable when the base
transaction completes - Drawback deadlocks, system unscalable
42Concurrency Control
- Certification reports - CR (Barbara, 97)
- Consists of the read/write sets of recently
committed transactions - Broadcast periodically by the server
- Clients execute part of validation work locally
- Must submit to server for final validation
43Concurrency Control
- Optimistic Concurrency Control with Update
Timestamp (OCC-UTS) - Server broadcasts invalidation report (IR), which
contains new timestamps of newly updated data
items - If any accessed data item in a local executing
transaction has an older timestamp, the local
transactions is aborted
44Mobile Databases
- Research Issues 2007 MDM
- Atomic commit protocol
- Data dissemination
- Spatio-temporal range queries
- Adhoc networks - data integrity, data replication
45- Ph. D. student Weigang Ni
- Data Management in Adaptive Broadcast
Environments
46Lazy Data Request (LDR)
- Pull-based data broadcasting data are broadcast
on demand (Stathatos, et al. ) - Scheduling algorithms
- First Come First Serve (FCFS)
- Most Requests First (MRF)
- Longest Wait First (LWF)
- Requests times Wait (R W)
- Other algorithms based broadcast histories,
estimation of the probabilities of access for
each data item.
47LDR
- Existing algorithms mainly concern data access
time. - Whenever a client has a data request, it sends
the request to the server Eager Data Request
(EDR). - Sending message consumes more battery power than
receiving message.
48LDR
- Motivation wanted data may have already been
requested by other clients. Why not wait instead.
Two possible results. - Issues need to be addressed
- Mobile clients do not communicate with each
other. Therefore, they cannot decide whether to
wait or go ahead and send the request - The system load changes dynamically. A predefined
waiting time will not work well.
49LDR
- Features of Lazy Data Request
- Client do not need to contact the server to get
the system load information and waiting time. - The waiting time is dynamically changing
according to system load. - LDR approach can apply to nearly all the existing
on-demand broadcast algorithms
50Server-side algorithm of LDR
- Step1. Let n be the total number of requested
data items - Step 2. Choose ?n data items to be broadcast
next based on some scheduling algorithm (0 1) - Step 3. Clear all existing requests for these ?n
data times. - Step 4. Broadcast the index section consisting of
these ?n data items. - Step 5. Broadcast the data items.
- Eg. Will broadcast (? 100) of data items
51Client-side algorithm of LDR
- Wait until wanted data or index section is
broadcast - If wanted data items come
- download the data
- drop the local pending request
- else
- check the index section
- if wanted data ids in index section
- wait until data are broadcast
- else
- send the pending request(s) to the server
52Discussion
- Algorithm still work without using index.
However, index makes the data broadcast more
predictable and further saves the data request
messages. - Adjust the value of ?,
- If ? 1, LDR becomes first come first served
(FCFS) algorithm - If ? is very small, LDR virtually becomes EDR as
every time only a couple of data items are chosen - Client waiting time is bounded.
53System Parameters
- Parameter Description Value
- Dbsize The number of items in DB 1000
- ? Mean request arrival rate (exp) 10 100
- ? Skewness of access pattern (zipf) 0.1 0.9
- ? Selection factor 0.1
54Experimental results - requests saved on average
55 requests saved on hot data
56 requests saved on cold data
57Impact on hot data
- Requests saved for hot data
- More likely hot data broadcast before requested
- MRF has 50 reduction in messages
- FCFS,
58Average Data Access time
59Average Data Access time- hot data
60Average Data Access time cold data
61Why is it faster?
- Saving the requests for hot data essentially
changes the access skewness of data requests sent
to the server, i.e., the access pattern appears
to be more evenly distributed than it actually is
62LDR Conclusions
- LDR decreases number of messages sent
- Decreases average data access time
- By up to 50 for both
- Data access time does not increase
- Data access time actually decreased by over 50
- Due mainly to cold data
- Works with a variety of scheduling algorithms
63Overall Conclusions
- Data management in mobile environments
- Concurrency control
- Algorithms proposed produce serializable
histories - Outperform existing algorithms
- Adaptive data broadcasting
- Algorithm proposed shows number of data request
messages can be reduced - Data access time does not increase
64Virtual Locks W. Ni
65Virtual Locks
- Lock-based concurrency control approach
- Using server authorization to eliminate
transaction restarts - Authorization information is broadcast in the
broadcast cycle header - Treat read-only and update transactions
differently
66Virtual Locks (cont.)
- Server schedules the data broadcast in the
following way - Read-only transactions data requests will be
satisfied unconditionally - Update transactions data requests are satisfied
only if they pass the conflict resolution.
67Virtual Locks (cont.)
- Read-only transaction
- Does not need explicit server authorization
- May not send data requests to the server as long
as the required data is on the air within one
broadcast cycle. - Commit locally
68Virtual Locks (cont.)
- Update transaction
- Client always sends the data requests to the
server - Can only proceed when it is authorized to begin,
i.e., its transaction id appears on the air - Transaction will be submitted to the server for
final installment
69Study
- Proved conflict serializable
- Compare performance to existing strategies
- CR certification report
- Receive RW sets from committed transactions
- Must always request validation from server, even
if only read
70Read-only transactions
- tune into the first index segment in current
broadcast cycle - if (read_set ? BC_SET and no data in read_set has
been broadcast) - download data in read_set from current
- broadcast cycle
- else
- send a data request message to the server
- while (true)
- wait till the beginning of next broadcast
- cycle
- if (read_set ? BC_SET)
- download the data in read_set
- break
- process the transaction and commit locally
-
71Write Transactions
- send a data request message to the server
- while (true)
- wait till the next broadcast cycle
- if (tran_id ? auth_list)
- download the data in read_set and write_set
- process the transaction
- send the old and updated value to the
- server for commit
- break
72(No Transcript)
73Experimental results (uniform access)
74Experimental Results (Zipf access)
75VL Conclusions
- Locking strategy VL better than optimistic CR
- Optimistic causes large amount of aborted
transactions - VL better for both uniform and skewed data access
pattern - VL better scalability
76(No Transcript)
77Adjusting serialization order W. Ni
78Optimistic Concurrency Control with Dynamic
Adjusting Timestamp Ordering (OCC-DTO)
- Problems with existing algorithms
- Each transaction must be submitted to the server
for validation regardless whether it is read-only
or update transaction. - During the transaction execution, if it misses a
CR or IR, transaction must be aborted. - Many read-only transactions can finish even if
they find that accessed data have been updated.
79Existing algorithms
- optimistic concurrency control with update
timestamp (OCC-UTS) - uses timestamps and caching to improve the system
performance. - Different from the CR approach, only updated data
items are broadcast in the invalidation report
(IR).
80Problems with existing algorithms
- CR and OOC-UTS algorithms can cause many
transactions to be restarted due to conflicts
with updated data items. - we propose these transactions be completed
without restarting by dynamically adjusting the
serialization order among transactions.
81(No Transcript)
82Wasted Abort
- Server commits T1
- T3 and T4 are aborted due to the conflict on data
item y. - Only T2 proceeds to finish
- if we adjust the serialization order between T1
and T3 so that T3 ? T1, we find that transaction
T3 also can proceed without aborting.
83Proposed strategy - OCC-DTO
- Features of OCC-DTO
- Clients are allowed to disconnect during the
execution of a transaction. - Read-only and update transactions are handled
differently. - Dynamically set transaction threshold timestamp
if conflict between read-set and IR - Read-only transactions are allowed to commit
locally. - Read-only transactions do not have to restart if
there is a conflict. - Proved global serialization is still maintained.
84OCC-DTO server-side algorithm
- Validate submitted updated transactions
- Commit the transactions passing the validation
test - Broadcast data and invalidation report (IR)
- Committed transactions
- Aborted transactions
- Updated data items (with new timestamps)
-
85OCC-DTO client-side algorithm
- Each transaction has two variables thresh_ts (?)
and adjusted (false) - Upon reading data item x
- if adjusted is true and ts(x) tresh_ts, abort
else continue to execute - Upon writing data item x
- If adjusted is true, abort transaction
86An Example
87(No Transcript)
88Extended to include reconnection
- Handle mobile clients disconnection
- A client disconnects from the network for various
reasons. - It may miss IRs during disconnection.
- In earlier algorithms, transactions will be
aborted if any IR is missed. - Upon reconnection, a client will reset its
timestamp equal to the maximum ts of its accessed
data item if adjusted is false.
89Simulation parameters
90Performance
- Baseline Restart Ratio Performance of OCC_DTO
91 Fig. 3 Avg. turnaround time comparison
under baseline parameter setting
92Fig. 4 Avg. turnaround time comparison under
R_W_RATIO 5
93OCC-DTO Conclusions
- Dynamically adjust serialization order using
timestamps to allow transactions to complete
without restarts - Read commit locally, same uplink bandwidth
- Better performance than CR fewer restarts