Title: Query Processing in Mobile P2P Databases
1Query Processing in Mobile P2P Databases
IGERT Seminar Presentation Bo Xu joint work with
Ouri Wolfson
2Talk outline
- Introduction
- System Model
- The MARKET Algorithm
- Evaluation
- Extension to CTS
- Conclusion and Future Work
3Query Processing Environments
Motivation a general purpose query processing
strategy mobile disconnected wireless ad-hoc
networks
4Store-and-forward to deal with sparseness
A
QA
r
Q
Q
q
A
A
qA
5Issues with Store-and-forward
- How to manage limited memory, power, and
bandwidth? - Which reports to save/transmit?
6Difficulty of Store-and-forward
Case Each mobile node is interested in every
data-item
Assume that the trajectories of all nodes is
known a priori at a central server. If memory,
energy, and bandwidth are bounded at mobile
nodes, then the problem of determining whether a
set of data-items can be disseminated to all the
mobile nodes is NP-complete.
Mobile P2P Trajectories unknown a priori
Heuristics needed
7Talk outline
- Introduction
- System Model
- The MARKET Algorithm
- Evaluation
- Extension to CTS
- Conclusion and Future Work
8Mobile P2P Database
Pdas, cell-phones, sensors, hotspots, vehicles,
with short-range wireless capabilities
C
A
B
- Applications coexist
- Variable report sizes
- A peer can be a produce, consumer, and broker
9Queries
- A query Q maps each report R to a match degree
- Examples
- Top parking slots given my current location
- Profile with expertise children-periodontics
- Similarity between two images
match(R,Q)e-?t-?d
10Query/report Dissemination
- Two peers within transmission range exchange
queries and reports - Least relevant reports that do not fit in local
broker database are purged - Exchange not necessarily synchronous (periodic
broadcast)
11Talk outline
- Introduction
- System Model
- The MARKET Algorithm
- Evaluation
- Extension to CTS
- Conclusion and Future Work
12Ranking Factors
- Rank of a report R is determined by
- Demand What fraction of peers are querying R
- Probability that a peer is interested in R
- Supply What fraction of peers already have R
- Probability that a peer has R
- Size of R
13Rank of a report
expected benefit demand(R)(1?supply(R))
14Report Ranking sample demand
Queries relation is FIFO maintained
15Rank of Reports
- Demand for R
- Qis are the members of the queries relation
- Size of the queries relation determined based on
Hoeffdings inequality
E.g., if n108, then with 95 chance the demand
estimation error is smaller than 0.08
16How does peer O determine supply(R)?
- A parametric formula giving the supply is beyond
the state of the art - O machine-learns supply(R) based on meta-data of
R - Age of R
- Number of times O sighted R from other peers
- etc.
17Computing Supply by Machine-learning
MAchine LEarning based Novelty rAnking (MALENA)
Reports database of O
Report
Report
aro
fin
aro
fin
report
-
id
report
-
id
description
description
R1
1 1
R1
1 1
R4
2 4
R4
2 4
R2
3 2
R2
3 2
R7
4 2
R7
4 2
aro The age rank order within Os reports
database fin The number of times O has sighted
the report from other peers
18MALENA
Examples created
negative
positive
Request R2
19MALENA Implementation Considerations
- Minimize overhead
- No need to actually store examples
- Model incrementally built
- Bayesian learning a simple but effective method
20Talk outline
- Introduction
- System Model
- The MARKET Algorithm
- Evaluation
- Extension to CTS
- Conclusion and Future Work
21Comparison with RANDI (MDM07)
mobility modelrandom way point, average motion
speed1 mile/hour transmission range100 meters,
mean of reports database size100Kbytes queries
database size100 queries report size uniformly
distributed between 1K and 2K bytes 0.1 report
produced per second
1 peer within transmission range
20 peers within transmission range
MARKET half as good as ideal benchmark MARKET
twice better than RANDI
RANDIMARKET-supply
22Comparison with LRU and LFU
mobility modeliMotes traces mean of reports
database size150Kbytes queries database size10
queries report size uniformly distributed between
2K and 20K bytes 0.1 report produced per second,
transmission size100Kbytes
throughput (matches/peer)
response-time bound (second)
(results obtained by Fatemeh Vafaee)
23Evaluation of MALENA (TAAS09)
turn-over peers enter/exit system injection
number of peers that have a report
initially mobility modeliMotes traces, reports
database size100 reports 2 reports produced per
second, transmission size10 reports
MALENA always follows the best indicator
24Application K-nearest-neighbors
query-point
sink
- Query K-nearest-neighbors of a fixed location
(query-point) - Reports current locations of mobile sensors
- match(Q,R) in reverse proportion to the distance
from query point
25Itinerary based KNN processing
Phase I Query delivered to the sensor closest to
query point
Phase II Query traverses an itinerary to collect
answers
Phase III Answers returned to sink
26Simulation Results
mobility modelrandom way point, average motion
speed1 mile/hour transmission range100 meters
report size24 bytes, query size16 bytes mean
of reports database size100 reports one location
report produced at each sensor per second
MARKET is especially suitable for sparse
environments
27Talk outline
- Introduction
- System Model
- The MARKET Algorithm
- Evaluation
- Extension to CTS
- Conclusion and Future Work
28TrafficInfo Disseminating Traffic Information in
VANETs
29What does relevance mean in TrafficInfo
B
B
A
A
A report is relevant if it changes the route
30Which factors indicate relevance of report?
- Distance to the reported road segment
- Type of road segment
- Speed variance
31Conceptual Learning Procedure
- An example is created for a received report
- The example is labeled positive if the report
changes route and negative otherwise - Individual vs. group
- How to deal with aggregation?
32Conclusion
sensor-rich environment
short-range wireless
Mobile P2P
?
- Store-and-forward enables in-network processing
in mobile disconnected networks - Ranking is important for dealing with memory,
bandwidth, and energy constraints
33Future Work
- Multimedia reports
- Utilization of metadata
- Integration of stateless and stateful approaches
- Starvation/fairness
34Thanks! Questions?
35802.11 Basics
- 3 modes transmitting, receiving, listening
(order of power consumption) - When listening if detecting a message destined
to host ? receive-mode - Time divided into slots, 20microsecs each
- Transmission
- Listen for 1 time slot
- If channel free start broadcast (observe
collision possible) - Broadcast may last for many time slots
36Energy Efficiency of a Broadcast
successfully receive the broadcast from x
Collisions occur at neighbor
Throughput (Th) (expected number of neighbors
that successfully receive broadcast) ? (broadcast
size) Power efficiency (PE)
37Computation of Throughput
X
Y
Conditions for successful reception at an
arbitrary node Y
- No green node inside starts to broadcast at the
same time slot with X - No transmission from any purple node overlaps
with that from X
38Energy Constraints
- Energy consumed by a 802.11 network interface for
transmitting a message of size M bytes - Enf?Mg
- For 802.11 broadcast, g266?10-6 Joule,
f5.27?10-6Joule/byte -
39Experimental MP2P Projects (Pedestrians)
- 7DS Columbia University (web pages)
- iClouds Darmstadt Univ. (incentives)
- MoGATU UMBC (specialized query processing,
e.g., collaborative joins) - PeopleNet NUS, IIS-Bangalore (Mobile commerce,
information type ? location baazar) - MoB Wisconsin, Cambridge (incentives,
information resources e.g. bandwidth) - Mobi-Dik Univ. of Illinois, Chicago (brokering,
physical resources, bandwidth/memory/power
management)
40Vehicular Projects
- Inter-vehicle Communication and Intelligent
Transportation - CarTALK 2000 is a European project
- VICS (The Vehicle Information and Control System)
is a government-sponsored system in Japan with an
11-year track record - FleetNet, an inter-vehicle communications system,
is being developed by a consortium of private
companies and universities in Germany - IVI (Intelligent Vehicle Initiative) and VII
(Vehicle Infrastructure Integration), the US DOT - MP2P provides data management capabilities on top
of these communication systems - Grassroots, TrafficView, SOTIS, V3 P2P
dissemination of traffic info to reduce travel
times
41RANk-based DIssemination (RANDI)
- Ranking of reports
- Bandwidth/energy aware
- Exchange enhances
- Consumer functionality
- Broker functionality
- Consumer Answer local query (pull)
- Broker Transmit reports most likely requested by
future-encountered peers (push) - Transmission trigger
- Encounter
- New reports
42RANDI
When two peers meet they conduct a two-phase
exchange
local query
Phase 1
answers
satisfied as a consumer (pull)
more reports
Phase 2
enhanced as a broker (push)
Phase 1 Exchange queries and receive answers
(pull) Phase 2 Exchange more reports using
available energy/bandwidth (push)
Combination of unicast (thin line) and
broadcast (thick lines) to enable overhearing.
43RANDI (Contd)
To solve problem with static peers Two
interaction modes which combine pull and push
new reports
- Query-response triggered by discovery of new
neighbors
- Relay triggered by receipt of new reports
- Disseminate to existing neighbors
447DS
P2P mode each node periodically broadcasts its
query and receives reports from neighboring
peers. No strategy to determine query frequency
and transmission size. Cache management based on
web-page expiration time.
45PeopleNet
Reports are randomly selected for exchanging and
saving upon encountering.
467DS
Each peer periodically broadcasts its query and
receives reports from neighboring peers. No
strategy to determine query frequency and
transmission size. Cache management based on
web-page expiration time.
47PeopleNet
Reports are randomly selected for exchanging and
saving upon encountering.
Peer A
Peer B
Peer A
Peer B
before exchange
after exchange
48Mobile Local Search Applications
- transportation
- Announce sudden stop, malfunctioning brake light,
patch of ice - Floating car data
- Dissemination of multi-media traffic information
(picture, video, voice) - Search close-by taxi customer, parking slot,
ride-share - social networking (wearable website)
- Personal profile of interest at a convention
- Singles matchmaking
- Floating BBS
- mobile electronic commerce
- Sale on an item of interest at mall
- Music-file exchange
- emergency response
- Search for victims in a rubble
- asset management and tracking
- Sensors on containers exchange security
information gt remote checkpoints - tourist and location-based-services
- Closest ATM
49Applications Common features
- Mobile/stationary peers
- Resources of interest
- in a limited geographic area
- Short time duration
- Can be solved by fixed servers, but
- Unlikely solution
- Proposed mp2p paradigm can enhance fixed solution
(reliability, performance, coverage)
50MARKET
When two peers meet they conduct a two-phase
exchange
Local query
Phase 1
answers
satisfied as a consumer (pull)
more reports
Phase 2
enhanced as a broker (push)
Phase 1 Exchange subscriptions and receive
answers (pull) Phase 2 Exchange more
publications using available energy/bandwidth
(push)
Combination of unicast (thin line) and
broadcast (thick lines) to enable overhearing.
51MARKET (Contd)
To solve problem with static peers Two
interaction modes which combine pull and push
new publications
- Query-response triggered by discovery of new
neighbors
- Relay triggered by receipt of new publications
- Disseminate to existing neighbors
52Query in static disconnected network
A
Q
r
Q
q
Q
A
In-network query processing may not be possible
53Query in static connected sensor network
A
QA
A
Q
A
r
A
A
Q
q
qA
A
Q
Q
A
Data transmission delay is 0.
Answer can be obtained instantaneously
54Query in static disconnected network
A
Q
r
Q
q
Q
A
In-network query processing may not be possible
55Query in mobile disconnected network
Query processing enabled by mobility and
store-and-forward
qA
A
QA
r
q
A
One hop case
56Query in mobile disconnected network
A
QA
r
Q
Q
q
A
A
qA
The answer is disseminated only after an answer
node receives query
Multil-hop case
Query can be in network processed, but it is
delayed
Query processing alogrithm doesnt control
motion.
First stage query disseminated during encounter