Title: Querying Smartphone Networks with SmartTrace
1Querying Smartphone Networks with SmartTrace
Demetris Zeinalipour Department of Computer
Science University of Cyprus
Colloquium Department of Computer Science,
University of Pittsburgh, Sennott Square -
Seminar Room 5317, 1400-1500, Friday, April
29th, 2011.
http//www.cs.ucy.ac.cy/dzeina/
2Acknowledgments
- "Disclosure-free GPS Trace Search in Smartphone
Networks", D. Zeinalipour-Yazti, C. Laoudias, M.
I. Andreou, D. Gunopulos, 12th Intl. Conf. on
Mobile Data Management (MDM'11), IEEE Computer
Society, Lulea, Sweden, June 6-9, 2011. - SmartTrace Finding Similar Trajectories in
Smartphone Networks without Disclosing the
Traces, C. Costa, C. Laoudias, D.
Zeinalipour-Yazti, D. Gunopulos Demo at the 27th
IEEE Intl. Conf. on Data Engineering (ICDE11),
Hannover, Germany, 2011. - Other Related Work
- Distributed Spatio-Temporal Similarity Search,
D. Zeinalipour-Yazti, et. al, In 15th ACM
Conference on Information and Knowledge
Management (CIKM06), Arlington, VA, USA, 2006. - "Finding the K Highest-Ranked Answers in a
Distributed Network", D. Zeinalipour-Yazti, , Z.
Vagena, D. Gunopulos and V. Kalogeraki, V.
Tsotras, M. Vlachos, N. Koudas, D. Srivastava,
Computer Networks (ComNet), vol. 53, issue 9, pp.
1431-1449, Elsevier Press, 2009.
3Smartphones
- Smartphone a mobile device (phone, tablet,
slate) that offers more computing ability than a
basic feature phone (e.g., one running JavaME)
and a dumb phone. - Computing Ability CPU, Memory Storage,
Networking, Sensing. - Example (Motorola Atrix 4G)
- Processing 1 GHz dual core
- RAM Flash Storage 1GB 48GB, respectively
- Networking WiFi, 3G (Mbps) / 4G (100Mbps1Gbps)
- Sensing Proximity, Ambient Light, Accelerometer,
Microphone, Geographic Coordinates based on AGPS
(fine), WiFi or Cellular Towers (coarse).
4Applications of Smartphones Sensors
Camera Find the right coupons on the right
moment!
Microphone Medical Stethoscope.
Compass / Accelerometer Augmented Reality
GPS/WIFI/Cell Smartphone Social Networks
5Road Traffic Mapping (RTM) Past
Mapping Road Traffic is traditionally carried out
with fixed cameras sensors mounted on roadsides
http//www.rta.nsw.gov.au/
6RTM with Smartphone Networks Future
Opportunistic (w/ user interaction) and
Participatory Sensing (w/out user interaction)
Mapping the Road traffic by collecting WiFi
signals.
Received Signal Strength (RSS) power present in
WiFi radio signal
?
?
G
?
?
A
B
Graphics courtesy of A .Thiagarajan et. al.
Vtrack Accurate, Energy-Aware Road Traffic
Delay Estimation using Mobile Phones, In
Sensys09, pages 85-98. ACM, (Best Paper) MITs
CarTel Group
7Collecting Trace Data on Smartphones
- Popular Smartphones are already collecting
positional information (i.e., user-agnostic
sensing) - Example A (iPhone logs User Positional Data)
- iPhone collects Longitude / Latitude (or
triangulated Cell Tower position) info locally on
your smartphone (and iTunes backup). - The unencrypted log file is even migrated between
devices! - Displaying your location history on a Map
http//petewarden.github.com/iPhoneTracker/ - Example B (Android logs/uploads Access Point
data) - There are rumors that Google uses its Android OS
for collecting (wardriving) positional info about
WiFi Access Points (APs). - When the phone detects a WiFi AP, it sends the
BSSID (MAC address) of the router along with
signal strength and GPS coordinates over to the
Geolocation database at Google - This enables a variety of interesting queries
(e.g., find the location of your WiFi AP)
http//samy.pl/androidmap/
8Collecting Trace Data on Smartphones
Mapping your iPhone locations with the popular
software (points are constrained to a grid, so
the exact location is not revealed in the
visualization)
Circle Size/Color indicates the frequency of
visits to a particularly spatial location
The availability of such data on a device enables
applications like SmartTrace, presented next.
9Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
10System Model and Problem Formulation
Find the K most similar trajectories to Q without
pulling together all traces at QN
11Constrains and Objectives
- Dont Disclose the Users Trajectory to QN
- Social sites are already undergoing significant
privacy restructuring (e.g., google buzz,
facebook) - Trajectories are large (270MB/year with 2s
samples) - Minimize Net Traffic and Local Processing
- 3G/4G and WiFi traffic i) depletes smartphone
battery and ii) degrades network health - In 2009 ATTs customers affected by iPhone
release.
12Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
13Trajectory Similarity Search
- Problem Compare the query with all the
distributed sequences and return the k most
similar sequences to the query. - Similarity between two objects A, B is associated
with a distance function (see next)
K
?
Query
14System Model and Problem Definition
- Lp-norms are the simplest way to compare
trajectories (e.g., Euclidean, Manhattan, etc.) - Lp-norms are fast (i.e., O(n)), but inaccurate.
- No Flexible matching in time. (miss out-of-phase)
- No Flexible matching in space. (miss outliers)
P1 Manhattan P2 Euclidean
14
15Longest Common Subsequence
- Longest Common Subsequence (LCSS) Given
strings A and B, LCSS is the longest string that
is a subsequence of both A and B - extensively utilized in text similarity, e.g.,
- String CGATAATTGAGA
- Substring (contiguous) CGA
- SubSequence (not necess. conti.) AAGAA
- Find the LCSS of the following 1D-trajectory
- A 3, 2, 5, 7, 4, 8, 10, 7
- B 2, 5, 4, 7, 3, 10, 8, 6
- LCSS (2, 5, 4, 7) or (2, 5, 7, 10) or
16Longest Common Subsequence
- A Dynamic Programming algorithm for this problem
requires O(AB) time. - However we can compute it in O(d(AB)), if we
limit the matching within a time window of d.
Time
Procesing a trajectory with size
Ai1.8MB, requires 111 seconds on a smartphone
17LCSS Definition
18LCSS(MBEQ, Ai) Bounding Above LCSS
Indexing multi-dimensional time-series with
support for multiple distance measures, M.
Vlachos, M. Hadjieleftheriou, D. Gunopulos, E.
Keogh, In KDD 2003.
19Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
20SmartTrace Algorithm Outline
- An intelligent top-K processing algorithm for
identifying the K most similar trajectories to Q
in a distributed environment. -
- Step A Conduct the cheap linear-time
LCSS(MBEQ,Ai) computation on the smartphones to
approximate the answer. - Step B Exploit the approximation to identify the
correct answer by iteratively asking specific
nodes to conduct LCSS(Q, Ai).
21SmartTrace Algorithm (1/2)
- Input Query Trajectory Q, m Target Trajectories,
Result Preference K (K ltlt m), Iteration Step
Increment ?. - Output K trajectories most similar to Q.
- At the query node QN
- Upper Bound (UB) Computation Instruct each of
the m smartphones to invoke a computation of the
linear-time LCSS(MBEQ,Ai) (i m). - Collection of UB Receive the UBs of all m
trajectories participating in the query and add
those scores to the METADATA vector stored at QN.
Let METADATA be sorted in descending order based
on the UB scores.
22SmartTrace Algorithm (2/2)
- Full Computation Ask the ? 1 (? K) highest
UB nodes to compute LCSS(Q,Ai) and then send back
their ? full scores. - Termination Condition If the next highest UB is
smaller than the K-th largest full match then
stop else goto step 3 in order to identify the
next ? cand. - (Tentative) Ship Matching If the termination
condition has been met, tentatively ship the
respective matches to QN, based on some local
trace disclosure policy.
STOP
CONTINUE
A
A
23SmartTrace Execution
Query Find the K2 most similar trajectories to Q
Ask A4 A2 for the computation of LCSS
Stop if Kth LCSS gt Last UB
?Kth LCSS
?
24SmartTrace Protocol
Server (QN)
Participating Node
Querying Node
LCSS(MBEQ,Ai)
1
2
LCSS(Q,Ai)
3
25Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
26Experimental Methodology
- Datasets Queries
- Oldenburg (Realistic) IAPG Institute, Germany
- Dataset
- 2,000 Car Trajectories moving in the city of
Oldenburg. - Trajectory Length 11,731 7,193 points
- Queryset
- Randomly sampled out of the original dataset with
interpolated noise - Trajectory Length 100 points.
- GeoLife (Real) Microsoft Research Asia
- Dataset
- 1,100 Human Trajectories over the city of Beijing
in the time frame 2007-2009 (1 sample / 5 seconds
or 1 sample / 10 meters) - Trajectory Length 190,110 126,590 points
- Queryset
- Randomly sampled out of the original dataset with
interpolated noise - Trajectory Length 500 points
http//iapg.jade-hs.de/personen/brinkhoff/generato
r/
http//research.microsoft.com/en-us/projects/geoli
fe/
27Experimental Methodology
- Algorithms
- Centralized (C) 1) Ship Trajectories to QN 2)
Conduct centralized LCSS(Q,Ai) computation - Decentralized (D) 1) Ship Q to all nodes 2)
Conduct the LCSS(Q,Ai) computation locally - SmartTrace (ST) 1) Ship Q to all nodes 2)
Conduct the linear-time LCSS(MBEQ,Ai)
computation 3) Iteratively ask specific nodes to
calculate LCSS(Q,Ai) - Metrics
- Execution Time (T) The total time to answer the
query. - Amortized Energy (E) per Device average energy
consumed by a smartphone for answering the query
(based on Powertutor profile Univ. of Michigan) - d and e (temporal and spatial matching)
parameters are kept constant for all experiments.
The values affect the matching granularity, which
is similar for all algorithms.
28Experimental Results(Execution Time)
Result I ST and D are 1
order of magnitude faster than C. Expl ST and D
rely mainly on processing while C relies on data
transfer, which is slow! Result II
ST is faster than D (i.e., 17
and 8, respectively for the two datasets)
10x
Expl Attributed to the variable length of
trajectories (i.e., D always compares against
the longest trajectory while ST compares against
it only if it belongs to the candidate S-set)
29Experimental Results(Energy Consumption)
- Result III
- C is network-intensive while ST and D are
cpu-intensive - Expl ST and D have very little network activity
(i.e., which accounts for 2.59mJ and 2.29mJ,
respectively) - Result IV
- - ST is 67 more energy efficient than D
- ST is 81 more energy efficient than C
- Expl ST doesnt execute LCSS(Q,Ai) on all nodes.
-
30Experimental Results(Varying K Parameter)
Result V Performance results are the same when
the preference K is constraint within 1 of the
answer set (typical for top-K query processing
algorithms).
31Experimental Results(Varying the ? Parameter)
- The ? parameter defines how aggressively ST
explores the top-k result set (Higher ? gt Faster
Convergence) - Theorem ST requires O(m/?) iterations in the
worse case, where ? denotes the step increment
and m the number of trajectories
Result VI (?-convergence) Our algorithm
convergences in 7.6 and 9.3 iterations, on
average, for the Oldenburg and Geolife datasets,
respectively.
32Prototype System (GPS)
- SmartTrace Implemented as a Client-Server
text-based protocol - Server implemented in JAVA (4,500 LOC)
- Client implemented in JAVA on Android (2,500 LOC
XML files)
Query
Device B
Device C
SmartTrace Finding Similar Trajectories in
Smartphone Networks without Disclosing the
Traces, C. Costa, C. Laoudias, D.
Zeinalipour-Yazti, D. Gunopulos Demo at the 27th
IEEE Intl. Conf. on Data Engineering (ICDE11),
Hannover, Germany, 2011.
33Prototype System (GPS)
Answer With Trace
Privacy Setting
Answer
34Prototype System (RSS)
The SmartTrace algorithm works equally well for
indoor environments (using RSS)
?
?
G
?
?
A
B
35Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
36Future Work
- Evaluate the SmartTrace prototype system over the
SmartNet testbed we are developing. - Develop extensions that do not require the
iterative execution of LCSS(Q,Ai) but can
postpone them to a final post-processing step. - Develop new Similarity Measures for (Highly
Dimensional) RSS Trajectories. - Develop a killer application for our algorithm
and deploy the executable APK on Google Market to
gain further experiences with this. Possibly also
develop a client for iPhone devices.
37SmartTrace Applications
- Our framework finds applications in a wide range
of domains - Intelligent Transportation Systems Find whether
a new bus route is similar to the trajectories of
K other users. - Social Networks Find whether there is a cycling
route from MOMA to the Julliard - GeoLife, GPS-Waypoints, Sharemyroutes, etc. offer
centralized counterparts. - Habitant Monitoring Find zebras that moved more
similarly to zebra X before it got injured.
38Presentation Outline
- Introduction
- System Model and Problem Formulation
- Background on Trajectory Similarity
- The SmartTrace Algorithm
- Experimental Evaluation
- Future Work
- Other Related Research Works
39SmartNet Programming Cloud
- Currently, there are no testbeds (like motelab,
planetlab) for realistically emulating and
prototyping Smartphone Network applications and
protocols at a large scale. - Currently applications are tested in emulators.
- Drawbacks
- Sensors are not emulated.
- It is difficult to concurrently
- re-program several devices
- between the devices.
- MobNet project (at UCY 2010-2012), will develop
an innovative cloud testbed of mobile sensor
devices using 50 Android devices.
40SmartNet Programming Cloud
SmartNet
Install APK, Upload File, Reboot,
Programming cloud for the development of
smartphone network applications protocols as
well as experimentation with real smartphone
devices.
41SmartNet Programming Cloud
42SmartP2P Peer-to-Peer Search in Smartphone
Networks
Finding objects (e.g., images, videos, etc.) in
a social neighborhood, without the necessity of
having the objects disclosed to the social
network provider.
"Multi-Objective Query Optimization in Smartphone
Networks" A. Konstantinidis, D.
Zeinalipour-Yazti, P. Andreou, G. Samaras, 12th
International Conference on Mobile Data
Management (MDM'11) (Short Paper), IEEE Computer
Society, Lulea, Sweden, June 6-9, 2011.
43PROXIMITY Finding Close-by Smartphones
- Problem Identifying geographically close-by
devices continuously for all smartphones. - Constraints
- Privacy Users do not want to expose their
precise location (we utilize location obfuscation
techniques) - Complexity Computing the above answers for
millions of devices requires takes time while the
answer need to be ready every few seconds.
44PROXIMITY ???es? Ge?t?????? S?s?e???
Application Proximity Chat
45Querying Smartphone Networks with SmartTrace
Demetris Zeinalipour Department of Computer
Science University of Cyprus Thanks! Questions?
Colloquium Department of Computer Science,
University of Pittsburgh, Sennott Square -
Seminar Room 5317, 1400-1500, Friday, April
29th, 2011.
http//www.cs.ucy.ac.cy/dzeina/