Title: Dimensionality Reduction in Sensor Networks
1Dimensionality Reduction in Sensor Networks
- Alfred O. Hero
- Dept. EECS, Dept BME, Dept. Statistics
- University of Michigan - Ann Arbor
hero_at_eecs.umich.edu - http//www.eecs.umich.edu/hero
Boston University, May 2006
- Sensor net applications
- The importance of dimensionality reduction
- Sensor localization via DR
- Distributed Weighted Multi-dimensional scaling
- Laplacian Eigenmaps Adaptive Neighbors
- Anomaly detection via DR
- Conclusions
2Acknowledgements
- Sensor Net collaborators
- (PG) Randy Moses, Rob Nowak, Raviv Raich, Neal
Patwari, Jose Costa, Doron Blatt - (G) Kevin Carter, Clyde Shih, Derek Justice
- (UG) Adam Pocholsky, Jionglin Wu
- (K12) Panna Felsen, Abiola Adatero
- Sensor net sponsors
- NSF ITR program (J. Cozzens)
- DARPA ISP program (D. Cochran, C. Schwartz)
- AFOSR MURI program (J. Tagney)
- ARL (B. Sadler)
- Motorola (J. Correal)
- Raytheon (H. Schmitt)
3Sensor Network Applications
Environmental monitoring and localization
Internet monitoring and anomaly detection
Internally sensed tomography and endpoint
estimation
Intruder detection and surveillance
Multiple source tracking with sensor swarms
4Dimensionality Bottlenecks
- Data dimension
- Sensor response variables Y
- 1,000,000 samples of an EM/Acoustic field on each
of N sensors - 10242 pixels of a projected image on a IR camera
sensor - N2 expansion factor to account for all pairwise
correlations - Latent variables S
- 250 targets with 6 dimensional states each with
10 possible labels - 10243 image volume
- 1000 behavior patterns
- Information dimension
- Number of free parameters describing probability
densities f(Y) or f(SY) - For known statistical model info dim model dim
- For unknown model info dim dim of density
approximation - Parametric-model driven dimension reduction
- DR by sufficiency, DR by maximum likelihood, DR
by ancillarity - Data-driven dimension reduction
- Manifold learning, structure discovery
5Two Geometries to Consider
Manifold Embedding
- (Non-metric) information geometry
Domain
are i.i.d. samples from
6Data-driven DR
- Data-driven projection to lower dimensional
subsapce - Extract low-dim structure from high-dim data
- Data may lie on curved (but locally linear)
subspace
1 Josh .B. Tenenbaum, Vin de Silva, and John C.
Langford A Global Geometric Framework for
Nonlinear Dimensionality Reduction Science, 22
Dec 2000. 2 Jose Costa, Neal Patwari and
Alfred O. Hero, Distributed Weighted
Multidimensional Scaling for Node Localization in
Sensor Networks, IEEE/ACM Trans. Sensor
Networks, to appear 2005. 3 Misha Belkin and
Partha Niyogi, Laplacian eigenmaps for
dimensionality reduction and data
representation, Neural Computation, 2003.
7Application Cooperative Localization
- Use measurements made between pairs of
unknown-location devices to self localize
- Time-of-Arrival (TOA)
- Received Signal Strength (RSS)
- Connectivity (Proximity)
- Quantized RSS (QRSS)
- Angle-of-Arrival (AOA)
8Manifold Learning for Localization
6
4
5
4 Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz,
Localization from mere connectivity, in Mobihoc
03, June 2003, pp. 201212. 5 N. Patwari, A.O.
Hero III Adaptive neighborhoods for manifold
learning-based sensor localization, IEEE SPAWC
2005, June 2005. 6 J. Costa, N. Patwari, A.O.
Hero III Distributed Weighted Multidimensional
Scaling for Node Localization in Sensor
Networks, IEEE/ACM Trans. Sensor Networks,
(submitted) June 2004.
9Iterative self-localization algorithm
10dwMDS RSS measurements
When initialized with NN oracle dwMDS is unbaised
and comes close to CRB Without oracle NNs are
estimated by in-range neighbors. First stage
dwMDS location estimates have high bias. Two
stage dwMDS attains similar performance as single
stage dwMDS with NN oracle
11LEAN Connectivity
7
7 Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz,
Localization from mere connectivity, in Mobihoc
03, June 2003, pp. 201212.
12Application Internet anomaly detection
- Measurements Distribution of traffic (5 min)
- From each sensor (router) in space and time
Figure Abilene Network, 11 routers, backbone of
US .edu / research network
Destination IP d
Port p
Source IP s
- Related Work
- Subspace-based decomposition 8
8 A. Lakhina, M. Crovella, C. Diot, Mining
Anomalies Using Traffic Feature Distributions,
ACM SIGCOMM 2005, Aug. 2005.
13Internet anomaly detection background
- Anomalies Worm outbreaks, DoS attacks,
Intrusion activity (scans) - Monitor Collect data from sensors (routers) in
space and time - Hypothesis Anomalies will change distribution
of traffic across sensors - Distribution traffic by src/dst port, IP
addresses packet sizes, etc. - Problem How to find anomalous relationships
across space and time?
9 N. Patwari, A. O. Hero, A. Pacholski,
Manifold Learning Visualization of Network
Traffic Data, ACM Wksp on Mining Net. Data
(MineNet05), Aug 2005.
14 Spatial degrees of freedom
- Spatio-temporal measurement vector
15Intrinsic dimension estimation
Knee?
- Scree plots
- Plot residual fitting errors of
- SVD, Isomap, LE, LLE
- Kolmogorov/Entropy/Correlation dimension
- Box counting, sphere packing (Liebovitch and
Toth1989) - Maximum likelihood
- Poisson approximation to Binomial
(LevinaBickel2004) - Entropic graphs
- Spanner-graph length approximation to entropy
functional (CostaHero2003)
ISOMAP residual curve
16Intrinsic Dimension Estimation
- Lakhina, Crovella, Diot Subspace-based
detection of traffic anomalies 8 - Intrinsic dim. estimation via kNN entropic graphs
10
Figure Data set of 7, total packets by link,
has dimension between 4 and 5
10 J.A. Costa, A.O. Hero, "Geodesic Entropic
Graphs for Dimension and Entropy Estimation in
Manifold Learning", IEEE Trans. on Signal
Processing, vol. 52, no. 8, pp. 2210-2221,
August, 2004.
17Dimension-based Anomaly detection
- The k-NN algorithm is more sensitive to small
complexity changes than the Maximum Likelihood
algorithm 11
11 E. Levina and P. Bickel. Maximum likelihood
estimation of intrinsic dimension. Neural
Information Processing Systems NIPS, Vancouver,
CA, Dec. 2004.
18Clustering router flows spatial
- Sensors at routers measure flows per source IP
address - 07-Jan-2005 during 1545-1550 UTD
- Packets are sampled 1/100
- Last 11 bits zeroed for privacy -gt data are
221length (sparse) vectors
19Dynamic Sensor Maps
- Typical router map, 18-Jan 1700 UTD
- Sensors (routers) as positioned by dwMDS
- Coordinates are normalized (flows) so are unitless
- Lines show physical Abilene links
- Small dots (- - -) show distance from 4-week mean
coord
20Maps Respond to Anomalous Traffic
- Wed. 19-Jan 2005, 000-100 UTD
- At 030, 035 large network scan
- 22,000 anomalous flows observed at STTL, DNVR,
KSCY, IPLS, ATLA - 60-byte, TCP
- From a few Miss. State U. IPs, Src Port lt 1024
- To range of Microsoft IPs, Dest Port 113
21Pure Time Series Small Change
- Abilene Backbone Total Flows, by router
- 18-19 Jan
Network Scan
22Anomaly Detection Algorithm
- Multivariate t-test comparing the current coords
to a
- history of coordinates
- Declare alarm when t-value exceeds threshold
- Eg 18-19 Jan-05
Network Scan
3
2
2 45kflow port scan from .tw to .dk 3 46kflow
port scan from .tw to .pl
23Clustering router flows temporal
- Before Sensor on all routers, for one 5-min
interval - Now Sensor on one router, for each 5-min, over
24 hours
- How does traffic distribution change over time?
- Flows by source IP
- During 2-Jan-05
- Using Isomap
- Credit Jionglin Wu
24Application Wireless Motion Detection Sensor Net
- Hypothesis Movement changes RF propagation
channel - Issues Low SNR, missing data, antennas move in
wind - Normal variations Battery powers, frequency
hopping - Method Use N sensors to measure O(N2) channels
- Experiment gridded SN on unmowed grass, N15
- H0 No motion in deployment area
- H1 Person walks through/around deployment area
Deployment Test in Motion condition
Picture Crossbow mica2 sensors
25Geometric Entropy Minimization
- Minimum-entropy-sets 14 and anomaly detection
-
- Equivalent to level-set and minimum-volume-set
tests 15 for Lebesgue densities - UMP for testing composite hypotheses
14 A. Hero and O. Michel, Asymp theoryy of
minimal k-point random graphs,, IEEE Trans IT,
1999. 15 C. Scott and R. Nowak, Learning
minimum volume sets, JMLR 2006
26GEM vs. UMP test
27Geometric Entropy Minimization (GEM)
- GEM learns minimum volume set of given
probability 16 - Sliding window draws latest 100 samples from H0
- Level of significance 0.001
In Motion
Key Score (1-p) Anomalies from
GEM xxxxx Ground truth of motion
Score
No Motion
(secs)
Sample Number (2 sps)
16 A. Hero, N. Patwari, J. Costa, GEM for
non-parametric anomaly detection, NIPS-06.
28Conclusions
- Any modeling of sensor data produces DR
- Data-driven DR can be useful
- Estimator of required dimension is essential
- Distributed DR is feasible
- Other directions
- Blind calibration calibration-while-track
- Folding in semi-parametric likelihood models
- Accounting for energy/bandwidth/throughput
constraints - Resource allocation and sensor management POMDP,
RL, CROPS