Title: Querying the Physical World
1Querying the Physical World
- Joe Hellerstein
- UC Berkeley Intel Research Berkeley
- Wei Hong
- Intel Research Berkeley
- Sam Madden
- M.I.T.
2Motivation
- Sensor networks (aka sensor webs, emnets) are
here - Several widely deployed HW/SW platforms
- Low power radio, small processor, RAM/Flash
- Variety of (novel) applications scientific,
industrial, commercial - Great experimental platform for ubicomp
mobility - Cross-cutting technical challenges
- Networking, systems, languages, databases, AI,
stats, signal proc. - We will summarize
- The state of the art
- Our experiences building TinyDB
- Current and future research directions
Berkeley Mote
3Some Sensornet Apps
smart cooling in data centers (hp/intel)
redwood forest microclimate monitoring
http//www.hpl.hp.com/research/dca/smart_cooling/
And More
condition-based maintenance (intel/BP)
- Homeland security
- Container monitoring
- Mobile environmental apps
- Bird tracking
- Zebranet
- Home automation
- Etc!
structural integrity (ucb/ggbd)
4Context The Berkeley Stack
5Declarative Queries
- Programming TinyOS Apps is Hard
- Embedded x Distributed Programming!
- Limited power budget
- Lossy, low bandwidth communication
- Require long-lived, zero admin deployments
- Distributed Algorithms
- Limited tools, debugging interfaces
- Queries abstract away much of the complexity
- Burden on the query engine developer
- Users get
- Safe, optimizable programs
- Freedom to think about apps instead of details
6TinyDB Query Engine
- Continuous variant of SQL TinySQL
- Power and data-acquisition based in-network
optimization framework - Extensible interface for aggregates, new types of
sensors
7Agenda
- Sensornet Background
- TinyDB
- Data Model and Query Language
- Software Architecture
- TASK
- Quick overview
- A Flavor of Research Challenges
8Sensor Networks a hot topic
- New university courses
- New conferences
- ACM SenSys, IEEE IPSN, etc.
- New industrial research lab projects
- Intel, PARC, MSR, HP, Accenture, etc.
- Startup companies
- Crossbow, Dust, Ember, Sensicast, Moteiv, etc.
- Media Buzz
- Over 30 news articles since July 2002 covering
Intel-Berkeley/UC Berkeley sensor network
activities - One of 10 emerging technologies that will change
the world MIT Technology Review
9A Brief History of Sensornets
- People have used sensors for a long time
- Recent CS History
- (1998) Pottie Kaiser Radio based networks of
sensors - (1998) Pister et al Smart Dust
- Initial focus on optical communication
- By 1999, radio based networks, COTS Dust, Motes
- (1999) Estrin Govindan
- Ad-hoc networks of sensors
- (2000) Culler/Hill et al TinyOS Motes
- (2000) Bonnet/Seshadri Device Database Systems
- (2002) Madden/Franklin/Hellerstein/Hong TinyDB
- (2002) Hill / Dust SPEC, mm3 scale computing
- UCLA / USC / Berkeley Continue to Lead Research
- Many other players now
- TinyOS/Motes as most common platform
- Emerging commercial space
- Crossbow, Ember, Dust, Sensicast, Moteiv, Intel
10Why Now?
- Commoditization of radio hardware
- Cellular and cordless phones, wireless
communication - Low cost -gt many/tiny -gt new applications!
- Real application for ad-hoc network research from
the late 90s - Coming together of EE CS communities
11Motes
12History of Motes
- Initial research goal wasnt hardware
- Has since become more of a priority with emerging
hardware needs, e.g. - Power consumption
- (Ultrasonic) ranging localization
- MIT Cricket, NEST Project
- Connectivity with diverse sensors
- UCLA sensor board
- Even so, now on the 5th generation of devices
- Costs down to 50/node (Moteiv, Dust)
- Greatly improved radio quality
- Multitude of interfaces USB, Ethernet, CF, etc.
- Variety of form factors, packages
13Motes vs. Traditional Computing
- Embedded OS
- Lossy, Adhoc Radio Communication
- Sensing Hardware
- Severe Power Constraints
14NesC/TinyOS
- NesC a C dialect for embedded programming
- Components, wired together
- Quick commands and asynch events
- TinyOS a set of NesC components
- hardware components
- ad-hoc network formation maintenance
- time synchronization
Think of the pair as a programming environment
15Radio Communication
- Low Bandwidth Shared Radio Channel
- 40kBits on motes
- Much less in practice
- Encoding, Contention for Media Access (MAC)
- Very lossy 30 base loss rate
- Argues against TCP-like end-to-end retransmission
- And for link-layer retries
- Generally, not well behaved
16Types of Sensors
- Sensors attach via daughtercard
- Weather
- Temperature
- Light x 2 (high intensity PAR, low intensity,
full spectrum) - Air Pressure
- Humidity
- Vibration
- 2 or 3 axis accelerometers
- Tracking
- Microphone (for ranging and acoustic signatures)
- Magnetometer
- GPS
- RFID Reader
17Non-Volatile Storage
- EEPROM
- 512K off chip, 32K on chip
- Writes at disk speeds, reads at RAM speeds
- Interface random access, read/write 256 byte
pages - Maximum throughput 10Kbytes / second
- MatchBox Filing System
- Provides a Unix-like file I/O interface
- Single, flat directory
- Only one file being read/written at a time
18Power Consumption and Lifetime
- Power typically supplied by a small battery
- 1000-2000 mAH
- 1 mAH 1 milliamp current for 1 hour
- Typically at optimum voltage, current drain rates
- Power Watts (W) Amps (A) Volts (V)
- Energy Joules (J) W time
- Lifetime, power consumption varies by application
- Processor 5mA active, 1 mA idle, 5 uA sleeping
- Radio 5 mA listen, 10 mA xmit/receive, 20mS /
packet - Sensors 1 uA -gt 100s mA, 1 uS -gt 1 S / sample
19Energy Usage in A Typical Data Collection Scenario
- Each mote collects 1 sample of (light,humidity)
data every 10 seconds, forwards it - Each mote can hear 10 other motes
- Process
- Wake up, collect samples ( 1 second)
- Listen to radio for messages to forward (1
second) - Forward data
20Sensors Slow, Power Hungry, Noisy
21Agenda
- Sensornet Background
- TinyDB
- TinyDB Overview
- Data Model and Query Language
- Demo
- TinyDB Java API and Scripting
- Demo with TinyDB GUI
- TinyDB Internals
- Extending TinyDB
- TinyDB Status and Roadmap
- TASK
- A Flavor of Research Challenges
22TinyDB Revisited
SELECT MAX(mag) FROM sensors WHERE mag gt
thresh SAMPLE PERIOD 64ms
- High level abstraction
- Data centric programming
- Interact with sensor network as a whole
- Extensible framework
- Under the hood
- Intelligent query processing query optimization,
power efficient execution - Fault Mitigation automatically introduce
redundancy, avoid problem areas
App
Query, Trigger
Data
TinyDB
23Feature Overview
- Declarative SQL-like query interface
- Metadata catalog management
- Multiple concurrent queries
- Network monitoring (via queries)
- In-network, distributed query processing
- Extensible framework for attributes, commands and
aggregates - In-network, persistent storage
24Architecture
TinyDB GUI
JDBC
TinyDB Client API
DBMS
PC side
0
Mote side
0
TinyDB query processor
2
1
3
8
4
5
6
Sensor network
7
25Query Language (TinySQL)
- SELECT ltaggregatesgt, ltattributesgt
- FROM sensors ltbuffergt
- WHERE ltpredicatesgt
- GROUP BY ltexprsgt
- SAMPLE PERIOD ltconstgt ONCE
- INTO ltbuffergt
- TRIGGER ACTION ltcommandgt
26Demo Time
27TinySQL Examples
Find the sensors in bright nests.
Sensors
- SELECT nodeid, nestNo, light
- FROM sensors
- WHERE light gt 400
- EPOCH DURATION 1s
1
28TinySQL Examples (cont.)
Count the number occupied nests in each loud
region of the island.
29Event-based Queries
- ON event SELECT
- Run query only when interesting events happen
- Event examples
- Button pushed
- Message arrival
- Bird enters nest
- Analogous to triggers but events are user-defined
30Query over Stored Data
- Named buffers in Flash memory
- Store query results in buffers
- Query over named buffers
- Analogous to materialized views
- Example
- CREATE BUFFER name SIZE x (field1 type1, field2
type2, ) - SELECT a1, a2 FROM sensors SAMPLE PERIOD d
INTO name - SELECT field1, field2, FROM name SAMPLE
PERIOD d
31Using the Java API
- SensorQueryer
- translateQuery() converts TinySQL string into
TinyDBQuery object - Static query optimization
- TinyDBNetwork
- sendQuery() injects query into network
- abortQuery() stops a running query
- addResultListener() adds a ResultListener that is
invoked for every QueryResult received - removeResultListener()
- QueryResult
- A complete result tuple, or
- A partial aggregate result, call
mergeQueryResult() to combine partial results - Key difference from JDBC push vs. pull
32Writing Scripts with TinyDB
- TinyDBs text interface
- java net.tinyos.tinydb.TinyDBMain run select
- Query results printed out to the console
- All motes get reset each time new query is posed
- Handy for writing scripts with shell, perl, etc.
33Inside TinyDB
Multihop Network
Query Processor
10,000 Lines Embedded C Code 5,000 Lines
(PC-Side) Java 3200 Bytes RAM (w/ 768 byte
heap) 58 kB compiled code (largest TinyOS
program)
Filterlight gt 400
Schema
TinyOS
TinyDB
34Tree-based Routing
- Tree-based routing
- Used in
- Query delivery
- Data collection
- In-network aggregation
- Parent Selection
- Goals
- Ensure good connectivity
- Maintain loop-free
- Techniques
- First person you hear from
- Snoop on others, keep track of promising
contenders, switch as appropriate
35Power Management Approach
- Coarse-grained app-controlled communication
scheduling
Epoch (10s -100s of seconds)
Mote ID
1
zzz
zzz
2
3
4
5
time
2-4s Waking Period
36Time Synchronization
- All messages include a 5 byte time stamp
indicating system time in ms - Synchronize (e.g. set system time to timestamp)
with - Any message from parent
- Any new query message (even if not from parent)
- Punt on multiple queries
- Timestamps written just after preamble is xmitted
- All nodes agree that the waking period begins
when (system time epoch dur 0) - And lasts for WAKING_PERIOD ms
- Adjustment of clock happens by changing duration
of sleep cycle, not wake cycle.
37Extending TinyDB
- Why extending TinyDB?
- New sensors ? attributes
- New control/actuation ? commands
- New data processing logic ? aggregates
- New events
- Analogous to concepts in object-relational
databases
38Adding Attributes
- Types of attributes
- Sensor attributes raw or cooked sensor readings
- Introspective attributes parent, voltage, ram
usage, etc. - Constant attributes constant values that can be
statically or dynamically assigned to a mote,
e.g., nodeid, location, etc.
39Adding Attributes (cont)
- Interfaces provided by Attr component
- StdControl init, start, stop
- AttrRegister
- command registerAttr(name, type, len)
- event getAttr(name, resultBuf, errorPtr)
- event setAttr(name, val)
- command getAttrDone(name, resultBuf, error)
- AttrUse
- command startAttr(attr)
- event startAttrDone(attr)
- command getAttrValue(name, resultBuf, errorPtr)
- event getAttrDone(name, resultBuf, error)
- command setAttrValue(name, val)
40Adding Attributes (cont)
- Steps to adding attributes to TinyDB
- Create attribute nesC components
- Wire new attribute components to TinyDBAttr
configuration - Reprogram TinyDB motes
- Add new attribute entries to catalog.xml
- Constant attributes can be added on the fly
through TinyDB GUI
41Adding Aggregates
- Step 1 wire new nesC components
42Adding Aggregates (cont)
- Step 2 add entry to catalog.xml
- ltaggregategt
- ltnamegtAVGlt/namegt
- ltidgt5lt/idgt
- lttemporalgtfalselt/temporalgt
- ltreaderClassgtnet.tinyos.tinydb.AverageClasslt/read
erClassgt - lt/aggregategt
- Step 3 (optional) implement reader class in Java
- a reader class interprets and finalizes aggregate
state received from the mote network, returns
final result as a string for display.
43TinyDB Status
- Latest released with TinyOS 1.1.6 (5/04)
- Install the task-tinydb package in TinyOS 1.1
distribution - First release in TinyOS 1.0 (9/02)
- Widely used by research groups as well as
industry pilot projects - Ongoing deployments in Intel Berkeley Lab and
redwood trees at UCs Sonoma Grove - Largest deployment 80 weather station nodes
- Network longevity 4-5 months
44The Redwood Tree Deployment
- Redwood Grove in UC Botanical Garden, Berkeley
- Collect dense sensor readings to monitor climatic
variations across - altitudes,
- angles,
- time,
- forest locations, etc.
- Versus sporadic monitoring points with 30lb
loggers! - Current focus study how dense sensor data affect
predictions of conventional tree-growth models
45Data from Redwoods
36m
33m 111
32m 110
30m 109,108,107
20m 106,105,104
10m 103, 102, 101
46TinyDB Roadmap (near term)
- Support for high frequency sampling
- For a variety of deployments
- Intel Fab, BP shipboard, Golden Gate Bridge
- Store and forward
- Bulk reliable data transfer
- Scheduling of communications
- Research agenda continues
- Discussion later
47For more information
- http//berkeley.intel-research.net/tinydb or
http//triplerock.cs.bekeley.edu/tinydb
48Agenda
- Sensornet Background
- TinyDB
- TASK
- SW/HW architecture
- Features
- A Flavor of Research Challenges
49SensorNet Dilemma
- Sensors still packaged like HeathKits
- Pretty hard to cope with out of the box
- Bare metal encourages one-off applications
- Inhibits reuse
- Deployment not intuitive
- No configuration/monitoring tools
- SensorNet PhD Factor
- Today 2.5 PhDs needed to deploy a SensorNet
- Needs to be Zero
50TASK Design Requirements
- Ease of S/W Installation
- Deployment tools
- Reconfigurability
- Health/Mgmt Monitoring
- Network Reliability Guarantee
- Interpretable Sensor Results
- Tool Integration
- Audit Trails
- Lifetime estimates
For Developers
- Familiar API
- Extensibility of S/W
- Modular services
51Tiny Application Sensor Kit
TASK Client Tools
External Tools
TaskView
Internet
TASK Field Tools
SensorNet Appliance
TASK Server
- Simplicity vs. Functionality
- Modularity
- Remote control
- Fault Tolerant
TinyDB Sensor Network
52SensorNet Appliance
SNA
- Intelligent Gateway
- Proxy for the sensornet
- Distributes query
- Stages results
- Manages configuration
- Components
- TASK Server
- TinyDB Client (Java)
- DBMS (PostgreSQL)
- WebServer (Apache)
http, other
TASKServer
DBMS
ODBC
TinyDB Client
SensorNet
53Tools
- Field Tool
- In-situ diagnostics
- TaskView
- Integrated tool for management and monitoring
54Quick TASK Demo
55Agenda
- Sensornet Background
- TinyDB
- TASK
- A Flavor of Research Challenges
56Sensor Network Research
- Very active research area
- Cant summarize it all
- Focus database-relevant research topics
- Some outside of Berkeley/MIT
- Other topics that are itching to be scratched
- But, some bias towards work that we find
compelling
57Topics
- In-network aggregation
- Acquisitional Query Processing
- Heterogeneity
- Intermittent Connectivity
- In-network Storage
- Statistical modeling and summarization
- In-network Joins
- Adaptivity and Sensor Networks
- Multiple Queries
58Topics
- In-network aggregation
- Acquisitional Query Processing
- Heterogeneity
- Intermittent Connectivity
- In-network Storage
- Statistical modeling and summarization
- In-network Joins
- Adaptivity and Sensor Networks
- Multiple Queries
59Tiny Aggregation (TAG)
- In-network processing of aggregates
- Common data analysis operation
- Aka gather operation or reduction in
programming - Communication reducing
- Operator dependent benefit
- Exploit query semantics to improve efficiency!
Madden, Franklin, Hellerstein, Hong. Tiny
AGgregation (TAG), OSDI 2002.
60Basic Aggregation
- In each epoch
- Each node samples local sensors once
- Generates partial state record (PSR)
- local readings
- readings from children
- Outputs PSR during assigned comm. interval
- Interval assigned based on depth in tree
- At end of epoch, PSR for whole network output at
root - New result on each successive epoch
61Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Sample Period
Interval
Time
1
62Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 3
Sensor
2
Interval
63Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 2
Sensor
1
3
Interval
64Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 1
5
Sensor
Interval
65Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Interval
1
66Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Interval
1
67Aggregation Framework
- As in extensible databases, TinyDB supports any
aggregation function conforming to
Aggnfinit, fmerge, fevaluate Finit a0 ?
lta0gt Fmerge lta1gt,lta2gt ? lta12gt Fevaluate lta1gt
? aggregate value
Partial State Record (PSR)
Example Average AVGinit v ?
ltv,1gt AVGmerge ltS1, C1gt, ltS2, C2gt ? lt S1
S2 , C1 C2gt AVGevaluateltS, Cgt ? S/C
Restriction Merge associative, commutative
68Taxonomy of Aggregates
- TAG insight classify aggregates according to
various functional properties - Yields a general set of optimizations that can
automatically be applied
Drives an API!
69Use Multiple Parents
- Use graph structure
- Increase delivery probability with no
communication overhead - For duplicate insensitive aggregates, or
- Aggs expressible as sum of parts
- Send (part of) aggregate to all parents
- In just one message, via multicast
- Assuming independence, decreases variance
- Recent research on convertingto robust
dup-insensitivity - E.g. count can be approximated robustlyin a
dup-insensitive fashionConsidine, et al., 2003
SELECT COUNT()
70Multiple Parents Results
- Better than previous analysis expected!
- Losses arent independent!
- Insight spreads data over many links
71Statistical Techniques
- Approximations, summaries, and sampling based on
statistics and statistical models - Applications
- Limited bandwidth and large number of nodes -gt
data reduction - Lossiness -gt predictive modeling
- Uncertainty -gt tracking correlations and changes
over time - Physical models -gt improved query answering
72Correlated Attributes
- Data in sensor networks is correlated e.g.,
- Temperature and voltage
- Temperature and light
- Temperature and humidity
- Temperature and time of day
- etc.
73BBQ Model-based Probabilistic Querying over
Sensor Networks
Joint work with Amol Desphande and Carlos
Guestrin
Query Processor
Model
1
3
4
2
5
BarbieQ A Tiny Model Driven Query System
6
7
8
9
74BBQ Model-based Probabilistic Querying over
Sensor Networks
Query Processor
Model
Consult Model
1
3
4
2
5
6
7
8
9
75BBQ Model-based Probabilistic Querying over
Sensor Networks
Query Processor
Model
Consult Model
1
3
4
2
5
6
7
8
9
76BBQ Model-based Probabilistic Querying over
Sensor Networks
Query Results
Query Processor
Model
Update Model
1
3
4
2
5
6
7
8
9
77Challenges
- What kind of models to use ?
- Routing
- How to tour the selected set of nodes
- Optimization
- Given a model and a query, find the best set of
attributes to observe - Cost not easy to measure
- Non-uniform network communication costs
- Changing network topologies
- Depends on touring technique
- Large plan space
- Might be cheaper to observe attributes not in
query - e.g. Voltage instead of Temperature
- Conditional Plans
- Change the observation plan based on observed
values
78BBQ Current Prototype
- Multi-variate Gaussian Models
- Kalman Filters to capture correlations across
time - Handles
- Range predicate queries
- sensor value within x,y, w/ confidence
- Value queries
- sensor value x, w/in epsilon, w/ confidence
- Simple aggregate queries
- AVG(sensor value) ? n, w/in epsilon, w/confidence
- Uses a greedy algorithm to choose the observation
plan
79A complex aggregate Wavelets
- just like AVG or COUNT
- but fancier -)
- re-codes a signal into a sum of square waves
- biggest coefficients ? approx. reconstruction
- lossy compression
- multi-resolution
- Examples
- histograms
- 2-d or 3-d compression and reconstruction
80computing a wavelet
35, -1, 3, 8, -4,
3, 3, 3
81computing a wavelet
And a correspondingCommunication Graph
82computing a wavelet
And a correspondingCommunication Graph
83computing a wavelet
And a correspondingCommunication Graph
84resulting comm graph
a binomial tree!
85I never promised you a binomial
- TinyDB agg trees are not any special shape
- Parents chosen for reliability, not tree shape
- Options
- Cope with non-binomial trees
- How to merge two subtrees of different sizes?
- Can muck with the Wavelet, but it biases the
answer - Can forward when merge is impossible, but comm
cost - Develop a scheme to ensure binomial agg trees
- When is it possible?
- Distributed algorithm to do it?
86Concluding Remarks
- Sensor nets becoming a reality
- For sensornet system hackers, TinyOS an emerging
standard - For sensornet app writers, TinyDB an emerging
standard - TASK a developing example of a turnkey solution
- Lots of room for innovation
- Applications and vertical solutions
- Systems work in OS, Networks, DBs
- Opportunities in Statistics, AI, Signal
Processing - This all meshes in new and interesting ways!
87Sensornets Getting Started
- TinyDB home page
- http//telegraph.cs.berkeley.edu/tinydb
- The TinyOS home page
- http//webs.cs.berkeley.edu/tinyos
- Crossbow (motes stargates)
- http//www.xbow.com
- Moteiv (motes)
- http//www.moteiv.com
- Intel Imote
- www.intel.com/research/exploratory/motes.htm.
88Backup Slides
89computing a wavelet
35, 0, 3, 8, -4,
0, 0, 0
90computing a wavelet
35, 0, 3, 8, -4,
0, 0, 0