Engine Design: Stream Operators Everywhere - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Engine Design: Stream Operators Everywhere

Description:

Engine Design: Stream Operators Everywhere. Theodore Johnson. AT&T Labs Research ... Define hardware capabilities as the types of queries they can execute ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 9
Provided by: telegraph
Category:

less

Transcript and Presenter's Notes

Title: Engine Design: Stream Operators Everywhere


1
Engine DesignStream Operators Everywhere
  • Theodore Johnson
  • ATT Labs Research
  • johnsont_at_research.att.com

Contributors Chuck Cranor Vladislav
Shkapenyuk Oliver Spatscheck
2
Early Data Reduction
  • Goal Query high-speed links using inexpensive
    off-the-shelf servers.
  • OC48 2 x 2.4 Gb/sec., 7 million packets/sec.
  • OC192 2 x 7.2 Gb/sec., 21 million packets/sec.
  • Goal Evaluate queries over every bit of every
    packet.
  • Problem Not enough cycles in a second.
  • 3 Ghz / 21 Mpacket/sec 142 cycles / packet
  • Solution Push data reduction operators as far
    down the protocol stack as possible.
  • Into the hardware if possible.
  • View hardware bit twiddling as stream operators.

3
Early Data Reduction in Gigascope
  • Gigascope was designed to monitor very high speed
    (optical) links using complex query sets.
  • Multiple levels of data reduction
  • Data reduction in the NIC depends on NIC
    capabilities
  • Snap length (projection)
  • BPF filters
  • Approximate filtering (bitmasks)
  • Data reduction queries (replace the NIC run time
    system)
  • Low level queries
  • Run queries on kernel input buffers
  • Preliminary filter for the query set
  • Other possibilities .

4
Example Router Monitoring
High Level Queries
  • Selection/projection/aggregation
  • Pre-filter

Low Level Queries
Kernel
Libpcap / BPF filters
Circular Buffer
Router
  • Snap length (projection)
  • Approximate filter (selection)
  • Selection/projection/aggregation queries
    (replace run time system)

Select Stream
Network Tap
5
Stream Operators
  • Problem Great heterogeneity in the specifics of
    manipulating the hardware mechanism
  • Stream selection vs. NIC filters vs. kernel
    filters, etc.
  • Programmable NIC vs. bit-twiddling NIC vs.
    non-programmable NIC, etc.
  • Solution
  • Define a set of stream operators to evaluate the
    stream query.
  • Selection, projection, (partial) aggregation
  • Merge, join, reorder ?
  • Define hardware capabilities as the types of
    queries they can execute
  • Multiple query optimization over the query set
  • Low level query nodes feed multiple user queries

6
Example (network monitoring)
select timestamp, sourceIP, destIP, source_port,
dest_port, len, total_length, gp_header from
GAMEPROTOCOL where sample_hash50, sourceIP,
destIP and protocol17 and offset0
  • NIC snap_len 134 (projection)
  • Pre-filter protocol17 and offset0
  • Low-level query

select timestamp, sourceIP, destIP, source_port,
dest_port, len, total_length, gp_header from
GAMEPROTOCOL where sample_hash50, sourceIP,
destIP and protocol17 and offset0
7
Other Operators?
  • Merge Some NICs deliver packets out of order
  • Optical links are not duplex

ordered stream
Almost ordered stream
Stream Merge
In Buffer
Out Buffer
In Buffer
Out Buffer
NIC
NIC
timestamp
timestamp
8
Summary
  • Early data reduction is critical for monitoring
    very high-speed streams
  • Selection, projection, aggregation.
  • Use stream operators to mask the complexity and
    heterogenity of hardware / kernel data reduction.
  • Issues
  • Multiple query optimization
  • Push more complex operators down the stack?
  • Join? Stratified sampling? Sketches?
  • Optimization at low level / hardware level
  • Approximate filters
  • Avoid duplicate filters. Where to place them?
  • Re-organization when the query set changes.
Write a Comment
User Comments (0)
About PowerShow.com