HiFi Systems: Network-Centric Query Processing for the Physical World

1 / 18
About This Presentation
Title:

HiFi Systems: Network-Centric Query Processing for the Physical World

Description:

Korea Advanced Institute of Science and Technology ... Korea Advanced Institute of Science and Technology. Hi Fan-In system. Ursa-Minor (TinyDB-based) ... –

Number of Views:204
Avg rating:3.0/5.0
Slides: 19
Provided by: nclabK
Category:

less

Transcript and Presenter's Notes

Title: HiFi Systems: Network-Centric Query Processing for the Physical World


1
HiFi SystemsNetwork-Centric Query Processing
for the Physical World
  • Michael J. Franklin, Shawn R. Jeffrey, et al
  • UC Berkeley TelegraphCQ Team
  • 2nd CIDR Conf. 2005

2
Table of Contents
  • One line Comment
  • Motivating Scenario
  • HiFi System with CSAVA processing stage
  • Internal Architecture of HiFi Node
  • Critiques
  • New Idea -1,2

3
One line Comment
  • Its a preliminary work describing the groups
    vision to distribute their TelegraphCQ system to
    a hierarchical network

4
Motivating Scenario Supply Chain Management
  • Smart Shelves continuously monitor item
    addition and removal.
  • Info is sent back through the supply chain.

5
Hi Fan-In system
Ursa-Major (TelegraphCQ w/Archiving)
Mid-tier Stargate Mid-tier Processing Node
6
Characteristics of HiFi Systems
  • High Fan-In, globally-distributed architecture
  • Large data volumes generated at edges
  • Filtering and cleaning must be done there
  • Successive aggregation as you move inwards
  • Summaries/anomalies continually, details later
  • Strong temporal focus
  • Strong spatial/geographic focus
  • Streaming data and stored data
  • Integration within and across enterprises

7
A View on this example
Archiving (provenance and schema evolution)
Filtering,Cleaning,Alerts
Monitoring, Time-series
Data mining (recent history)
8
High fan-in system levels with associated CSAVA
processing stages
Headquarters
Regional Centers
Warehouse
Warehouse Doors
Receptor
9
Internal Architecture of a HiFi node
Query Placement Service
Query Listener
Control Manager
Data Disseminator
Query Planner
Metadata Repository
Data Stream Processor
Local View Manager
DSP Manager
Logical Query Planner
Archive Manager
Physical Query Planner
Cache Manager
Data Flow
Resource Manager
Query Dispatcher
Query Flow
Data Listener
HiFi Glue
Control Flow
10
Critiques
  • Strong Point
  • They classify and formulate five distinct data
    processing stage
  • They develop the prototype system (in VLDB 05)
  • Weak Point
  • Designing MDR is critical but no initial effort
    is done
  • No new system requirement
  • Solutions are not technically deep

11
New Idea - 1
By-passing
SP Accel
Buffering
Filtered out
Data Source
CQ engine
Web Server
Clients
12
New Idea related to SPAccel
  • Designing front-end component (Cache??)
  • Filtering out unwanted input data
  • By-passing data matching query predicates
  • Buffering data for windowed queries (views) or
    distributed queries
  • Buffering Query Results

13
Issues expected
  • Cache replacement mechanism
  • How to index cached elements
  • What to cache?
  • How much?

14
New Idea -2 processing stream data for OLAP
queries
  • OLTP OLAP
  • Users Clerk, IT professional Knowledge
    worker
  • Function Day to day operations decision
    support
  • DB design application-oriented subject-oriente
    d
  • Data current, up-to-date historical,
    summarized
  • detailed, flat relational
    multidimensional
  • isolated integrated, consolidated
  • Usage repetitive ad-hoc
  • Access read/write, lots of scans
  • index/hash on prim. key
  • Unit of work short, simple transaction comple
    x query
  • Records accessed tens millions
  • Users thousands hundreds
  • DB size 100MB-GB 100GB-TB
  • Metric transaction throughput query
    throughput/response

15
A Sample Data Cube
Date
1Q
2Q
3Q
4Q
camera
C o u n t r y
Product
video
USA
CD
Canada
Mexico
16
New Idea - 2
  • Stream data in terms of OLAP domain
  • OLAP queries are
  • Inherently multidimensional
  • Spans a long time
  • Need data from multiple sources
  • Processing OLAP queries are
  • Memory intensive
  • Computation intensive

17
Naïve Solution
  • Pre-computing popular computation path

18
Supplementary Silde
  • Cleaning
  • CREATE VIEW cleaned_rfid_stream AS
  • ( SELECT receptor_id, tag_id
  • FROM rfid_stream rs
  • WHERE read_strength gt strength_T)
  • Smoothing
  • CREATE VIEW smoothed_rfid_stream AS
  • ( SELECT receptor_id, tag_id
  • FROM cleaned_rfid_stream
  • GROUP BY receptor_id, tag_id
  • HAVING count() gt count_T)
Write a Comment
User Comments (0)
About PowerShow.com