Title: Zhang Yelei
1Handling RFID Data
- Zhang Yelei
- 20 November, 2009
2Presentation Outline
- RFID Introduction
- Data Processing
- Data Integration
- Questions
3Presentation Outline
- RFID Introduction
- What is RFID
- Standards
- System architecture
- Applications
- Challenges
- Data Processing
- Data Integration
- Questions
4What is RFID (1)
- RFID Radio Frequency Identification
- A new method/technology of remotely storing and
retrieving data - Need to use RFID readers and tags
- A little historical information
- First applied in World War II by UK
- First paper exploring RFID Communication by
Means of Reflected Power by Harry Stockman
published in 1948 - Not until 1990s it was widely deployed
5What is RFID (2)
Roy Want, RFID a Key to Automate Everything
6What is RFID (3)
- RFID Tags (transponder)
- Low cost, not application-specific
- Operate on frequencies ranging from 100KHz to
beyond 2.5 GHz - Type passive, semi-passive, active
- Majority are write-once/read-only, others offer
r/w capability - Readability influenced by factors like frequency,
environment, tag position, antenna direction etc.
- Christian Floerkemeier and Matthias Lampes
cards playing experiment illustrates some sources
of errors in tag reading.
7What is RFID (4)
- RFID Readers
- Portable or fixed
- Use serial port (RS232) or network
interface/protocol (wired/ wireless connection)
to communicate with computers - Radio could be software defined
8Related Standards
- Standards about frequencies and communication
- Identification cards and related areas ISO/IEC
10536, ISO/IEC 14443 - Automatic identification and data capture
technologies ISO/IEC 15961, ISO/IEC 15962 - Conformance ISO/IEC 18046, ISO/IEC 18047
- ETSI, ERO
- Standards about the data format on tags
- EPCGlobal focuses on the standardization of the
data format - EPC, electronic product code (64,96,256 bits
long), is now the internationally accepted
item-level code.
9System Architecture
Savant mapping low-level data stream from
readers to a more manageable form, cleaning data,
supporting simple queries and installed standard
queries Central IS provide high-level services
that are easier for application to use.
10Applications (1)
- Business applications
- Transport and logistics
- Supply chain management
- Agriculture
- Government applications
- Defense and security
- Library systems
- Consumer applications
- Personal welfare and safety
- Sports and leisure
- Shopping and dining out
- Smart homes
11Applications (2)
- EPCGlobal Network
- A method of using RFID to share information in
the global supply chain - 5 components
- EPC (Electronic Product Code), ID system, EPC
middleware, Discovery services, EPC information
services (EPC IS)
12Applications (3)
Source Sun and RFID
13Challenges
- Reducing tag costs
- Global standards
- Frequency of tags and readers
- Other specifications
- IT infrastructure
- Data processing handling large amount of stream
data online, effecient use of storage, network
bandwidth, and so on. - Integration with databases, data warehouses and
enterprise applications - Security issues
14Presentation Outline
- RFID Introduction
- Data Processing
- Challenges of Handling Data Stream
- DSMS vs DBMS
- Projects
- Problems
- Solutions
- Data Integration
- Questions
15Challenges of Handling Data Stream
- What is the data stream
- A potentially unbounded sequence of tuples
(transactional data stream and measurement data
stream) - Data is continuous, infinite
- Most operations should be done online without
interrupting data stream. - Data recovery could also be a serious problem
- Computational resources are limited
- Real-time data stream requires efficient data
handling - Complex queries need to be performed nearly
real-time
ATT Labs-research, Data Stream Query Processing
16DSMS vs. DBMS
- DSMS
- DAHP model
- Deals with tuple sequences
- Complex queries executed real-time and online
- Database updated frequently
- Query persistent, plan adaptive, answer
approximate - DBMS
- HADP model
- Deals with tuple sets
- Complex queries usually executed offline
- Database relatively stable
- Query transient, plan fixed, answer exact
17Research Projects
- Aurora (supports cq, ad-hoc query, and
materialized view) - Aims to better support monitoring applications
- Borealis (distributed SPE, QoS based techniques)
- A distributed stream processing engine based on
Aurora and Medusa - TelegraphCQ (focuses on hybrid, ad-hoc query)
- Intends to handle large streams of continuous
queries over high-volume, hightly-variable data
streams - PSoup (focuses on both ad-hoc and continuous
query) - A query processor that supports both streaming
data and streaming query - STREAM
- A general purpose DSMS prototype
- GigaScope, Hancock, Nile, TinyDB, COUGAR
18Problems
- Data models
- Window operations
- Query languages
- Query processing
- System optimization
19Solutions (1) Data Models
- Relation-based Models
- Aurora stream type ( TS, A1,, An)
- PSoup
- STREAM a stream S is an unbounded bag of pairs
lts,?gt, a relation R is a time-varying bag of
tuples - Object-based Models
- COUGAR and Tribeca data types are associated
with methods
20Solutions (2) Window Operations
- Why?
- Time/ordering is a very important aspect of
streaming data - Data processing is still based on a finite data
set. - How to define?
- Window can be time-based or tuple-based, or
partitioned sliding window. - Types
- Fixed, snapshot
- Landmark
- Sliding
21Solutions (3) Query Languages
- Relation-based Languages
- CQL used by STREAM select from S1 Rows 1000,
S2 Range 2 Minutes where S1.AS2.A and S1.Agt10 - Object-based Languages
- COUGAR select R.s.getTemperature() from R where
R.floor3 and every(60) - Others (Procedural Languages ??)
- Aurora 7 new operators like map, resample
are defined
Golab and Ozsu, Data Stream Management Issues
----- A Survey
22Solutions (4) Query Processing
- Use connection points to cach streaming data
(Aurora) - TelegraphCQ use OSCAR for the trade-off of
quality and size of the data (from the disk) - Attach data queues with operators
- In Aurora, queue is managed by successors
pointers. - Shared modules among different queries
- In STREAM, synopses is replaced by stub and store
to reduce redundancy.
23Solutions (5) System Optimization
- Data gathering
- Run-time statistics are gathered
- Inserting, combining, reordering operators
- Train scheduling, superbox scheduling (like batch
operation) - Load shedding
- Static analysis and delay-based dynamic analysis
for overload detection - By dropping tuples, or by value-based tuple
filtering
24Presentation Outline
- RFID Introduction
- Data Processing
- Data Integration
- Research gap
- Design considerations
- Questions
25Research Gap
- Academic research
- Focuses on issues like processing ability,
efficient deployment, antenna design, and so on. - Lack of the emphasis on the effective interaction
with data warehouses and high-level applications. - Enterprise IS and data warehouse
- Emerged in 1980s, intend to deal with discrete,
aggregated data, not continuous, real-time,
single-item data.
26Design considerations
- Manage data storage
- Which data should be saved and where
- Eliminate redundant data
- Handle historical data
- Query data
- Study business scenarios
- Identify typical on-site queries and data
warehouse queries - Real-time processing
- Design of triggers
- Link real-time events with business processes.
(for example, BPEL process and web service)
27Presentation Outline
- RFID Introduction
- Data Processing
- Data Integration
- Questions