Querying the Internet with PIER - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Querying the Internet with PIER

Description:

Querying the Internet with PIER. CS294-4. Paul Burstein. 11/10/2003 ... One relation already hashed on join attribute. R, S relations. Nr, Ns relation namespaces ... – PowerPoint PPT presentation

Number of Views:61

Avg rating:3.0/5.0

Slides: 27

Provided by: bur100

Learn more at: https://people.eecs.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Querying the Internet with PIER

1
Querying the Internet with PIER

CS294-4
Paul Burstein
11/10/2003

2
Outline

Motivation
Architecture
Join Algorithms
Evaluation
Discussion

3
Motivation

Inject a degree of distribution into databases
Internet scale systems vs. hundred node systems
Large scale applications requiring database
functionaity

4
Applications

P2P Databases
Highly distributed and available data
Network Monitoring
Intrusion detection
Fingerprint queries

5
Design Principles

Relaxed Consistency
Sacrifice Consistency in face of Availability and
Partition tolerance
Organic Scaling
Growth with deployment
Natural Habitats for Data
Data remains in original format with a DB
interface
Standard Schemas
Achieved though common software

6
Outline

Motivation
Architecture
Join Algorithms
Evaluation
Discussion

7
PIER Architecture
8
DHT Design

Implemented with CAN and Chord
Routing Layer
Mapping for keys
Storage Manager
Node data storage

Provider
Storage access interface for higher levels

9
Routing Storage

Routing Layer
DHT-based API
locationMapChange local key set change
Storage Manager
Easy to realize API
Efficient performance relative to network
Main-memory storage manager used

10
Provider

Couples the routing and storage layers
namespace relation
resourceId primary key

namespace resourceId ? key
instanceId distinguishes objects with same
namespace and resourceID
lifetime item storage duration
multicast contacts namespaces nodes
lscan iterates over a nodes local data
newData application callback on data arrival

11
PIER Query Processor

Query dataflow engine
Operators
Selection, projection, joins, grouping,
aggregation
Operators push and pull data
Current data modification is though the DHT
interface

Relaxed consistency and reachable snapshot
Working only with nodes reachable at the time a
query is issued

12
Outline

Motivation
Architecture
Join Algorithms
Evaluation
Discussion

13
Join Algorithms

Symmetric Hash Join
Rehashes the relations
Scan and copy
Fetch Matches
One relation already hashed on join attribute
R, S relations
Nr, Ns relation namespaces
Nq - DHT-based temporary table

14
Join Rewriting

Aimed at lowering the bandwidth utilization
Symmetric semi-join
Local projections to join keys
Global fetch matches join
Bloom joins
Local bloom filters are published into temporary
namespaces
Filters multicast to opposite relations nodes

How does this scale?

16
Outline

Motivation
Architecture
Join Algorithms
Evaluation
Discussion

17
Workload Parameters

CAN configuration d 4
R 10 times larger than S
Constants provide 50 selectivity
f(x,y) evaluated after the join
90 of R tuples match a tuple in S
Result tuples are 1KB each
Symmetric hash join used

18
Simulation Setup

Up to 10,000 nodes
Network cross-traffic, CPU and memory
utilizations ignored
1. 100ms and 10Mbps fully connected links
2. GT-ITM transit-stub topology

19
Scalability

1MB data per node
Fully-connected topology
Variable number of computation nodes
Network congestion is an issue with few
computation nodes
How is the computation workload distributed?

20
Join Algorithms (1/2)

Infinite Bandwidth
1024 data and computation nodes
Core join Algorithms
Perform faster
Rewrites
Bloom Filter two multicasts
Semi-join two CAN lookups

21
Join Algorithms (2/2)

Limited Bandwidth
10Mbps inbound capacity
25GB relations, 1024 nodes
Symmetric Hash Join
Rehashes both tables
Semi-join
Transfers only matching tuples
At 40 selectivity, bottleneck switches from
computation nodes to query sites

22
Soft State

Failure detection and recovery
15 second failure detection
4096 nodes
Refresh period
Time to reinsert lost tuples

23
Transit Stub Topology

GT-ITM
4 Domains, 10 nodes per domain, 3 stubs per node
50ms, 10ms, 2ms latency
10Mbps inbound links
Similar trends as fully connected topology
A bit longer end-to-end delays

24
Experimental Results

64 PCs on 1Gbps network
All nodes are computation nodes

25
Outline

Motivation
Architecture
Join Algorithms
Evaluation
Discussion

26
Discussion

PIER presents a distributed query engine
What remains to be done?
DB issues
Networking issues

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval) PowerPoint PPT Presentation

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval) - Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval) | PowerPoint PPT presentation | free to view

PlanetLab: An Open Laboratory for Introducing Disruptive Technology into the Internet PowerPoint PPT Presentation

PlanetLab: An Open Laboratory for Introducing Disruptive Technology into the Internet - PlanetLab: An Open Laboratory for Introducing Disruptive Technology into the Internet David Culler University of California, Berkeley http://www.cs.berkeley.edu/~culler | PowerPoint PPT presentation | free to view

Streaming in a Connected World: Querying and Tracking Distributed Data Streams PowerPoint PPT Presentation

Streaming in a Connected World: Querying and Tracking Distributed Data Streams - Streaming in a Connected World: Querying and Tracking Distributed Data Streams Minos Garofalakis Intel Research Berkeley minos.garofalakis@intel.com | PowerPoint PPT presentation | free to view

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval) - Tracking virus/worm infections. Timeliness is very helpful. Might ... PIER currently runs box-and-arrow dataflow graphs. We are working on query optimization ... | PowerPoint PPT presentation | free to view

Welcome to the Local Internet Registry Course PowerPoint PPT Presentation

Welcome to the Local Internet Registry Course - IPv4, IPv6, wireless connectivity meeting@ripe.net 12. Local ... most changes in the RR. user query scripts need re-writing. Everybody will be affected! ... | PowerPoint PPT presentation | free to view

Welcome to the Local Internet Registry Tutorial PowerPoint PPT Presentation

Welcome to the Local Internet Registry Tutorial - http://www.ripe.net/ripe/meetings/archive/ripe-37 ... LIRs / RIRs / ICANN / etc ... Information dissemination. New Projects. Test Traffic Measurements ... | PowerPoint PPT presentation | free to view

Querying and Routing in NextGeneration Networks PowerPoint PPT Presentation

Querying and Routing in NextGeneration Networks - Host-centric protocols defined in terms of IP addresses. ... Gnutella queries from 30 LimeWire Ultrapeers simultaneously on PlanetLab ... | PowerPoint PPT presentation | free to view

PeertoPeer p2p Querying PowerPoint PPT Presentation

PeertoPeer p2p Querying - Modified the LimeWire Gnutella Client. Run as leaf or ultrapeer. Monitor Gnutella traffic ... Log of Gnutella queries from LimeWire clients. Reissued Gnutella queries ... | PowerPoint PPT presentation | free to view

PIER Continuing Medical Education CME PowerPoint PPT Presentation

PIER Continuing Medical Education CME - PIER Continuing Medical Education (CME) ... To earn credit, check the box next to one or more entries that you wish to apply ... PIER CME Transcript ... | PowerPoint PPT presentation | free to view

Seaweed: Scalable Delay Aware Querying PowerPoint PPT Presentation

Seaweed: Scalable Delay Aware Querying - One-shot queries. Incremental results. Progress estimation. Meta-data replication ... No double-counting. Every endsystem's results counted ... | PowerPoint PPT presentation | free to view

Querying the Internet with PIER PIER Peertopeer Information Exchange and Retrieval VLDB 2003 PowerPoint PPT Presentation

Querying the Internet with PIER PIER Peertopeer Information Exchange and Retrieval VLDB 2003 - PIER currently only one primary module: the relational execution engine ... (Scan) Use lscan to retrieve all data from ONE table NR ... | PowerPoint PPT presentation | free to view

PIER: an InternetScale Query Processor PowerPoint PPT Presentation

PIER: an InternetScale Query Processor - UFL queries are direct specifications of physical query execution plans in PIER. An UFL query plan is made up of one or more operatorgraphs(opgraphs) ... | PowerPoint PPT presentation | free to view

Architectures and Algorithms for InternetScale P2P Data Management PowerPoint PPT Presentation

Architectures and Algorithms for InternetScale P2P Data Management - The 'Internet Screensaver' Engage end users: education and prevention ... Trackability and liability will prevent this being used for free speech. Now consider p2p ... | PowerPoint PPT presentation | free to view

Peer-to-peer computing research: a fad? PowerPoint PPT Presentation

Peer-to-peer computing research: a fad? - Internet users cooperating to share, for example, music files ... Attacker controls enough nodes to foil the redundancy. N32. N10. N5. N20. N110. N99. N80 ... | PowerPoint PPT presentation | free to view

Declarative Networking: Extensible Networks with Declarative Queries PowerPoint PPT Presentation

Declarative Networking: Extensible Networks with Declarative Queries - Era of change for the Internet ' ... realms that its original design neither anticipated nor easily accommodates. ... Distributed Gnutella/Web crawlers (Dataflow, UCB) ... | PowerPoint PPT presentation | free to view

The DataCentric Revolution in Networking PowerPoint PPT Presentation

The DataCentric Revolution in Networking - Protocol delivers data from one host to another. unicast: conceptually trivial ... PIER (talk later today in session A9!): joins, aggregation, recursive and ... | PowerPoint PPT presentation | free to view

phi - vision 1: shift network security from medicine to public health ... energizing the end-users. endpoints are ubiquitous. internet, intranet, hotspot ... | PowerPoint PPT presentation | free to view

Seminar Service Aspects in ad-hoc and P2P networks PowerPoint PPT Presentation

Seminar Service Aspects in ad-hoc and P2P networks - Seminar Service Aspects in ad-hoc and P2P networks Database functionality in P2P-networks von Thorsten Weiberg | PowerPoint PPT presentation | free to view

P2: Implementing Declarative Overlays PowerPoint PPT Presentation

P2: Implementing Declarative Overlays - Get a bunch of friends together and build your own ISP (Internet evolvability) ... a distributed query processor, lots of things fall off the back of the truck... | PowerPoint PPT presentation | free to view

1 - Modified the LimeWire Gnutella Client. Run as leaf or ultrapeer. Monitor Gnutella traffic ... Log of Gnutella queries from LimeWire clients. Reissued Gnutella queries ... | PowerPoint PPT presentation | free to view

Failure Data Collection and Analysis PowerPoint PPT Presentation

Failure Data Collection and Analysis - Frequency of collection. synchronized with application and system crashes on computers ... How we collect minidumps (1) Corporate Error Reporting ... | PowerPoint PPT presentation | free to view

Information Online : Medical Databases PowerPoint PPT Presentation

Information Online : Medical Databases - Progression of coronary artery calcium and risk of first myocardial infarction ... HealthFinder. http://www.healthfinder.gov. Spanish & Kids version ... | PowerPoint PPT presentation | free to view

Architectures%20and%20Algorithms%20for%20Internet-Scale%20(P2P)%20Data%20Management PowerPoint PPT Presentation

Architectures%20and%20Algorithms%20for%20Internet-Scale%20(P2P)%20Data%20Management - This file was generated using MS PowerPoint 2004 for Mac. ... Loo, Robert Morris, Sriram Ramabhadran, Sean Rhea, Ion Stoica, David Wetherall ... | PowerPoint PPT presentation | free to view

P2: Implementing Declarative Overlays - Overlays are a fundamental tool for repurposing communication ... Intrusion detection with friends (DDI, Polygraph) Have your assets discover each other (iAMT) ... | PowerPoint PPT presentation | free to view

Continuous Monitoring of Topk Queries over Sliding Windows PowerPoint PPT Presentation

Continuous Monitoring of Topk Queries over Sliding Windows - Continuous Monitoring of Top-k Queries over Sliding Windows ... Blocking Operators (Juggle, Punctuations) Queries Referencing Past Data ... | PowerPoint PPT presentation | free to view

Query Processing and Networking Infrastructures PowerPoint PPT Presentation

Query Processing and Networking Infrastructures - Day 2: Research Synergies w/Networking. Queries as indirection, revisited ... Re-xmission: e.g. polling, retries. 'Joe is so persistent' Persistence of put or get ... | PowerPoint PPT presentation | free to view

Declarative Networking: Extensible Networks with Declarative Queries - University of California, Berkeley ... realms that its original design neither anticipated nor easily accommodates. ... Distance Vector with Poisoned Reverse ... | PowerPoint PPT presentation | free to view