CS290F, Winter 2005 Large-scale Networked Systems - PowerPoint PPT Presentation

About This Presentation
Title:

CS290F, Winter 2005 Large-scale Networked Systems

Description:

CS290F, Winter 2005 Large-scale Networked Systems Instructor Ben Y. Zhao (ravenben at cs.ucsb.edu, 1151 Engineering I) Office hour: Thur, 3-4pm Lecture time: TuTh, 1 ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 41
Provided by: BenY89
Category:

less

Transcript and Presenter's Notes

Title: CS290F, Winter 2005 Large-scale Networked Systems


1
CS290F, Winter 2005Large-scale Networked Systems
  • Instructor
  • Ben Y. Zhao (ravenben at cs.ucsb.edu, 1151
    Engineering I)
  • Office hour Thur, 3-4pm
  • Lecture time TuTh, 100-300 pm
  • Place 1401 Phelps
  • Teaching Assistant
  • Ted Huffmire (huffmire at cs.ucsb.edu)
  • Office hours TBA

2
Overview
  • Administrivia
  • What this class is, and is not
  • Enrollment, etc
  • What you have to do for this class
  • Grading
  • Break
  • Course projects
  • organization
  • some initial ideas

3
Administrivia
  • Course webpage
  • http//www.cs.ucsb.edu/ravenben/classes/290F
  • check it periodically for announcements
  • Class mailing list
  • cs290f at lists.cs.ucsb.edu
  • go to http//lists.cs.ucsb.edu/mailman/listinfo/c
    s290f
  • sign up today!
  • Deadlines
  • Unless otherwise specified, it means 10 minutes
    before the lecture
  • Special circumstances should be brought to my
    attention before deadlines

4
Class Registration
  • Please send me (ravenben_at_cs.ucsb.edu ) an e-mail
    with the subject cs290f registration" and the
    following
  • Last and first name
  • Student ID
  • Your department
  • Preferred email address
  • URL of your home page
  • Please indicate explicitly if I can add you to
    the on-line web page that lists each student
    enrolled in the class (only your name and URL
    will be made public).

5
Class Enrollment
  • Class size capped at 40
  • currently enrollment is full
  • for good discussion, we should be at 25
  • For those unable to enroll
  • email me with your student ID (perm )
  • I expect enrollment to open up in first 1-2 weeks

6
Goals of this Course
  • Understand
  • How do peer-to-peer systems work?
  • What are the design issues in building large
    networked systems?
  • Where is networking research heading?
  • Get familiar with current network research
  • Understand solutions in context
  • Goals
  • Assumptions

7
Goals of this Course (contd)
  • Appreciate what is good research
  • Problem selection
  • Solution and research methodology
  • Presentation
  • Apply what you learned in a class project

8
Reasons to Not Take This Class
  • Its not...
  • a survey class on p2p systems
  • a reading seminar
  • I might do a p2p reading group / seminar later
  • will be focused more on reading papers
  • Unreasonably high expectations
  • this is my first grad class
  • its in my research area
  • And
  • smaller class will benefit everyone
  • also, this class will likely be offered next year

9
What do you need to do?
  • A research-oriented class project
  • Paper reading / reviews
  • Lead class discussion on one research paper
  • Participate in discussions in class
  • this is meant to be an interactive class

10
Research Project
  • Investigate new ideas and solutions in class
    project
  • Define the problem
  • Execute the research
  • Work with your partners
  • Write up and present your research
  • Ideally, best projects will become conference or
    workshop papers
  • e.g., SIGCOMM, MOBICOM, SOSP/OSDI, NSDI
  • IPTPS, WCW, HotOS, WMCSA, NOSSDAV

11
Research Project Steps
  • Ill distribute a list of projects
  • You can either choose one of these projects or
    come up with your own
  • Pick your project, partner, and submit a one-page
    proposal describing
  • The problem you are solving
  • Your plan of attack with milestones and dates
  • Any special resources you may need
  • A midterm presentation of your progress (5
    minutes)
  • Poster session
  • Submit project papers

12
Paper Reviews
  • Goal synthesize main ideas and concepts in
    papers
  • Number 2-3 per class
  • Length no more than ½ page per paper
  • Content
  • Main points intended by the author
  • Points you particularly liked/disliked
  • Other comments (writing, conclusions)
  • Submission
  • Submit each review via email before class on
    lecture day
  • See class web page for details

13
What is a Presentation?
  • Presentations are in Powerpoint
  • or PDF if you hate MSFT
  • Target a 30 minute presentation
  • thats about 15 slides
  • idea is to engage the class in discussion on your
    points
  • leave the audience with thoughtful questions
  • Youll be graded on several things
  • did you present the high-level points of the
    paper?
  • did you get a good discussion going
  • did you relate the paper to others in the course
    (context!)

14
Choose Your Paper
  • Pick your paper from the lecture schedule online
  • the readings and dates will be finalized by
    Thursday
  • Email me to claim the paper
  • Ill let you know if its already been taken
  • presenters names will be updated online asap

15
About Grading
Paper reviews 10
Class presentations 20
Class Project 70
  • This is a graduate networking class
  • key is what you learn, not the grade
  • Evaluation of projects
  • 2 components execution and impact
  • execution
  • proposal, design, implementation, evaluation,
    presentation
  • impact
  • how novel is your work, how does it advance the
    state of the art?
  • good execution ? B/A-, impact ? A/A

16
Class Topics
  • Study existing p2p systems
  • Gnutella, Freenet, Tapestry, OceanStore, PAST,
    RON, I3, Vivaldi
  • Study fundamental system issues
  • security, fault-tolerance, system measurement and
    deployment
  • Study issues in alternative network environments
  • sensor networks
  • ad-hoc and mobile networks

17
Break
  • Take a break, get up, walk around, stretch
  • come back in 10 minutes

18
Course Projects
  • Small project teams
  • ideally teams of 3
  • 2 (likely too small to tackle significant
    project)
  • 4 (likely too large to coordinate well)
  • Choose your team members
  • use the class enrollment webpage (Ill put that
    up shortly)
  • use the class mailing list
  • Send me a one-page project proposal by Jan 15
  • what is the problem youre solving (w/ related
    works)
  • motivation/challenges (why is this important and
    new)
  • plan of attack (w/ milestones and dates)
  • any resources you might need

19
Background on Structured Overlays
  • Each data item and machine (node) in the system
    has associated a unique ID in a large ID space
  • ID space is partitioned among nodes
  • DHT Hash table like interface
  • put(id, data)
  • data get(id)
  • data items are stored at the node responsible for
    its ID
  • DOLR directory service interface
  • publish (id)
  • routeToId (id, message)
  • data stored anywhere and located via directory

20
Structured Peer-to-Peer Overlays
  • Assign random nodeIDs and keys from secure hash
  • incrementally route towards destination ID
  • each node has small set of outgoing routes, e.g.
    prefix routing

ID ABCE
ABC0
To ABCD
AB5F
A930
21
Whats in a Protocol?
  • Definition of name-proximity
  • each hop gets you closer to destination ID
  • prefix routing, numerical closeness, hamming
    distance
  • Size of routing table
  • amount of state kept by each node as f (N), N
    network size
  • of overlay routing hops
  • worst case routing performance (in overlay hops,
    not IP)
  • Network locality
  • does choice of neighbor consider network distance
  • impact on actual performance of P2P routing
  • Application Interface

22
Chord
  • NodeIDs are numbers on ring
  • Closeness defined by numerical proximity
  • Finger table
  • keep routes for next node 2i away in namespace
  • routing table size log2 n
  • n total of nodes
  • Routing
  • iterative hops from source
  • at most log2 n hops

Node 0/1024
0
128
896
256
768
640
384
512
23
Chord II
  • Pros
  • simplicity
  • Cons
  • limited flexibility in routing
  • neighbor choices unrelated to network proximity
    but can be optimized over time
  • Application Interface
  • distributed hash table (DHash)

24
Tapestry / Pastry
  • incremental prefix routing
  • 1111?0XXX?00XX ?000X?0000
  • routing table
  • keep nodes matching at least i digits to
    destination
  • table size b logb n
  • routing
  • recursive routing from source
  • at most logb n hops

Node 0/1024
0
128
896
256
768
640
384
512
25
Routing in Detail
Example Octal digits, 212 namespace, 2175 ? 0157
5712
0880
3210
4510
7510
26
Project Ideas
  • Discuss 10 project suggestions
  • some related to peer-to-peer protocols and
    systems
  • Legend based on how well-defined projects are
    not necessary how difficult they are
  • Well-defined projects (5)
  • Less-defined project (3)
  • You need to define projects goals (2)
  • You can also work on variants of these, or make
    up your own topics
  • Need to send me a one page proposal by Jan 15
  • Ill provide feedback/meetings to iterate on topic

27
Project 1 Implement Chimera
  • Implement a structured peer-to-peer protocol that
    is a hybrid of Tapestry, Pastry, and Bamboo
  • implement in C/C
  • focus on usability and performance
  • Key features
  • simplicity (leafsets like Pastry)
  • stability (p2p exchange of routing tables like
    Bamboo)
  • performance (locality like Tapestry)
  • support standard p2p API and easy to build up
    on
  • might support multiple teams

28
Project 2 the DIY P2P protocol
  • Study the related literature on existing p2p
    protocols, and design something different
  • must have something significantly novel/new
  • can be new contribution in performance, security,
    theoretical interest, specialized for a useful
    application
  • The challenge
  • finding an area yet unexplored in the p2p space
  • The result
  • well-specified protocols on routing, node
    insertion/deletion, failure handling
  • Warning this is not easy

29
Project 3-5 P2P Applications/Systems
  • All applications built on common API
  • should work w/ existing protocols and new ones
    developed in class
  • Project 3 a lightweight p2p CDN/web-cache
  • use a DHT/DOLR to quickly search for desired
    object
  • use request tracking to determine optimal data
    placement
  • implement a client-side http proxy
  • The challenge
  • maintaining data stability as nodes come and go
  • providing fairness / load balancing across nodes

30
Project 4 P2P EBay
  • Design and implement a scalable and secure
    auctioning / e-commerce system on a p2p
    infrastructure
  • support large of transactions/time
  • support large of users and items
  • be stable and resilient against attacks
  • The challenge
  • understanding the types of attacks
  • selecting a reasonable subset to resist
  • maintaining scalability despite security
    enhancements

31
Project 5 Lightweight Data Synchronization
  • Design and implement Quartz
  • lightweight p2p data sharing system
  • store your most critical files (lt100MB) online
  • use simple application-specific handlers to
    provide fast data synchronization (a la CVS)
  • synchronize your HTML bookmarks across machines
  • synchronize your papers, homework files,
    financial records
  • end to end encryption
  • The challenge
  • keeping data highly-persistent on a dynamic p2p
    network
  • optimizing per-node operational overhead
  • a simple user interface that people will
    actually use

32
Project 6 P2P Security
  • Security is one of the biggest challenges in p2p
  • large population spread across networks and
    domains
  • high probability of node compromise
  • need to function in presence of malicious nodes
  • Existing work on security discouraging
  • Sybil attacks hard to prevent
  • bad users can create lots of identities online
    and generate collusion attacks
  • hard to avoid malicious nodes
  • youre 1, they are many

33
Project 6 a P2P reputation system
  • Design and evaluate a highly-adaptive p2p
    reputation system
  • quickly recognize malicious (or compromised)
    nodes
  • use p2p collaboration to share reputation
    information
  • form trusted circles
  • use third-party anonymous verification to build
    reputations
  • The challenge
  • building reliable reputations that are highly
    adaptable
  • doing so without generating massive network probe
    traffic
  • minimizing impact if system is circumvented

34
Project 7 Distributed Workload Logging
  • Modelnet and Emulab provide emulation
    environments for up to 1000s of virtual nodes
  • good for reproducible experiments
  • bad in a dedicated cluster, unlike real world
    conditions
  • PlanetLab provides real-world platform for
    distributed applications (200 nodes across
    Inet)
  • good for deploying / hardening your
    application/system
  • bad for reproducible results
  • What we need
  • a way to gather interesting (and unexpected)
    environmental conditions from distributed
    networks like Planetlab, and use it as trace to
    drive Modelnet/Emulab

35
Project 7 cont.
  • The goal
  • design, implement, and deploy a distributed
    sensor and logging interface
  • needs to be scalable, lightweight (minimize
    impact on environment), and accurate
  • gather a large data set, and build an
    environmental trace

36
Project 8 P2P Benchmarking
  • Design and test and benchmark for P2P routing
  • needs to account for a wide-range set of
    application-level behaviors and working
    conditions
  • throughput
  • latency
  • key lookup versus data read/store
  • rate of node churn
  • The challenge
  • being comprehensive
  • being fair to goals of different protocols

37
Project 9 Quantifying P2P Performance
  • Test a set of protocols against a number of
    network topologies
  • mostly simulations (might need to implement some
    protocols)
  • test using different types of network models,
    GT-ITM, BRITE, Waxman, etc
  • vary protocol parameters to measure impact on
    basic performance
  • The challenge
  • making sense of large quantities of test data
  • getting a better understanding of inherent
    properties of network models and how they impact
    overlay networks

38
Project 10 P2P Ad-hoc
  • Take p2p routing algorithms, and apply to the
    ad-hoc or sensor net space
  • consider new constraints power, processing, no
    underlying IP
  • design new routing protocol
  • validate via simulation
  • The challenge
  • theres fairly little previous work, and its not
    clear the idea is feasible

39
Support Infrastructure
  • CSIL machines
  • you should all have accounts
  • A new 32 node networking cluster
  • just assembled, being installed now
  • Dell Xeon servers, 2.4Ghz, 2G RAM
  • cluster building in progress
  • will have modelNet installed, and support
    large-scale network emulation tests (300 nodes)

40
Summary of TODOs
  • What you need to do
  • send me registration email with subject cs290f
    registration
  • sign up online for class mailing
    listhttp//lists.cs.ucsb.edu/mailman/listinfo/cs2
    90f
  • pick a paper to present and email me to claim
    it(after Thur, first come first serve)
  • find other team members and pick a project
    topic(first draft of proposal due Jan 15)
Write a Comment
User Comments (0)
About PowerShow.com