Topics in Distributed Systems - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Topics in Distributed Systems

Description:

A distributed system is a collection of autonomous, ... Entity=a process on a device (PC, PDA, mote) Communication Medium=Wired or wireless network ' ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 20
Provided by: adrianai
Category:

less

Transcript and Presenter's Notes

Title: Topics in Distributed Systems


1
  • Topics in Distributed Systems
  • OR
  • Having Fun with Systems
  • (Session 1)
  • COP4600, October 13, 2008

2
Examples of Distributed Systems
ATT web
Gnutella network
The Internet
A Sensor Network
3
Definition (a version)
  • A distributed system is a collection of
    autonomous, programmable, failure-prone entities
    that are able to communicate through a
    communication medium that is unreliable.
  • Entitya process on a device (PC, PDA, mote)
  • Communication MediumWired or wireless network
  • Internet-Scale
  • Spanning multiple institutional or network
    (DNS) domains
  • (Much) Larger than cluster

4
Real-life Distributed Systems Technologies
  • Volunteering computing SETI_at_home
  • P2P systems Napster, Gnutella (history),
    BitTorrent, etc
  • Grid computing
  • Utility/Cloud computing
  • Amazons Elastic Computing Cloud (EC2), Simple
    Storage Service (S3), etc.
  • Human distributed computing
  • Amazons Mechanical Turk
  • CAPTCHA/reCAPTCHA
  • Mobile/ubiquitous computing

5
Real-life Distributed Systems Technologies
  • Volunteering computing SETI_at_home
  • P2P systems Napster, Gnutella (history),
    BitTorrent, etc
  • Grid computing
  • Utility/Cloud computing
  • Amazons Elastic Computing Cloud (EC2), Simple
    Storage Service (S3), etc.
  • Human distributed computing
  • Amazons Mechanical Turk
  • CAPTCHA/reCAPTCHA
  • Mobile/ubiquitous computing

6
SETI_at_HOMEhttp//setiathome.berkeley.edu/
7
(No Transcript)
8
SETI_at_home Operations
data recorder
9
How does it work?
SETI_at_home
Master-worker architecture
  • Fixed-rate data processing task
  • Low bandwidth/computation ratio
  • Independent parallelism
  • Error tolerance

10
History and Statistics
  • Conceived 1995, launched April 1999
  • scientific experiment that uses
    Internet-connected computers in the Search for
    Extraterrestrial Intelligence (SETI). You can
    participate by running a free program that
    downloads and analyzes radio telescope data.
  • No ET signals yet, but other results

11
Volunteering computing
  • Also called public-resource computing
  • Utilizes idle computing cycles over Internet
  • Other systems
  • Original GIMPS, distributed.net
  • Commercial United Devices, Entropia, Porivo,
    Popular Power
  • Academic, open-source
  • Cosm, folding_at_home

12
None of the popularity of SETI!
  • ET
  • How to get and retain users (from David Anderson,
    the leader of the SETI_at_home project)
  • Graphics are important (but monitors do burn in)
  • Teams users recruit other users
  • Keep users informed
  • Science news
  • System management news
  • Periodic project emails
  • Reward users
  • PDF certificates
  • Milestone pages and emails
  • Leader boards (overall, country, )

13
Millions and millions of computers!(Problems)
  • Server scalability
  • Dealing with excess CPU time
  • Cheating
  • Bad behavior
  • Team recruitment by spam
  • Sale of accounts on eBay
  • Malfunctions
  • Network bandwidth costs money

14
SETI_at_home Summary
  • Master-worker design
  • Centralized solution
  • Mastercentral point of control
  • Single point of failure
  • Performance bottleneck
  • Incentives for participation
  • Mean sometimes incentives for cheating
  • Massive (embarrassing) parallelism
  • Low bandwidth/computation ratio
  • Users do donate real resources 1.5M / year
    consumed power
  • More information http//setiathome.ssl.berkeley.e
    du

15
Human Distributed Computing
16
CAPTCHA
  • CAPTCHA program that tells whether a user is
    human or computer.
  • Given distorted, colored text
  • Protects from bots
  • 60 million CAPTCHAs are solved by humans around
    the world every day
  • Ten seconds of human time per puzzle
  • Why trust the answers to the puzzles?
  • Fixed number of puzzles bots can learn them
  • Dynamic how to verify?

17
reCAPTCHA
  • reCAPTCHA is a free CAPTCHA service that helps to
    digitize books.
  • Books from the Internet Archive
  • Old editions of the New York Times.

18
Amazons The Mechanical Turk
  • Artificial Artificial Intelligence
  • Called after a famous hoax
  • chess-playing machine in the late 18th century
  • Shown to be a hoax (a human inside the machine
    was operating it)
  • http//aws.amazon.com/mturk/

19
Practical Matters
  • REU program with funding available
  • Good for
  • Learning about what you want to do
  • Learning about research
  • Building experience with working with people
  • http//www.csee.usf.edu/
  • RESEARCH -- REU program
  • Projects available for summer and during the year
Write a Comment
User Comments (0)
About PowerShow.com