Introduction to Distributed Programming - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Introduction to Distributed Programming

Description:

Distributed programming is still difficult. ... still ongoing. Distributed programming. in general much more ... Most Internet Applications still of this type ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 33
Provided by: ozo
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Distributed Programming


1
Introduction to Distributed Programming
  • Per Brand

2
Introduction
  • Global distributed computing needs an infra
    structure.
  • The Internet provides the first steps towards a
    global distributed applications
  • a global namespace (URLs)
  • a global communications protocol (TCP/IP).
  • Platforms such as Java and CORBA that take
    advantage of this infrastructure have become
    widely-used.
  • Distributed programming is still difficult.
  • Writing efficient, open, and robust distributed
    applications remains much harder than writing
    centralized applications.
  • Making them secure increases the difficulty by
    another quantum leap.

3
What are the properties of global distributed
systems?
  • A distributed system is set of processes, linked
    by a network
  • No global information, no global time
  • Unpredictable communication delays
  • Concurrency and nondeterminism
  • Large probability of localized faults
  • Easy access by unauthorized users

4
Additional Properties of the Internet
  • A global network that is partitioned into several
    protection domains (Firewalls)
  • Private sub networks with multiple reassignment
    of IP addresses across networks
  • Dynamic reassignment of IP addresses -- ISPs
    reuse a pool of IP addresses among customers

5
The issues in distributed programming
Classical problems of software engineering, code
reuse, maintainability, etc. are all here
Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Part of problem
Scalability
Interaction
6
Distributed Programming
  • Centralized programming
  • difficult enough
  • research development for 50 years
  • still ongoing
  • Distributed programming
  • in general much more difficult
  • why??

7
Adding/changing distribution
E.g. new security considerations
Distribution
Security
Openness
Functionality
E.g. RMI -semantics
Resource Control
Fault tolerance
E.g. new kinds of failure
Scalability
8
Adding/changing distribution -2
Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
E.g. recovery changes
Scalability
9
Adding/changing distribution -3
E.g. security in recovery
Distribution
Security
Openness
Functionality
E.g. functional operations on entities mixed
with error-recovery
Resource Control
Fault tolerance
E.g. persistence/error recovery consume resources
Scalability
10
Adding/changing distribution -4
E.g. further subdivision of tasks
Distribution
Security
Openness
Functionality
Largest problem Keeping needing to come back here
Resource Control
Fault tolerance
Scalability
11
Adding/incrementing openness
Example allow users to share with their buddies
- programs, games, virtual community
E.g. more potential security problems
Distribution
Security
Openness
Functionality
Resource Control
E.g. resource use more unpredictable
Fault tolerance
E.g. more kinds of failure
Scalability
12
Adding/incrementing openness - 2
Distribution
Security
E.g. resource control code mixed with
functional code
Openness
Functionality
Resource Control
Fault tolerance
E.g. resource overuse new kinds of faults
E.g. resource control consumes resources
Scalability
13
Levels of Difficulty-1
  • Client-Server Applications
  • Most Internet Applications still of this type
  • Client/server interface very limited and
    controlled
  • http
  • forms
  • Little fault-tolerance beyond classical database
    transactions on server-side
  • In the controlled server environment, issues of
    openness, security, and resource-control hardly
    apply
  • Fixed and simple distribution
  • Scalability an issue so if you cant buy a bigger
    server then ...

Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Scalability
14
Levels of Difficulty - 2
  • Client side
  • Security (mobile code)
  • Resource control
  • memory/cpu
  • Orthogonal aspects from server side

Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Scalability
15
Levels of Difficulty-3
  • Server Clusters
  • Distribution and Fault-tolerance within the
    cluster
  • Fault-tolerance simplified by the fact that there
    is no network partitioning within the cluster
  • Distribution simplified by uniformity of cluster
    - latencies can almost be ignored.
  • In the controlled server environment, issues of
    openness, security, and resource-control hardly
    apply.

Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Scalability
16
Levels of Difficulty-4
  • Multi-tier server architectures
  • Fault-tolerance between tiers/clusters, i.e.
    distributed transactions
  • Latencies important, alternative service
    providers
  • In the controlled server environment, issues of
    openness and resource-control hardly apply.
  • Security considerations lesser because of lack of
    openness

Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Scalability
17
Levels of Difficulty-5
  • Virtual Community
  • End-users add services to a shared environment
  • Openness with security is essential
  • Resource control important - mobile code

Distribution
Security
Openness
Functionality
Resource Control
Fault tolerance
Scalability
18
Distributed Programming Platform - DPP
  • DPPs
  • language/tools/implementation aimed at providing
    the developer of distributed applications what he
    needs
  • general-purpose programming system
  • more than just a centralized programming system
  • subsumes a centralized programming system

19
Groping for DPPs
  • RPC
  • Java and offshoots
  • Original and Pure Java - sharing code across the
    net
  • RMI (based on RPC)
  • Java Enterprise Beans (within a cluster)
  • Object Voyager
  • Continually evolving
  • often because of shortcomings in previous version
    (e.g. security manager in Java 1.1 vs 1.2)
  • Corba (for interoperability too)
  • Erlang
  • E-language (system)
  • Mozart
  • What is the common element ??
  • What is missing??

20
How to answer these Questions
  • Present a vision of what DPP should be
  • DPP provides 3 basic properties
  • The 3 basic properties are not new, only the
    context - analogies with programming languages
    used
  • Examining current tools
  • See how they partly fulfill these goals
  • Show they fall short.
  • Our view - we are the beginning of DPP
    development

21
DPP for distributed global applications
  • The DPP abstracts the complexity of the
    underlying system of connected computers
  • Provides transparency/hiding (network and
    location) as much as possible or as much as
    desirable.
  • Provides awareness - i.e. models the aspects of
    distribution that effect
  • performance(e.g. latency)
  • reliability (e.g. partial failure)
  • Provides control for tuning application with
    respect to fundamental tradeoffs in distributed
    systems
  • e.g. consistency protocol for state

The applications
The DPP runtime
Connected Computers
The Network
22
Transparent View
Application
The network and individual computers are
abstracted away Programmer sees a global
computation space
DPP
Machine
Machine
Machine
Communication Medium
23
Awareness View
  • Fundamental aspects of distribution presented to
    the programmer as
  • abstractly and
  • simply as possible
  • without losing necessary information

Application
DPP
DPP
DPP
Middleware
Machine
Machine
Machine
Communication Medium
24
Control View
The necessary control to to tune performance
available Litmus test It should not be possible
to improve performance by much by removing the
middleware and implementing on a lower
level. Compare high-level languages and assembler
Application
DPP
DPP
DPP
Middleware
Machine
Machine
Machine
Communication Medium
25
The Three Principles in Programming Languages
  • Transparency/hiding
  • Program constructs hide or make transparent
  • memory locations
  • actual machine instructions
  • hardware architecture
  • E.g iteration and recursion in C
  • Awareness
  • Programmers have a mental model of performance
    for logically-equivalent program constructs
  • E.g. Iteration gives better performance by orders
    of magnitude
  • Control
  • So basic that we forget this.
  • Consider a C compiled as it is today that only
    provided recursion.
  • Slower by many orders of magnitude (memory
    consumption increases)
  • Litmus test fails - the programmer would program
    in assembler instead

26
DPP in the broadest sense
  • Across the entire network, i.e. not just for
    server cluster architecture
  • Clients, between clusters, between clusters that
    cross administrative boundaries, even devices.
  • General-purpose
  • For all types of applications
  • Compare general-purpose programming languages
    with domain-specific ones

27
DPPs and programming languages
  • What is the relationship between DPP and
    programming languages?
  • DPP is not another word for programming language
  • A DPP subsumes, extends, and adds a new dimension
    to programming languages
  • Traditionally programming languages are an
    abstraction of a single machine.
  • A DPP abstracts over a set of connected machines
  • still includes a set of one -
  • still includes basic computation - for
    functionality
  • it is natural to base DPPs on a existing
    programming language (no reinventing the wheel)

28
Extension
  • DPPs introduces many more abstractions that are
    not needed in centralized programming languages,
    e.g.
  • Failure- shared object may fail due to network
    partitioning, crash of other site, etc.
  • At the very least new exceptions
  • For sophisticated fault-tolerance need to couple
    error recovery to object.
  • Resource control - imported code
  • Execute procedure with specified resource limits
  • Scalability - moving computations

29
New Dimension
  • For awareness and control DPPs may need to make
    distinctions on program constructs the
    programmer may find these
  • new
  • artificial
  • unnatural and burdensome
  • Example - object (shared object)
  • Choice of consistency protocol- best choice for
    performance is application dependent.
  • Three fundamental types as developed in
    distributed systems
  • stationary
  • mobile - with token protocol
  • mobile - with invalidation protocol
  • To fulfill control goals need all 3 kinds.

30
New Dimension -2
  • The burden of the new distinctions is dependent
    on the program language base that the middleware
    is based upon.
  • Example - object (shared object)
  • Stateful vs. stateless (in pure-object oriented
    languages) - for efficiency across the network
    the platform needs to know that information is
    stateless.
  • Stateless information can be replicated across
    the net
  • No consistency protocol
  • No infrastructure for consistency protocol.
  • Synchronous vs. asynchronous
  • New dimension latency.

31
Minimality
  • Also a Distributed programming language should be
    as similar to a programming language as possible
  • without losing awareness and control.!!
  • Minimal extensions, and minimal new dimensions.

32
The goal of a DPP-separation of aspects
Distribution
Security
Distribution
Security
Functionality
Openness
Openness
Functionality
Resource Control
Fault tolerance
Resource Control
Scalability
Fault tolerance
Scalability
Write a Comment
User Comments (0)
About PowerShow.com