Title: Distributed Systems
1Distributed Systems
2Introduction to Naming Services
- Names used to refer to a wide variety of
resources - Computers, services, ports, objects and users
- Fundamental in distributed system design
- Needed to
- Communicate
- Share resources in a distributed system
- Users cannot communicate with one another via a
distributed system unless they can name one
another
3..Intro to Naming Services
- Name services in a distributed system
- Provides clients with data about named objects.
- Describes approaches to be taken in the design
and implementation of name services - Names used in a Distributed system are specific
to some particular service - Names are needed to refer to entities that are
beyond scope of any single service.
4Names
- Names are used in computer systems to allow a
process to access resources such as entities,
locations on a network, other processes, etc. - A naming system is required to implement a naming
system. Names can be critically important.
Names that have meaning to humans can help to
make a system self-documenting and easier to
maintain. - Distributed systems need a naming system that
works across multiple systems. Many named
entities may have aliases, making unique
identification difficult
5Naming Entities
- A name is usually a string of characters that is
used to refer to an entity (which can be many
things_ - an attribute, a host, a file, a printer, a
process or function, an object. - The identifier can also be a binary number.
- In order to use any entity, we need an access
point, usually an address. - An entity can change access points over time,
such as a process that moves to a different host,
or a mobile computer that logs on from a
different location.
6Global and Local Names
- Names are always organized in a name space.
- Thus, all names are defined relative to a
directory node. - An absolute name can be a confusing term, as it
is defined within the name space. - A global name is a name that always refers to the
same entity anywhere in the system, - A local name is a name whose interpretation
depends on where it is used. - There might be another entity with the same name
elsewhere in the system.
7Addresses
- It may seem logical to refer to an entity by an
address, but it is seldom done. - If the address changes, you lose contact with the
entity. - If an entity has more than one access point,
which address do you use? - Therefore, addresses are treated as a special
type of name, usually with references to connect
it to identifiers. - A name that is separate from its addresses is
said to be location independent.
8Identifiers
- An identifier is a special form of name that
- refers to at most one entity
- each entity has at most one identifier
- an identifier is never reused, thus always refers
to the same entity. - Identifiers make it much easier to refer to an
entity without ambiguity
9Human Friendly Names
- Names that are intended to be used by people are
called human friendly names, in contrast to
identifiers intended for machine use. - Sometimes there are different identifiers.
- A network host might have a human friendly
hostname like gwcregmore.gmit.ie but an IP
address of 127.168.7.3 expressed as a binary
number. - Human friendly names can be important for users
who need to use and maintain the system.
10Name Spaces
- Names in a distributed system are usually
assigned to a name space. - This can be represented as a labeled, directed
graph with two types of nodes. - A leaf node represents an entity and has no
outgoing edges. - A directory node has a number of outgoing edges
each labeled with a name. - Usually a name space has a single root node from
which all of the names can be found.
11Path Names
- If you follow the graph of a name space from a
directory node to the leaf node that represents
the entity you wish to access, the series of
names along the graph form a path name. - If the first node in the path is the root, it is
called an absolute path name. - Otherwise it is a relative path name.
12Name Resolution
- Given a path name, you should be able to locate
information stored in the node referred to by
that name. - The process of looking up a name is called name
resolution. - A search of the name space finds the node with
the name that is sought and returns the path
name.
13Closure Mechanism
- We need to know how and where to start when
seeking a node by name. - Finding the initial node in a name space from
which to begin name resolution is called a
closure mechanism. - For example, in a Unix file system, the inode of
the root node is the first inode in the logical
disk representing the file system.
14Aliases
- An alias is another name for the same entity. An
environmental variable is an example of an alias. - Hard links allow multiple absolute path names to
the same entity. - Symbolic links use a resolution process to find
an absolute path name for an entity referred to
by an alias.
15Garbage Collection
- Removing unreferenced entities from a name space
is an important task that can be critical to
system performance. - If a system has to search through dead links to
find live ones, a great deal of CPU time can be
wasted. - Also, especially in a dynamic system where
entities are being added and removed often, dead
links can require a great deal of memory and disk
space. - Many system lockups in Microsoft Windows have
occurred as a result of failure to release
resources no longer in use.
16Unreachable Nodes
- In the graph on the next slide, derived from
figure 4-28 in Tanenbaum, nodes in red cannot
be reached from the set of root nodes. Since the
closure mechanism may require searches to begin
from the root set, these nodes may consume
computer resources such as memory and disk space
without being able to be used or even located so
that the resources can be freed
17Unreferenced Nodes
Key
Root Node
Reachable Node
Unreachable Node
18Internet Address Protocols
- Network addressing schemes assign machine
readable numeric addresses to network cards in
nodes on a network. - The most common forms are the Internet protocol
addresses (IPv4 and IPv6) and Ethernet addresses. - The pathname system for file spaces has been
adapted to a similar form to provide human
readable identifiers for locations on a network,
including the Web. - These are Uniform Resource Identifiers (URI),
extended to Uniform Resource Locators (URL) and
Uniform Resource Names (URN).
19Uniform Resource Identifiers
- A uniform resource identifier is a protocol for
identifying a resource. It is normally in a form
similar to this - protocolsubprotocolresource
- There are variations on URIs for different
purposes, - such as identifying a database driver,
- a file on a network, or another type of resource.
- Extending the model above, we might find forms
like these - protocol//hostport/path/subpath/resource
- jdbcoracle9ithinaccounts-payable
20Uniform Resource Locators
- A URL is a URI that is used to locate a
particular resource. A URL has three parts - Protocol
- ftp//
- http//
- file/
- Host
- java.sun.com
- java.sun.com80
- Path
- /products/JDK/1.0.2
21Uniform Resource Names
- A URN is a URI that is used as a pure resource
name instead of as an identifier. - It might identify many different resources, such
as an email message ESMTP id 0J3N00GVA78VIH70_at_mstr
5.srv.hcvlny.cv.net - There is a special category of URI beginning with
urn reserved for URNs, although, as the example
above shows, not all URNs begin with urn - An example using the ISBN number for the
Coulouris text book urnISBN0-321-26354-5.
22Name Services
- A name service, such as the Domain Name System,
manages identifiers - May be required to translate between naming
schemes, - such as Ethernet, IPv4 and URL protocols,
- shown on the next slide.
23Naming domains to access a resource from a URL
24What is the Domain Name System?
- Converts human readable names to IP Address
(similar to a phonebook) - DNS is client/server oriented,
- We have a name server containing the IP Addresses
and names which serves information to the
clients. - Hierarchical System
- Primarily Used to resolve names to IP Addresses
(Forward Lookup) - Can also be used to map IP Addresses to names
(Reverse Lookup)
25Why do we need DNS?
- Introduced when the number of systems on the
internet became high enough that it became
difficult to track every systems address manually - If you are using TCP/IP then the translation of
IP addresses to name is done using the /etc/hosts
file. - This has certain disadvantages
- Every time a machine is added or removed from the
network you have to change the /etc/hosts file. - /etc/hosts file has to be maintained on all the
machines.
26Why do we need DNS?
- DNS is an automated system to do that job
- Instead of updating every machine in the network,
the DNS server maintains a database and provides
the client machines with information about both
addresses and names.
27World Wide Web
- Every time you access a website by name
- such as www.gmit.ie,
- DNS references a host record
- to resolve that name to an IP address.
- Once the name is resolved the
- web browser retrieves the content from the web
server using the address
28Email
- Email uses DNS for mail routing.
- Mail Exchanger (MX) records tell a mail server
where each message should be routed based on the
domain. - Everything before the _at_ symbol in an email
address is called the user portion and everything
after the _at_ symbol is called the host portion
(user_at_domain) - Uses DNS to resolve the Mail Exchanger record for
this domain to an IP Address. - Uses SMTP to send the mail to the receiving
servers IP address resolved by the DNS.
29Microsoft Active Directory
- DNS is an integral part of Active Directory
- Active Directory uses service (SRV) records to
advertise all network services to clients. - Client can use resources without knowing
anything about the network layout
30Other uses of DNS
- DNS is used anytime you reference a host by its
DNS name connect to a domain name regardless of
the service you are using. - Example of services used by DNS are
- telnet and ssh
- for system access to UNIX servers, groupware
clients and backup utilities. - Host (A) records are typically used.
31DNS Namespace
- Hierarchical layout of DNS names
- At the top of the namespace is the Root defined
by a null character. - Domain names read from right to left i.e. the
highest level of the namespace on the right. - Root is not explicitly specified in user
applications. - It is specified in the DNS configuration file and
is denoted by a trailing period. - A zone is a grouping of machines that may or may
not be in the same domain. - This is a set of machines over which a particular
name server has authority and maintains data.
32Top level Domains
- Below the root are the top level domains (TLDs)
such as .com .org .edu - The TLDs are maintained by the Internet
Corporation for Assigned Names and Numbers
(ICANN) - You can register domains beneath the TLDs
33An Example of the Namespace
34DNS Request Process
- Client sends a request for a DNS name resolution
to a local DNS server - The local DNS server first checks to see if it is
authoritative for the domain and checks for
cached copy of the requested information. - If found it will return the response to the
client and the process ends.
35DNS name servers
36Recursion
- If request cannot be fulfilled locally,
- DNS server retrieves information for client from
other DNS servers. - This is called recursion.
- Recursion is looking for each portion of the name
starting from the top of the hierarchy. (ex.
www.njit.edu) - First step is to contact one of the Internets
root name servers with a request for the first
part of the DNS name (.edu). DNS servers maintain
a list of these servers. - The root server returns a list of authoritative
DNS servers.
37Iterative navigation
38Non-recursive and recursive navigation
39Recursion contd...
- DNS server now contacts one of the DNS servers
returned by the root server - It requests the next portion of the DNS name
being queried in this case gmit.ie - The queried server replies with the authoritative
server for gmit.ie - The DNS server doing the recursion finally has
the authoritative DNS server for the domain
being requested it can now query that server for
www.gmit.ie
40Hierarchical Layout of DNS
- The hierarchical layout ensures that no DNS
server needs to hold the entire DNS database - Instead the root DNS servers hold a list of
servers authoritative for each TLD - The servers for the TLD hold a list of servers
for domains under that TLD and those servers hold
the actual record - This reduces the load on the servers and ensures
that the DNS namespace is maintainable across the
root servers and the TLD servers.
41Caching
- Caching is another feature of DNS which helps
alleviate the load on the servers. - Once a server caches this information it does not
need to perform the entire recursion process
again.
42Configuring DNS clients in UNIX
- Login as root
- vi /etc/resolve.conf (configuration file)
- Three entries
- Domain - This will be appended to any DNS queries
that fail. - Search - You can specify a list of up to 6
domains to append in case of failed queries (use
either domain or search but not both) - Nameserver IP Address of the DNS Server
43Nslookup Help
44Some DNS servers in Unix
- BIND
- Berkeley Internet name domain
- Maintained by the Internet Software Consortium
(ISC) - Most common DNS sever in use today
- DNS servers are included with many Unix operating
systems such as - Sun Solaris, HP-UX, IBM AIX and others.
45The X.500 Directory Service
- The X.500 Directory Service is a naming service
adopted by ITU and ISO as a standard for
providing a network directory service - similar to the yellow pages provided by
telephone companies - to allow individuals and organizations to provide
information about themselves to others. - X.500 is the basis for the Lightweight Directory
Access Protocol (LDAP) and is also used in the
DCE directory service.
46X.500 Architecture
- X.500 servers organise data in a tree structure
- with named nodes like other naming services,
- X.500 allows a wide range of attributes to store
information at each node. - Nodes can not only be searched by name, but the
attributes can be searched as well. - The X.500 name tree is called
- the Directory Information Tree (DIT)
- the entire structure with all the attribute
information is called - the Directory Information Base (DIB).
47X.500 Architecture
- X.500 servers are Directory Service Agents (DSA)
and their clients are Directory User Agents
(DUA). Figure 9.10 shows the service
architecture and one of the possible navigation
models. Each DUA client interacts with a single
DSA, which uses other DSAs as necessary to
satisfy requests for information in the DIB
48X.500 service architecture
49X.500 Example
- The following two slides illustrate an X.500
system. - The first slide shows part of the DIT that
includes the University of Gormenghast in Great
Britain. - The second shows one of the associated DIB
entries with information on a member of staff.
50Part of an X.500 Directory Information Tree
51An X.500 DIB Entry
52References
- http//linux-tutorial.info
- www.rpmfind.net
- George Coularis, Jean Dollimore and Tim Kindberg,
Distributed Systems, Concepts and Design, Addison
Wesley, Fourth Edition, 2005 - Andrew Tanenbaum and Martin van Steen,
Distributed Systems, Principles and Paradigms,
Prentice Hall, 2002 ISBN 0-13-088893-1