Title: CTIS 490 DISTRIBUTED SYSTEMS
1CTIS 490DISTRIBUTED SYSTEMS
- WEEK 13
- DISTRIBUTED WEB-BASED SYSTEMS
2INTRODUCTION
- The World Wide Web (WWW) can be viewed as a huge
distributed system with millions of clients and
servers for accessing linked documents. - Servers maintain collections of documents while
clients provide users an easy-to-use interface
for presenting and accessing those documents. - A document is fetched from a server, transferred
to a client, and presented on the screen. To a
user there is conceptually no difference between
a document stored locally or in another part of
the world.
3INTRODUCTION
- Now, Web has become more than just a simple
document based system. - Since 1994, the World Wide Web Consortium (W3C)
is responsible for standardizing protocols,
improving interoperability, and enhancing
capabilities. - With the emergence of Web services, it is
becoming a system of distributed services rather
than just documents offered to any user or
machine.
4TRADITIONAL WEB-BASED SYSTEMS
- Many Web-based systems are still organized as
simple client-server architectures. - The core of a Web site is formed by a process
that has access to a local file system storing
documents. - A reference called Uniform Resource Locator (URL)
is used to refer a document. - The DNS name of its associated server along with
a file name is specified.
5TRADITIONAL WEB-BASED SYSTEMS
- The URL also specifies the protocol for
transferring the documents across the network. - A client interacts with Web servers through a
special application known as browser. - A browser is responsible for properly displaying
a document.
6TRADITIONAL WEB-BASED SYSTEMS
- The overall organization of a traditional Web
site.
7WEB DOCUMENTS
- A Web document does not only contain text, but it
can include all kinds of dynamic features such as
audio, video, animations, etc. - In many cases special helper applications
(interpreters) are needed, and they are
integrated into the browser. - The main part of Web documents are written in a
markup language. - The most widely used markup language is HyperText
Markup Language (HTML). The other markup language
is the eXtensible Markup Language (XML).
8WEB DOCUMENTS
- Both HTML and XML are subsets of Standard
Generalized Markup Language (SGML). - The major difference between the two is that XML
is a meta-markup language, i.e. unlimited set of
tags can be defined. - HTML and XML can include tags that refer to
embedded documents, which are references to other
files. - An embedded document can be a complete program
executed on-the-fly as part of displaying
information.
9WEB DOCUMENTS
- Multipurpose Internet Mail Exchange (MIME) is
used to specify the type of an embedded document. - MIME was originally developed to provide
information on the content of e-mail messages. - MIME can distinguish various types of message
contents. - A content includes a type and a subtype
identifiers.
10WEB DOCUMENTS
- Six top-level MIME types and some common subtypes.
11WEB DOCUMENTS
- The application type is for data which do not
fit any of the other categories. It should be
processed by an application program to make it
viewable or usable. For example, application/PDF
requires a separate application to process the
document. - The multipart type is used for composite
documents in which they consist of several parts
where each part will have its associated
top-level type. - The variety of document types forces browser to
be extensible. As a result, plug-ins are required
to follow a standard interfaces so that they can
be easily integrated with the browsers.
12MULTITIERED ARCHITECTURES
- Web documents can be built in two ways
- Static locates and returns the object
identified in the request. Static objects include
predefined HTML pages and JPEG or GIF files. Web
servers do not require communication with any
server-side application. - Dynamic the request is forwarded to an
application system where the resulting reply is
generated dynamically, i.e. data is generated
through a server-side program execution. - Although Web started as simple two-tiered
client-server architecture for static Web
documents, this architecture has been extended to
support advanced type of documents.
13MULTITIERED ARCHITECTURES
- One of the first enhancements is Common Gateway
Interface (CGI). - A CGI program resides on the server machine. The
Web server interfaces with this program to
process user requests. - Usually, user data comes from an HTML form it
specifies the program that is to be executed at
the server side, along with parameters that are
filled by the user.
14MULTITIERED ARCHITECTURES
- After program finishes its work, Web server sends
the results back to users browser to be
displayed. - With CGI programs, fetching a document could be
delegated in such a way that the server would
remain unaware of whether a document had been
generated on the fly or read from a file.
15MULTITIERED ARCHITECTURES
- The principle of using server-side CGI programs.
16MULTITIERED ARCHITECTURES
- Because of the server-side processing many Web
sites are now organized as three-tiered
architectures consisting of a Web server, an
application server, and a database server. - In many cases the server runs Java programs
called servlets that maintain things like
shopping carts, implement recommendations, keep
list of favorite items, etc. - Server-side scripting technologies are used by
running a script directly on the Web server to
generate dynamic content. - Server-side scripting technologies include
- Microsoft Active Server Pages (ASP.NET)
- Sun Java Server Pages (JSP)
- Netscape JavaScript
- Free Software Foundation PHP
17CLIENTS
- Web client software, the browser, provides an
interface by which hyperlinks are displayed so
that a user can select them through a mouse
click. - Web browsers are platform independent. This goal
is achieved by using standard graphical and
networking libraries. - The core of a browser is formed by the browser
engine and the rendering engine. - The browser engine provides mechanisms for the
user to go over a document, select parts of it,
activate hyperlinks, etc. - The rendering engine requires parsing HTML or
XML, interpreting the scripts i.e. JavaScript,
and network communications.
18CLIENTS
- The logical components of a Web browser.
19CLIENTS
- Web browsers should be easily extensible to
support any type of documents returned by a
server. - A plug-in which is a small program that can be
dynamically loaded into a browser to handle a
specific document type. As a result, it is an
extension of the rendering engine. - Plug-ins and browsers conform to a standard
interface. - Another client-side process is a Web proxy used
to allow a browser to handle protocols other than
HTTP, such as FTP. - For example, to transfer a file from an FTP
server, the browser can issue an HTTP request to
a local FTP proxy which will fetch the file and
return it.
20CLIENTS
- Using a Web proxy when the browser does not speak
FTP.
21APACHE WEB SERVER
- By far the most popular Web server is Apache. As
of March 2007, 58 of all websites are using it. - Apache is platform independent. This independence
is realized by providing its own runtime
environment, which is implemented for different
operating systems. - This runtime is known as the Apache Portable
Runtime (APR), which is a library that provides
an interface for file handling, networking,
locking, threads, etc. - Apache uses the concept of a hook, which is a
placeholder for a specific group of functions. - Hooks are used to translate a URL to a local file
name, to write information to a log, to check a
clients identification, to check access rights,
etc.
22WEB SERVER CLUSTERS
- Web servers are replicated and combined with a
front end - to improve performance.
23WEB SERVER CLUSTERS
- The front end can be designed in two ways
- Transport-layer switch simply passes data sent
along the TCP connection to one of the servers ,
depending on some measurement of the servers
load. - Content-aware request distribution it first
inspects the HTTP request and decides which
server it should forward that request to. For
example, if the front end always forwards
requests for the same document to the same
server, the server may cache the document
resulting in higher response times. - The two can also be combined as illustrated in
the next figure.
24WEB SERVER CLUSTERS
- A scalable content-aware cluster of Web servers.
25WEB SERVER CLUSTERS
- Another alternative to set up a Web server
cluster is to use round-robin DNS. - With round-robin DNS a single domain name is
associated with multiple IP addresses. - When resolving a host name, a browser would
receive a list of multiple addresses, each
address corresponding a server. - Normally, browsers choose the first address on
the list, but most DNS servers circulate the
entries. - As a result, simple distribution of requests over
the servers in the cluster is achieved.
26HTTP
- All communication between the clients and servers
is based on the HTTP. Servers listen on port 80. - HTTP is a simple protocol a client sends a
request to a server and waits for a response. - HTTP is stateless it does not have any concept
of open connection and does not require a server
to maintain information on its clients. - HTTP is based on TCP whenever a client issues a
request to a server, it first sets up a TCP
connection and sends the message on that
connection. The same connection is used for
receiving the response. - One of the problems with the first versions of
HTTP was its inefficient use of TCP connections.
27HTTP CONNECTIONS
- (a) Using non-persistent connections.
(b) Using persistent connections.
28HTTP CONNECTIONS
- A Web document is constructed from a collection
of different files from the same server. - In HTTP version 1.0 and older, each request to a
server required setting up a separate connection.
When server had responded the connection was
broken down. These connections are referred as
nonpersistent. - In HTTP version 1.1, several requests and their
responses can be issued without the need for a
separate connection. These connections are
referred as persistent. - Furthermore, a client can issue several requests
in a row without waiting for the response to the
first request which is referred as pipelining.
29HTTP METHODS
- Operations supported by HTTP.
30HTTP METHODS
- HTTP assumes that each document may have a
separate header that is sent along with a request
or response. - For example, the head operation may only return
the time the document was modified to verify the
validity of the document as cached by the client. - The most important operation is get which
actually fetches a document. It is also possible
to specify that a document should be returned
only if it has been modified after a specific
time.
31HTTP METHODS
- The put operation is opposite of get
operation and it stores a document. - The post operation is similar to put
operation except that a client will request data
to be added to a document or a collection of
documents. An example is posting an article to a
news group. - The delete operation will remove the document.
32HTTP MESSAGES
- HTTP only recognizes request and response
messages, and they contain three parts - Request/Status line
- Request/Response message headers
- Message body
- Request line specifies the operation and
reference to the document associated with the
request. HTTP version number in which client is
expecting is identified. - Status line contains version number and three
digit status code. The code is explained with a
textual phrase. For example code 200 is
associated with phrase OK. - Other frequently used codes are 400 (Bad
Request), 403 (Forbidden), and 404 (Not Found).
33HTTP MESSAGES
- (a) HTTP request message.
34HTTP MESSAGES
- (a) HTTP response message.
35HTTP MESSAGES
- There are also various message headers that the
client can send to the server explaining what it
is able to accept as a response. - For example, a client may accept responses
containing gzip compression program. - In this case, client will send an Accept-Encoding
message header with its content containing
Accept-Encodinggzip - For example, accept message header ca be used to
specify only HTML Web pages may be returned.
36HTTP MESSAGES
- Some HTTP message headers.
37HTTP MESSAGES
- Some HTTP message header (continued).
38SOAP
- Simple Object Access Protocol (SOAP) is a
protocol for exchanging XML-based messages
generally using the HTTP. - It is used to invoke methods over the Web
regardless of operating system, object model, or
lanhuage particular application. - Originally developed by Microsoft, it is now
maintained by the World Wide Web Consortium
(W3C). - http//www.w3.org/TR/soap12-part0/
- There is a serious performance bottleneck parsing
the SOAP messages.
39SOAP
- Request
- ltSOAPBodygt
- ltAddTwoNumbersgt
- ltFirstNumbergt5lt/FirstNumbergt
- ltSecondNumbergt3lt/SecondNumbergt
- lt/AddTwoNumbersgt
- lt/SOAPBodygt
- Response
- ltSOAPBodygt
- ltAddTwoNumbersResponsegt
- ltValuegt8lt/Valuegt
- lt/AddTwoNumbersResponsegt
- lt/SOAPBodygt
- Source
- http//searchwebservices.techtarget.com/searchWebS
ervices/downloads/what_is_soap.swf
40SERVICE-ORIENTED ARCHITECTURES (SOA)
- A means of developing distributed systems where
the components are stand-alone services. - Services may execute on different computers from
different service providers. - Standard protocols have been developed to support
service communication and information exchange.
41SERVICE-ORIENTED ARCHITECTURES (SOA)
42SERVICE-ORIENTED ARCHITECTURES (SOA)
- There are several service models, but mainly in a
SOA - A service provider offers a service by defining
its interface and implementing the service
functionality. - A service requestor binds that service into its
application. This means that the requestors
application includes code to call that service
and process results of the service call. - To ensure that the service can be accessed, the
service provider makes an entry in a service
registry that includes information about the
service and what it does.
43SERVICE-ORIENTED ARCHITECTURES (SOA)
- The differences between this service model and
distributed object approach to distributed
systems architectures are - Service can be offered by any service provider
inside or outside of an organization. For
example, a manufacturing company can link
directly to services provided by its suppliers. - The service provider and service user do not need
to agree about what the service does before it
can be incorporated in an application program.
44SERVICE-ORIENTED ARCHITECTURES (SOA)
- Applications can delay the binding of services
until they are deployed. Therefore, an
application using a stock price service can
dynamically change service providers while system
is executing. - Service user can create innovative applications
by combining various services. - Service users can pay according to their use
instead of buying an expensive component that is
rarely used.
45WEB SERVICES
- The development of the Web meant that client
computers had access to remote servers outside of
their organizations. - However, access was only through a Web browser,
and direct access to the servers by other
programs was not practical. For example, querying
a number of catalogs was not possible. - Web services architecture is an instance of the
Service Oriented Architecture. - The services are provided over the Internet.
46WEB SERVICES - EXAMPLES
- Microsoft MapPoint http//www.microsoft.com/mappoi
nt/products/webservice/default.mspx - One of the first commercial Web services.
- Developers can integrate mapping functionality
including street maps, driving directions,
proximity searches into their applications. - It also provides business listings and points of
interest that can be used with mapping.
47WEB SERVICES - EXAMPLES
- Source http//www.ajlopez.net/ArticuloVe.php?Id
340
48WEB SERVICES - EXAMPLES
- TerraServer USA
- http//terraserver-usa.com/webservices.aspx
- 1998 Microsoft researches set up a database for
aerial and satellite images of earth and made it
public. - TerraService provides Web services to
geography-oriented applications.
49WEB SERVICES
- Web services are based on the following standards
http//www.w3.org/TR/2003/WD-ws-arch-20030808/ - Simple Object Access Protocol (SOAP) This
XML-based protocol defines data exchange between
Web services. - Web Services Description Language (WSDL) This
XML-based protocol defines how the interfaces of
Web services can be represented. A WSDL file
describes what a service does and how to invoke
its operations. - Universal Description, Discovery, and Integration
(UDDI). This is a standard that specifies
information about service provider, the services
provided, and the location of service description.
50WEB SERVICES
51WEB SERVICES
- For example, a Web service such as a mapping
service takes two addresses and tells you how to
travel between them. - To create such a service, the following steps
must be done - Service developer must write a service
application that conforms to standards. - Service developer must deploy the application so
that other applications can use it. - Service developer must describe the interfaces
and endpoint needed for a client to make calls to
the service using an XML-based WSDL file.
52WEB SERVICES
- Service developer must register (publish) the
service description to the UDDI registry - Post information about company
- Categorize the service such as e-commerce,
weather service, map service, etc. - Specify URL of the service description WSDL file
- Client developer must go to the UDDI registry and
look for the required functionality. - Once the functionality is found, he/she should
obtain the specific WSDL file.
53WEB SERVICES
- Client developer must use the WSDL file to
generate the client stub code to make calls to
the remote service. - Client developer must incorporate the stub code
into client application and make calls to it.
Stub handles the required protocols. - Source http//java.sun.com/j2me/reference/whitepa
pers/Web_Svcs_wp072904.pdf
54NAMING
- The Web uses a single naming system to refer to
documents. The names used are called Uniform
Resource Identifers (URI). - A Uniform Resource Locator (URL) is a URI which
identifies a document by including information on
how and where to access it. - How to access a document is in the scheme that is
part of the URL, such as http, ftp, or telnet.
55NAMING
- Where a document is located is embedded in a URL
by means of DNS name of the server to which an
access request can be sent, although an IP
address can also be used. - The number of the port on which the server will
be listening for such requests is also part of
the URL when left out default port is used. - A URL also contains the name of the document to
be looked up by the server.
56NAMING
- Often-used structures for URLs. (a) Using only a
DNS name. - (b) Combining a DNS name with a port number.
- (c) Combining an IP address with a port number.
57NAMING
- URIs are also used for other purposes than to
refer a document. - For example, a telnet URI is used for setting up
a telnet session to a server. - The modem URI can be used to set up a modem-based
connection with another computer.
58NAMING
59SECURITY
- Most of the security issues in the Web deal with
setting up a secure channel between a client and
server. - The predominant approach for setting up a secure
channel is to use Secure Socket Layer (SSL)
originally developed by Netscape. Although SSL
has never been formally standardized, most Web
clients and servers support it. - An update of SSL is formally specified, and it is
referred as Transport Layer Security (TLS)
protocol. - TLS can support variety of higher level protocols
such as HTTP, FTP, and Telnet. - TLS implementations are usually based on TCP.
60SECURITY
- The position of TLS in the Internet protocol
stack.
61SECURITY
- Setting up a secure channel proceeds in two
phases. - First, client informs the server the
cryptographic algorithms it can handle. The
server reports the choices back to client. - Second, the authentication takes place.
- The server is always required to authenticate
itself, so it passes a certificate containing its
public key signed by a certification authority
(CA). - If the server requires that the client be
authenticated, the client sends its certificate
also. - The client generates a random number that will be
used by both sides for constructing a session key
encrypted with the servers public key.
62SECURITY
- TLS with mutual authentication