Internet Engineering Course - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Internet Engineering Course

Description:

... computer, responsible for accepting HTTP requests from clients, and serving ... http-equiv='Content-Type' content='text/html; charset=windows-1252' meta name ... – PowerPoint PPT presentation

Number of Views:492
Avg rating:3.0/5.0
Slides: 55
Provided by: ZhiLi7
Category:

less

Transcript and Presenter's Notes

Title: Internet Engineering Course


1
Internet Engineering Course
  • Web Servers

2
Introduction
  • Company needs to provide various web services
  • Hosting intranet applications
  • Company web site
  • Various internet applications
  • Therefore there is a need to provide http server
  • First we have a look at what http protocol is
  • Then we talk about Web Servers and Apache as
    leading web server application

3
The World Wide Web (WWW)
  • Global hypertext system
  • Initially developed in 1989
  • By Tim Berners Lee at the European Laboratory for
    Particle Physics, CERN in Switzerland.
  • To facilitate an easy way of sharing and editing
    research documents among a geographically
    dispersed groups of scientists.
  • In 1993, started to grow rapidly
  • Mainly due to the NCSA developing a Web browser
    called Mosaic (an X Window-based application)
  • First graphical interface to the Web ? More
    convenient browsing
  • Flexible way people can navigate through
    worldwide resources in the Internet and retrieve
    them

4
Web Browsers
  • Provides access to a Web server
  • Basic components
  • HTML interpreter
  • HTTP client used to retrieve HTML pages
  • Some also support
  • FTP, NTTP, POP, SMTP,

5
Web Servers
  • Definitions
  • A computer, responsible for accepting HTTP
    requests from clients, and serving them Web
    pages.
  • A computer program that provides the above
    mentioned functionality.
  • Common features
  • Accepting HTTP requests from the network
  • Providing HTTP response to the requester
  • Typically consists of an HTML
  • Usually capable of logging
  • Client requests/Server responses

6
Web Servers cont.
  • Returned content
  • Static
  • Comes from an existing file
  • Dynamic
  • Dynamically generated by some other
    program/script called by the Web server.
  • Path translation
  • Translate the path component of a URL into a
    local file system resource
  • Path specified by the client is relative to the
    servers root dir

7
Basic Client/Server Architecture in WWW
  • Overall organization of the Web.
  • Basic function operation is to fetch documents
  • Client issues requests, browser displays document
  • Server responsible for retrieving document from
    local file system
  • Client/server communications based on HTTP
    protocol

8
Dynamic Content
  • Parts of documents may be specified via
    scripts/programs
  • Client-side (executed on client machine, e.g.,
    within the browser)
  • Client-side script - Script embedded in html
    document
  • Applet - pre-compiled program passed to client
  • Server-side (executed on server machine)
  • Server-side script embedded in document
  • Servelet - precompiled program executed within
    the servers address space
  • CGI scripts

9
Common Gateway Interface (CGI)
  • The principle of using server-side CGI programs.
  • Allows documents can be generated dynamically
    on-the-fly
  • Provides a standard way for web server to execute
    a program using user-provided data as input
  • To the server, CGI program appears as program
    responsible for fetching the requested document

10
Architectural Overview
  • Architectural details of a client and server in
    the Web.
  • Document fetch (and possibly server-side script)
    2b-3b
  • Execute CGI Script (separate process) 2c-3c-4c
  • Execute servlet program (run within server)
    2a-3a-4a

11
http protocol
  • Defines the communication between a web server
    and a client
  • Used to deliver virtually all files and other
    data (collectively called resources) on the World
    Wide Web
  • A browser is an HTTP client because it sends
    requests to an HTTP server (Web server
  • The standard (and default) port for HTTP servers
    to listen on is 80, though they can use any port.

12
Structure of http transactions
  • Request/Response, text based protocol
  • Format of a http message
  • ltinitial line, different for request vs.
    responsegt
  • Header1 value1
  • Header2 value2
  • Header3 value3
  • ltoptional message body goes here, like file
    contents or query data it can be many lines
    long, or even binary data gt

13
The Format of a Request
method
sp
URL
sp
version
header

value
header

value
Entity Body
14
Request Example
  • GET /index.html HTTP/1.1 CRLF
  • Accept image/gif, image/jpeg CRLF
  • User-Agent Mozilla/4.0 CRLF
  • Host www.ui.ac.ir80 CRLF
  • Connection Keep-Alive CRLF
  • CRLF

15
Request Example
  • GET /index.html HTTP/1.1
  • Accept image/gif, image/jpeg
  • User-Agent Mozilla/4.0
  • Host www.ui.ac.ir80
  • Connection Keep-Alive
  • blank line here

16
The Format of a Response
status line
version
sp
status code
sp
phrase
header

value
header

value
Entity Body
17
Response Example
  • HTTP/1.0 200 OK
  • Date Fri, 31 Dec 1999 235959 GMT
  • Content-Type text/html
  • Content-Length 1354
  • lthtmlgt
  • ltbodygt
  • lth1gtHello Worldlt/h1gt
  • (more file contents) . . .
  • lt/bodygt
  • lt/htmlgt

18
Response Example
  • HTTP/1.0 200 OK
  • Date Fri, 31 Dec 1999 235959 GMT
  • Content-Type text/html
  • Content-Length 1354
  • lthtmlgt
  • ltbodygt
  • lth1gtHello Worldlt/h1gt
  • (more file contents) . . .
  • lt/bodygt
  • lt/htmlgt

19
Initial line
  • A typical initial request line
  • GET /path/to/file/index.html HTTP/1.0
  • Initial response line
  • HTTP/1.0 200 OK
  • HTTP/1.0 404 Not Found
  • Status code
  • 1xx indicates an informational message only
  • 2xx indicates success of some kind
  • 3xx redirects the client to another URL
  • 4xx indicates an error on the client's part
  • 5xx indicates an error on the server's part
  • Common status codes
  • 200 OK
  • 404 Not Found
  • 301 Moved Permanently
  • 302 Moved Temporarily
  • 303 See Other (HTTP 1.1 only)
  • 500 Server Error

20
Header lines
  • Typical request headers
  • From email address of requester
  • User-Agent for example User-agent Mozilla/3.0Gol
    d
  • Typical response headers
  • Server for example Server Apache/1.2b3-dev
  • Last-modified for example Last-Modified , 19
    Feb 2006 235959 GMT

21
Message body
  • In a response, this is where the requested
    resource is returned to the client (the most
    common use of the message body), or perhaps
    explanatory text if there's an error.
  • In a request, this is where user-entered data or
    uploaded files are sent to the server.
  • If an HTTP message includes a body, there are
    usually header lines in the message that describe
    the body. In particular,
  • The Content-Type header gives the MIME-type of
    the data in the body, such as text/html or
    image/gif.
  • The Content-Length header gives the number of
    bytes in the body.

22
MIME Media types
  • Multipurpose Internet Mail Extensions
  • HTTP sends the media type of the file using the
    Content-Type header
  • Some important media types are
  • text/plain, text/html
  • image/gif, image/jpeg
  • audio/basic, audio/wav
  • model/vrml
  • video/mpeg, video/quicktime
  • application/, application-specific data that
    does not fall under any other MIME category, e.g.
    application/octet-stream

23
Sample HTTP exchange
  • To retrieve the file at the URL
    http//www.somehost.com/path/file.html
  • Request
  • GET /path/file.html HTTP/1.0
  • From someuser_at_jmarshall.com
  • User-Agent HTTPTool/1.0
  • blank line here
  • Response
  • HTTP/1.0 200 OK
  • Date Fri, 31 Dec 1999 235959 GMT
  • Content-Type text/html
  • Content-Length 1354
  • lthtmlgt ltbodygt lth1gtHappy New Millennium!lt/h1gt
    (more file contents) . . . lt/bodygt lt/htmlgt

24
HTTP methods
  • GET request a resource by url
  • HEAD
  • is just like a GET request, except it asks the
    server to return the response headers only, and
    not the actual resource (i.e. no message body).
  • This is useful to check characteristics of a
    resource without actually downloading it, thus
    saving bandwidth.
  • POST
  • A POST request is used to send data to the server
    to be processed in some way, like by a CGI
    script.
  • There's a block of data sent with the request, in
    the message body. There are usually extra headers
    to describe this message body, like Content-Type
    and Content-Length.
  • The request URI is not a resource to retrieve
    it's usually a program to handle the data you're
    sending.
  • The HTTP response is normally program output, not
    a static file.

25
HTTP 1.1
  • It is a superset of HTTP 1.0. Improvements
    include
  • Faster response, by allowing multiple
    transactions to take place over a single
    persistent connection.
  • Faster response and great bandwidth savings, by
    adding cache support.
  • Faster response for dynamically-generated pages,
    by supporting chunked encoding, which allows a
    response to be sent before its total length is
    known.
  • Efficient use of IP addresses, by allowing
    multiple domains to be served from a single IP
    address.

26
Manually Experimentingwith HTTP
  • gttelnet eng.ui.ac.ir 80
  • Trying 192.168.50.84
  • Connected to eng.ui.ac.ir
  • Escape character is .

27
Sending a Request
  • gt GET /ladani/index.htm HTTP/1.0
  • blank line

28
The Response
  • HTTP/1.1 200 OK
  • Date Fri, 29 Feb 2008 082333 GMT
  • Server Apache/2.0.52 (CentOS)
  • Last-Modified Wed, 07 Nov 2007 122744 GMT
  • ETag "6ccb6-741c-43e55e05a5000"
  • Accept-Ranges bytes
  • Content-Length 29724
  • Connection close
  • Content-Type text/html charsetWINDOWS-1256
  • lt!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0
    Transitional//EN"gt

  • lthtmlgt

  • ltheadgt

  • ltmeta
  • http-equiv"Content-Type" content"text/html
    charsetwindows-1252"gt

  • ltmeta name"
  • GENERATOR" content"Microsoft FrontPage 5.0"gt
  • .

29
GET /ladani/index.htm HTTP/1.0
HTTP/1.1 200 OK
HTML code
30
GET /ladani/no-such-page.htm HTTP/1.0
HTTP/1.1 404 Not Found
HTML code
31
GET /index.html HTTP/1.1
HTTP/1.1 400 Bad Request
HTML code
Why is it a Bad Request?
HTTP/1.1 without Host Header
32
Session-persistent State
  • What does session-persistent state mean?
  • State information that is preserved between
    browsing sessions.
  • Information that is stored semi-permanently
    (i.e., on disk) for later access.
  • Why was calculator example not session-persistent?
  • Sum, current display, etc. not preserved if we
    went to a different website and back to
    calculator.

33
Why session-persistence?
  • User-based customizations.
  • MyYahoo, ETrade, etc.
  • Long transactions.
  • Electronic shopping carts.
  • Order preparation
  • Server-side state maintenance.
  • Large amounts of state info that you dont want
    to pass back and forth.

34
Cookie Overview
  • HTTP cookies are a mechanism for creating and
    using session-persistent state.
  • Cookies are simple string values that are
    associated with a set of URLs.
  • Servers set cookies using an HTTP header.
  • Client transmits the cookie as part of HTTP
    request whenever an associated URL is visited in
    the future.

35
Anatomy of a cookie.
  • Cookie has 6 parts
  • Name
  • Value
  • Domain
  • Path
  • Expiration
  • Security flag
  • Name and Value are required, others have default
    value.

36
Setting a cookie.
  • A cookie is set using the Set-cookie header in
    an HTTP response.
  • String value of the Set-cookie header is parsed
    into semi-colon separated fields that define the
    different parts of the cookie.
  • Cookie is stored by the client.

37
Sending cookies
  • Every time a client makes an HTTP request, it
    tests every cookie for a match.
  • Cookies match if
  • Cookie domain is suffix of URL server.
  • Cookie expiration has not passed.
  • Cookie path is prefix of URL path.
  • Cookie security flag is on and connection is
    secure.
  • If a match is made, then name/value pair of
    cookie is sent as Cookie header in request.

38
Setting a Cookie
  • Full cookie
  • Set-Cookie my_cookie This is my cookie value
    domain.eng.ui.ac.ir path/ladani expires Thu,
    06-March-08 120000 GMT
  • Can have more than one Set-Cookie header, or can
    combine more than one cookie in one header by
    separating with ,

39
Cookie Matching
  • Biggest misunderstanding
  • Servers do not RETRIEVE cookies!!!!
  • Servers RECEIVE cookies previously planted.
  • Step 1
  • Some response by server installs cookie with
    Set-cookie header.
  • Client saves cookie to disk.

40
Cookie Matching
  • Step 2
  • Browser goes to some page which matches
    previously received cookie.
  • Cookie name and value sent in request as Cookie
    HTTP header.
  • Step 3
  • CGI program detects presence of cookie and uses
    it.
  • Where is the cookie info?
  • Environment variable HTTP_COOKIE

41
Where are cookies stored on client?
  • Client-specific locations.
  • No standard.
  • Latest IE stores in a folder called Temporary
    Internet Files
  • Each cookie stored in a separate file.
  • Netscape stores in cookies.txt

42
Typical Cookie Usages
  • Cookies as Database Index
  • Most common use of cookies.
  • State information is kept in some sort of
    database and the cookie acts as an index.
  • Cookies as State Variables
  • Name of cookie is like variable name.
  • Value of cookie is state information.

43
Cookie Security
  • Security flag restricts when browser will send a
    cookie back to server.
  • Requires secure connection.
  • For example https in effect.
  • What does this mean about when the cookies was
    set?

44
First Web Server
  • Berners-Lee wrote two programs
  • A browser called WorldWideWeb
  • The worlds first Web server, which ran on
    NeXSTEP
  • The machine is on exhibition at CERNs public
    museum

45
Most Famous Web Servers
  • Apache HTTP Server from Apache Software
    Foundation
  • Internet Information Services (IIS) from
    Microsoft
  • Google Web Server (GWS)
  • Started from May 2007
  • Lighttpd
  • powers several popular Web 2.0 sites like
    YouTube, wikipedia and meebo

46
Web Servers Usage Statistics
  • The most popular Web servers, used for public Web
    sites, are tracked by Netcraft Web Server Survey
  • Details given by Netcraft Web Server Reports
  • Apache is the most popular since April 1996
  • Currently (February 2008) about
  • 50.93 ? Apache
  • 35.56 ? Microsoft (IIS, PWS, etc.)
  • 5.16 ? Google
  • 0.99 ? Lighttpd

47
Web Servers Usage Statistics cont.
Total Sites Across All Domains August 1995 -
February 2008
48
Web Servers Usage Statistics cont.
Market Share for Top Servers Across All Domains
August 1995 - February 2008
49
Web Servers Usage Statistics cont.
Totals for Active Servers Across All DomainsJune
2000 - February 2008
50
Apache (A PAtCHy) Web Server
  • Origins NCSA (Univ. of Illinois,Urbana/Champaign)
  • Now Apache Software Foundation (www.apache.org),
    developers world-wide
  • Most widely used web server today NetCraft web
    survey, 2/2008
  • Open source software
  • Geographically distributed developers
  • Modular, extensible design needed where
    third-party developers could override or extend
    basic characteristics

51
Web Server Processing Steps
52
Apache HTTP Server
  • Apache Core
  • Receives client request
  • Typically, allocate new process for each incoming
    request
  • Allocates request record
  • Invokes handlers on individual modules in
    sequence
  • Modules register handlers during configuration
  • Handler
  • Request record passed as single parameter
  • Each handler reads/modifes request record

53
Web Server Phases
  • Apache core invokes a handler for each phase
  • Resolve document reference (URI) to a local file
    name (or CGI programparameters)
  • Client authentication (verify client identity)
  • Client access control (determine access rights)
  • Request access control (check if access allowed)
  • MIME type determination of the response
  • General phase for handling leftovers (e.g., check
    syntax of returned response, build up user
    profile)
  • Transmission of the response to client
  • Logging data on the processing of the request

54
References
  • http//www.jmarshall.com/easy/http/
  • TCP/IP Tutorial and Technical Overview,
    Rodriguez, Gatrell, Karas, Peschke, IBM redbooks,
    August 2001
  • Wikipedia, the free encyclopedia
  • Apache The Definitive Guide, 2nd edition, Ben
    Laurie, Peter Laurie, OReilly, February 1999
  • Webmaster in a nutshell, 1st edition, Stephen
    Spainhour, Valerie Quercia, OReilly, October
    1996
  • Netcraft February 2006 Web Server Survey
Write a Comment
User Comments (0)
About PowerShow.com