Web Servers, Data Transmission and Exchange - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Web Servers, Data Transmission and Exchange

Description:

A small number of request types (GET, POST, PUT, DELETE) ... (RGB)n -- 3n bytes: packed (24-bit) RGB values for the thumbnail pixels, ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 35
Provided by: zack4
Category:

less

Transcript and Presenter's Notes

Title: Web Servers, Data Transmission and Exchange


1
Web Servers, Data Transmission and Exchange
  • Zachary G. Ives
  • University of Pennsylvania
  • CIS 455 / 555 Internet and Web Systems
  • January 28, 2009

2
Today
  • Finish discussion of thread pools and Web servers
  • Communications Sending data
  • Physical vs. logical representation
  • Encoding and management of heterogeneity

3
HTTP Overview
  • Requests
  • A small number of request types (GET, POST, PUT,
    DELETE)
  • Request may contain additional information, e.g.
    client info, parameters for forms, etc.
  • Responses
  • Response codes 200 (OK), 404 (not found), etc.
  • Metadata contents MIME type, length, etc.
  • The payload or data

4
A Simple HTTP Request
  • GET /cis455/index.html HTTP/1.1If-Modified-Sinc
    e Sun, 25 Jan 2009 111223 GMTReferer
    http//www.cis.upenn.edu/index.html
  • Requests data at a path using HTTP 1.1 protocol
  • Example response
  • HTTP/1.1 200 OKDate Tue, 28 Jan 2009 95600
    GMTLast-Modified Wed, 25 Jan 2009 83000
    GMTContent-Type text/htmlContent-Length 3931

5
Request Types
  • GET
  • Retrieve the resource at a URL
  • PUT
  • Publish the specified data at a URL
  • DELETE
  • (Self-explanatory)
  • POST
  • Submit form content

6
The Thread Pool Request Handler Queue
  • (on board)

7
Forms Returning Data to the Server
  • HTML forms allow assignments of values to
    variables
  • Two means of submitting forms to apps
  • GET-style within the URL
  • GET /home/my.cgi?paramvalparam2val2
  • POST-style as the data
  • POST /home/second.cgi
  • Content-Length 34
  • searchKey Pennwhere www.google.com

8
Authentication and Authorization
  • Authentication
  • At minimum, user ID and password authenticates
    requestor
  • Client may wish to authenticate the server, too!
  • SSL (well discuss this more later)
  • Part of SSL certificate from trusted server,
    validating machine
  • Also public key for encrypting clients
    transmissions
  • Authorization
  • Determine what user can access
  • For files, applications typically, access
    control list
  • If data from database, may also have view-based
    security

9
Programming Support in Web Servers
  • CGI Common Gateway Interface the oldest
  • A CGI is a separate program, often in Perl,
    invoked by the server
  • Certain info is passed from server to CGI via
    Unix-style environment variables
  • QUERY_STRING REMOTE_HOST, CONTENT_TYPE,
  • HTTP post data is read from stdin
  • Interface to persistent process
  • In essence, how communication with a database is
    done Oracle or MySQL is running on the side
  • Communicate via pipes, APIs like ODBC/JDBC, etc.
  • Server module running in the same process
  • Might be custom code (e.g., Apache extension) or
    an interpreter/runtime system

10
Server Modules
  • Interpreters
  • JavaScript/JScript, PHP, ASP,
  • Often a full-fledged programming language
  • Code is generally embedded within HTML, not
    stand-alone
  • Custom runtimes/virtual machines
  • Most modern Perl runtimes Java servlets ASP.NET
  • A virtual machine runs within the web server
    process
  • Functions are invoked within that JVM to handle
    each request
  • Code is generally written as usual, but may need
    to use HTML to create UI rather than standard GUI
    APIs
  • Most of these provide (at least limited)
    protection mechanisms

11
Servlets
  • An interesting model for programming applications
    in Java
  • A servlet is a subclass of HttpServlet
  • It overrides methods doGet() or doPost()
  • Its given a number of objects
    HttpServletRequest (includes info about
    parameters, browser, etc.), HttpServletResponse
    (a means for sending info back to the browser,
    including data, forwarding requests, etc.)
  • Theres a notion of a session that can be used to
    share state across doGet()/doPost() invocations
    its generally connected with a cookie
  • Those of you who took CSE 330/CIS 550 should be
    generally familiar with servlets
  • Those who didnt should be able to catch up by
    looking at, e.g., http//www.apl.jhu.edu/hall/jav
    a/Servlet-Tutorial/
  • http//www.novocode.com/doc/servlet-essentials/
  • Your homework assignment will be to build a
    simple servlet engine a la Tomcat

12
(Cross-)Session State Cookies
  • Major problem with sessionless nature of HTTP
    how do we keep info between connections?
  • Cookie an opaque string associated with a web
    site, stored at the browser
  • Create in HTTP response with Set-Cookie xxx
  • Passed in HTTP header as Cookie xxx
  • Interpretation is up to the application
  • Usually, object-value pairs passed in HTTP
    header
  • Cookie userJoe pwdblob
  • Often have an expiration
  • Very common session cookies

13
Persistent State Interfacing with a Database
  • A very common operation
  • Read some data from a database, output in a web
    form
  • e.g., postings on Slashdot, items for a product
    catalog, etc.
  • Three problems, abstracted away by ODBC/ADO/JDBC
  • Impedance mismatch from relational DBs to objects
    in Java (etc.)
  • Standard API for different databases
  • Physical implementation for each DB

14
Going One Step Further
  • Today, data doesnt just come from databases
  • Web services, e.g., Amazon or corporate intranet
    services
  • External entities like credit card companies,
    shippers
  • Web pages
  • Etc.

15
Sending Data
  • How do we send data within a program?
  • What is the implicit model?
  • How does this change when we need to make the
    data persistent?
  • What happens when we are coupling systems?
  • How do we send data between programs on the same
    machine?
  • Between different machines?

16
Marshalling
  • Converting from an in-memory data structure to
    something that can be sent elsewhere
  • Pointers -gt something else
  • Specific byte orderings
  • Metadata
  • Note that the same logical data gets a different
    physical encoding
  • A specific case of Codds idea of
    logical-physical separation
  • Data model vs. data

17
Communication and Streams
  • When storing data to disk, we have a combination
    of sequential and random access
  • When sending data on the wire, data is only
    sequential
  • Stream-based communication based on packets
  • What are the implications here?
  • Pipelining, incremental evaluation,

18
Why Data Interchange Is Hard
  • Need to be able to understand
  • Data encoding (physical data model)
  • May have syntactic heterogeneity
  • Endian-ness, marshalling issues
  • Impedance mismatches
  • Data representation (logical data model)
  • May have semantic heterogeneity
  • Imprecise and ambiguous values/descriptions

19
Examples
  • MP3 ID3 format record at end of file

20
Examples
  • JPEG JFIF header
  • Start of Image (SOI) marker -- two bytes (FFD8)
  • JFIF marker (FFE0)
  • length -- two bytes
  • identifier -- five bytes 4A, 46, 49, 46, 00
    (the ASCII code equivalent of a zero terminated
    "JFIF" string)
  • version -- two bytes often 01, 02
  • the most significant byte is used for major
    revisions
  • the least significant byte for minor revisions
  • units -- one byte Units for the X and Y
    densities
  • 0 gt no units, X and Y specify the pixel aspect
    ratio
  • 1 gt X and Y are dots per inch
  • 2 gt X and Y are dots per cm
  • Xdensity -- two bytes
  • Ydensity -- two bytes
  • Xthumbnail -- one byte 0 no thumbnail
  • Ythumbnail -- one byte 0 no thumbnail
  • (RGB)n -- 3n bytes packed (24-bit) RGB values
    for the thumbnail pixels, n Xthumbnail
    Ythumbnail

21
Finding File Formats
  • http//www.wikipedia.org/
  • http//www.wotsit.org/
  • etc.

22
The Problem
  • You need to look into a manual to find file
    formats
  • (At best, e.g., MS .DOC file format)
  • The Web is about making data exchange easier
    Maybe we can do better!
  • The mother of all file formats

23
Desiderata for Data Interchange
  • Ability to represent many kinds of information
  • Different data structures
  • Hardware-independent encoding
  • Endian-ness, UTF vs. ASCII vs. EBCDIC
  • Standard tools and interfaces
  • Ability to define shape of expected data
  • With forwards- and backwards-compatibility!
  • Thats XML

24
Consumers of XML
  • A myriad of tools and interfaces, including
  • DOM document object model
  • Standard OO representation of an XML tree
  • SAX simple API for XML
  • An event-driven parser interface for XML
  • startElement, endElement, etc.
  • Ant Java-based make tool with XML makefile
  • XPath, XQuery, XSL, XSLT
  • Web service standards
  • Anything AJAX (mash-ups)

25
XML as a Data Model
  • XML information set includes 7 types of nodes
  • Document (root)
  • Element
  • Attribute
  • Processing instruction
  • Text (content)
  • Namespace
  • Comment
  • XML data model includes this, plus typing info,
    plus order info and a few other things

26
Example XML Document
Processing Instr.
  • lt?xml version"1.0" encoding"ISO-8859-1" ?gt
  • ltdblpgt
  • ltmastersthesis mdate"2002-01-03"
    key"ms/Brown92"gt
  •   ltauthorgtKurt P. Brownlt/authorgt
  •   lttitlegtPRPL A Database Workload
    Specification Languagelt/titlegt
  •   ltyeargt1992lt/yeargt
  •   ltschoolgtUniv. of Wisconsin-Madisonlt/schoolgt
  •   lt/mastersthesisgt
  • ltarticle mdate"2002-01-03" key"tr/dec/SRC1997-
    018"gt
  •   lteditorgtPaul R. McJoneslt/editorgt
  •   lttitlegtThe 1995 SQL Reunionlt/titlegt
  •   ltjournalgtDigital System Research Center
    Reportlt/journalgt
  •   ltvolumegtSRC1997-018lt/volumegt
  •   ltyeargt1997lt/yeargt
  •   lteegtdb/labs/dec/SRC1997-018.htmllt/eegt
  •   lteegthttp//www.mcjones.org/System_R/SQL_Reunio
    n_95/lt/eegt
  •   lt/articlegt

Open-tag
Element
Attribute
Close-tag
27
XML Data Model Visualized( Document Object
Model)
attribute
root
p-i
element
Root
text
dblp
?xml
mastersthesis
article
mdate
mdate
key
key
author
title
year
school
2002
editor
title
year
journal
volume
ee
ee
2002
1992
1997
The
ms/Brown92
tr/dec/
PRPL
Digital
db/labs/dec
Univ.
Paul R.
Kurt P.
SRC
http//www.
28
A Few Common Uses of XML
  • Serves as an extensible HTML
  • Allows custom tags (e.g., used by MS Word,
    openoffice)
  • Supplement it with stylesheets (XSL) to define
    formatting
  • Provides an exchange format for data (still need
    to agree on terminology)
  • Tables, objects, etc.
  • Format for marshalling and unmarshalling data in
    Web Services

29
XML as a Super-HTML(MS Word)
  • lth1 class"Section1"gtlta name"_top /gtCIS 550
    Database and Information Systemslt/h1gt
  • lth2 class"Section1"gtFall 2003lt/h2gt
  • ltp class"MsoNormal"gt
  • ltplacegt311 Townelt/placegt, Tuesday/Thursday
  • lttime Hour"13" Minute"30"gt130PM
    300PMlt/timegt
  • lt/pgt

30
XML Easily Encodes Relations
Student-course-grade
  • ltstudent-course-gradegt
  • lttuplegt ltsidgt1lt/sidgtltcoursegt330-f03lt/coursegtltgra
    degtBlt/gradegtlt/tuplegt
  • lttuplegt ltsidgt23lt/sidgtltcoursegt455-s04lt/coursegtltgr
    adegtAlt/gradegtlt/tuplegt
  • lt/student-course-gradegt

31
It Also Encodes Objects (with Pointers
Represented as IDs)
  • ltprojectsgt
  • ltproject classcse455 gt
  • lttypegtProgramminglt/typegtltmemberListgt
  • ltteamMembergtJoanlt/teamMembergt
  • ltteamMembergtJilllt/teamMembergt
  • lt/memberListgtltcodeURLgtwww.lt/codeURLgtltincorpora
    tesProjectFrom classcse330 /gt
  • lt/projectgt

32
XML and Code
  • Web Services (.NET, Java web service toolkits)
    are using XML to pass parameters and make
    function calls marshalling as part of remote
    procedure calls
  • SOAP WSDL
  • Why?
  • Easy to be forwards-compatible
  • Easy to read over and validate (?)
  • Generally firewall-compatible
  • Drawbacks? XML is a verbose and inefficient
    encoding!
  • But if the calls are only sending a few 100s of
    bytes, who cares?

33
XML When Tags Are Used by Different Sources
  • Namespaces allow us to specify a context for
    different tags
  • Two parts
  • Binding of namespace to URI
  • Qualified names
  • lttag xmlnsmynshttp//www.fictitious.com/mypath
    xmlnshttp//www.default/mypathgt
  • ltthistaggtis in default namespacelt/thistaggt
  • ltmynsthistaggtthis a different
    taglt/mynsthistaggtlt/taggt

34
XML Isnt Enough on Its Own
  • Its too unconstrained for many cases!
  • How will we know when were getting garbage?
  • How will we query?
  • How will we understand what we got?
Write a Comment
User Comments (0)
About PowerShow.com