LIS650 lecture 5 information architecture javascript, http and apache

1 / 99
About This Presentation
Title:

LIS650 lecture 5 information architecture javascript, http and apache

Description:

`a tomato is a red or yellowish fruit with a juicy pulp, used as a vegetable, ... to a bulletin board, newsgroup, mailing list, or similar group of articles; ... –

Number of Views:837
Avg rating:3.0/5.0
Slides: 100
Provided by: open6
Learn more at: http://openlib.org

less

Transcript and Presenter's Notes

Title: LIS650 lecture 5 information architecture javascript, http and apache


1
LIS650 lecture 5information architecturejavascri
pt, http and apache
  • Thomas Krichel
  • 2004-12-18

2
Sensitivity exercise
  • What do you hate about a web site?
  • What do you like about a web site?
  • All issues to do with that fall into three
    categories
  • Technical
  • Look and Feel
  • Architecture

3
Reasons to hate a web site
  • Can't find it.
  • Page crowded
  • Loud colours
  • Gratuitous use of technology
  • Inappropriate tone
  • Designer centered
  • Lack of attention to detail

4
Reasons to like a web site
  • useful
  • attractive to look at
  • thought provoking
  • findabilty
  • personalisation

5
Why is it so difficult
  • technical expertise
  • graphical design expertise
  • overall structure

6
IA determines
  • organization
  • content
  • functionality
  • navigation
  • labeling
  • searching

7
Good IA is important for the producer
  • web site an important point of first contact
  • needs to determine overall design before the site
    is built
  • reorganizing a site is
  • costly
  • difficult

8
Topics covered
  • classification
  • navigation
  • labelling
  • making a site searchable

9
The challenge of classification
  • ambiguity
  • a tomato is a red or yellowish fruit with a
    juicy pulp, used as a vegetable, botanically it
    is a berry.''
  • heterogeneity
  • in a library
  • on a web site
  • granularity
  • format
  • difference in perspective
  • internal politics

10
Organizational schemes
  • Exact schemes
  • alphabetical
  • chronological
  • geographical
  • ambiguous schemes
  • topical should be there, but not the only scheme
  • task-oriented
  • audience-specific open or closed
  • metaphor-driven not as overall organization
  • Hybrid schemes are not good

11
The mixed-up library
  • adult
  • arts and humanities
  • community center
  • get a library card
  • learn about our library
  • science
  • teen
  • youth

12
Organizational form hierarchies
  • keep balance between breadth and depth
  • obey 7 -2 rule horizontally,
  • no more than 5 levels vertically
  • cross-link ambiguous items if really necessary
  • keep new sites shallow

13
organizational forms hypertext
  • great flexibility
  • great potential for confusion
  • not good as a prime organizational structure

14
organizational forms database
  • powerful for searching
  • useful if there is controlled vocabulary
  • easy reorganization
  • on the fly or static generation of pages
  • but ensure robot indexing
  • not good for heterogenous data

15
Navigation aids
  • provide context
  • allow for flexibility of movement
  • support associative learning
  • danger of overwhelming the user

16
browser navigation aids
  • They include
  • open
  • back
  • forward
  • history
  • bookmarks
  • prospective view
  • visited url color
  • sites should not corrupt the browser.

17
navigation
  • the you are here'' mark
  • pages should indicate site name
  • navigation should be consistent
  • navigation not to refer to current pages
  • highlight current page in a different way
  • allow for lateral navigation

18
Types of navigational systems
  • global hierarchical navigation systems
  • text
  • icon
  • local navigation systems integration with global
    system can be challenging
  • ad hoc navigation clear label are required

19
Frames are problematic
  • potential waste of pages real estate
  • speed of display
  • disrupt the page model
  • complex design

20
remote navigation system I
  • table of contents
  • good in a hierarchical web site
  • reinforce the hierarchy
  • facilitate known-item access
  • resist temptation to overwhelm user
  • indexes
  • presents key term without hierarchy
  • key terms found from search behavior
  • links terms to final destination pages
  • use term rotation

21
remote navigation systems II
  • site maps
  • is a graphical representationof the site's
    contents
  • new because no equivalent in print
  • there are automated tools to generate site maps
  • seldomly well-done
  • to be kept simple
  • guided tours
  • important for sites with restricted access
  • should feature linear navigation

22
labelling
  • a label is short expression that represents a
    larger set of information.
  • example contact us''
  • labelling is an outgrowth of site organization,
    that we have discussed previously.
  • labelling communicates the organization of the
    site

23
Why bother
  • we need to guess at how users respond to a label
  • users will not spend much time interpreting the
    label
  • appropriate tone, no hot'', cool'', stuff''
  • should reflect thinking of the user, not of the
    owner
  • it is easy to have unplanned labelling

24
Good labelling
  • Sticking with the familiar
  • main, main page, home, home page
  • search, find
  • browse
  • contact, contact us, feedback
  • Help, FAQ, Frequently Asked Questions
  • About, About Us
  • Labels may be augmented with scope notes

25
Grammatical consistency
  • contact us, search our site, browse our content
  • contact, search, browse
  • contact information, search page, table of
    contents
  • (also good in student essays)

26
Labels as indexing terms
  • use in tags, or in tag
  • use as controlled vocabulary in the database
  • but some search, in fact almost all, engines do
    not use metadata

27
Textual labels
  • born in Völklingen, (Saarland) in 1965, I studied
    Economics and Social Sciences at the universities
    of Toulouse, Paris, Exeter and Leicester. Between
    Febrary 1993 and April 2001 I lectured in the
    Department of Economics at the University of
    Surrey. In 1993 I founded NetEc, a consortium of
    Internet projects for academic economists. In
    1997, I founded the RePEc dataset to document
    Economics. Between October and December 2000, I
    held a visiting professorship at Hitotsubashi
    University.

28
labels as headings
  • good practice
  • consistency in terminology wording on labels is
    uniform and cohesive
  • consistency in granularity
  • chunks covered by labels at the same level is
    roughly equal
  • chunks covered do not vary by their depth

29
Iconic labels
  • There is only a limited vocabulary'' of
    commonly understood labels
  • it is fine for some key concepts
  • labels need to be very consistently placed
  • they can communicate a graphic identity for the
    page
  • they are easy to find on a page, provided that
    page is not long

30
Designing labelling systems I
  • start from existing one
  • put in table or tree (on paper)
  • make small changes towards consistency
  • benevolent plagiarism'' from competitors and
    academic sites
  • use controlled vocabularies, example yellow pages

31
Designing labeling systems II
  • use a thesaurus, example legislative indexing
    vocabulary
  • see'' link
  • see also'' links
  • broader terms
  • narrower terms
  • labels from contents best judged by an outsider
  • labels from query logs
  • labels from user interviews
  • labels from modeling user needs

32
fine tuning a labelling system
  • remove duplicates
  • sort alphabetically
  • homogenize case and punctuation and grammar
  • remove synonyms according to audience
  • make labels as different from one another as
    possible
  • search for gaps
  • look into the future
  • keep scope focussed
  • consider granularity

33
why not make a site searchable
  • not a tool to satisfy all user's needs
  • not good on poor contents
  • not a cure for bad browsing!
  • needs good planning

34
why make a site searchable
  • cope with bad organization (Foyle's)
  • dynamic contents
  • large contents

35
user needs
  • some want overview, others want detail
  • some need accuracy, others dont care much
  • some can wait, others need it now
  • some need some info, others need a comprehensive
    answer

36
user's searching expectation
  • known-item searching
  • existence searching
  • exploratory searching
  • comprehensive searching

37
integrated searching and browsing
  • literature deals with separate browsing and
    searching systems
  • browsing and searching in a single system
  • with multiple iteration
  • and associative learning takes place

38
designing search interfaces I
  • level of expertise
  • boolean?
  • concept search?
  • amount returned
  • comprehensive?
  • verbose?
  • how much to make searchable

39
designing search interfaces II
  • search target
  • navigation pages?
  • HTML only?
  • are there specific types of data that users will
    want multi-lingual?
  • audience difference

40
features of sophisticatedsearch engines
  • fielded searches
  • sophisticated query languages
  • reusable results set
  • customizable relevance

41
Deal with problems
  • getting too much suggest boolean AND
  • getting nothing suggest boolean OR or
    truncation
  • bad answers suggest to contact an expert, may
    be not...

42
today
  • information architecture
  • javascript
  • http
  • apache introduction

43
the
  • is an element that calls a script.
  • It requires a "type" attribute that gives the
    type of the script language. e.g.
    type"text/javascript".
  • It takes the "src" argument that gives a URI
    where the script can be found. Such a script is
    called an external script.
  • It takes a "defer" attribute. If set as defer"1"
    you tell the user agent that the script will
    generate no output. This helps the user agent in
    that case.

44
example
  • document.write("hello, world")
  • Interestingly enough, you can place this script
    in the head or the body.
  • This is an example of an automated script. The
    user has to do nothing to get the script to run.
  • You can also trigger a script. To do that, we
    have to study some more HTML attributes. We will
    do that later.

45
external script
  • You can also create an external file, say
    hello.js with the line
  • document.write("hello, world")
  • Then you can call it up in the html file
  • src"hello.js"/

46
default script language
  • You should set the default scripting language
    used in the document using the element in
    the
  • content"text/javascript"/
  • If you don't the validator does not complain, but
    I don't see other ways to specify the language.

47
Javascript history
  • A programming language that was developed by
    Netscape for their browser in 1995.
  • To counter, Mickeysoft developed Jscript.
  • It has been standardized by the European Computer
    Manufacturers Association as ECMA 262.

48
principal features
  • It contains instructions for a user agent to
    execute. Javascript is not run by the server.
  • It resembles Java, but not the same language.
  • It is an object-oriented language.

49
object
  • In an object-oriented language, an object is the
    prime focus of attention.
  • An object has properties and methods.
  • Example from real life. Let a bus be an object.
  • color of the bus is a property
  • move to next station is a method

50
objects in javascript
  • Properties are accessed by
  • object_name.property_name
  • Methods are accessed by
  • object_name.method_name()
  • where object_name is the name of an object,
    property_name is the name of a property and
    method_name() is the name of an object. Note the
    use of the dot and the parenthesis.

51
Example
  • Syntax rules
  • Comments are started with // and go to the end of
    the line.
  • Instructions are terminated with semicolon
  • Example
  • // create a new object called bus
  • new bus Object()
  • // paint it white --- set a property
  • bus.color white
  • // move to next stop --- apply a method
  • bus.movetonextstop()

52
event attributes
  • Event attributes can be given to elements (like
    any attribute, really)
  • The name of the attributes gives a certain event
    that could happen to the element.
  • The value of the event attribute is the script to
    be executed when the event occurs on the element
    that has the event attribute.
  • Example
  • Cow shit is ...
  • as the user moves the mouse over the
    paragraph, the browser fires up an imaginary
    script called stink that makes it start to stink.

53
core event attributes I
  • "onclick" occurs when the pointing device button
    is clicked over an element.
  • "ondblclick" occurs when the pointing device
    button is double clicked over an element.
  • "onmousedown" occurs when the pointing device
    button is pressed over an element.
  • "onmouseup" occurs when the pointing device
    button is released over an element.
  • "onmouseover" occurs when the pointing device is
    moved onto an element.

54
core events attributes II
  • "onmousemove" occurs when the pointing device is
    moved while it is over an element.
  • "onmouseout" occurs when the pointing device is
    moved away from an element.
  • "onkeypress" occurs when a key is pressed and
    released over an element.
  • "onkeydown" occurs when a key is pressed down
    over an element.
  • "onkeyup" occurs when a key is released over an
    element.

55
special event attributes
  • "onfocus" occurs when an element receives focus
    either by the pointing device or by tabbing
    navigation. This attribute may only be used with
    the element, and some form elements that we
    have not covered.
  • "onblur" occurs when an element loses focus
    either by the pointing device or by tabbing
    navigation. It may be used with the same elements
    as onfocus.

56
more special event attributes
  • "onsubmit" occurs when a form is submitted. It
    only applies to the element.
  • "onreset" occurs when a form is reset. It only
    applies to the element.
  • some more are only used with other form
    elements...
  • Let us look at some examples

57
two stupid examples
  • javascript test
  • content"text/javascript"/
  • time
  • "document.write('not funny')"joke

58
An even more silly example
  • Bush in the bush
  • "text/javascript"/
  • prbunew Image() prbu.src"bush.jpg"
  • natbnew Image() natb.src"natgeo.jpg"
    Bush in the
    bushonmouseover"document.bush.srcnatb.src"
  • onmouseout"document.bush.srcprbu.src"
    alt"bush in the bush"/

59
http
  • Stands for the hypertext transfer protocol. This
    is the most important application layer protocol
    on the Internet today, because it provides the
    foundation for the world wide web.
  • defined in Fielding, Roy T., James Gettys,
    Jeffrey C. Mogul, Paul J. Leach, Tim Berners-Lee
    Hypertext Transfer Protocol -- HTTP/1.1''
    (1999), RFC 2616

60
history
  • 1990 version 0.9 allows for transfer of raw
    data.
  • 1996 rfc1945 defines version 1.0. by adding
    attributevalue headers.
  • 1999 rfc 2616
  • adds support for
  • hierarchical proxies
  • caching,
  • virtual hosts and some
  • Support for persistent connections
  • is more stringent.

61
http resource identification
  • identification of resources is assumed through
    Uniform Resource Identifiers (URI).
  • As far as http is concerned, URIs are string.
  • http can use absolute'' and relative'' URIs.
  • A URL is a special case of a URI.

62
rfc about http
  • An application-level protocol for distributed,
    collaborative, hypermedia information systems.
  • HTTP is also used as a generic protocol for
    communication between user agents and
    proxies/gateways to other Internet systems,
    including those supported by the SMTP, NNTP, FTP,
    Gopher, and WAIS protocols. In this way, HTTP
    allows basic hypermedia access to resources
    available from diverse applications.

63
http assumes transport
  • http assumes that there is a reliable way to
    transport data from one host on the Internet to
    another one.
  • All http requests and responses are separate TCP
    connections. The default is TCP port 80, but
    other ports can be used.

64
use of other standards
  • http shares the same registry as the MIME
    multimedia email extensions. It is based at the
    IANA, at
  • http//www.isi.edu/innotes/iana/
  • assignments/media-types/media-types
  • The default character set is ISO-8859-1.

65
Absolute http URL
  • the absolute http URL is
  • http//hostportabs_path?query
  • If abs_path is empty, it is /.
  • The scheme name "http" and the host name are
    case-insensitive.
  • Characters other than those in the reserved''
    and unsafe'' sets of RFC 2396 are equivalent to
    their HEX HEX'' encoding.
  • optional components are in

66
http messages
  • There are two types of messages.
  • Requests are sent form the client to the server.
  • Responses are sent from the server to the client.
  • The generic format is the same as for email
    messages
  • start line
  • message headers
  • empty line
  • body
  • Empty lines before the start line are ignored.
  • The request's start line is called the
    request-line.
  • The response start line is called the
    status-line.

67
overall operation server side
  • Server sends response, required items are
  • status line
  • protocol version
  • success or error code
  • optional items are
  • server information
  • body

68
overall operation client side
  • Client sends request, required items are
  • method
  • request URI
  • protocol version
  • optional items are
  • request modifiers
  • client information
  • Let us now look at different methods

69
GET and HEAD method
  • The GET method means retrieve whatever
    information (in the form of an entity) is
    identified by the Request-URI. If the Request-URI
    refers to a data-producing process, it is the
    produced data which shall be returned as the
    entity in the response and not the source text of
    the process.
  • The HEAD method is identical to GET except that
    the server MUST NOT return a message-body in the
    response.

70
Conditional partial GET
  • The semantics of the GET method change to a
    conditional GET'' if the request message
    includes an
  • If-Modified-Since
  • If-Unmodified-Since
  • If-Match
  • If-None-Match
  • If-Range header
  • The semantics of the GET method change to a
    partial GET'' if the request message includes a
    Range header field. A partial GET requests that
    only part of the entity be transferred

71
The POST method
  • The POST method is used to request that the
    origin server accept the entity enclosed in the
    request as a new subordinate of the resource
    identified by the Request-URI in the
    Request-Line. POST is designed to allow a uniform
    method to cover the following functions
  • Annotation of existing resources
  • Posting a message to a bulletin board, newsgroup,
    mailing list, or similar group of articles
  • Providing a block of data, such as the result of
    submitting a form, to a data-handling process
  • Extending a database through an append operation.

72
PUT and DELETE methods
  • The PUT method requests that the enclosed entity
    be stored under the supplied Request-URI. If the
    Request-URI refers to an already existing
    resource, the enclosed entity should be
    considered as a modified version of the one
    residing on the origin server.
  • The DELETE method requests that the origin server
    delete the resource identified by the Request-URI.

73
The request headers
  • Accept Accept-Charset
  • Accept-Encoding Accept-Language
  • Authorization Expect
  • From Host
  • If-Match If-Modified-Since
  • If-None-Match If-Range
  • If-Unmodified-Since Max-Forwards
  • Proxy-Authorization Range
  • Referer TE
  • User-Agent

74
The status line
  • The status line is a set of lines that are of
    the form
  • HTTP-Version Status-Code Reason-Phrase
  • The status code is a 3-digit number used by the
    computer.
  • The reason line is a friendly note for a human to
    read.

75
Status code classes
  • 1 Informational Request received, continuing
    process
  • 2 Success The action was successfully received,
    understood, and accepted
  • 3 Redirection Further action must be taken in
    order to complete the request
  • 4 Client Error The request contains bad syntax
    or cannot be understood
  • 5 Server error The request is valid but can not
    be executed by the server

76
Error codes
  • 100 Continue
  • 101 Switching Protocols
  • 200 OK
  • 201 Created
  • 202 Accepted
  • 203 Non-Authoritative Information
  • 204 No Content
  • 205 Reset Content
  • 206 Partial Content

77
Error codes II
  • 300 Multiple Choices
  • 301 Moved Permanently
  • 302 Found
  • 303 See Other
  • 304 Not Modified
  • 305 Use Proxy
  • 307 Temporary Redirect

78
Error codes III
  • 400 Bad Request
  • 401 Unauthorized
  • 402 Payment Required
  • 403 Forbidden
  • 404 Not Found
  • 405 Method Not Allowed
  • 406 Not Acceptable
  • 407 Proxy Authentication Required
  • 408 Request Time-out

79
Error codes IV
  • 409 Conflict
  • 410 Gone
  • 411 Length Required
  • 412 Precondition Failed
  • 413 Request Entity Too Large
  • 414 Request-URI Too Large
  • 415 Unsupported Media Type
  • 416 Requested range not satisfiable
  • 417 Expectation failed

80
Error codes V
  • 500 Internal Server Error
  • 501 Not Implemented
  • 502 Bad Gateway
  • 503 Service Unavailable
  • 504 Gateway Time-out
  • 505 HTTP Version not supported

81
Response headers
  • Accept-Ranges
  • Age
  • Etag
  • Location
  • Proxy-Authenticate
  • Retry-After
  • Server
  • Vary
  • WWW-Authenticate

82
Entity headers, common to response and request
  • Allow
  • Content-Encoding
  • Content-Language
  • Content-Length
  • Content-Location
  • Content-MD5
  • Content-Range
  • Content-Type
  • Expires
  • Last-Modified

83
The body
  • The entity-body (if any) sent with an HTTP
    request or response is in a format and encoding
    defined by the entity-header fields.
  • When an entity-body is included with a message,
    the data type of that body is determined via the
    header fields Content-Type and Content-Encoding

84
example status redirect
  • If you use Apache, you can create a file
    .htaccess (note the dot!) with a line
  • redirect 301 old_url new_url
  • old_url must be a relative path from the top of
    your site
  • new_url can be any URL, even outside your site
  • This works on wotan by virtue of configuration
    set for apache for your home directory. Examples
  • redirect 301 /krichel http//openlib.org/home/kri
    chel
  • redirect 301 Cantcook.jpg http//www.foodtv.com

85
Apache
  • Is a free, open-source web server that is
    produced by the Apache Software Foundation, see
    http//www.apache.org
  • It has over 50 of the market share.
  • It runs best on UNX systems but can run an a
    Mickeysoft OS as well.
  • I will cover it here because it is freely
    available.
  • Wotan runs version 2.

86
Apache in debian
  • /etc/apache2/apache.conf is set main
    configuration file.
  • /etc/init.d/apache2 action, where action is one
    of
  • start
  • stop
  • restart
  • is used to fire the daemon up or down.
  • The daemon runs user www-data

87
Virtual host
  • On a single installation of Apache several web
    servers can be supported.
  • That means the server can behave in a different
    way according to how it is being addressed.
  • The easiest way to implement addressing a server
    in different was is through DNS host names.

88
Directives in apache.conf
  • This file contains directives that control the
    operation of the Apache server process as a whole
    (the 'global environment').
  • Some of them are
  • the server root, where it finds its configuration
  • the time out for requests
  • which port to listen
  • another part of apache.conf has extensive
    settings to deal with content
  • different languages
  • different character sets
  • different MIME types

89
Modules
  • To extend Apache, modules have written. The
    modules are kept in a directory modules-available
  • Modules that are enabled are listed in the
    directory modules-enabled.
  • Looking at this gives you vital information about
    what the server can do.

90
Server directives
  • User
  • Gives the user name apache runs under
  • Group
  • Gives the group name the server runs under
  • ServerAdmin
  • Email of a human who runs the default server
  • ServerName
  • The name of the default server
  • DocumentRoot
  • The top level directory of the default server

91
Directory options
  • Many options for a directory can be set with
  • instructions
  • Name is the name of a directory.
  • Instructions can be a whole lot of stuff

92
Directory instructions
  • Options sets global options for the directory, it
    can be
  • None
  • All
  • Or any of
  • Indexes (form directory indexes?)
  • Includes (all server side includes?)
  • FollowSymlinks (allow to follow server-side
    includes)
  • ExecCGI (allow cgi-scripts?)
  • MultiViews

93
Access control
  • Can be part of to set directory level
    access control
  • Example
  • Allow from friendly.com
  • Deny from evil.com
  • Sometimes you have to set the order, example
  • Order allow, deny

94
Authentication
  • This is used to enable password access. In that
    case the authentication is handled by a file
    .htaccess in the directory.
  • The AllowOverride instruction is used to state
    what the user can do within the .htaccess file.
    Depending on its values, you can password protect
    a web site.
  • We will not discuss this further here.

95
Userdir
  • This sets the directory that is created by the
    user in her home directory to be accessed by
    requests to user.
  • On wotan, we have
  • UserDir public_html
  • That is the default, actually.

96
Set up permission for user home directories
  • AllowOverride FileInfo AuthConfig Limit
  • Options Includes
  • Options MultiViews Indexes SymLinksIfOwnerMatc
    h IncludesNoExec
  • Order allow,deny
  • Allow from all
  • MOVE LOCK UNLOCK
  • Order deny,allow
  • Deny from all

97
Logs
  • The web server logs every transaction.
  • The are severeal types of logs that used to be
    kept separately, in early days.
  • 209.73.164.50 - - 26/Jan/2003091951 -0500
    "GET /ramon/videos/ntsc175.html
  • HTTP/1.1" 206 808
  • Additional information may be kept in the referer
    and user agent log.
  • The referer log may have some interesting
    information on who links to your pages.

98
Virtual hosts
  • Most apache directive can be wrapped in a
    grouping.
  • This implies that the only hold for the virtual
    host. Example, from wotan
  • ServerAdmin krichel_at_openlib.org
  • DocumentRoot /home/connect/public_html
  • ServerName connections2003.liu.edu
  • ErrorLog /var/log/apache/connections2003-error
    .log
  • CustomLog /var/log/apache/connectios2003-acces
    s.log common

99
http//openlib.org/home/krichel
  • Thank you for your attention!
Write a Comment
User Comments (0)