Title: Introduction to the Web
1Introduction to the Web
2Prerequisites
- Basic computer skills
- Experience using the World Wide Web
- Experience developing object-oriented software
3Learning Objectives
- Overview of Web and Internet technologies
- Review of existing Web programming technologies
4Agenda
- Internet Technologies
- Programming Languages and Paradigms
- Programming the Web
5Internet Technologies The World Wide Web
- A way to access and share information
- Technical papers, marketing materials, recipes,
... - A huge network of computers the Internet
- Graphical, not just textual
- Information is linked to other information
- Application development platform
- Shop from home
- Provide self-help applications for customers and
partners
6Internet TechnologiesWWW Architecture
PC/Mac/Unix Browser
Client
Request http//www.msn.com/default.asp
TCP/IP
Network
Response lthtmlgtlt/htmlgt
Web Server
Server
7Internet TechnologiesWWW Architecture
- Client/Server, Request/Response architecture
- You request a Web page
- e.g. http//www.msn.com/default.asp
- HTTP request
- The Web server responds with data in the form of
a Web page - HTTP response
- Web page is expressed as HTML
- Pages are identified as a Uniform Resource
Locator (URL) - Protocol http
- Web server www.msn.com
- Web page default.asp
- Can also provide parameters ?nameLeon
8Internet TechnologiesWeb Standards
- Internet Engineering Task Force (IETF)
- http//www.ietf.org/
- Founded 1986
- Request For Comments (RFC) at http//www.ietf.org/
rfc.html - World Wide Web Consortium (W3C)
- http//www.w3.org
- Founded 1994 by Tim Berners-Lee
- Publishes technical reports and recommendations
9Internet TechnologiesWeb Design Principles
- Interoperability Web languages and protocols
must be compatible with one another independent
of hardware and software. - Evolution The Web must be able to accommodate
future technologies. Encourages simplicity,
modularity and extensibility. - Decentralization Facilitates scalability and
robustness.
10Internet TechnologiesHypertext Markup Language
(HTML)
- The markup language used to represent Web pages
for viewing by people - Designed to display data, not store/transfer data
- Rendered and viewed in a Web browser
- Can contain links to images, documents, and
other pages - Not extensible
- Derived from Standard Generalized Markup Language
(SGML) - HTML 3.2, 4.01, XHTML 1.0
11Internet TechnologiesHTML Forms
- Enables you to create interactive user interface
elements - Buttons
- Text boxes
- Drop down lists
- Check boxes
- User fills out the form and submits it
- Form data is sent to the Web server via HTTP when
the form is submitted
12Internet TechnologiesHypertext Transport
Protocol (HTTP)
- The top-level protocol used to request and return
data - E.g. HTML pages, GIFs, JPEGs, Microsoft Word
documents, Adobe PDF documents, etc. - Request/Response protocol
- Methods GET, POST, HEAD,
- HTTP 1.0 simple
- HTTP 1.1 more complex
13Internet TechnologiesHTTP Request
Method
File
HTTP version
Headers
- GET /default.asp HTTP/1.0
- Accept image/gif, image/x-bitmap, image/jpeg,
/ - Accept-Language en
- User-Agent Mozilla/1.22 (compatible MSIE 2.0
Windows 95) - Connection Keep-Alive
- If-Modified-Since Sunday, 17-Apr-96 043258 GMT
Blank line
Data none for GET
14Internet TechnologiesHTTP Response
HTTP version
Status code
Reason phrase
Headers
HTTP/1.0 200 OK Date Sun, 21 Apr 1996 022042
GMT Server Microsoft-Internet-Information-Server/
5.0 Connection keep-alive Content-Type
text/html Last-Modified Thu, 18 Apr 1996
173905 GMT Content-Length 2543 ltHTMLgt Some
data... blah, blah, blah lt/HTMLgt
Data
15Internet TechnologiesHTTP Server Status Codes
Code Description
200 OK
201 Created
301 Moved Permanently
302 Moved Temporarily
400 Bad Request not understood
401 Unauthorized
403 Forbidden not authorized
404 Not Found
500 Internal Server Error
16Internet TechnologiesHTTP
- HTTP is a stateless protocol
- Each HTTP request is independent of previous and
subsequent requests - HTTP 1.1 introduced keep-alive for efficiency
- Statelessness has a big impact on how scalable
applications are designed
17Internet TechnologiesCookies
- A mechanism to store a small amount of
information (up to 4KB) on the client - A cookie is associated with a specific web site
- Cookie is sent in HTTP header
- Cookie is sent with each HTTP request
- Can last for only one session (until browser is
closed) or can persist across sessions - Can expire some time in the future
18Internet TechnologiesHTTPS
- A secure version of HTTP
- Allows client and server to exchange data with
confidence that the data was neither modified nor
intercepted - Uses Secure Sockets Layer (SSL)/Transport Layer
Security (TLS)
19Internet TechnologiesURIs, URLs and URNs
- Uniform Resource Identifier (URI URL or URN)
- Generic term for all textual names/addresses
- Uniform Resource Locator (URL)
- The set of URI schemes that have explicit
instructions on how to access the resource over
the Internet, e.g. http, ftp, gopher - Uniform Resource Name (URN)
- 1) A URI that has an institutional commitment to
availability, etc. - 2) A particular scheme intended to identify
resources - e.g. urnschemashttpmailsubject
20Internet TechnologiesMultipurpose Internet Mail
Extensions (MIME)
- Defines types of data/documents
- text/plain
- text/html
- image/gif
- image/jpeg
- audio/x-pn-realaudio
- audio/x-ms-wma
- video/x-ms-asf
- application/octet-stream
21Internet TechnologiesMIME
- Specifies character sets, e.g. ASCII
- Supports multi-part messages
- Originally designed for email, but also used in
other places, such as HTTP
22Internet TechnologiesBrowsers
- Client-side application
- Requests HTML from Web server and renders it
- Popular browsers
- Netscape
- Internet Explorer
- Opera
- others
- Also known as a User Agent
23Internet TechnologiesClients Servers
- Client and Server computers both have
- CPU
- Memory
- I/O
- Disks
- Network
- Bus
- Multi-tasking operating system
- Applications
24Internet TechnologiesClients Servers
- Clients
- Generally supports a single user
- Optimized for responsiveness to user
- User interface, graphics
- Servers
- Supports multiple users
- Optimized for throughput
- More CPUs (SMP), memory, disks (SANs), I/O
- Provide services (e.g. Web, file, print,
database, e-mail, fax, transaction, telnet,
directory)
25Internet TechnologiesProxy Servers Firewalls
- Proxy Server
- A server that sits between a client (running a
browser) and the Internet - Improves performance by caching commonly used Web
pages - Can filter requests to prevent users from
accessing certain Web sites - Firewall
- A server that sits between a network and the
Internet to prevent unauthorized access to the
network from the Internet
26Internet TechnologiesNetworks
- Network an interconnected collection of
independent computers - Why have networks?
- Resource sharing
- Reliability
- Cost savings
- Communication
- Web technologies add
- New business models e-commerce, advertising
- Entertainment
- Applications without a client-side install
27Internet TechnologiesNetworks
- Network scope
- internet a collection of connected networks
- Internet a specific world-wide network based on
TCP/IP, used to connect companies, universities,
governments, organizations and individuals.
Originated as ARPANET, funded by the US DoD. - intranet a network based on Internet
technologies that is internal to a company or
organization - extranet a network based on Internet
technologies that connects one company or
organization to another
28Internet TechnologiesNetworks
- Network technology is largely determined by
scale - Local Area Network (LAN) Span up to a few
kilometers. Bus vs. ring topologies - Wide Area Networks (WAN) Can span a country or
continent. WANs use routers as intermediate
nodes to connect transmission lines
29Internet TechnologiesNetworks
- Network technology
- Broadcasting
- Packets of data are sent from one machine and
received by all computers on the network - Multicast packets are received by a subset of
the machines on a network - Point-to-point
- Packets have to be routed from one machine to
another there many be many paths - In general, geographically localized networks use
broadcasting, while disperse networks use
point-to-point
30Internet TechnologiesNetworks
OSI Model Layers
TCP/IP Protocol Architecture Layers
TCP/IP Protocol Suite
Application Layer
Presentation Layer
Application Layer
Telnet
FTP
SMTP
DNS
RIP
SNMP
HTTP
Session Layer
Host-to-Host Transport Layer
TCP
UDP
Transport Layer
Network Layer
Internet Layer
IP
ICMP
IGMP
ARP
Data Link Layer
Network Interface Layer
Token Ring
Ethernet
ATM
Frame Relay
Physical Layer
31Internet TechnologiesNetwork Protocol Stack
HTTP
HTTP
TCP
TCP
IP
IP
Ethernet
Ethernet
32Internet TechnologiesNetworks - Internet Layer
- Internet Protocol (IP)
- Responsible for getting packets from source to
destination across multiple hops - Not reliable
- IP address 32 bit value usually written in
dotted decimal notation as four 8-bit numbers (0
to 255) e.g. 130.50.12.4
33Internet TechnologiesNetworks - Transport Layer
- Provides efficient, reliable and cost-effective
service - Uses the Sockets programming model
- Ports identify application
- Well-known ports identify standard services
(e.g. HTTP uses port 80, SMTP uses port 25) - Transmission Control Protocol (TCP)
- Provides reliable, connection-oriented byte
stream - UDP
- Connectionless, unreliable
34Internet TechnologiesNetworks - Application Layer
- Telnet Remote sessions
- File Transfer Protocol (FTP)
- Network News Transfer Protocol (NNTP)
- Simple Network Management Protocol (SNMP)
- Simple Mail Transfer Protocol (SMTP)
- Post Office Protocol (POP3)
- Interactive Mail Access Protocol (IMAP)
35Internet TechnologiesNetworks - Domain Name
System (DNS)
- Provides user-friendly domain names, e.g.
www.msn.com - Hierarchical name space with limited root
names - DNS servers map domain names to IP addresses
36Internet TechnologiesExtensible Markup Language
(XML)
- Represents hierarchical data
- A meta-language a language for defining other
languages - Extensible
- Useful for data exchange and transformation
- Simplified version of SGML
37Agenda
- Internet Technologies
- Programming Languages and Paradigms
- Programming the Web
- .NET Overview
38Programming Languages
- Machine code
- Assembly language
- High-level languages
- Fortran, LISP, Cobol
- C, Pascal, Basic, Smalltalk
- C, Eiffel
- Java, C
- Scripting languages
- Shell scripts, Perl, TCL, Python, JavaScript,
VBScript
39Programming Paradigms
- Unstructured programming
- Structured programming
- Object-oriented programming
- Component-based programming
- Event-based programming
40Programming ParadigmsUnstructured Programming
- See Go To Statement Considered Harmful at
http//www.acm.org/classics/oct95/
41Programming ParadigmsStructured Programming
- Sequence
- Conditional
- if then else
- switch
- Looping
- for i from 1 to n
- do while
- do until
- Functions
- Exceptions
42Programming ParadigmsObject-Oriented Programming
- Objects have data and behavior
- Data members, fields, variables, slots,
properties - Behavior methods, functions, procedures
- Using objects is easy
- First instantiate the type of object desired
- Then call its methods and get/set its properties
- Designing new types of objects can be hard
- Design goals often conflict simplicity,
functionality, reuse, performance
43Programming ParadigmsObject-Oriented Programming
- Key object-oriented concepts
- Identity
- Encapsulation
- Data behavior
- Information hiding (abstraction)
- Classes vs. instances
- Polymorphism
- Interfaces
- Delegation, aggregation
- Inheritance
- Patterns
44Programming ParadigmsComponent-Based Programming
- Components
- Independent modules of reuse and deployment
- Coarser-grained than objects (objects are
language-level constructs) - Includes multiple classes
- Often language-independent
- In the general case, the component writer and the
component user dont know each other, dont work
for the same company, and dont use the same
language
45Programming ParadigmsComponent-Based Programming
- Component Object Model (COM)
- Initial Microsoft standard for components
- Specifies a protocol for instantiating and using
components in-process, across processes or
across machine boundaries - Basis for ActiveX, OLE, and many other
technologies - Can be created in Visual Basic, C, .NET,
- Java Beans
- Java standard for components
- Not language-independent
46Programming ParadigmsEvent-Based Programming
- When something of interest occurs, an event is
raised and application-specific code is executed - Events provide a way for you to hook in your own
code into the operation of another system - Event callback
- User interfaces are all about events
- onClick, onMouseOver, onMouseMove
- Events can also be based upon time or
interactions with the network, operating system,
other applications, etc.
47Agenda
- Internet Technologies
- Programming Languages and Paradigms
- Programming the Web
- .NET Overview
48Programming the WebClient-Side Code
- What is client-side code?
- Software that is downloaded from Web server to
browser and then executes on the client - Why client-side code?
- Better scalability less work done on server
- Better performance/user experience
- Create UI constructs not inherent in HTML
- Drop-down and pull-out menus
- Tabbed dialogs
- Cool effects, e.g. animation
- Data validation
49Programming the WebClient-Side Technologies
- DHTML/JavaScript
- COM
- ActiveX controls
- COM components
- Remote Data Services (RDS)
- Java
- Plug-ins
- Helpers
- Remote Scripting
50Programming the WebDynamic HTML (DHTML)
- Script that is embedded within an HTML page
- Usually written in JavaScript (ECMAScript,
JScript) for portability - Internet Explorer also supports VBScript and
other scripting languages - Each HTML element becomes an object that has
associated events (e.g. onClick) - Script provides code to respond to browser events
51Programming the WebDHTML
- The DHTML Document Object Model (DOM)
window
history
document
location
screen
navigator
frames
event
all
location
children
selection
forms
body
links
text
button
radio
textarea
select
password
checkbox
submit
option
file
reset
52Programming the WebServer-Side Code
- What is server-side code?
- Software that runs on the server, not the client
- Receives input from
- URL parameters
- HTML form data
- Cookies
- HTTP headers
- Can access server-side databases, e-mail servers,
files, mainframes, etc. - Dynamically builds a custom HTML response for a
client
53Programming the WebServer-Side Code
- Why server-side code?
- Accessibility
- You can reach the Internet from any browser, any
device, any time, anywhere - Manageability
- Does not require distribution of application code
- Easy to change code
- Security
- Source code is not exposed
- Once user is authenticated, can only allow
certain actions - Scalability
- Web-based 3-tier architecture can scale out
54Programming the WebServer-Side Technologies
- Common Gateway Interface (CGI)
- Internet Server API (ISAPI)
- Netscape Server API (NSAPI)
- Active Server Pages (ASP)
- Java Server Pages (JSP)
- Personal Home Page (PHP)
- Cold Fusion (CFM)
- ASP.NET
55Programming the Web Active Server Pages (ASP)
- Technology to easily create server-side
applications - ASP pages are written in a scripting language,
usually VBScript or JScript - An ASP page contains a sequence of static HTML
interspersed with server-side code - ASP script commonly accesses and updates data in
a database
56Programming the WebASP
ASP page (static HTML, server-side logic)
57Architectures
- N Tier
- Application Servers
- Middleware
- Intranets
58What to Know About Architectures
- Evolution of Architectures
- Evolution of COM
- Definitions
- ASP
- Middleware
- Database transparency
- ADO
59Client/Server
- Client/Server evolves along the line of PC
computing and Microsoft - PC moved from complement to substitute for Host
systems - 80s PCs were personal
- 90s PCs were departmental
- 00s PCs are enterprise wide platforms
60Key Problems of Client/Server
- Scalability
- Manageability
- Complexity
- Ease of Use
- Application Development
61Network Architectures
- Host-based networks the host computer performs
virtually all of the work - Client-based networks the client computer
performs virtually all of the work - Client-server networks the work is shared
between the hosts and clients
62Functions
- The work done by any application program can be
divided into four general functions - data storage
- data access logic
- application logic
- presentation logic
63Host-Based Architectures
64Host-Based Architectures
- Scalability
- Pros provide large scale access
- Cons expensive, lumpy, slow
- Centralized Management
- Complexity
- Poor user interface
- Application backlog low level tools,
specialized knowledge
65Client-Based Architectures
- Evolution of standalone PC to fill gaps in
departmental computing needs
66Client-Based Architectures
- Server acts as a remote disk drive.
- Client does all the processing.
- Searches slow down the network, each query has
the server pass the entire database over the
network.
67Client-Based Architectures
- Scalability
- Pros cheap easy to add a station up to limits
- Cons severely limited by bandwidth/architecture
- Cons changes to application must be made on
server and all clients - Decentralized Management
- Complexity
- Graphical user interface
- Off the shelf applications high level tools
- Custom apps difficult to develop
68Two Tier Client-Server
- Moves data access logic to server - decreases
congestion - Ease application development object oriented
programming
69Two Tier Client/Server
- Improved Scalability less data traffic
- An SQL request is generated in the client and
transmitted to the server. The DBMS searches
locally and returns only matching records greatly
reducing data traffic.
70COM - Component Object Model
- Rapid development lower costs by using ready
made building blocks - Components called like subroutines
- Two tier system - components downloaded to the
client to execute, uses lots of bandwidth bogs
down WANs and enterprise networks
71Three Tier Client/Server
- Application server stores executes components
for clients - Easier management - centralizing application
logic - Reduces network traffic
- Optimizes network processing - Lower load on
client. Can use thin clients, NCs, older PCs,
72Three Tier Object Models
- CORBA Common Object Request Broker Architecture
- DCOM (MS) Distributed Component Object Model
- Allows components to be executed remotely with
results sent back to client
73Three Tier Client/Server
- API Application Programming Interface and three
tier architecture brings the power of layering
logic - layering logic - change a layer without changing
the others - Application logic - (ex. Tax software)
74Web Mail as 3 Tier System
75Systems Management Performance Web Sites MS
Windows DNA
- Front End stateless load balancing, cloned
systems - Back End stateful, partitioning
76N-Tier Architecture
77What is a Web Application?
HTTP Clients
Services
IIS
Active Server Pages
Transaction server
Script Logic
Message Queuing
COM
Commerce
Databases and Directories
78Architecture Tiers Business Logic
- COM - Component Object Model
- Service accessibility
- Visual Basic, C, Java, Cobol, Pascal
- Used from Active Server Pages
- COM components provide
- Encapsulation of internal details
- Modularity for code reuse
- Intellectual Property Security
79N Tier Intranet - Scalability
80N Tier Architecture
SQL Server
Internet InformationServer \ LDAP Server
MTS
Directory
HTTP
WebBrowser
Components
DCOM
MicrosoftExchange
LDAP
Others
MSMQ
Presentation
Business logic
Data services
81Architecture Of A Commerce Site
Commerce Server Supplied Components
HTML
.ASP
Data
IIS
ltbrgt Tha ltAgt Cop
. lt/ Server .
SQL, Oracle, (ODBC compliant)
Commerce Server Components
82Commerce Accessing The Data
- Data is accessed using ADO and any ODBC compliant
database can be easily used
83Middleware
- Conversion layer, sits between client and
database - Database transparency - middleware translates DB
using OLE DB (API) - Data access logic (ADO, OLE DB)
- application written independent of source of
database - integrate data from incompatible sources
- Allows multi-vendor solutions
84Benefits of Client-Server Architectures
- Scalability
- Manageable
- Cost Effective - PC hardware is more than 1000
times cheaper than mainframe hardware for the
same computing power. - Complexity easy on user, tough on IT
85Intranets
- In-house web site not accessible by public
- Gives benefits of three tier model
- Low bandwidth requirements
- Thin client
- Lower management cost /deployment cost
- Scalability
- Reduces complexity - cross platform
interoperability, standard interface - Emerging tools and standards HTML not suited to
applications, DHTML, XML, Biztalk
86Elements of the Intranet
- Client - Browser
- navigation, interpretation of HTML from server,
light processing - dynamic HTML - does processing on client Web
pages function more like regular software, better
control of interface - Server -
- serves up HTML to client via HTTP, lower
bandwidth needed between server and client - Active Server Pages (.asp), script language
embedded in HTML pages that generates HTML output
from remote procedure call on server