Title: Web Technology and DBMSs
1Chapter 28
- Web Technology and DBMSs
- Transparencies
2Chapter 28 - Objectives
- Basics of Internet, Web, HTTP, HTML, URLs.
- Difference between two-tier and three-tier
client-server architecture. - Advantages and disadvantages of Web as a database
platform. - Approaches for integrating databases into Web
- Scripting Languages
- Common Gateway Interface (CGI)
- HTTP Cookies
3Chapter 28 - Objectives
- Extending the Web Server
- Java and JDBC, SQLJ, Servlets, and JSP
- Microsoft Web Solution Platform ASP and ADO
- Oracle Internet Platform.
4Introduction
- Web most popular and powerful networked
information system to date. - As architecture of Web was designed to be
platform-independent, can significantly lower
deployment and training costs. - Organizations using Web as strategic platform for
innovative business solutions, in effect becoming
Web-centric.
5Introduction
- Many Web sites today are file-based where each
Web document is stored in separate file. - For large sites, this can lead to significant
management problems. - Also many Web sites now contain more dynamic
information, such as product and pricing data. - Maintaining such data in both a database and in
separate HTML files is problematic. - Accessing database directly from Web would be a
better approach.
6 Internet
- Worldwide collection of interconnected networks.
- Began in late 60s in ARPANET, a US DOD project,
investigating how to build networks that could
withstand partial outages. - Starting with a few nodes, Internet estimated to
have over 100 million users in 1997, and over 390
million users in over 100 countries in 2001. - May be 640 million users of Web by year 2003.
- About 2.5 billion documents on Internet (550
billion if intranets/extranets included).
7Intranet and Extranet
- Intranet - Web site or group of sites belonging
to an organization, accessible only by members of
that organization. - Extranet - An intranet that is partially
accessible to authorized outsiders. - Whereas intranet resides behind firewall and is
accessible only to people who are members of same
organization, extranet provides various levels of
accessibility to outsiders.
8eCommerce and eBusiness
- eCommerce - Customers can place and pay for
orders via the businesss Web site. - eBusiness - Complete integration of Internet
technology into economic infrastructure of the
business. - Business-to-business transactions may reach 1.3
trillion by 2003. - eCommerce may account for 3.2 trillion in
worldwide corporate revenue by 2003 and could
represent 5 of sales in the global economy.
9The Web
- Hypermedia-based system that provides a simple
point and click means of browsing information
on the Internet using hyperlinks. - Information presented on Web pages, which can
contain text, graphics, pictures, sound, and
video. - Can also contain hyperlinks to other Web pages,
which allow users to navigate in a non-sequential
way through information. - Web documents written using HTML.
10The Web
- Web consists of network of computers that can act
in two roles - as servers, providing information
- as clients (browsers), requesting information.
- Protocol that governs exchange of information
between Web server and browser is HTTP and
locations within documents identified as a URL. - Much of Webs success is due to its simplicity
and platform-independence.
11Basic Components of Web Environment
12HyperText Transfer Protocol (HTTP)
- Protocol used to transfer Web pages through
Internet. - Based on request-response paradigm
- Connection - Client establishes connection with
Web - server.
- Request - Client sends request to Web server.
- Response - Web server sends response (HTML
- document) to client.
- Close - Connection closed by Web server.
13HyperText Transfer Protocol (HTTP)
- HTTP/1.0 is stateless protocol - each connection
is closed once server provides response. - This makes it difficult to support concept of a
session that is essential to basic DBMS
transactions.
14HyperText Markup Language (HTML)
- Document formatting language used to design most
Web pages. - A simple, yet powerful, platform-independent
document language. - HTML is an application of Standardized
Generalized Markup Language (SGML), a system for
defining structured document types and markup
languages to represent instances of those
document types.
15HyperText Markup Language (HTML)
16HyperText Markup Language (HTML)
17Uniform Resource Locators (URLs)
- String of alphanumeric characters that
represents location or address of a resource on
Internet and how that resource should be
accessed. - Defines uniquely where documents (resources) can
be found. - Uniform Resource Identifiers (URIs) - generic set
of all Internet resource names/addresses. - Uniform Resource Names (URNs) - persistent,
location-independent name. Relies on name lookup
services.
18Uniform Resource Locators (URLs)
- URL consists of three basic parts
- protocol used for the connection,
- host name,
- path name on host where resource stored.
- Can optionally specify
- port through which connection to host should be
made, - query string.
- http//www.w3.org/Markup/MarkUp.html
19Static and Dynamic Web Pages
- HTML document stored in file is static Web page.
- Content of dynamic Web page is generated each
time it is accessed. - Thus, dynamic Web page can
- respond to user input from browser
- be customized by and for each user.
- Requires hypertext to be generated by servers.
- Need scripts that perform conversions from
different data formats into HTML on-the-fly.
20Requirements for Web-DBMS Integration
- Ability to access valuable corporate data in a
secure manner. - Data- and vendor-independent connectivity to
allow freedom of choice in DBMS selection. - Ability to interface to database independent of
any proprietary browser or Web server. - Connectivity solution that takes advantage of all
the features of an organizations DBMS.
21Requirements for Web-DBMS Integration
- Open architecture to allow interoperability with
a variety of systems and technologies. For
example - different Web servers
- Microsoft's (Distributed) Common Object Model
(DCOM/COM) - CORBA/IIOP (Internet Inter-ORB protocol)
- Java/Remote Method Invocation (RMI).
- Cost-effective solution that allows for
scalability, growth, and changes in strategic
directions, and helps reduce applications
development costs.
22Requirements for Web-DBMS Integration
- Support for transactions that span multiple HTTP
requests. - Support for session- and application-based
authentication. - Acceptable performance.
- Minimal administration overhead.
- Set of high-level productivity tools to allow
applications to be developed, maintained, and
deployed with relative ease and speed.
23Two-Tier Client-Server Architecture
24Three-Tier Client-Server Architecture
- Client side presented two problems preventing
true scalability - Fat client, requiring considerable resources on
clients computer to run effectively. - Significant client side administration overhead.
- By 1995, three layers proposed, each potentially
running on a different platform.
25Three-Tier Client-Server Architecture
- Advantages
- Thin client, requiring less expensive hardware.
- Application maintenance centralized.
- Easier to modify or replace one tier without
affecting others. - Separating business logic from database functions
makes it easier to implement load balancing. - Maps quite naturally to Web environment.
26Three-Tier Client-Server Architecture
27Advantages of Web-DBMS Approach
- DBMS advantages
- Simplicity
- Platform independence
- Graphical User Interface
- Standardization
- Cross-platform support
- Transparent network access
- Scalable deployment
- Innovation
28Disadvantages of Web-DBMS Approach
- Reliability
- Security
- Cost
- Scalability
- Limited functionality of HTML
- Statelessness
- Bandwidth
- Performance
- Immaturity of development tools
29Approaches to Integrating Web and DBMSs
- Scripting Languages.
- Common Gateway Interface (CGI).
- HTTP Cookies.
- Extending the Web Server.
- Java, JDBC, SQLJ, Servlets, and JSP.
- Microsoft Web Solution Platform ASP and ADO.
- Oracle Internet Platform.
30Scripting Languages (JavaScript and VBScript)
- Scripting languages can be used to extend browser
and Web server with database functionality. - As script code is embedded in HTML, it is
downloaded every time page is accessed. - Updating browser is simply a matter of changing
Web document on server. - Some popular scripting languages are JavaScript,
VBScript, Perl, and PHP. - They are interpreted languages, not compiled,
making it easy to create small applications.
31Common Gateway Interface (CGI)
- Specification for transferring information
between a Web server and a CGI program. - Server only intelligent enough to send documents
and to tell browser what kind of document it is. - But server also knows how to launch other
programs. - When server sees that URL points to a program
(script), it executes script and sends back
scripts output to browser as if it were a file.
32CGI - Environment
33CGI
- CGI defines how scripts communicate with Web
servers. - A CGI script is any script designed to accept and
return data that conforms to the CGI
specification. - Before server launches script, prepares number of
environment variables representing current state
of the server, who is requesting the information,
and so on. - Script picks this up and reads STDIN.
34CGI
- Then performs necessary processing and writes its
output to STDOUT. - Script responsible for sending MIME header, which
allows browser to differentiate between
components. - CGI scripts can be written in almost any
language, provided it supports reading and
writing of an operating systems environment
variables.
35CGI
- Four primary methods for passing information from
browser to a CGI script - Passing parameters on the command line.
- Passing environment variables to CGI programs.
- Passing data to CGI programs via standard input.
- Using extra path information.
36CGI - Passing Parameters on Command Line
37CGI - Advantages
- CGI is the de facto standard for interfacing Web
servers with external applications. - Possibly most commonly used method for
interfacing Web applications to data sources. - Advantages
- simplicity,
- language independence,
- Web server independence,
- wide acceptance.
38CGI - Disadvantages
- Communication between client and database server
must always go through Web server. - Lack of efficiency and transaction support, and
difficulty validating user input inherited from
statelessness of HTTP protocol. - HTTP never intended for long exchanges or
interactivity. - Server has to generate a new process or thread
for each CGI script. - Security.
39HTTP Cookies
- Cookies can make CGI scripts more interactive.
- Cookies are small text files stored on Web
client. - CGI script creates cookie and has Web server send
it to clients browser to store on hard disk. - Later, when client revisits Web site and uses a
CGI script that requests this cookie, clients
browser sends information stored in the cookie. - Cookies can be used to store registration
information or preferences (e.g. for virtual
shopping cart). - However, not all browsers support cookies.
40Extending the Web Server
- To overcome limitations of CGI, many servers
provide an API that adds functionality to server.
- Two of main APIs are Netscapes NSAPI and
Microsofts ISAPI. - Scripts are loaded in as part of the server,
giving back-end applications full access to all
the I/O functions of server. - One copy of application is loaded and shared
between multiple requests to server.
41Extending the Web Server
- Approach more complex than CGI, possibly
requiring specialized programmers. - Can provide very flexible and powerful solution.
- API extensions can provide same functionality as
a CGI program, but as API runs as part of the
server, API approach can perform significantly
better than CGI. - Extending Web server is potentially dangerous,
since server executable is being changed.
42Server-Side JavaScript for Database Access
43Comparison of CGI and API
- CGI and API both extend capabilities of server.
- CGI scripts run in environment created by Web
server program. - Scripts only execute once Web server interprets
request from browser, then returns results back
to the server. - API approach not nearly so limited in its ability
to communicate. - API-based extensions are loaded into same address
space as Web server.
44Java
- Proprietary language developed by Sun.
- Originally intended to support environment of
networked machines and embedded systems. - Now, Java is rapidly becoming de facto language
for Web computing. - Interesting because of its potential for building
Web applications (applets) and server
applications (servlets).
45Java
- A simple, object-oriented, distributed,
interpreted, robust, secure, architecture
neutral, portable, high-performance,
multi-threaded and dynamic language. - Has a machine-independent target architecture,
the Java Virtual Machine (JVM). - Since almost every Web browser vendor has already
licensed Java and implemented an embedded JVM,
Java applications can currently be deployed on
most end-user platforms.
46Java
47Java
- Before Java application can be executed, it must
first be loaded into memory. - Done by Class Loader, which takes .class
file(s) containing bytecodes and transfers it
into memory. - Class file can be loaded from local hard drive or
downloaded from network. - Finally, bytecodes must be verified to ensure
that they are valid and do not violate Javas
security restrictions.
48Java
- Loosely speaking, Java is a safe C.
- Safety features include strong static type
checking, automatic garbage collection, and
absence of machine pointers at language level. - Safety is central design goal ability to safely
transmit Java code across Internet. - Security is also integral part of Javas design -
sandbox ensures untrusted application cannot gain
access to system resources.
49Java 2 Platform
- In mid-1999, Sun announced it would pursue a
distinct and integrated Java enterprise platform - J2ME aimed at embedded and consumer-electronics
platforms. - J2SE aimed at typical desktop and workstation
environments. Essentially equivalent to JDK 1.2. - J2EE aimed at robust, scalable, multiuser, and
secure enterprise applications. - J2EE was designed to simplify complex problems
with development, deployment, and management of
multitier enterprise applications.
50Java 2 Platform
- Cornerstone of J2EE is Enterprise JavaBeans
(EJB), a standard for building server-side
components in Java. - Two types of EJB components
- EJB Session Beans, components implementing
business logic, business rules, and workflow. - EJB Entity Beans, components encapsulating some
data contained by the enterprise. Entity Beans
are persistent. - From database perspective, interested in two J2EE
components JDBC and JSP.
51Java 2 Platform
52JDBC
- Modeled after ODBC, JDBC API supports basic SQL
functionality. - With JDBC, Java can be used as host language for
writing database applications. - On top of JDBC, higher-level APIs can be built.
- Currently, two types of higher-level APIs
- An embedded SQL for Java (e.g. SQLJ).
- A direct mapping of relational database tables to
Java classes (e.g. Java Blend from Sun).
53JDBC
- JDBC API consists of two main interfaces an API
for application writers, and a lower-level driver
API for driver writers. - Applications and applets can access databases
using - ODBC drivers and existing database client
libraries - JDBC API with pure Java JDBC drivers.
54JDBC
55JDBC - Advantages/Disadvantages
- Advantage of using ODBC drivers is that they are
a de facto standard for PC database access, and
are available for many DBMSs, for very low price.
- Disadvantages with this approach
- Non-pure JDBC driver will not necessarily work
with a Web browser. - Currently downloaded applet can connect only to
database located on host machine. - Deployment costs increase.
56SQLJ
- Another JDBC-based approach uses Java with static
embedded SQL. - SQLJ comprises a set of clauses that extend Java
to include SQL constructs as statements and
expressions. - SQLJ translator transforms SQLJ clauses into
standard Java code that accesses database through
a CLI.
57Comparison of JDBC and SQLJ
- SQLJ is based on static embedded SQL while JDBC
is based on dynamic SQL. - Thus, SQLJ facilitates static analysis for syntax
checking, type checking, and schema checking,
which may help produce more reliable programs at
loss of some functionality. - It also potentially allows DBMS to generate an
execution strategy for the query, thereby
improving performance of the query.
58Comparison of JDBC and SQLJ
- JDBC is low-level middleware tool with features
to interface Java application with RDBMS. - Developers need to design a relational schema to
which they will map Java objects, and must write
code to map Java objects to corresponding rows of
relations. - Problems
- need to be aware of two different paradigms
(object and relational) - need to design a relational schema to map onto an
object design - need to write mapping code.
59Java Servlets
- Servlets are programs that run on Java-enabled
Web server and build Web pages, analogous to CGI. - Have a number of advantages over CGI
- improved performance
- portability
- extensibility
- simpler session management
- improved security and reliability.
60Java Server Pages (JSP)
- JSP is a Java-based server-side scripting
language that allows static HTML to be mixed with
dynamically-generated HTML. - Behind scenes, JSP is compiled into Java servlet
and processed by a Java-enabled Web server (JSP
works with most Web servers). - Since servlet is compiled, performance is
improved.
61Microsoft Web Solution Platform
- Microsoft Web Solution Platform, a precursor to
.NET, has been created for building and deploying
interoperable Web solutions. - Contains various tools, services, and
technologies, such as - Windows 2000,
- Exchange Server,
- Visual Studio,
- HTML/XML,
- scripting languages,
- components (Java, ActiveX).
62Object Linking and Embedding for DataBases (OLE
DB)
- Microsoft has defined set of data objects,
collectively known as OLE DB. - Allows OLE-oriented applications to share and
manipulate sets of data as objects. - OLE DB is an object-oriented specification based
on C API. - Components can be treated as data consumers and
data providers. Consumers take data from OLE DB
interfaces and providers expose OLE DB
interfaces.
63OLE DB Architecture
64Active Server Pages (ASP)
- ASP is programming model that allows dynamic,
interactive Web pages to be created on server. - ASP provides flexibility of CGI, without
performance overhead discussed previously. - ASP runs in-process with the server, and is
optimized to handle large volume of users. - When an .asp file is requested, Web server
calls ASP, which reads requested file, executes
any commands, and sends generated HTML page back
to browser.
65Active Server Pages (ASP)
66ActiveX Data Objects (ADO)
- Programming extension of ASP supported by
Microsoft IIS for database connectivity. - Supports following key features
- Independently-created objects.
- Support for stored procedures.
- Support for different cursor types.
- Batch updating.
- Support for limits on number of returned rows.
- Support for multiple recordsets.
- Designed as an easy-to-use interface to OLE DB.
67Comparison of ASP and JSP
- Both designed to enable developers to separate
page design from programming logic through use of
callable components. - Differences
- JSP is essentially platform and server
independent whereas ASP primarily restricted to
MS Windows-based platforms. - JSP perhaps more extensible as JSP developers can
extend the JSP tags available. - JSP components are reusable across platforms.
- JSP benefits from in-built Java security model.
68Microsoft Access and Web Page Generation
- Microsoft Access provides 3 wizards for
automatically generating HTML pages - Static pages user can export data to HTML
format. - Dynamic pages using ASP user can export data to
an asp file on Web server. - Dynamic pages using data access pages data
access pages are dynamic HTML Web pages bound
directly to data in the database. Can be used
like Access forms, except pages are stored as
external files.
69Oracle Internet Platform
- Comprises Oracle Internet Application Server
(iAS) and Oracle DBMS. - It is n-tier architecture based on industry
standards such as - HTTP and HTML/XML for Web enablement.
- OMGs CORBA technology.
- IIOP for object interoperability and RMI.
- Java, EJB, JDBC, and SQLJ for database
connectivity, Java servlets, and JSP.
70Oracle Internet Platform
71Oracle Internet Application Server (iAS)
- A reliable, scalable, secure, middle-tier
application server designed to support eBusiness. - Currently available in three versions
- Standard Edition lightweight Web server with
minimal application support - Enterprise Edition for Web sites that handle
high volume of transactions - Wireless Edition same as EE but also includes
Portal-to-Go for delivering content to wireless
devices.
72iAS Communication Services
- Handles all incoming requests received by iAS, of
which some are processed by the Oracle HTTP
Server and some requests are routed to other
areas of iAS. - Oracle HTTP Server is extended version of Apache
Server. - Although default Apache HTTP Server supports only
stateless transactions, it can be configured to
support stateful transactions using Apache JServ.
73Oracle HTTP Server Modules (mods)
- Oracle has enhanced several of the compiled
Apache mods provided with Apache server, and has
added Oracle-specific ones e.g. - mod_ssl, provides standard S-HTTP
- mod_plsql, routes PL/SQL requests to Oracle
PL/SQL service, which then delegates servicing to
PL/SQL programs - mod_perl, forwards Perl application requests to
the embedded Perl Interpreter - mod_jserv, routes all servlet requests to the
embedded Apache JServ servlet engine.
74Business Logic Services
- These services support the application logic
e.g. - Oracle JVM, a server-side Java platform
supporting EJBs, CORBA, and database stored
procedures. - Oracle PLSQL, a scalable engine for running
business logic against data in the Oracle Cache
and the Oracle database. - Oracle Forms, enables users to run applications
based on Oracle Forms technology over Internet or
intranet to query or modify data in the database.
75Presentation Services
- These services deliver dynamic content to client
browsers, supporting servlets, JSP, Perl/CGI
scripts, PL/SQL pages, forms, and business
intelligence e.g. - Apache Jserv, a Java servlet engine
- OracleJSP, an implementation of Suns JSP
- Oracle PSP (PL/SQL Server Pages), analogous to
JSP, but uses PL/SQL rather than Java for the
server-side scripting. In the simplest case, a
PSP is nothing more than an HTML file or an XML
file.
76Oracle Portal Services
- A portal is a Web-based application that provides
a common, integrated entry point for accessing
dissimilar data types on a single Web page. - Oracle Portal provides portal services for users
connecting from a traditional desktop. - Oracle Portal-to-Go is a portal service for
delivering information and applications to mobile
devices. Portal-to-Go allows portal sites to be
created that use Web pages, Java applications,
and XML-based applications.