Title: Towards the Engineering of Webbased Applications
1Towards the Engineering of Web-based Applications
- Cornelia Boldyreff
- Distributed Systems Engineering Group,
- RISE, University of Durham, UK
- www.dur.ac.uk/cornelia.boldyreff
2Outline
- Web Site Development and Maintenance
- Web Metrics
- Web Site Evolution
- Web Management and Design Processes
- Web Site Engineering
3Web Site Development and Maintenance
- Recognition of potential Web maintenance problem
and very interesting case of software evolution - Key paper - Measuring Readability and
Maintainability of Hyperdocuments, Hatzimanikatis
et al, JSM, 1995 - Possibility of working with Richard West and
colleagues managing UK government Web sites
4Initial Findings
- Uneven, largely poor, quality of authorship
- Badly structured web documents
- Difficulty in navigating web documents
- Hypertext structures rapidly become complex as
the linking of nodes increases - Lack of tools for managing Web development and
maintenance
5Further Findings
- Applications on the WWW are often distributed and
therefore maintained by several
authors/maintainers - Maintenance of WWW applications relies on the
error logs of each server, complaints from users,
periodic checks by owners - No general consensus of opinion, or standards, on
what constitutes a good web application
therefore assessment is difficult
6Analyzing and Assessing Hyperdocuments
- Readability Factors
- Maintainability Factors
- Effectiveness Factors - Usability
- Evaluation process in practice needed
- High level factors needed to be related to low
level measurable factors - Lessons to be learnt from classical software
engineering, especially software maintenance
7Key Results
- Analysis and Assessment of Web documents GQM
and ISO Product Quality Standard - Requirements Studies for Web Site Development and
Management Support - Workbench developed - see www.dur.ac.uk/cornelia.b
oldyreff/workbench
8Solving the Broken Link Problem
- Major effort needed to maintain links and ensure
their currency - Existing solutions involved non-standard
implementations, e.g. Hyper-G(Hyperwave) - Guardian Agent solution implemented using
standard http and CORBA - employing both eager
and lazy link update - patented by British Telecom
9BT Requirements for WWW Maintenance
- Link management - if documents are relocated,
links should be redirected automatically to the
new document location - Version control - support re-configuration of
existing sites with new up-dates - Support of team working - several authors should
be able to work on the same documents without
interfering with each other
10Guardian Agents
- Compatible with WWW
- Provide on-line maintenance
- Work in background - transparent to the user
- Flexible
- Scalable
- Configurable
11Guardian Agents
- Guardian Agents analyze, monitor and record the
following information - Outgoing and incoming links
- Who accesses the web application
- Who changes the web application
12Determining Success - Web Quality Metrics
- Apply Goal-Question-Metric
- Metrics derived from Software Metrics, e.g.
Complexity measures - Tailored Hypertext Metrics, e.g. tree impurity
- New Web Metrics, e.g. Brays html sincerity
- Both static and dynamic measures required
13Software Evolution - Web Evolution
- Inspired by Lehman classic studies in Software
Evolution - Metrics here used to study changes over time
- Aim to understand and predict the web evolution
process - Support better, more controlled, web maintenance
and management
14Early Metrics
- Key paper - Measuring Readability and
Maintainability of Hyperdocuments, Hatzimanikatis
et al, JSM, 1995 - Also earlier work by Brown on maintaining
long-life hypertext, Botafogo et al, and Garzotto
et al on hypertext quality - Brays paper, Measuring the Web, WWW5, 1996
15Content Metrics
- Measures to effect editorial control, e.g. spell
checking, grammar checking, content checking
(refereeing) - Measures to gauge readability, e.g. sentence
lengths, vocabulary analysis, font analysis - Measures to gauge usability, e.g. accessibility
checking (Bobby - www.cast.org/bobby),
color-blindness checking
16Structural Metrics
- Metrics derived from Software Metrics
- Per-document metrics LOC - lines of code, COM -
lines of comments, NOM - number of modules
(nodes), number of links (edges), counts of
various html tags, MVG - McCabes Cyclomatic
complexity within the documents (for local
links), fan-in, fan-out, Henry-Kafura/Sheppard
information flow measure - (fan-infan-out)2 - per-site metrics Number of documents, totals and
means of the above measures
17Web Evolution Studies
- Structural metrics used
- Method
- develop and apply measurements over time
- identify patterns of change and develop models of
change - test theories against more case studies
- Fixed intervals of times used instead of release
dates as in classical software evolution studies
18WWW as Software-in-the-large
- Web software evolves and degrades like
conventional software - all forms of change
occur. Perfective, adaptive, corrective, and
preventative maintenance is needed - Speculative or Pre-emptive maintenance is also
needed (e.g. Link Checking) - Maintenance may introduce errors - regression
testing - A large manpower commitment is required
19WWW as Software-in-the-small
- Hypertext is represented as a graph - compare the
call graph of traditional programs - Links are edges Modules (files) are nodes.
- Web pages may contain embedded software, e.g.
cgi, Java, VRML, etc - Users execute links
- Different kinds of links images, frames, etc
- Referential/organizational links
20Web Sites Chosen for Study
- S.T.A. - 20 files (very small)
- Cartercopters - 60 files (small)
- Durham University (part) - 641 (medium)
- Sunderland University (part) - 438 (medium)
- BBC - thousands
- BT - thousands
21SiteSeer and parsley
1 per site
- Web spider
- parsley built using UNIX lex and yacc
- problems of robot exclusion and perturbation
- other tools used to view and manipulate graphs
produced
SiteSeer
parsley
parsley
1 per document
22Early results from studies
- Simple metrics collected easily
- Large quantities of data to analyze
- Index documents easily identifiable -
characterized by high fan-out (and fan-in) -
exhibit very high coupling - Observed that the overall rate of change of a
site increased with the size of the site -
smallest site exhibited no changes during first
study - Growth but never shrinkage!
23Further Results
- Recurrent structures emerged from studies of site
graphs - indices, tours, picture galleries can
all be identified. Possible basis for further
research on web design patterns. - High link density (links per document) is
strongly related to the probability of additional
links being added to the document per unit time.
24Future Web Metrics Research
- Research on hypertext and web metrics is in its
infancy. Existing metrics require better
interpretation (e.g. mapping from low level
measures to high level quality factors). - A unified measurement programme combining
structural metrics with content metrics,
especially usability, is needed.
25Web Site Classification
- Web site evolution studies used a 2-D
classification site size and nature of site
ownership - Others have used size and development focus
(long-lived, short-lived, one-off applications),
site type educational, commercial and
institutional - Some metrics may be more relevant to certain
classes of sites, so determining a web site
classification is a useful step before further
metrics research.
26Proposed Dimensions
- Size
- Domain
- Purpose
- Functionality
- Technology
- Age
- Rate of Change
- Evolution Strategy
27Observations
- It seems likely that the maintenance process
evolves with time and as the site grows, e.g. no
changes, minor corrections, managed re-design,
multi-developer site, multiple data base driven
site. - Evolution is closely related to usability as
following Lehmans first law, it can be
anticipated that a large web application must
undergo continuous change or it will become
progressively less useful.
28Key Points
- Measuring and modeling the WWW allows to study
its structure and contents, and determine quality
factors operationally. - Classification of Web Sites can give us a better
insight into appropriate measures to study their
evolution. - Studying Web sites and how they change over time,
gives us insights in web design and maintenance
processes and their possible improvement.
29Web Management and Design Processes
- Early process models developed based on UK
government web site management practice - Hypermedia/Web Development methods and models -
surveyed and classified - Metrics used here to guide research on process
models for Web Site Engineering
30Web Process Models
- Durham Workbench model for Web Maintenance - 1995
- UK government - CCTA Management Model 1996
- Survey of Hypermedia Design Methods - 1998
- Lowe and Halls Hypermedia Development Process
Model - 1999
31Early Maintenance Process Model
32CCTA Management Model
Web page Development
Customer Awareness
Train in Web page development
External Organization
Database of HTML files
Transfer checked Web pages
Get customer for CCTA
CCTA Customer
CGIS Team
Mirror database of HTML files
Develop Web Pages
Check the page
Guidelines for Web page presentation
CGIS Team
33Hypermedia/Web Development models and methods
Survey
- Design and Development stages from different
methods - abstracted Hypermedia/Web Development stages
- Relationship hierarchy of various methods
34Method Relationship Hierarchy
Hypermedia Design Methods
EORM Lange
HDM Garzotto
OOHM Hendrix
STDT Bichler
HM-Data Maurer
RMM Isakowitz
Semi-formal methods
OOHDM Schwabe
Web Architect Takahashi
Extended RMM
Database Design Techniques
Object Oriented Techniques
35Stages from Methods
36Hypermedia Development stages
Feasibility Requirements Analysis
Requirements document
Conceptual Design
Conceptual design diagrams e.g. ER diagrams or
classes, etc
Navigational Design
Nodes and links, navigational diagrams, e.g.
Slices
User Interface Design
Interface and screen diagrams
Conversion Implementation
Finished Product
Testing Maintenance
37Lowe and Halls Development Process Models
- found in their book - Hypermedia the Web An
Engineering Approach - Based on Traditional Software Engineering process
models the Waterfall, Prototyping, Spiral Model. - Fullest model includes project planning, risk
management and project management with overall
system architecture, system design and
application partitioning.
38Web Process Improvement and CMM
- Process models, methods, plus metrics provide the
basis for Web Process Improvement - Companies need guidance on how best to improve
their existing Web development and maintenance
practices - Web Usability Engineering is a good starting
point as it is key when a company is trying to
attract and keep its Web site customers
39Web Site Engineering projects
- Web-SEM project - Establishing Effective Web Site
Evaluation metrics - building on earlier web
metrics and evolution studies combined with web
usability and quality metrics - Small projects on web site re-engineering, klone
detection, web site re-use (web-in-a-box), web
site life cycle and automating recurrent
maintenance activities.
40Improving Basic Web Site and Web Product Design
- Popularising the concept of web engineering -
taking a systematic and disciplined approach to
engineering web applications among small and
medium enterprises - Business Informatics project - Working with the CACDP to migrate all their
products and services to the WWW, helping them to
develop a well-defined engineering approach along
classic SPI/CMM lines - CASTLE project
41Web Site Engineering
- Applying and adapting classical Software
Engineering models, methods and tools to the
engineering of web-based applications, e.g. web
pages, web sites, web applications in general. - Special case of Distributed System Engineering
- Closely related to Software Engineering but also
recognizing important differences, e.g. periodic
nature of web maintenance, pre-emptive nature of
maintenance, faster rates of change, larger
scales of deployment/usage than in classical
software.
42Towards Web Site Engineering
- Well developed models of Web development and
maintenance processes - Web Software Quality Determination
- Models and Laws of Web-based Software Evolution
- Support for distributed developers - Computer
Supported Co-operative Working applied to Web
Site Engineering
43Key Points
- Web developers and maintainers can learn from
Software Engineering - Web metrics can help to evaluate, to describe,
and to develop new approaches to web engineering
processes and products - Software Engineering can provide a foundation for
Web Engineering - Web Evolution studies provide guidance
44Software Engineering
Distributed System Engineering
Software Metrics and SPI
Human-Computer Interaction
Hypermedia (Graph) Theory
Usability Metrics
Hypertext Metrics
Open Hyper- media Design
Web-based Software Engineering
45References
- Warren et al, The Evolution of Websites, IWPC99.
- Warren et al, Characterising Evolution in Web
Sites Some Case Studies, WSE99. - Boldyreff et all, Web-SEM Project Establishing
Web Site Evaluation Metrics, WSE2000 - other links from www.dur.ac.uk/cornelia.boldyreff
- Boldyreff et al, Establishing a measurement
Programme for the World Wide Web, SAINT01. - Kyaws technical report - Survey of Hypermedia
Design Methods, CS-3-98 - Lavery, Designing Web Site Usability, on-line at
www.dur.ac.uk/janet.lavery - and other links on Janets pages