Title: Sustainability: Web Site Statistics
1Sustainability Web Site Statistics
- Marieke Napier
- UKOLN
- University of Bath
- Bath, BA2 7AY
Email m.napier_at_ukoln.ac.uk URL http//www.ukoln.ac
.uk/
UKOLN is supported by
2Web Site Statistics
- This presentation will
- Give a (very) brief overview of what Web
statistics are - Consider why we need them
- Focus on the analysis of usage data created by
your Web site - Look at what other criteria, besides Web server
statistics, can be used to provide performance
indication
3What are Web Statistics?
- Web statistics are produced by the Web server
software - Information (such as IP address, name of
resource) is recorded in a log files - It is also possible to configure your server to
record more information (such as referrer
details) - The log files produced are mainly accurate
- However interpretation of the statistics can be
misleading
4Why do we Need Them?
- They indicate how popular your site is
- They show how successful your marketing strategy
has been - They can be used in management reports
- They can identify gaps in service provision
- They predict and plan for future load patterns
- They allow you to monitor performance levels
- They can be used in consideration of deployment
of new technologies - They can inform and motivate contributors
- They can show who your users are
- NOF have asked for them
5The HTTP Process
- A user clicks a link or enters a URL
- The remote web server downloads the HTML page
- The HTML page is interpreted and any inline
objects are also downloaded - Each image (occurrence of ltIMG SCRimage1"gt)
- Background image or sound
- External JavaScript or stylesheet files etc.
- The user follows a path through the site making
new requests till they leave your site
Summary Each individual users request for a page
can produce multiple requests at the remote
server and generate multiple hits.
6Viewing Web Statistics
- Server log files are available to viewbut may
not make a lot of sense on first look - The Analog program (Cambridge University) was one
of the first packages to provide a graphical
summary of web log file.
http//www.statslab.cam.ac.uk/sret1/stats/stats.h
tml
7Web Statistics Terms Used
- Hit
- Any information requested from a site - this
includes HTML pages, pictures, forms, scripts and
files downloaded - Can be affected by redesign, robots, caching etc.
- Page Views (or requests/impressions)
- The number of pages viewed
- Extensions such as .htm, .html, .asp etc.
- User Sessions
- Series of requests from unique IP address within
a period of time (more accurate if registered
users - Issues with firewalls, institutional caches etc.
8Interpretation Issues
- Profiling users - can we track users easily?
- You cant tell the exact identity of your users
- Using IP addresses, domain names of visitors
- Following paths entering and exiting the site
- Registration
- Caching
- Browser caching and institutional/ISP caching
- Robots
- Necessary enable your resources to be found
- Robots generate hits
- Quality??
9Log Analysis Tools
- There are many tools available
- Analog free, easily automated. However little
data-mining capabilities and management graphs
limited. - WebTrends Popular desktop package. Several
versions. May be expensive for reporting on
multiple Web sites. - Webaliser, WebVisit, HitList, Reportmagic etc.
- A list is available at http//uk.dir.yahoo.com/Com
puters_and_Internet/Software/Internet/World_Wide_W
eb/Servers/Log_Analysis_Tools/
10Externally-Hosted Services
- Two services have been used extensively by UKOLN
SiteMeter and NedStat - Advantages
- No software to buy, install, configure and run or
powerful PC to run software on - No log files to manage
- Uses "cache-busting" images
- Can monitor extra features
- Disadvantages
http//www.sitemeter.com/
- Limited data-mining
- Lloss of Ownership of data
- Dependency on external service
- Fails to monitor text browsers
http//www.nedstats.com/
11Other Performance Indicators
- Links to Your Site
- Indicators that people are interested in your
service (and can deliver traffic) - Search Engines Coverage
- Indicators that users can find resources on your
Web site - User Feedback
- Comments, voting, etc.
- Technical Indicators
- Browser support, broken links, server-uptime,
etc.
12Links To Your Site
- Links are an indication of potential use of your
Web site - Search engines can be used to report on the
numbers of links to a Web site - LinkPopularity.com provides an interface to 3
search engines - Monthly reports can be obtained
http//www.linkpopularity.com
13Coverage By Search Engines
- Have you promoted your Web site?
- Can your Web site be accessed by search engines?
- Are you near the top of the search results?
- Search engines can report on their coverage of
your Web site - Coverage is an indication of potential use of
your Web site
For information on how to ensure that your web
site has been indexed see the section on
Promotion of your Project Web
14Technical Indicators
- Broken Links
- How many links are there on your Web site
(internal and external)? - How many broken links are there?
- Use services like linkalarm.com
- Server Availability
- Recording down time
- Email alerting
- Use services like InternetSeer.com
15Conclusions
- Web statistics can be difficult to interpret
- Analysis of Web statistics is needed for lots of
reasons - Think about the tools you will need (and the
resource implications in using them) - Besides analysis of log files there are other
performance indicators which may be of use - Analysis will also help with in monitoring the
performance of your Web site and planning future
developments
16Any Questions?
- This presentation is loosely based on the
Information Paper on Web Site Performance
Monitoring available at
http//www.ukoln.ac.uk/nof/support/help/papers/per
formance/