Web Usage Mining for Website Design Improvement - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Web Usage Mining for Website Design Improvement

Description:

Web Usage Mining for Website Design Improvement. I-Hsien (Derrick) Ting ... 3. The Step 2 of the KDD Process. 4. The Step 3 and Step 4 of the KDD Process ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 39
Provided by: imNu
Category:

less

Transcript and Presenter's Notes

Title: Web Usage Mining for Website Design Improvement


1
Web Usage Mining for Website Design Improvement
  • I-Hsien (Derrick) Ting
  • Department of Computer Science
  • The University of York
  • 3/November/2006

2
Outline
  • 1. Introduction
  • The KDD Process for Website Design Improvement
  • 2. The Step 1 of the KDD Process
  • 3. The Step 2 of the KDD Process
  • 4. The Step 3 and Step 4 of the KDD Process
  • 5. Closing the KDD Loop
  • 6. An empirical Study
  • 7. Conclusion

2
3
1. Introduction (1)
  • The concept behind the research project

4
1. Introduction Cont.
  • Website design in an important success factor for
    a website
  • Understanding the browsing behaviour of users is
    a good way for improving a websites design
  • Web Usage Mining is a technique that can help to
    understand users browsing behaviour (Kohavi 2001)

Figure An example of website design improvement
(Kohavi 2003)
5
1. Introduction Cont.
  • The KDD (Knowledge Discovery in Database) Process
    for Website Design Improvement

Figure The KDD Process for Web Site Design
Improvement (Lee et al. 2001)
6
1. Introduction Cont.
  • The aims of the research
  • Developing and using web usage mining techniques
  • To understand the browsing behaviour of users
  • To improve websites design
  • Closing the KDD loop
  • Few researchers focus on how to close the KDD
    loop and the loop has never been closed (Kohavi
    2001, Ansari 2001)

7
2. The Step 1 of the KDD Process-Data Collection
  • The Clickstream data format
  • 83.151.206.241 - - 04/Dec/2004181035 0000
    "GET /storedetail-2-product_id-305054 HTTP/1.1"
    200 31898 "http//www.google.com/search"
    "Mozilla/4.0 (compatible MSIE 6.0 Windows 98)"

8
2. The Step 1 of the KDD Process Cont.-Data
Pre-processing
  • Raw Clickstream data usually full of noises,
    incomplete and unnecessary data
  • Common Data Pre-processing Process

Figure A Common Data Pre-Processing Process
(Cooley et al., 1999 Eirinaki et al., 2003)
8
9
2. The Step 1 of the KDD Process Cont.-Data
Pre-processing (Novel Contribution)
  • Problems of the common data pre-processing
    process
  • Many accesses are made by Bot
  • Data lost due to caching
  • Backward browsing
  • The modified data pre-processing process

Figure The Modified data pre-processing process
9
10
2. The Step 1 of the KDD Process Cont.-Data
Pre-processing
  • Bot detection and cleaning
  • A Bot list
  • The behaviour of Bot
  • The user requests robots.txt file
  • Request different web pages at the same time
  • Low percentage of the Clickstream data with
    referrer

11
2. The Step 1 of the KDD Process Cont.-Data
Pre-processing
  • The PRM Algorithm
  • It can help to reconstruct the lost Clickstream
    data and incomplete users browsing path
  • The Algorithm
  • Checking the referrer information
  • Checking the websites structure

Server-side Clickstream data Page1.htm Page2.htm
Page3.htm
Client-side Clickstream data Page1.htm Page2.htm
Page1.htm Page3.htm
Cooley, R. et al. 1999
11
12
2. The Step 1 of the KDD Process Cont.-Data
Pre-processing
  • An example of the PRM algorithm

() 61.59.121.221, 160249, http//www-users.cs.y
ork.ac.uk/kimble/, -, Restored (1)
61.59.121.221, 160254, http//www-users.cs.york.
ac.uk/kimble/research/research.html,
http//www-users.cs.york.ac.uk/kimble/ (2)
61.59.121.221, 160300, http//www-users.cs.york.
ac.uk/kimble/teaching/teach.html,
http//www-users.cs.york.ac.uk/kimble/
() 61.59.121.221, 160249, http//www-users.cs.y
ork.ac.uk/kimble/, -, Restored (1)
61.59.121.221, 160254, http//www-users.cs.york.
ac.uk/kimble/research/research.html,
http//www-users.cs.york.ac.uk/kimble/,
Original () 61.59.121.221, 160257,
http//www-users.cs.york.ac.uk/kimble, -,
Restored (2) 61.59.121.221, 160300,
http//www-users.cs.york.ac.uk/kimble/teaching/te
ach.html, http//www-users.cs.york.ac.uk/kimble/
, Original
13
3. The Step 2 of the KDD Process-Pattern
Discovery and Analysis
  • Web usage mining techniques
  • Basic statistical method (Srivastava et al.,
    2000)
  • The web page index.htm has been viewed average 20
    times per week
  • Clustering Classification
  • Grouping users who have similar browsing
    behaviour
  • Association Rule Mining
  • The user who view index.htm and also view
    product.htm, the support0.5 the confidence0.6
  • Sequential Mining
  • 30 users browsing behaviour follow the
    sequential pattern web page A, web page B then
    web page C

14
3. The Step 2 of the KDD Process Cont.- Pattern
Discovery and Analysis
  • Novel Web Usage Mining Techniques
  • Footstep Graph
  • A Clickstream data visualisation tool
  • APD (Automatic Pattern Discovery) Method
  • Discovering some pre-identified patterns
    automatically
  • Distance-based Association Rule Mining
  • Distance The third measurement of Association
    rule Mining

15
2. The Step 2 of the KDD Process Cont.- Pattern
Discovery and Analysis
  • A Visualisation Tool Footstep Graph

R O U T E
Time
A.htm?B.htm?C.htm?D.htm?C.htm?E.htm
15
16
3. The Step 2 of the KDD Process Cont.-APD
Method Some Interesting Patterns
Mountain Pattern
Upstairs Pattern
Index?produc1?product2?shopping cart?checkout
Index?product_index?product1?product1_price?produc
t1_price?product1?product_index
Valley Pattern
Fingers Pattern
Index?product1?index?product2?index?product3
Index?product1?product2?product3?index?product4
17
3. The Step 2 of the KDD Process Cont. The APD
Method
  • An automatic way to discover pre-identified
    patterns
  • Users browsing route transformation
  • Transforming users browsing route to number-based
    sequence

Users Browsing Route0,10,0,20,0,30,0,40,0
18
3. The Step 2 of the KDD Process Cont.-The APD
Method
  • Level-1 and Level-2 users browsing elements
  • Level-1 elements
  • Browsing Trend
  • Same 0, 0 1, 1
  • Up 1, 2
  • Down 2,1 7, 0
  • Level-2 elements
  • Turning Point
  • Peak Up, Down
  • Trough Down Up

19
3. The Step 2 of the KDD Process Cont.-The APD
Method
  • An example

0 1 2 0 1 3 4 0 5 0 6 0 7
0 1 0 7
20
3. The Step 2 of the KDD Process Cont.-The APD
Method
21
3. The Step 2 of the KDD Process
Cont.-Distance-based Association Rule Mining
Short Stairs Pattern
Long Stairs Pattern
Short Fingers Pattern
Long Fingers Pattern
22
3. The Step 2 of the KDD Process
Cont.-Distance-based Association Rule Mining
  • To discover the association between web pages
  • E.g. The people who view University Home Page
    then view Computer Science Department page (Rule
    A)
  • Support Rule A/All Sessions
  • Confidence Rule A/All Rules from University home
    page
  • Distance From University home page to CS
    Department page

Distance10
The concept of distance measurement in
Association Rule
23
3. The Step 2 of the KDD Process-Distance-based
Association Rule Mining
Top The people who view Universitys home page
also view (Frequencygt10 and Distancegt5)
24
4. The Step 3 and Step 4 of the KDD
Process-Recommendation and Action
  • Recommendation
  • The analysis results must be reviewed from
    different aspects
  • Three ways to generate recommendations
  • Automatically
  • Semi automatically
  • Manually

Figure The process of generate the actionable
recommendation (Adopted and Modified from
Perkowitz and Etzioni 2000)
25
4. The Step 3 and Step 4 of the KDD
Process-Action
  • Action
  • Actionable recommendation
  • Cost
  • Appropriate techniques
  • Valuable or interesting enough for the website
  • Improving the design of website
  • Modifying the content of a web page
  • Adding or removing Links
  • Changing the layout of the web page
  • Changing the structure of the web site
  • Completely redesign the website

26
4. The Step 3 and Step 4 of the KDD
Process-Recommendation and Action
  • A heuristic for website design improvement
  • Based on APD and Distance-based Association Rule
    Mining

Figure A sample heuristic for website design
improvement
27
5. Closing the KDD Loop
  • Closing the KDD Loop
  • Sequentially
  • The four steps of the KDD process must be done
    step by step
  • Completely
  • All of the four steps must be done completely
  • Smoothly
  • No any gap in between any two steps of the KDD
    process

Figure The KDD Process for Website Design
Improvement
28
6. Empirical Study- Channel 6 Website
  • We collaborated with an E-commerce website design
    company
  • Channel 6 Multimedia
  • Analysis Target Channel 6 website
    (http//www.ch-6.co.uk)
  • The Company provided us Clickstream data for the
    period of 6 months
  • A website designer for us to discuss

29
6. Empirical Study-Potential Problem of the
website
30
(No Transcript)
31
Service page-Solutions
Service index page
Service page-Microstyle
32
6. Empirical Study-Recommendation and Taking
Action
  • Recommendation
  • Is it possible to provide cross linking table in
    the top or bottom of each service related page?
  • Action

Channel 6 Website http//www.ch-6.co.uk/services.
asp
33
6. Empirical Study- Performance Evaluation
  • Evaluation Criteria
  • Distance
  • The Amount of Fingers or Downstairs Pattern
  • Results

34
6. Empirical Study-Performance Evaluation
35
Distance
T-test Average Change1.82 S2 2.773247059 T4.63
6746577 P1.740 (significance level0.05) The
distance after the changing of the website is
shorter than before, which is achieving the
significance level.
36
Fingers and Mountain Pattern
T-test Average Change32.67778 T4.597186
P1.740 (significance level0.05) The
percentage of Fingers and Mountain pattern after
the changing of the website is lower than before,
which is achieving the significance level.
36
37
7. Conclusion
  • Web usage mining is helpful to understand the
    browsing behaviour of users
  • The websites design can be improved through the
    KDD process for website design improvement
  • The techniques that developed in this research
    can be treat as a toolkit, which can help a
    website to improve its design.
  • This research provides one way to close the KDD
    loop

38
Thanks for Your AttentionAny Question?
Write a Comment
User Comments (0)
About PowerShow.com