E-Commerce - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

E-Commerce

Description:

E-Commerce Outline Introduction Customer Data on the Web Automated Recommender Systems Networks and Recommendations Web Path Analysis for Purchase Prediction ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 34

Provided by: cjen2

Learn more at: http://ibook.ics.uci.edu

Category:

more less

Transcript and Presenter's Notes

Title: E-Commerce

1
E-Commerce
2
Outline

Introduction
Customer Data on the Web
Automated Recommender Systems
Networks and Recommendations
Web Path Analysis for Purchase Prediction

3
Introduction

Some Motivating Questions
Can we design algorithms to help recommend new
products to visitors based on their browsing
behavior?
Can we better understand factors influencing how
customers make purchases on a website?
Can we predict in real time who will make
purchases based on their observed navigation
patterns?

4
Customer Data on the Web

Data collection on client, server sides and
anywhere in between
Goal determine who is purchasing what products
Tracking customer data
Web logs, E-Commerce logs, cookies, explicit
login
Data then used to provide personalized content to
site users to
Assist customers in locating their target
selections
Encourage customers to make certain selections

5
Automated Recommender Systems

Problem framed in two ways
Users vote for pages/items (binary)
Users rank pages/items (multivalued)
Results are captured in a generally sparse matrix
(users x items)
Complication no votes can occur because users do
not vote on items they do not like (Breeze, et al
1998)
Ignored by most recommender systems

6
Automated Recommender Systems
7
Evaluating Recommender Systems

Cautions in data interpretation
Users may purchase items regardless of
recommendations
Users may also avoid purchases they might have
made based on recommendations
Approaches to recommender algorithms
Nearest-neighbor
Model-based collaborative filtering
Others?

8
Nearest-Neighbor Collaborative Filtering

Basic principle utilize users vote history to
predict future votes/recommendations
Find most similar users to the target user in the
training matrix and fill in the target users
missing vote values based on these
nearest-neighbors
A typical normalized prediction scheme
goal predict vote for item j based on
other users, weighted towards those with
similar past votes as target user a

9
Nearest-Neighbor Collaborative Filtering

Another challenge defining weights
What is the most optimal weight calculation to
use?
Requires fine tuning of weighting algorithm for
the particular data set
What do we do when the target user has not voted
enough to provide a reliable set of
nearest-neighbors?
One approach use default votes (popular items)
to populate matrix on items neither the target
user nor the nearest-neighbor have voted on
A different approach model-based prediction
using Dirichlet priors to smooth the votes (see
chapter 7)
Other factors include relative vote counts for
all items between users, thresholding, clustering
(see Sarwar, 2000)

10
Nearest-Neighbor Collaborative Filtering

Structure based recommendations
Recommendations based on similarities between
items with positive votes (as opposed to votes of
other users)
Structure of item dependencies modeled through
dimensionality reduction via singular value
decomposition (SVD) aka latent semantic indexing
(see chapter 4)
Approximate the set of row-vector votes as a
linear combination of basis column-vectors
i.e. find the set of columns to least-squares
minimize the difference between the row
estimations and their true values
Perform nearest-neighbor calculations to project
predictions for all items

11
Model Based Collaborative Filtering

Recommendations based on a model of
relationships between items based on historical
voting patterns in the training set
Better performance than nearest-neighbor analysis
Joint distribution modeling
Uses one model as basis for predictions
Conditional distribution modeling
A model for each item predicting future vote
based on votes for each of the other items

12
Model Based Collaborative Filtering

Joint distribution modeling A practical approach
Model joint distribution as a finite mixture of
simpler distributions
Additional simplification is achieved by assuming
that votes are independent of others within a
component
Limitation assumes that users can be described
with one model of the K mixture components
Hoffman and Puzicha (1999) propose a workaround
asserting that each row of votes represents up to
K mixture components, rather than a single
component

13
Model Based Collaborative Filtering

Another limitation all predictions are based on
the (static) training set
Conditional distribution modeling
Better results by creating a model for each item
conditioned on the others rather than using a
single joint density model
Decision trees Heckerman et al. (2000)
Greedy approach to approximate tree structure
Predictions are made for each item not purchased
or visited
Performance
Accuracy nearly equal to Bayesian networks
Offline memory usage significantly less than
Bayesian networks
Offline computation time complexity better than
Bayesian networks

14
Model-Based Combining of Votes and Content

Combine content-specific information with other
information (e.g. structure, vote)
Useful for determining item similarity (Mooney
and Roy 2000) and creating user models
Useful when there is no vote history
Implementation (Popescul et al 2000)
Extension of (Hoffman and Puzicha 1999)
Joint density is determined assuming a hidden
latent variable making users, documents, and
words conditionally independent i.e.

15
Model-Based Combining of Votes and Content

The hidden variable represents multiple (hidden)
topics of a document
Conditional probabilities of the hidden parameter
are calculated using EM
Sparsity still remains a problem for
content-based modeling

16
Challenges

Noisy Data
The same user may use multiple IP
addresses/logins
Different users may use the same IP address/login
Privacy
No cookies!
Changing user habits
Previous history may not accurately predict
present purchase selection
Continuous updating of user activities

17
Networks Recommendation

Word-of-Mouth
Needs little explicit advertising
Products are recommended to friends, family,
co-workers, etc.
This is the primary form of advertising behind
the growth of Google

18
Email Product Recommendation

Hotmail
Very little direct advertising in the beginning
Launched in July 1996
20,000 subscribers after a month
100,000 subscribers after 3 months
1,000,000 subscribers after 6 months
12,000,000 subscribers after 18 months
By April 2002 Hotmail had 110 million subscribers

19
Email Product Recommendation

What was Hotmails primary form of advertising?
Small link to the sign up page at the bottom of
every email sent by a subscriber
Spreading Activation
Implicit recommendation

20
Spreading Activation

Network effects
Even if a small number of people who receive the
message subscribe (0.1), the service will
spread rapidly
This can be contrasted with the current practice
of SPAM
SPAM is not sent by friends, family, co-workers
No implicit recommendation
SPAM is often viewed as not providing a good
service

21
Modeling Spreading Activation

Diffusion Model
Montgomery (2002)
Applied models used in marketing literature, Bass
(1969) to the hotmail phenomena
Similar word-of-mouth networks used in selling
consumer electronics such as refrigerators and
televisions
We want to predict at time t how many individuals
k(t) will adopt the product out of a population
of N possible adopters

22
Modeling Spreading Activation

Diffusion Model
Two ways individuals will subscribe
Direct Advertising
At time t, N k(t) individuals have not
subscribed
a 0 percent of these individuals will subscribe
due to direct advertising
Word-of-Mouth
At time t, there are k(t)(N k(t)) possible
connections between subscribers and
non-subscribers
ß 0 percent of these connections will cause a
non-subscriber to subscribe

23
Modeling Spreading Activation

Combine these and we get the following
expression
Solve this and we get

24
Modeling Spreading Activation
25
Modeling Spreading Activation
26
Modeling Spreading Activation

Diffusion Model
This does not completely model the what actually
occurred
However, it is simple and provides a lot of
interesting (useful) information
Other work
Domingos Richardson (2001) Markov Random Field
Model
Daley Gani (1999) various deterministic and
stochastic models

27
Purchase Prediction

We want to predict whether or not a shopper will
make a purchase
We know demographics
We know page view patterns
Can we accurately predict if the user will make a
purchase or not?

28
Purchase Prediction

Li et al. (2002)
Study 1160 shoppers at www.barnesandnoble.com
between April 1 and April 30, 2002
The data was collected client side so they knew
exactly what pages were displayed to the user
They also knew the demographics (predominantly
well-educated and affluent)

29
Purchase Prediction

Li et al. (2002)
There were 14,512 page views which they divided
into 1659 sessions
Mean 8.75
Median 5
Standard deviation 16.4
Min 1
Max 570
7 of sessions contained a purchase

30
Purchase Prediction

Li et al. (2002)
Divided the pages into 8 classes
Home (H), main page
Account (A), account information pages
List (L), pages with lists of items
Product (P), page with a single item
Information (I), informational pages (shipping
etc.)
Shopping cart (S)
Order (O), indicates a completed order
Entry or Exit (E), entering or leaving the site

31
Purchase Prediction

Li et al. (2002)
Each session was represented by a string of the
form I H H I I L I I E
A session containing an O is considered having
made a purchase
The average length of a session with a purchase
was 34.5 and without was only 6.8

32
Purchase Prediction

Markov transition matrix
For sessions with no purchase

33
Purchase Prediction

Li et al. (2002)
They did several models based on this data
Tested on predicting next page and predicting a
purchase
Best models 64 accurate at predicting next page
After 2 page views the best models predicted 12
true positives and 5.3 false positives
After 6 page views 13.1 true positives and 2.9
false positives

Write a Comment

User Comments (0)