Inferring Demographic Attributes of Anonymous Internet Users - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Inferring Demographic Attributes of Anonymous Internet Users

Description:

Most Internet users are anonymous. The Solution. 1 Post ... 3 Make survey. Anyway. Target Of This Research ... Information retrieval technique to create vector ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 21
Provided by: qian53
Category:

less

Transcript and Presenter's Notes

Title: Inferring Demographic Attributes of Anonymous Internet Users


1
Topic
Inferring Demographic Attributes of Anonymous
Internet Users
Web Mining Seminar Qian, Jun
2
Structure
  • 1 Abstract
  • 2 Introduction
  • 3 Approach
  • 4 Conclusion

3
Abstract
  • Anonymous internet uses
  • Advertisement demographic attributes
  • Usage information
  • Latent Semantic Analysis
  • Neural Model

4
Structure
  • 1 Abstract
  • 2 Introduction
  • 3 Approach
  • 4 Conclusion

5
The Problem
  • Web advertisers want to target customers with
    certain demographic attributes
  • Most Internet users are anonymous

6
The Solution
  • 1 Post the ad on relevant web-sites
  • 2 Wait for the search term of the users
  • 3 Make survey
  • Anyway...

7
Target Of This Research
  • Build a high-quality database to establish the
    possibility of inferring up to 6 demographic
    factors to those whose demographic information is
    not otherwise available

8
Methodology
  • Collect usage information
  • Prepare usage information-LSA
  • Create a neural model

9
LSA Overview
  • Information retrieval technique to create vector
  • Like create a single vector representing an
    internet user of interest
  • Combination of vectors and a vector

10
Vector-space Information Retrieval
  • Documents are vectors of terms d(t1,t2,tn)
  • A query is a vector of terms as well
    q(t1,t2,tn)
  • Term-by-Document matrix/Row-by-Column

11
The Singular Value Decomposition (SVD)
  • Decompose txd term-by-document matrix A , A
    TSDt, into
  • a txk matrix T of term vectors
  • the transpose of a dxk matrix of document vectors
  • a kxk diagonal matrix S of singular value ,
    define 100ltklt300

12
Structure
  • 1 Abstract
  • 2 Introduction
  • 3 Approach
  • 4 Conclusion

13
Collect Background Information
  • Target----- a collection of documents consisting
    of popular web pages accessed by internet users
  • Procedure--a web-crawler was used,web pages with
    less than 4k bytes in size were accessed

14
Create A Term By Document Matrix
  • Target------ Create term-by-document matrix from
    the document collection as input
  • Procedure--SMART software from Cornell University

15
Perform A SVD On The Term-document Matrix
  • Target------an LSA vector representing all the
    usage data associated with each Internet user of
    interest
  • Procedure--Compute the sum of the vectors in the
    matrix T, scale the resulting vector by the
    inverse of the matrix S, add the document vectors
    representing the web pages accessed by the
    Internet user to the pseudo-document vector
    created in the previous step

16
Create A Neural Model To Test The Hypothesis
  • Model----- 3-layer neural model
  • Training--- independent dependent variables
  • Number----40000 observations for training, 20000
    observations for validation

17
Variables
Variables Gender Age Under 18 Age 55 Income
Under 50000 Marital Status Some College
Education Children in the Home
Possible Values male, female true,false true,false
true,false single,married true,false true,false
18
Training
  • Training Data contain equal proportions of the
    values of the dependent variable under
    consideration
  • Validation Datacontain true proportions of the
    values of the dependent variable under
    consideration

19
Structure
  • 1 Abstract
  • 2 Introduction
  • 3 Approach
  • 4 Conclusion

20
Conclusion
  • It is really possible to make demographic
    inferences about Internet users for whom
    information is not otherwise available
  • Privacy concern
Write a Comment
User Comments (0)
About PowerShow.com