An Analysis of P3P Deployment

About This Presentation
Title:

An Analysis of P3P Deployment

Description:

Red birds on Froogle and Yahooligans most likely. Collect ... by Froogle and ... Froogle list sites with delivery company. Least likely by. Government web ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 34
Provided by: temp334
Learn more at: https://zoo.cs.yale.edu

less

Transcript and Presenter's Notes

Title: An Analysis of P3P Deployment


1
An Analysis of P3P Deployment
  • Hyun Jin Kim
  • Sensitive Information
  • in a Wired World
  • November 11, 2003

2
Introduction
  • Privacy Policies
  • US self-regulatory approach to online privacy
    protection
  • Description of a companys data practices
  • What information they collect from individuals
    and what they do with it

3
P3P Specifications
  • Developed by World Wide Web Consortium (W3C) over
    5 years of work
  • Became an official W3C Recommendation just over
    a year ago on April 16, 2002

4
P3P Specifications
5
P3P Evaluation System Design
  • Automated process to measure P3P adoption and
    gather data from P3P-enabled web sites
  • By Lorrie Faith Cranor, Simon Byers, and David
    Kormann (ATT Labs-Research)
  • Five major components
  • URL Collection Mechanism
  • P3P Policy Retriever
  • Scripted Interface to the W3C P3P Validator
  • P3P Policy Evaluator
  • Generic Data Analysis Tools

6
URL Collector
  • To identify sets of sites of interest
  • Existing lists of URLs
  • Newly constructed lists that focus on particular
    web sites
  • Web spidering technique
  • Gather information from web directories and other
    sources

7
P3P Policy Retriever
  • Pearl Script to retrieve P3P information
  • All policies, policy reference files, compact
    header policies

8
P3P Validator
  • W3C P3P Validator
  • Fetches P3P policy reference files, policy files
    and compact policies
  • Checks them for compliance with the P3P 1.0
    Specification
  • Stops validation upon encountering an error
  • Scripted interface to the W3C P3P Validator
  • Retrieve P3P policies from sites with errors in
    their policy reference files

9
P3P Policy Evaluator
  • Compares a web sites policy with a users
    privacy preferences
  • Finds a mismatch between the P3P policy and the
    privacy preferences

10
Data Analysis
  • Outputs of policy evaluations gathered in a
    rectangular matrix
  • Row policy from a web site
  • Column APPEL rule set file
  • Run a Pearl script over the matrix
  • Produce various tabulations
  • i.e., number of sites that returned mismatch
    between privacy preferences and P3P policies

11
Web Site Selection
  • Focus on the sites frequently visited by users
  • PFF Most Popular
  • 85 of the 100 busiest sites determined by the
    October 2001 Nielsen/NetRatings ranking of sites
    with the most unique visitors per month
  • Excludes adult sites, childrens sites,
    business-to-business sites, and sites not in the
    .com top level domain
  • PFF Random
  • Random sample of 302 of the 7821 domains with at
    least 39,000 unique monthly visitors in October
    2001 by Nielsen/NetRatings
  • PFF Refined Random
  • 209 domains from the PFF Random list that were in
    the top 5,625 domains in October 2001 by
    Nielsen/NetRatings
  • Excludes adult sites, childrens sites,
    business-to-business sites, and non-dot-coms
  • Netscore Top 500
  • 500 domains with the most unique visitors during
    July 2002 by comScore Media Matrix netScore
    Standard Traffic Measurement report
  • Key Measures
  • Top 500 domains with the most unique visitors
    during July 2002 by comScore Media Matrix Key
    Measures report
  • Includes third-party sites

12
Web Site Selection (Cont.)
  • Alexia
  • Top 500 domains by Alexia Traffic Ranking on
    Feb.4, 2003
  • Includes non-US domains and adult sites
  • Froogle
  • 1,017 sites obtained by crawling the
    www.froogle.com web sites in April 2003
  • Sites offer products for sale
  • Yahooligans
  • 900 sites obtained by crawling www.yahooligans.com
    in April 2003
  • Sites for children ages 7-12
  • Firstgov
  • 344 government sites indexed at www.firstgov.gov
    in April 2003
  • Includes US federal and state government sites
    and sites for some quasi-government organizations
  • News
  • 2,429 sites by news.google.com in April 2003
  • Includes a variety of news-reporting
    organizations from the US and other countries

13
P3P Adoption on May 2003
14
P3P Adoption (Cont.)
  • P3P adoption increasing over time
  • Highest for the most popular web sites
  • Key Measures site lists higher than Netscore
  • Presence of third-party sites
  • To avoid having their cookies blocked by IE6
  • Alexa top 500 list lowest
  • International nature
  • Large number of adults sites
  • One third of the P3P-enabled sites had errors
    flagged by W3C P3P Validator
  • 7 had errors that prevented their evaluation by
    Privacy Bird evaluation engine
  • Omit required components of a P3P policy
  • Improperly referencing data elements

15
Privacy Bird Evaluation
  • Definition of not sharing data
  • Sites share data only with agents that use it
    only to complete the transaction for which it was
    provided or with delivery companies
  • Data sharing occurs only under an opt-in policy
  • 3 standard settings
  • Low
  • Trigger a red bird policy does not match the
    preferences
  • Collects health/medical info
  • Share it with other companies
  • Use it for analysis, marketing or to make
    decisions what content or ads the user sees
  • Engage in marketing but do not provide a way to
    opt-out

16
Privacy Bird Evaluation (Cont.)
  • Medium
  • Same as low
  • Sites sharing PII (physical contact info, online
    contact info, government-issued identifier),
    financial info, or purchase info with other
    companies
  • Sites collecting PII but provide no access
    provisions
  • High
  • Same as medium
  • Sites sharing any personal info (including
    non-identified info) with other companies
  • Use it to determine the users habits, interests,
    or other characteristics
  • Sites contacting users for marketing
  • Sites using financial or purchase info for
    analysis, marketing, or to make decisions that
    may affect what content or ads the user sees

17
Privacy Bird Evaluation (Cont.)
18
Privacy Bird Evaluation (Cont.)
  • Red bird on 24 of the evaluated sites
  • No opt-out of marketing and/or telemarketing
    ability offered
  • Most popular sites receive both green bird on low
    setting and red bird on high setting
  • Green bird - Greater awareness of the importance
    of the choice principle
  • Red bird - Most offer rich ecommerce environments
    that rely heavily on targeted marketing and
    profiling visitors
  • Red birds on Froogle and Yahooligans most likely
  • Collect health and medical info

19
Types of Data Collected
20
Types of Data Collected (Cont.)
  • Most collected data
  • Computer info and click stream info
  • HTTP protocol used for retrieving content from
    website
  • Demographic data
  • Less by Froogle and govt web sites
  • Online contact info, physical contact info,
    interactive data, unique ids
  • Mostly by news web sites
  • Preference info, purchase info, and state
    management info (cookies)
  • Fewer collected financial info (excludes purchase
    process)
  • Least collected data
  • Content (email msgs, bulletin board postings,
    etc.)
  • Government-issued identifiers
  • Health information
  • Political information
  • Location information (ie. GPS positioning data)
  • Information not falling into any other
    pre-defined categories
  • No government websites collect government-issued
    identifiers

21
Data Usage
22
Data Usage (Cont.)
  • Almost all websites used data for
  • Completion and support of the activity for which
    data was provided
  • Web site and system administration
  • Research and development
  • Majority of sites used data for
  • Email and postal mail marketing
  • One-time tailoring of the site content
  • Two-forms of pseudonymous profiling
  • Fewer sites used data for
  • Telemarketing
  • Profiling in which individuals are identified by
    name or other PII
  • Very few sites used data for
  • Historical preservation (Not by government sites)
  • Other purposes that do not fall into these
    categories
  • News web sites use data for almost every purpose.

23
Data Recipients and Sharing
24
Data Recipients and Sharing (Cont.)
  • Half the websites share PII with parties other
    than agents who use data for the purpose for
    which it was provided
  • Most likely by
  • News web sites
  • Froogle list sites with delivery company
  • Least likely by
  • Government web sites

25
Choice Options
26
Choice Options (Cont.)
  • Top sites most likely to engage in marketing than
    less popular sites
  • Top sites most likely to offer choices
    (opt-in/out)
  • Internal choices (telemarketing and other
    marketing) offered more opt-out than opt-in
  • Third-party choices offered more opt-in than
    opt-out

27
Access Provisions
28
Access Provisions (Cont.)
  • 92 of sites collecting identified data provides
    some access provisions
  • Most provides access to both contact info and
    other data
  • Smaller number provides access to only contact
    info or to all identified data
  • Very few provides no access
  • None provides access only to non-contact info

29
Dispute Resolution Options and Remedies
30
Dispute Resolution Options and Remedies
  • Individuals can contact customer service to
    resolve their disputes on most sites
  • About one-third offered resolution via
    independent organization (ie. Privacy seal
    provider)
  • by most popular sites
  • Very few indicated resolution of dispute under an
    applicable law
  • Almost none indicated resolution in court

31
Data Retention Policies
32
Data Retention Policies (Cont.)
  • Majority did not have a data retention policy for
    all of the data they collected
  • Government web sites more likely to have a policy
    of not retaining info or to have a retention
    policy based on a legal requirement

33
Conclusion
  • P3P adoption is increasing over time, especially
    for the most popular web sites
  • Yahooligans (sites for children) most likely to
    offer opt-in policies
  • Large number of websites with technical errors in
    their P3P policies
  • Debates continue about the need for further
    privacy legislation and the effectiveness of
    industry self-regulation in the privacy area.
  • Essential to have good statistics and privacy
    policies
  • US government web sites began posting P3P
    policies to comply with the privacy requirements
    of section 208 of the E-Government Act of 2002
  • Continue web sweeps of govt web sites to monitor
    compliance with these requirements
Write a Comment
User Comments (0)