Title: Conceptual and Operational Issues in the Measurement of Internet Use
1Conceptual and Operational Issues in the
Measurement of Internet Use
- Jonathan Zhu
- City University of Hong Kong
- enjhzhu_at_cityu.edu.hk
- Funded by the UGC of HKSAR (CityU1152/00H)
_at_
2Background the Diffusion of the Internet in Hong
Kong, Beijing and Guangzhou
Source J. H. Zhu (2003)
3Internet Penetration Rate in East Asia
4Wired Internet Use vs. Wireless Internet Use
5Diffusion of Cable TV, the Internet, and Mobile
Phone in Hong Kong
6Internet vs. Mobile Phone in Beijing and Guangzhou
7Issues in Measurement of Internet Use and Users
- The size of Internet users in a society is a
function of - Definition of study population (SP)
- Method of sample weighting (SW)
- Requirement of minimal usage (MU)
- The amount of online time by Internet users is
a function of - Definition of study population (SP)
- Method of sampling weighting (SW)
- Method of data collection (DC)
- Treatment of extreme values (EV)
8Criteria for Evaluation of Measurement
- Validity how accurate or correct is the measure
as compared with the truth? - Reliability how precise or stable is the measure
over time and/or across space? - Practicality how efficient or economic is the
measure in data collection and analysis?
9Data
- Hong Kong Survey 2002 telephone interviews of
1,800 residents at 6 and above in Dec. 2002 by
Jonathan Zhu and his team - AC Nielsen/Netratings 2002-03 online tracking of
1,500 Internet users from 811 households in Hong
Kong in Oct. 2002 and Jan. 2003.
10Definitions of Study Population
- WIP-Hong Kong 18-74
- CNNIC 6
- Another popular definition 18
- HK Census 2002
- 6-17 16.4
- 18-74 80.0
- 75 3.6
11Impact of Population Definitions on Internet User
Size
Data Hong Kong 2002
12Requirements of Minimal Usage
13Impact of Minimal Requirements on Internet User
Size
Data Hong Kong 2002
14Age Distribution of the Sample before and after
Weighting
Data Hong Kong 2002
15Impact of Weighting Methods on Internet User Size
Data Hong Kong 2002
16Summary Internet Users by Population, Usage
Requirement Weighting Method
Data Hong Kong 2002
17A Mathematical Model of True Internet Users
(TIU)
- TIU 55.3 1.4SP18-74 - 3.7SP18 - 4.5MU
5.4SW - (Adjusted R2 99.6, Standard Error 0.3)
- Where TIU is the Unadjusted Internet Users ()
for HK in 2002, which should be 1.4 less for a
study population of 18-74, or 3.7 less for a
study population of 18, or 4.5 less if those
use the Internet less than 1 hour per week are
excluded, or 5.4 less if the sample is weighted
based on population census.
18Impact of Population Definitions on Online Time
(at Home)
Data Hong Kong 2002
19Impact of Weighting Methods on Online Time (at
Home)
Data Hong Kong 2002
20Impact of Extreme Values on Online Time (at Home)
Data Hong Kong 2002
21Impact of Data Collection (DC) Methods on Online
Time
Data HKS 2002 Netratings 2002-03
22Summary Online Time by SP, SW, DC, and EV
Data Hong Kong 2002
23A Mathematical Model of True Online Time (TOT)
- TOT 532 16SP18-74 22SW 49EV - 249DC
- (Adjusted R2 93.5, Standard Error 34.3)
- Where TOT is the Unadjusted Online Time (min.)
for HK users in 2002, which should be 16 min.
more for a study population of 18-74, 22 min.
less if the user sample is weighted, 49 min. less
if extreme values are removed, or 249 min. less
if data are collected through online tracking
method.
24Caution Different Definitions of Online
Activities
- Telephone interview data include
- Online time at both home (68) and elsewhere
(32) - Non-HTTP based activities such as using POP3
Email (136 min./week) and other protocols
- Online tracking data include
- Online time only at home
- Only HTTPbased activities protocols).
It is estimated that tracking data may measure
only 51 of the total online time..
25Estimated Distribution of Online Time by Location
and Protocol of Usage
26Conclusion How Many Internet Users Are There?
- The size of Internet Users is significantly
affected by the definition of study population
(SP), the requirement of minimal usage (MU) and
the method of sample weighting (SW). - SP (e.g., general population vs. adults) may
produce a difference of 1-4 and MU (e.g., no
requirement vs. 1 hour per week) up to 5. While
there is no correct definition of SP or MU, it
is important to report the definition and adopt,
whenever possible, multiple definitions. - SW (weighted vs. unweighted) may contribute
another 5 difference. Since Internet use is
highly correlated with age and sex, it seems both
necessary and effective to weight the sample to
ensure the accuracy of the measurement.
27Conclusion How Much Time Do They Spend Online?
- The amount of online time is marginally affected
by SP (p 0.3) and SW (p 0.2) probably due to
the fact the base of analysis is already
restricted to users. - Online time is significantly affected by the
treatment of extreme values (EV), which may
inflate online time by up to 10. It is thus
necessary to control for it (i.e., removing EVs). - Online time is most significantly affected by the
method of data collection (DC, e.g., interviews
vs. online tracking), which may result in a
difference of 2-folder. Although online tracking
is generally more accurate, it is far more
expensive and impractical in many societies. It
is thus important to keep in mind the magnitude
of inflation in self-reported data.
28Ultimate Criteria for Evaluation
- Validity how accurate or correct is the measure
as compared with the truth? - Reliability how precise or stable is the measure
over time and/or across space? - Practicality how efficient or economic is the
measure in data collection and analysis?
29Consistency in Measurement of Internet Users over
Time and across Space
Based onWIP definition.
30Stability in Measurement of Sex Ratio among
Internet Users in Hong Kong
31Stability in Measurement of Online Locations in
Hong Kong
32Consistency in Difference between Methods across
Age Cohorts
33Final Verdicts
- Measurement of Internet users and online time
based on interviews data is largely reliable over
time and across space. - The interview-based measurement is generally more
practical than online tracking method. - The interview-based measurement is generally
weaker in validity, as compared to online
tracking method. However, it could be adjusted if
the departure from the truth is known (e.g.,
based on comparison with online tracking data.