Title: Quality in
1Quality in Italian consumer price survey optimal
allocation of resources and indicators to monitor
the data collection process Federico Polidoro,
Rosabel Ricci, Anna Maria Sgamba ( Istat -
Italy )
2introduction
quality in Consumer Price Survey
two research topics
1. the optimal allocation of the available
resources (minimizing sample error burden and
cost) 2. the definition of a system of indicators
to monitor data collection process (minimizing
non sample error)
3introduction
the optimal allocation of the available resources
the issue
the calculation of a consumer price index (CPI)
requires a large amount of resources
the aim
allocating these resources in the most efficient
way (quality burden and cost)
4introduction
indicators to monitor data collection process
the issue
the definition of a system of indicators to
monitor data collection process
the aim
improving data quality (quality accuracy)
51. the optimal allocation of the available
resources
61. the optimal allocation of the available
resources
1. the optimal allocation of the available
resources
- Approach description
- Italian background
- Approach to variance estimation
- Cost function
- Case study and results
71. the optimal allocation of the available
resources
the objective of this research
identifying the optimal sample sizes either in
terms of outlets or in terms of elementary items
observed in order to minimize sample error
measured by sample variance
81. the optimal allocation of the available
resources
the optimal allocation approach
2 pillars
- a variance function
- a cost function
in order to
derive optimal sample sizes minimizing variance
of the estimates for a given cost
91. the optimal allocation of the available
resources
Italian background
consumer price index sampling structure
- Sampling of geographical areas
- Sampling of outlets
- Sampling of products
- Sampling of elementary items in each outlet
101. the optimal allocation of the available
resources
Italian background
consumer price index sampling design
non-probability sampling
consumer price index sampling methods
111. the optimal allocation of the available
resources
Italian background
consumer price index sampling methods
- Sampling of geographical areas
the selection of geographical areas is
established by Italian laws (No 222/1927 and
621/1975)
in 2007 prices were collected in 85 county chief
towns (Municipal Offices of Statistics, MOS) all
over the national territory
121. the optimal allocation of the available
resources
Italian background
consumer price index sampling methods
within each county chief towns, the selection of
outlets is carried out by MOS sample is drawn by
outlet list of the Chamber of commerce,
statistical business register (ASIA), census data
and other local sources the outlets with the
highest total sales are chosen (mix of cut-off
and quota sampling)
in 2007 prices are collected in about 40.000
outlets all over the national territory
131. the optimal allocation of the available
resources
Italian background
consumer price index sampling methods
the selection of products is carried out by
National Institute of Statistics (Istat) the
selection of the products - a list (basket) of
products types with product type specifications -
is based on sales data (cut-off sampling)
in 2007, 540 products are included in the CPIs
141. the optimal allocation of the available
resources
Italian background
consumer price index sampling methods
- Sampling of elementary items in each outlet
within each outlet, the selection of elementary
items is carried out by MOSs price
collector the most sold elementary items is
chosen (the representative item method)
in the 2007 about 400.000 price quotations are
collected all over the national territory
151. the optimal allocation of the available
resources
Italian background
consumer price index sampling methods
sample update yearly base revision
optimum sample allocation current sizes of
samples for elementary items are not optimal
161. the optimal allocation of the available
resources
the approach to variance estimation
the sample is considered drawn from a
two-dimensional population of products and
outlets a cross-classified sample (CCS)
The Swedish approach has been used to estimate
the variance of CPI (Dalén, Ohlsson, 1995)
171. the optimal allocation of the available
resources
the approach to variance estimation
- representative products as rows (i)
- outlets as columns (j)
- stratification into categories of products
stratum (g) - stratification into outlet groups stratum (h)
- the crossing of strata - cell (g,h)
- the parameter (index) I
- parameter estimator (index) Î
181. the optimal allocation of the available
resources
the approach to variance estimation
the general index (target parameter)
Vgh weight for cell
turnover for the category of products g traded in
the outlets of group h
where the cell index is
Igh index cell
191. the optimal allocation of the available
resources
the approach to variance estimation
the cell index
where
wi weight for representative product i
wh weight for outlet j
lij 1 if representative product i is traded
in outlet j lij 0 otherwise
201. the optimal allocation of the available
resources
the approach to variance estimation
the estimated cell index
the estimated general index
211. the optimal allocation of the available
resources
the approach to variance estimation
in CCS assumption the variance estimator can be
decomposed into
V(Î)tot VPRO VOUT VINT
where
VPRO variance between representative products
VOUT variance between outlets VINT outlet
and representative product interaction variance
221. the optimal allocation of the available
resources
the approach to variance estimation
formulas for variance estimation
231. the optimal allocation of the available
resources
the approach to variance estimation
with the following formula for variance estimation
where
241. the optimal allocation of the available
resources
Case study
one geographical area Udine county chief
town (Resident population 96.750) one COICOP
division (two-digit level) Food and non
alcoholic beverages reference period December
2007
251. the optimal allocation of the available
resources
the approach to variance estimation
Case study
Outlets are divided into 12 strata according a
commercial distribution type (reduced to 5 types
for Food and non alcoholic beverages) Representati
ve products are divided into 52 strata according
to the national nomenclature (categories of
products)
Currently for outlets and products purposive
sampling is used but a probability sampling has
been postulated for both
26Case study
1. the optimal allocation of the available
resources
the approach to variance estimation
Inclusion probabilities for representative
products (pgi)
Imputation by brands information in each strata
Inclusion probabilities for outlets (phj)
Imputation by the amount of representative
products collected in each outlet
27Case study
1. the optimal allocation of the available
resources
the approach to variance estimation
main numerical results
Food and non alcoholic beverages Division
Sample size 2.373 Î (index) 103.979569
VPRO 0.009466 VOUT 0.000904 VINT
0.000719 VTOT 0.011090
95 confidence interval
281. the optimal allocation of the available
resources
the cost function
one data collection method
interviewers collect prices each month by
visiting each outlet
Thus the following function cost is used
291. the optimal allocation of the available
resources
the approach to cost function estimation
where
C0 fixed cost (i.e. for administration and
other) nh the number of outlets into stratum
h mg the number of products into stratum g ah
fixed cost per outlet into stratum h (i.e. for
travel time) bh cost to measuring one product
in the outlets of stratum h rgh average
relative frequency of products in stratum g sold
in outlets of stratum h
301. the optimal allocation of the available
resources
the allocation problem
Case study
County chief town Udine Resident population
96.750 Reference time December 2007 Food and non
alcoholic beverages price quotes 2.373 Food and
non alcoholic beverages outlets 43 C0 not
considered ah we consider the average travel
time ? h bh we consider the average
collecting time ? h Estimate CTOT 182 h.
311. the optimal allocation of the available
resources
Conclusion
- Developing the contents of the paper solving the
problem of nonlinear optimization deriving from
the Cost and Variance formula - Important news preliminary attempt to estimate
Italian CPI variance - Enhancing effort to move towards a probability
approach to CPI sampling
322. indicators to monitor data collection process
33Data collection the net design
2. indicators to monitor data collection process
Data collector
PSTN or UMTS
Istat CPI Office
Data server
DB Oracle
Data collector
E-mail server
intranet
intranet
Data collector
Firewall
PSTN or UMTS
FTP server
Web server
Data collector
348
2. indicators to monitor data collection process
Different steps of data check and data quality
indicators
- Data collection software
- UMTS data transmission for each outlet or data
collection tour first check and first data
quality set of indicators on the web server
(possible real time data in the outlet)
358
2. indicators to monitor data collection process
Different steps of data check and data quality
indicators
- Second check on the total amount of monthly
elementary data and second data quality set of
indicators (MOS) - Final check (the third one) on total amount of
elementary data coming from all the chief towns
(Istat) and third set of data quality indicators - Quarterly check concerning sampling
368
2. indicators to monitor data collection process
Different steps of data check and data quality
indicators
A completely integrated data production process
where each event that will be stressed by the
system of indicators will produce consequences in
order to remove mistakes or their possible causes
37Thank you for your attention
Federico Polidoro (Istat - Italy,
polidoro_at_Istat.it) Rosabel Ricci (Istat - Italy,
roricci_at_Istat.it) Anna Maria Sgamba (Istat -
Italy, sgamba_at_Istat.it)