Rudi Seljak - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Rudi Seljak

Description:

In the case of bimodal sampling distribution the sampling variance was significantly higher. ... Bimodal sampling distribution can cause serious instability in ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 35
Provided by: Rudi74
Category:
Tags: bimodal | rudi | seljak

less

Transcript and Presenter's Notes

Title: Rudi Seljak


1

Estimation of Standard Errors of Indices in the
Sampling Business Surveys
  • Rudi Seljak
  • Statistical Office of the Republic of Slovenia

2
Overview of the presentation
  • Introduction of the problem
  • Description of the simulation study
  • Results of the study
  • Conclusions

3
Concept of index numbers
  • The concept of index numbers is a widely used
    concept in economy statistics, especially in the
    area of short-term statistics.
  • Generally, the index number is a ratio of two or
    more quantities measured with the same unit.
  • For the purposes of this study, we will consider
    the index as a ratio of two or more quantities
    measured in two different time points.

4
Different types of indices
  • Aggregate approach
  • Weighted average approach
  • Chained index

5
Value index
  • The value index is derived from the general form
    of the aggregate index
  • If the values Vt ,V0 are estimated on the basis
    of the random sample, we can write the value
    index as a statistical estimator

6
Value index cont.
  • Estimates are generally estimated from two
    different samples, which were selected from two
    different populations
  • This fact causes departure from the classical
    estimation of a ratio problem

7
Target population of the study
  • The basis for the simulation study were monthly
    data from tax authorities for enterprises from
    sections G-K of NACE classification for
    2004-2005.
  • From these data we were able to get quite good
    approximation for turnover which is often the
    target variable in short-term surveys.
  • The target population consisted of approximately
    18 000 units in each month.

8
Sampling design
  • For the first month, a stratified simple random
    sample was selected.
  • The second sample was selected by rotating part
    of the units out of the first sample and
    replacing them with the new units.
  • For the rotation procedure the system of
    permanent random numbers was used.
  • The stratification was done according to the
    2-digit NACE group and size class, determined on
    the basis of estimated turnover.

9
Sampling design cont.
  • All the large enterprises were selected with
    certainty.
  • The target statistics was the estimator of the
    turnover index
  • ......sampling weights for current
    and base month
  • ......turnover for selected units in
    current and base month

10
Simulation
  • The object of the simulation study was to explore
    the sampling distribution of the estimator and to
    compare three different methods for variance
    estimation.
  • The parameters in the simulation study were
  • size of the samples
  • time gap between the two selected months
  • the rotation rate of the second sample

11
Methods for variance estimation
  • Three methods for variance estimation were tested
    and compared. Two of them are based on the
    analytical approach. The third one uses the
    re-sampling approach.
  • The basis for the analytical approaches was the
    Taylor linearization formula for the variance
    estimation of the estimator of the ratio
  • Using SAS procedure SURVEYMEANS we were able to
    estimate and but the procedure
    doesnt provide the estimation of the sampling
    covariance.

12
Methods for variance estimation cont.
  • With the first approach we estimated the sampling
    covariance by using the formula for the variance
    of the sum of two random variables
  • With this approach we also had to construct the
    common weight for the sum. We used equation

13
Methods for variance estimation cont.
  • With the second approach the sampling covariance
    was calculated directly by using formula for
    sampling covariance of estimates from two partly
    overlapping samples

14
Methods for variance estimation cont.
  • With the third approach we used the well known
    Jackknife replication method.
  • If we wanted to use the existing software, we had
    to merge two samples into one data set and then
    adequately adjust the data.
  • The weights from both samples had to be
    composed into a common weight.
  • Zero values were inserted for the units which
    were at a certain time point not in the sample.

15
Sampling distribution
  • To explore the sampling distribution of the
    estimated index, we selected 10.000 partly
    overlapping pairs of samples, using the same
    sampling design.
  • The changing parameters were the size of the
    samples, time lag between two time points and the
    rate of rotation in the second sample.

16
Sampling distribution cont.
  • We expected to get the distribution that would be
    at least approximately normal.
  • In most cases that was really the case and the
    histogram looked like the following one

17
Sampling distribution cont.
  • We expected to get the distribution that would be
    at least approximately normal.
  • In most cases that was really the case and the
    histogram looked like the following one

18
Sampling distribution cont.
  • But in some cases the shape of normality
    disappeared and we have been faced with clear
    bimodal distribution.

19
Sampling distribution cont.
  • But in some cases the shape of normality
    disappeared and we have been faced with clear
    bimodal distribution.

20
The problem of bimodality
  • The source of the bimodality is the distribution
    of the estimator of the total, either in the
    denominator or in the enumerator.
  • In all the cases where the bimodality appeared at
    least one of the estimators of the totals was
    bimodally distributed.
  • For now we couldnt find the exact reason for
    bimodality. It should be a subject of further
    investigation.

21
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

22
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

23
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

24
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

25
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

26
Comparison of the methods The case of normal
distribution
  • In the case when the sampling distribution was
    (approx.) normal all the methods worked quite
    well.
  • In the picture we will show the case when we
    fixed the months and the sample size and we were
    only changing the rotation rate.

27
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

28
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

29
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

30
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

31
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

32
Comparison of the methods The case of bimodal
distribution
  • In the case of bimodal sampling distribution the
    sampling variance was significantly higher.
  • In the pictures we show the case where the
    sampling size was fixed to 4000, the rotation
    rate to 0.2 and we were changing the time lag
    between the months.

33
Conclusions
  • The second analytical method performs slightly
    better but the advantage of the first method is
    that it requires less tailor-made programming.
  • The Jackknife method slightly overestimates the
    variance but we judge this bias is due to the
    technical reasons of adjustment of the method and
    it could be decreased.
  • The variability of the estimates is the lowest in
    the case of the JKK method.

34
Conclusions cont.
  • The problem of bimodality should be further
    investigated in the future.
  • Bimodal sampling distribution can cause serious
    instability in the procedure of variance
    estimation.
  • Bimodality of the distribution should require
    different interpretation of the sampling
    variation.
Write a Comment
User Comments (0)
About PowerShow.com