Sampling designs using the National Pupil Database

About This Presentation

Title:

Description:

Number of Views:76

Avg rating:3.0/5.0

Slides: 7

Provided by: Gold92

Category:

more less

Transcript and Presenter's Notes

Title: Sampling designs using the National Pupil Database

1
Sampling designs using the National Pupil Database

2
Size of data set

The data set already contains some 3000k
longitudinal records and increases by 600k a
year.
To carry out reasonably complex analyses, e.g.
value added multilevel models, is already time
consuming.
Worth investigating the efficiency of sampling
the database either as a whole or for specific
subpopulations such as LEAs.
Traditional sampling theory can be used for
simple statistics such as means or regression
coefficients, and there is a literature for
power calculations for multilevel models (see
ESRC research project by Browne at Nottingham)

3
Special features of the NPD

The population characteristics are known and
can be used for drawing efficient samples.
The possibility of an adaptive design exists,
e.g.
Select a random subsample to determine
relationships of interest (equivalent of a pilot
study)
Fit a suitable model to estimate parameter values
Choose parameters of interest together with their
confidence intervals
Increase sample size to establish relationship
between CI and sample size and extrapolate to
sample size needed to achieve required interval
size.
Any statistic of interest (in additon to CI) can
be chosen.

4
Complex designs and replication

For multilevel models and designs where interest
focuses on special groups (e.g. low achievers) we
need good choices of numbers of higher level
units (schools) and numbers in the groups.
A similar adaptive approach can be used,
evaluating CIs or significance levels as design
parameters are altered.
We also have the opportunity of replicating an
analysis by selecting an independent sample from
the database.

5
Using all the data

When analysing a given sample we will also
generally have available data related to the
sample members, e.g.
School level averages for each pupil in a study
School level data for previous schools attended
School level data for previous years
LEA data for previous years
School data for neighbouring schools,
All such data can be incorporated into a model,
increasing the number of variables but not the
sample size.

6
Other possibilities

Poststratification using population
distributions to re-weight statistics or to
incorporate weights in model estimation.
Setting up an archive of results that may be
useful for designing samples
Using PLASC to select a research sample subject
to appropriate permissions.

Write a Comment

User Comments (0)