Sampling designs using the National Pupil Database - PowerPoint PPT Presentation

About This Presentation
Title:

Sampling designs using the National Pupil Database

Description:

Sampling designs using the National Pupil Database Some issues for discussion by Harvey Goldstein (University of Bristol) & Tony Fielding (University of Birmingham) – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 7
Provided by: Gold92
Category:

less

Transcript and Presenter's Notes

Title: Sampling designs using the National Pupil Database


1
Sampling designs using the National Pupil Database
  • Some issues for discussion
  • by
  • Harvey Goldstein (University of Bristol)
  • Tony Fielding (University of Birmingham)

2
Size of data set
  • The data set already contains some 3000k
    longitudinal records and increases by 600k a
    year.
  • To carry out reasonably complex analyses, e.g.
    value added multilevel models, is already time
    consuming.
  • Worth investigating the efficiency of sampling
    the database either as a whole or for specific
    subpopulations such as LEAs.
  • Traditional sampling theory can be used for
    simple statistics such as means or regression
    coefficients, and there is a literature for
    power calculations for multilevel models (see
    ESRC research project by Browne at Nottingham)

3
Special features of the NPD
  • The population characteristics are known and
    can be used for drawing efficient samples.
  • The possibility of an adaptive design exists,
    e.g.
  • Select a random subsample to determine
    relationships of interest (equivalent of a pilot
    study)
  • Fit a suitable model to estimate parameter values
  • Choose parameters of interest together with their
    confidence intervals
  • Increase sample size to establish relationship
    between CI and sample size and extrapolate to
    sample size needed to achieve required interval
    size.
  • Any statistic of interest (in additon to CI) can
    be chosen.

4
Complex designs and replication
  • For multilevel models and designs where interest
    focuses on special groups (e.g. low achievers) we
    need good choices of numbers of higher level
    units (schools) and numbers in the groups.
  • A similar adaptive approach can be used,
    evaluating CIs or significance levels as design
    parameters are altered.
  • We also have the opportunity of replicating an
    analysis by selecting an independent sample from
    the database.

5
Using all the data
  • When analysing a given sample we will also
    generally have available data related to the
    sample members, e.g.
  • School level averages for each pupil in a study
  • School level data for previous schools attended
  • School level data for previous years
  • LEA data for previous years
  • School data for neighbouring schools,
  • All such data can be incorporated into a model,
    increasing the number of variables but not the
    sample size.

6
Other possibilities
  • Poststratification using population
    distributions to re-weight statistics or to
    incorporate weights in model estimation.
  • Setting up an archive of results that may be
    useful for designing samples
  • Using PLASC to select a research sample subject
    to appropriate permissions.
Write a Comment
User Comments (0)
About PowerShow.com