Title: Microdata access in practice
1Microdata access in practice
2Overview
- Concerns
- Conceptual and practical concerns
- International practice
- UK experience
- Key lessons
3Conceptual concerns
- Flexibility
- Convenience
- Confidentiality
- Practicality
- Scalability
- Cost
4Practical considerations
- Location
- on-site laboratories
- distributed centres
- local access
- Data management
- distributed vs centralised
- Processing facility
- fat vs thin clients
- Remote job submission
5International practice social data
- Characteristics
- easy to anonymise usefully
- unlinkable
- dominate microdata research
- Accessed through
- anonymised files with almost unrestricted release
- scientific use, CURF, etc identifiable data
with limited release (eg special license, on-site
lab, remote access, remote job submission) - released with identifying variables for NSI-work
only - easily identified observations typically not
useful statistically
6International practice business data
- almost always restricted/zero access
- identifying characteristics often useful ones
- data typically identifiable, even in scientific
use files - no access is the international norm
- where access is provided
- on-site labs and special licenses dominate
- moves towards centralised thin-client systems
(UK, Denmark, Sweden, Netherlands, Slovenia, US) - local access in Scandinavia
- Four main areas of development
- making useful anonymous files (Canada, Germany)
- synthetic data (US)
- remote job submission (Australia, NZ, US)
- remote access (non-NSI sites) through thin client
systems
7International practice health and Census data
- share characteristics with business and social
data - identifying characteristics often useful ones
- Census presents special problems because of
inclusion probability - large variations on confidentiality within and
across countries - often not collected by NSI
- in general treated like business data
8UK experience the strategy
More confidential, more secure
Special licence
No release
Virtual microdata laboratory
Web
UKDA
Remote job submission
Business data, Census data
Census, health data, OGD access to business data
Aggregate data
Not anonymised
GHS LFS
Less confidential, easier access
9UK experience the VML
- Limited lab experience
- Thin clients used to simulate on-site laboratory
- cost
- security
- flexibility
- ease of management
- Strict technical regime to ensure confidentiality
- Practicality of servicing researchers through
- training
- shifting of responsibility
- limited support
10Lessons learned
- Use the law intelligently
- challenge unhelpful interpretations
- use laws actively to support procedures
- Demonstrate benefits soon, clearly, continuously
- Running a lab
- Practising researchers design and manage lab
- Sort out rules in advance
- especially confidentiality
- actively involve users
- Continual development in operations and principles
11 - Felix Ritchie
- felix.ritchie_at_ons.gov.uk
- bdl_at_ons.gov.uk