Collaborative Data Management for Longitudinal Studies - PowerPoint PPT Presentation

About This Presentation
Title:

Collaborative Data Management for Longitudinal Studies

Description:

maketest using filename.do, replace. Options: using: specifies file to write ... replace: overwrite existing data file. clear: clear current data in memory ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 15
Provided by: stephen523
Category:

less

Transcript and Presenter's Notes

Title: Collaborative Data Management for Longitudinal Studies


1
Collaborative Data Management for Longitudinal
Studies
  • Stephen Brehm
  • coauthors L. Philip Schumm Ronald A. Thisted
  • University of Chicago
  • (Supported by National Institute on Aging Grant
    P01 AG18911-01A1)

2
Agenda
1. Background on Study
2. Problem Data Management Deficiencies
3. Solution Collaborative Data Management
4. STATA Programs maketest makedata
3
Background on Study
  • NIH-funded Longitudinal Study
  • Loneliness Health
  • Thousands of Measures
  • Loneliness
  • Depression
  • 230 subjects
  • Repeated Yearly

4
Problem Data Management Deficiencies
  • Code Not Modular
  • Difficult to manage the data cleaning code
  • Limited code reuse from year to year Difficult
    to collaborate among interns
  • No Established Set of Data Cleaning Steps
  • Difficult for research assistants (turn-over)
  • Inconsistent data cleaning techniques
  • Data cleaning code difficult to read

5
Problem Data Management Deficiencies
Research Assistant
Research Assistant
Research Assistant
Core File Set
Research Assistant
Research Assistant
6
Solution Collaborative Data Management
  • Process
  • Established Steps
  • File System Layout
  • Automated Tests
  • Collaboration
  • Concepts
  • Module
  • Batch
  • Data Certification
  • STATA Programs
  • maketest
  • makedata

7
Solution Collaborative Data Management
  • Process
  • Established Steps
  • File System Layout
  • Automated Tests
  • Collaboration
  • Concepts
  • Module Exloneliness
  • Batch
  • Data Certification
  • STATA Programs
  • maketest
  • makedata

8
Solution Collaborative Data Management
  • Process
  • Established Steps
  • File System Layout
  • Automated Tests
  • Collaboration
  • Concepts
  • Module Exloneliness
  • Batch Exyr1, yr2, yr3
  • Data Certification
  • STATA Programs
  • maketest
  • makedata

9
Solution Collaborative Data Management
Set of Files for Each Module acquire-module.do
fix-module.do test-module.do derive-module
.do label-module.do
Year-Specific
60 Code Reuse Files Shared Between Years
Acquire Fix
Derive
Test
Label
10
STATA Program maketest
  • Purpose
  • Auto-generation of Data Certifying Tests
  • Functionality
  • Tests Variable Type
  • Checks Consistency of Value Labels
  • Verifies Existence of Variable

11
STATA Program maketest
  • Syntax
  • maketest varlist using, REQuire(varlist)
    append replace
  • Example
  • maketest using filename.do, replace
  • Options
  • using specifies file to write
  • REQ requires presence of variables in list
  • append add to existing test .do file
  • replace overwrite existing .do file

12
STATA Program makedata
Bringing it all together
13
STATA Program makedata
  • Syntax
  • makedata namelist, Pattern(string) replace
    clear Noisily Batch(namelist) TESTonly
  • Example
  • makedata ats, p("acquire-.do") b(yr1) clear
    replace
  • Options
  • p pattern file naming convention
  • replace overwrite existing data file
  • clear clear current data in memory
  • Noisily full output (default summary)
  • b batch year, wave, center
  • TESTonly only run tests step

14
Other Applications
  • Beyond Longitudinal Data
  • Teaching Data Cleaning with STATA
  • Contact Information
  • Stephen Brehm
  • sbrehm_at_uchicago.edu
  • L. Philip Schumm
  • pschumm_at_uchicago.edu
  • Ronald A. Thisted
  • thisted_at_health.bsd.uchicago.edu
  • Supported by National Institute on Aging
  • Grant P01 AG18911-01A1
Write a Comment
User Comments (0)
About PowerShow.com