Title: Using Health Related Data Sets for Research
1Using Health Related Data Sets for Research
Jon Mark Hirshon, MD, MPH Committee Chair
2Presentation Outline
- Introduction
- Emergency Medicine and Public Health
- Definitions
- What are Health Related Data Sets (HRDSs)?
- Examples of HRDSs
- HRDSs Research Potential
- NHAMCS by Dr. Margaret Warner
- Future Directions and Challenges for Research Use
of HRDSs
3Working Group
- Daniel Andersen, PhD
- Jon Mark Hirshon, MD, MPH
- Charlene Irvin, MD
- Linda McCaig, MPH
- Richard Niska, MD, MPH
- Gordon S. Smith, MBChB, MPH
- Margaret Warner, PhD
4Why Emergency Medicine and Public Health?
- Institute of Medicine has stated
- Strengthening the relationship between public
health and medicine is critical to address the
health needs of the public - Emergency departments (EDs) play a vital role in
our health care system. - In part due to key location at the interface
between the populace and healthcare
5EDs Data Collection
- Emergency Departments are well positioned to
collect data concerning major public health,
medical and social problems - Improved technology and governmental programs
have dramatically increased information
availability - Can be viewed as a window on the health and
social needs of the community because EDs are a
key healthcare access entry point
6Research Based on ED Data
- Emergency physicians, researchers and other
health care professionals must understand and use
the data collected from EDs and related systems
to - Better understand patterns of health resource
utilization - Recognize worsening public health problems
- Help in the development of
- More focused research activities
- More thoughtful and efficacious interventions
7What are Health Related Data Sets (HRDSs)?
- No standard definition found within the
literature - Working Definition
- Any data that is created as result of patient
interactions with the health care system or
relates to interactions with the health care
system - Includes routine physician visits, hospital
admissions, prescription purchases, but also
vital statistics and law enforcement records
8Health Services Data Sets
- Specific Subset of HRDSs are Health Services Data
Sets - Large amount of literature related to Health
Services Data Sets - Does not include all HRDSs
- E.g. law enforcement records, longitudinal
surveys - Defined as
- Extant data that can serve as a proxy measure of
the health of populations
9Broad Definition of HRDSs
- Any systematic collection of any information
related to health care, including - related costs
- services (pharmacy, physician, ambulance or
transport, hospital, clinic, school, prison,
other institution) - health related surveys or studies
- information from health insurance providers.
10Importance of HRDSs
- Opportunity to conduct epidemiological studies on
population health - Evaluate the determinants and distribution of
diseases - Potential to study issues related to health
disparities and health care for the underserved - Conduct public health surveillance
- Potential to link data from multiple sources
11Limitations of HRDSs
- Frequently based upon existing data
- Usually appropriate for retrospective studies
- Can be costly to create and maintain
- Many HRDSs already exist and are readily
available, though may cost a nominal fee - Important to maintain patient confidentiality
- Only a limited data set may be publically
available
12Barriers to Use of HRDSs
- Concerns related to
- Completeness, accuracy, and timeliness of data
- Generalizability
- May require specific analytical skills to use
- IRB approval
- Generally not an issue since most data sets use
de-identified data - Each HRDS may have unique limitations or barriers
to use
13Types of Health Related Data Sets
- Complete Enumeration
- Population-based Sample Surveys
- Non-population Based Registries
- Longitudinal Surveys
- Linked Data
14Complete Enumeration
15Population Based Sample Surveys
16Non-Population Based Registries
17Longitudinal Surveys
18Linked Data
19What are the research potential for HRDSs?
- Explore a specific example
- National Hospital Ambulatory Medical Care Survey
(aka NHAMCS) -
- Margaret Warner, PhD
- National Center for Health Statistics
- Centers for Disease Control and Prevention
20Overview of the NHAMCS
- Margaret Warner PhD
- National Center for Health Statistics
- Consensus Conference
- May 13, 2009
21Overview
NHAMCS is a national probability sample survey of
visits to emergency departments (EDs) of
non-Federal, short-stay, and general hospitals in
the United States.
- How are the NHAMCS data used?
- NHAMCS Survey Methodology
- Data user considerations
- Accessing the data
22How are NHAMCS data used?
- Changes in utilization and practice
- diagnoses, tests/procedures, prescribing
- Quality of care
- Impact of performance measures and educational
campaigns - Healthy People 2010 objectives
- Health disparities
- Adoption/Diffusion of new technologies
23(No Transcript)
24Data users
- Medical associations
- Government agencies
- Institute of Medicine
- Health services researchers
- University and medical schools
- Broadcast and print media
25(No Transcript)
26The IOM report
27Trends in emergency department visits, number of
hospitals, and number of emergency departments in
the United States, 1994-2004
- Kellermann A. N Engl J Med 20063551300-1303
28 Percentage of ED visits at which an opioid was
prescribed by pain severity and race
1997-2000
2003-2005
NOTE Pain severity was not collected in
2001-2002. SOURCE Wilper AP et al. Health
Affairs. 2008Jan-Febw84-w93.
29Age adjusted injury visit rates using alternative
definitions of injury for state benchmarking
1,500.0
1,400.0
1,300.0
Visits per 10,000 population
1,200.0
1,100.0
1,000.0
900.0
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
SOURCES CDC/NCHS, NHAMCS-ED, 1995-2005 data
files and NEISS-AIP data from WISQARS.
30NHAMCS Methodology
31NHAMCS Sample design
- 112 geographic PSUs
- 500 hospitals
- 400 EDs and 250 OPDs
- 37,000 ED and 35,000 OPD visits
- 4-week reporting period
32NHAMCS Scope
- General medicine, surgery, pediatrics, ob/gyn,
substance abuse, and other clinics are in-scope - Ancillary services are out of scope
33 Data collection
- Gaining cooperation
- Advance letters
- Endorsement letters
- Public relations materials
- Data collection procedures
- Induction visit by Census field representative
(FR) - FR training of office/hospital staff
342009 NHAMCS Patient Record Form
35Items available on the public use file
- Patient characteristics
- age, race, sex, ethnicity
- Visit characteristics
- reason for visit, diagnosis, medication
- Provider characteristics
- physician specialty, hospital ownership
- Contextual variables based on patient zip code
- of poverty, median HH income, adults with
bachelors degree or higher, urban/rural
36Data processing
- Data are coded and keyed by SRA International
- Quality control procedures
- Edit checks by NCHS
37Coding systems used
- A Reason for Visit Classification (NCHS)
- ICD-9-CM
- diagnoses
- external causes of injury
- procedures
- Drug coding system (NCHS)
38Data User Considerations
39Encounter vs. person data
- NHAMCS are record-based surveys
- Estimates are in terms of visits and not persons
- Not population-based surveys (NHIS)
- Cannot calculate incidence or prevalence rates
from NHAMCS estimates
40Sample weight
- Sample data MUST be weighted to produce national
estimates - Estimation process
- Adjusts for survey and item nonresponse
- Makes several ratio adjustments within and across
physician specialties and hospitals
41Sampling error
- NHAMCS are not simple random samples
- Clustering effects
- Providers within PSUs
- Visits within physician practice or hospital
- Must use generalized variance curve or special
software (e.g., SUDAAN) to calculate SEs for all
estimates, percents, and rates
42Reliability criteria
- Estimate considered reliable if
- Estimate is based on 30 raw cases or more and
- Relative standard errors (RSEs) of estimate is
less than 30 percent - Combine multiple years of data to increase
reliability
43Nonsampling error
- Frame coverage
- Reporting and processing errors
- Biases due to survey and item nonresponse
- Incomplete responses
44Minimizing nonsampling error
- Improve sample frame for better coverage
- Encourage uniform reporting and eliminate
ambiguities - Pretest survey items and procedures
- Perform quality control procedures consistency
and edit checks - Train Census field representatives
45NHAMCS Response rates
ED
46Attempts to improveresponse rate
- Publicity
- Eliminating questions that have a high item
non-response - Methodological studies
- PR material
47HIPAA
- No directly identifiable information collected
- PHS Act 308(d) / Title 15
- Data Use Agreement w/ Limited Dataset
- IRB approval w/ waiver of patient authorization
- Accounting Document
48Accessing the data
49Microdata files
- Downloadable files
- NHAMCS, 1992-2006
- CD-ROMs
- NHAMCS, 1992-2005
- Tapes/cartridges (NTIS)
- NHAMCS, 1992-1997
50Tools to access public-use files
- SAS input statements, variable labels, value
labels, and format assignments for 1993-2006 - SPSS syntax files
- STATA .do and .dct files for 2002-2006
51Other tools to access data
52Accessing non-public use data
- Research Data Center
- Access to information not available on public use
files - Patient zip code linked income, education, or
urbanicity status - Provider physician gender and age, board
certification, teaching hospital, medical school
affiliation, ED size - Supplement data CCSS
- Geographic state and county FIPS codes
53Data Center rules
- Submit a proposal
- Cannot use data to identify patients or providers
or geographic location of providers - Cannot remove data files
- Fee onsite / remote / file construction
- 2 NCHS RDCs Hyattsville,MD Altanta, GA
- 9 Census Bureau RDCs are located in Boston, MA
Berkeley, CA Los Angeles, CA Washington, DC
Chicago, IL Ann Arbor, MI New York, NY Ithaca,
NY and Durham, NC.
54I need more information on NHAMCS data
- Call Ambulatory Hospital Care Statistics Branch
at (301) 458-4600 - Public Use Documentation
- or
55http//www.cdc.gov/nchs/about/major/ahcd/ahcd1.htm
56Thank You
- Margaret Warner
- mwarner_at_cdc.gov
57Future Directions and Challenges
- Future- Potential for significant growth with
electronic medical records - Increased data sets
- More timely data
- Increased access
- Challenge- need to increase usefulness,
especially for evaluating prevention and
screening programs
58Research Recommendation 1
- For applicable datasets, electronically link ED
health care visits longitudinally to future
health outcomes, including costs and other
financial implications, while maintaining
de-identification of the data.
59Research Recommendation 2
- For data collected electronically, provide timely
access to the data. For data not collected
electronically, provide access to the data no
later than 1.5 years after data collection
60Research Recommendation 3
- Improve completeness of data collection for
clinically relevant and/or historical data
elements, such as external causes of injury
codes.
61Research Recommendation 4
- Provide easy access to data that can be parsed
into smaller jurisdictions (such as states) for
policy and/or research purposes, while
maintaining confidentiality.