Title: Introduction to DDI 3'0
1Introduction to DDI 3.0
- Sanda Ionescu
- ICPSR
- CESSDA Expert Seminar, September 2007
2DDI Version 3.0
- Radically different.
- More complex
- (but certainly doable!)
- Brings important benefits.
3Workshop Schedule
- 1430 1510 Overview (40)
- 1510 1535 Structure and Technical
- Mechanisms (25)
- 1535 1545 Break (10)
- 1545 1610 Study Unit Modules Content (25)
- 1610 1630 Variable Markup Example (20)
- 1630 1640 Break (10)
- 1640 1710 Grouping Modules Content and
- Examples (30)
- 1710 1730 Getting Started (20)
4DDI 3.0
5DDI BackgroundDevelopment History
- 1995 A grant-funded project initiated and
organized by ICPSR proposes to create a new
standard for documenting social science data, to
replace OSIRIS tagged codebooks. - First drafts used SGML, then converted to
Web-friendly XML. - 2000 DDI Version 1.0 published as a mainly
document- and codebook-centric standard.
6DDI BackgroundDevelopment History
- 2003 DDI Version 2.0 published with extended
scope - Aggregate data coverage (based on matrix
structure) - Additional geographic representation to assist
geographic search systems and GIS users - Versions 1.0 through 2.1 (latest published) are
backwards compatible, and based on the same
structure.
7DDI BackgroundDevelopment History
- February 2003 Formation of the DDI Alliance, a
self-sustaining membership organization whose
members have a voice in the development of the
DDI specification. - http//www.ddialliance.org/
8DDI BackgroundDevelopment History
- Version 3.0
- 2004-2006 Planning and Development
- November 2006 Internal Review
- February 2007 Public Review
- July 2007 Candidate Draft Release
- http//www.ddialliance.org/ddi3/index.html
9Benefits of using DDI as an XML-based standard
- Interoperability
- Enables seamless exchange and reuse by other
systems. - Repurposing
- Provides a core document from which different
types of outputs can be generated. - Value-added documentation
- Tagging carries intelligence in the document by
describing content. - Enhanced Data Discovery
- Increases precision and granularity of searches.
- Support for Data Analysis
- Variables description is accepted as input by
online analysis systems. - Multiple presentation formats
- ASCII text PDF HTML RTF.
- Preservation-friendly
- Non-proprietary format.
10Why DDI 3.0?
- DDI 3.0 presents new features in response to
- Perceived needs of
- -Data users
- -Data producers
- -Data archivists/librarians
- Developments in documenting and archiving data
- Advances in XML technology
11DDI 3.0 and the Data Life Cycle Model
- DDI Versions 1/2 were codebook-centric
- Closely followed the structure of traditional
print codebooks. - Captured data documentation at a single, frozen
point in time archiving.
12DDI 3.0 and the Data Life Cycle Model
- Version 3.0 is Life Cycle oriented
- -Designed to cover all stages in the life cycle
of a data collection - pre-production production
post-production secondary use
13Life Cycle Coverage in DDI 3.0
Planning for the Study Proposal / Design
Study Purpose / Outline Concepts Study
Population Author(s) Funding Sources
Version 3.1 Survey / Sample Design Pre-testing
14Life Cycle Coverage in DDI 3.0
Data Collection methodology sampling, time,
etc. Instrument characteristics
Questionnaire Data cleaning, weighting, coding,
etc.
15Life Cycle Coverage in DDI 3.0
Physical representation Data format, Record
structure, Statistics.
Intellectual content Variables, Categories,
Codes.
16Life Cycle Coverage in DDI 3.0
- Archiving / (Re)Distributing the data collection
Processing checks
Holdings, availability and access conditions
17Life Cycle Coverage in DDI 3.0
- DDI becomes visible to the outside world
DDI Instance Pulls together all life cycle
stages Acquires its own identity as an
object Becomes a tool for data discovery and
analysis
18Life Cycle Coverage in DDI 3.0
- Secondary use of data new conceptual framework
New DDI Instance New Purpose New Logical
Product New Physical Description of Data
19DDI 3.0 and the Data Life Cycle Model
- Advantages of Life Cycle orientation
- Allows capture and preservation of metadata
generated by different agents at different points
in time. - Facilitates tracking changes and updates in both
data and documentation.
20DDI 3.0 and the Data Life Cycle Model
- Advantages of Life Cycle orientation
- Enables investigators, data collectors and
producers to document their work directly in DDI,
thus increasing the metadatas visibility and
usability. - Benefits data users, who need information from
the full data life cycle for optimal discovery,
evaluation, interpretation, and re-use of data
resources.
21New / Extended Functionalities in DDI 3.0
Questionnaire
- Versions 1/2
- No instrument coverage.
- Question text only as part of variable
description. - No documentation for question flow / conditions.
- Version 3.0
- Full description of instrument as a separate
entity. - Documents specific use of questions flow,
conditions, loops. - Compatible with Computer Assisted Interviewing
software.
22New / Extended Functionalities in DDI 3.0
Complex Data
- Versions 1/2
- Inadequate representation of complex /
hierarchical data
- Version 3.0
- Detailed documentation for complex / hierarchical
data - Logical structure of records
- Record Types and Relationships
- Relevant variables key-link, case
identification, record type locator - Physical layout of records
- Single hierarchical file for all records,
multiple rectangular files, relational database,
etc.
23New / Extended Functionalities in DDI 3.0
Aggregate Data
- Versions 1/2
- Initially designed for microdata only
- Aggregate data section added in V 2.1 to support
limited representation (Census-type data,
delimited files) - Version 3.0
- Adds support for tabular, spreadsheet-type,
representation of aggregate data - Aggregate data transport option cell content may
be included inline with the data item description
24New / Extended Functionalities in DDI 3.0 Data
Transport
- Versions 1/2
- -None
- Version 3.0
- -In-line inclusion enabled for both aggregate
data - and microdata
25New / Extended Functionalities in DDI 3.0
Longitudinal / Time Series / Cross-national
DataComparability
- Versions 1/2
- -None
- Version 3.0
- -Grouping structure documents studies related on
one or several dimensions (time, geography,
language, etc.) as well as their comparability
26New / Extended Functionalities in DDI 3.0
Increased Multilingual Support
- Versions 1/2
- Limited
- ltanytag xmllanggt
- Version 3.0
- Support for multiple language use and
translations - ltInternationalStringType xmllang
translated translatablegt - ltVariablegt
- ltLabel xmllangger translatedfalse
translatabletruegt - Geburtsjahrlt/Labelgt
- ltLabel xmllangeng translatedtruegtYe
ar of Birthlt/Labelgt - lt/Variablegt
27DDI 3.0 Specification Schema-based
- Versions 1/2
- DTD-based
- Version 3.0
- Schema-based
- Data typing supports machine actionability
- Use of namespaces supports
- Modularity
- Extensibility and reuse
- Alignment with / use of other standards
28DDI 3.0 Specification Machine-actionable
- Versions 1/2
- Machine-readable
- Version 3.0
- Machine-actionable
- 1. Data typing increased use of controlled
vocabularies and standard codes - 2. Larger set of required elements
- Predictable content a more consistent
- base for programming
29DDI 3.0 Modular Structure
- Version 1/2
- Single file, hierarchical design
- Version 3.0
- Modular design
- Facilitates reuse
- Facilitates versioning and maintenance
- Supports life cycle model
- Allows flexibility in organizing the DDI
Instance - Supports grouping and comparing studies
- Supports creation of metadata registries
30DDI 3.0 Alignment with other metadata standards
- Versions 1/2
- MARC, Dublin Core (bibliographic standards)
- Version 3.0
- MARC, DC, but also
- SDMX (Statistical Data and Metadata Exchange)
- ISO 11179 (Metadata Registries)
- FGDC (Digital Geospatial Metadata)
- - ISO 19115 (Geographic Information Metadata)
31DDI 1/2 or DDI 3.0?
- DDI 3.0 will not supersede DDI 2.1.
- Both versions will
- coexist
- continue to be maintained
- be used according to specific needs.
- All DDI 1/2 markup will not have to be migrated
to Version 3.0.
32DDI 3.0
33DDI 3.0 Modular Structure
- Building blocks of DDI 3.0
- Modules
- Schemes
34DDI 3.0 Modular Structure
- Modules
- Document different aspects of a study, or group
of studies, following the data through their life
cycle (Conceptual Components, Data Collection,
Logical Product, Physical Instance, etc.) - Schemes
- Include collections of sibling objects that are
traditionally components of a variable
description Concepts, Universes, Questions,
Variable Labels and Names, Categories, Codes.
35DDI 3.0 Modular Structure
- Modules
- Can live independently (have their own schemas)
or connected to one another within a hierarchical
structure. - Schemes
- Can live semi-independently (need a higher-level
wrapper as they do not have their own schemas) or
in-line within a Study Unit or Group module.
36DDI 3.0 Modular Structure
- DDI 3.0 model a multi-branched hierarchy
- Module level
-
DDI Instance
Resource Package
Group
Study Unit
Subgroup
Study Unit
Conceptual Components
Data Collection
Archive
(Sub)group
Study Unit
Organizations
Study Unit
Subgroup
37DDI 3.0 Modular Structure
- DDI 3.0 model a multi-branched hierarchy
- Within modules
Data Collection
Question Scheme
Processing
Methodology
Sampling
Time Method
Question Item
Question Item
Weighting
Coding
38DDI 3.0 Modular Structure
- Relationships are established through
- In-line inclusion
- (Relational order is explicit)
- Referencing Internal
- External
- (Relational order is implicit)
39DDI 3.0 Structural mechanisms
- Enable modular design and help actualize its
benefits. - Inheritance
- Referencing
- Identification
40DDI 3.0 Inheritance
- Inheritance is based on the hierarchical
structure of the model. - In DDI 3.0 a number of elements are reused at
different levels of the hierarchy. - When the same element is present at multiple
levels, lower levels inherit content from the
upper levels, and only need to specify
differences (local overrides).
41DDI 3.0 InheritanceExample
- Instance Coverage Spatial 50 US states
-
- -Study Unit A no Spatial Coverage defined
- will be inherited
from Instance - -Study Unit B Coverage Spatial 48
coterminous states - supersedes
definition in Instance
42DDI 3.0 Referencing
- DDI 3.0 modular structure is dependent upon
creating relationships by reference. - Referencing implies bringing up the content of a
DDI object within, or in association with,
another object, by specifying its Unique
Identifier. - Identifiers are the key links between DDI objects.
43DDI 3.0 ReferencingExample
- Data Collection Module
- Question Scheme Question ID Q1
- Text How many days in the past week did you
watch the national network news on TV?
- Conceptual Components Module
- Concept Scheme Concept ID C1
- Description Exposure to national TV news
Logical Product Module Variable Scheme
Variable ID V1 Name V043014 Label Days
past week watch natl news on TV
Question Reference ID Q1
Concept Reference ID C1
44DDI 3.0 ReferencingExample
45DDI 3.0 Identification
- Consistency in building and using identifiers is
needed for - Proper functioning of reference systems, enabling
a smooth exchange and reuse of existing metadata. - Machine-actionability of DDI instances, allowing
them to serve as a basis for running programs and
processes.
46DDI 3.0 Identification
- Element types used in the Identification system
47DDI 3.0 IdentificationElement Types
- Non-identified elements
- Require context, which is provided by containing
parents. - Example codes within code schemes
- Are not reusable.
- Example variable and category statistics
48DDI 3.0 IdentificationElement Types
- Identifiables
- Carry their own ID
- May be referenced / reused
- Cannot be versioned or maintained, except as part
of a complex parent element - (Example Variable a change implies a new
version of the entire scheme).
49DDI 3.0 IdentificationElement Types
- Versionables
- Carry their own ID
- Carry their own Version content changes are
important to note - (Example Concept may be independently
versioned within a scheme).
50DDI 3.0 IdentificationElement Types
- Maintainables
- Are higher level DDI objects
- Are both identifiable and versionable
- Can also be published and maintained as separate
entities - (Example all modules, schemes, comparison maps)
51DDI 3.0 Identification Structure
- Maintainable elements
- URN and / or ID Identifying Agency
- Versioning
Information -
Version -
Version Date -
Version
Responsibility -
Version
Rationale - Versionable elements
- URN and / or ID Versioning Information
- Identifiable elements
- URN and / or ID
52DDI 3.0 Identification StructureNon-specified
Identification information is inherited from the
levels above.
- Example 1
- Inheritance is assumed.
- Maintainable Variable Scheme
- ID VarScheme_AIdentifying Agency ICPSR
- Version 1.0
- Identifiable Variable
- ID Var_1
- Identifying Agency
- Version
53DDI 3.0 Identification StructureNon-specified
Identification information is inherited from the
levels above.
- Example 2
- Inheritance is applied by default
- Maintainable Logical Product
- ID LogicalProd_Y
- Identifying Agency ICPSR
- Version 1.0
- Maintainable Variable Scheme
- ID VarScheme_A
- Identifying Agency
- Version
- Example 1
- Inheritance is assumed
- Maintainable Variable Scheme ID VarScheme_A
- Identifying Agency ICPSR
- Version 1.0
- Identifiable Variable
- ID V1
Identifying Agency - Version
54DDI 3.0 Identification Structure IDs
- Uniqueness of Identifiers is necessary for both
internal and external referencing - 1) All IDs MUST be unique within a
maintainable - 2) All maintainables MUST have unique IDs
across an Agency
55DDI 3.0 Identification Structure Creating
unique Identifiers
- A DDI Instance may include multiple
maintainables - at different hierarchical levels
- Instance (maintainable) unique ID within
Identifying Agency - Study Unit (maintainable) unique ID within
Identifying Agency - Logical Product (maintainable) unique ID
within Identifying Agency - Variable Scheme (maintainable)
unique ID within Identifying Agency -
-
-
56DDI 3.0 Identification Structure Creating
Unique Identifiers
Markup
- Instance_A (unique at ICPSR)
- StudyUnit_1
- Logical Product_1
- VariableScheme_1
- Variable_1
- Instance_B (unique at ICPSR)
- StudyUnit_1
- Logical Product_1
- VariableScheme_1
- Variable_1
Post-markup Variable ID Instance_AStudyUnit_1Log
icalProduct_1VariableScheme_1Variable_1 Instance_B
StudyUnit_1LogicalProduct_1VariableScheme_1Variabl
e_1
57DDI 3.0 Identification Structure URNs
- Have a fixed structure and MUST include object
ID, Identifying Agency, and Version. - For versionable and identifiable elements, the
containing maintainable is specified. - Take precedence when both a URN and the
Identification sequence are used for the same
object. - May be constructed post-markup from the
Identification sequence.
58DDI 3.0 IdentificationURN Structure
Identifying Agency
Object ID
Object Version
- Examples
- Maintainables
- urnddi3.0StudyUnitddialliance.org
StudyUnit_ID1.0 - Versionables
- urnddi3.0ConceptSchemeddialliance.orgConceptS
cheme_ID1.0 - ConceptConcept_ID
2.1 - Identifiables
- urnddi3.0VariableSchemeddialliance.orgVariabl
eScheme_ID1.0 -
VariableVariable_ID
Object name
59DDI 3.0 Referencing
- Reference structure
- URN, and/or
- Referenced objects ID Identifying Agency
Version -
Containing Module ID -
Containing Scheme ID -
60DDI 3.0 Reuse of Information
- Referencing
- Mechanisms for REUSE
- Inheritance
- Reuse of Information
- Facilitates development of documentation
throughout the study life cycle - Promotes interoperability and standardization
across organizations - Saves markup time and effort
- Reduces the risk of human entry error
- Provides a basic level of implicit comparability
61DDI 3.0 Modules
62DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Resource Package
Concepts
Study Unit
Subgroup
Study Unit
Sub(Group)
Data Coll.
Logical Pr.
etc
63Other specialized DDI 3.0 modules
- Aggregate Data
- NCube Logical Product
- Inline NCube Record Layout
- NCube Record Layout
- Tabular NCube Record Layout
- Inline Microdata
- Dataset
- User-specific Markup Templates
- DDI Profile
64DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
65DDI 3.0
- Modules used to mark up a simple study
66DDI 3.0 modules for documenting a single,
survey-type study
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
67DDI 3.0 modules for documenting a single,
survey-type study
- Instance
- Study Unit
- Conceptual Component
- Data Collection
- Logical product
- Physical Data Product
- Physical Instance
- Archive
- Organizations
68DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
69DDI Instance -- wrapper for all modules --
- Identification
- URN
- Identification Sequence
- Name
- Citation ( optional DC Elements)
- Coverage
- Topical
- Spatial
- Temporal
- Group (module) repeatable
- Resource Package (module) - repeatable
- Study Unit (module) - repeatable
- Other Material(s)
- Note(s)
- Translation Information
70Coverage in DDI 3.0
- Study American National Election Study (ANES),
2004 - Topical Coverage
- Subject
- Historical and Contemporary Electoral Processes
- Keyword
- Electoral campaigns
- Political attitudes
- Political participation
- Spatial Coverage
- Description United States
- Top level nation
- Lowest level congressional district
- Temporal Coverage
- Date 2004
71DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
72Study Unit -- documents a single study --
- Identification, Other Material(s), Note(s)
- Citation
- Abstract
- Universe Reference
- Funding Information
- Purpose
- Coverage
- Analysis Unit
- Embargo
- Conceptual Component (module)
- Data Collection (module)
- Logical Product (module)
- Physical Data Product (module)
- Physical Instance (module)
- Archive (module)
- Organizations (module)
73DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
74Conceptual Component-- lists concepts and
universes --
- Identification, Other Material(s), Notes
- Coverage
- Concept Scheme or Reference to External Scheme
- Vocabulary describes vocabulary used
- Concept
- Label
- Description
- Similar Concept
- Difference
- Concept Group
- Concept Reference (nestable)
- Universe Scheme or Reference to External
Scheme - Universe
- Human Readable
- Machine Readable
- Subuniverse
- Subuniverse
75DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
76Data Collection
- Identification, Other Material(s), Note(s)
- Coverage
- Methodology
- Time Method
- Sampling
- Collection Event
- Data Collector
- Data Source
- Collection Date (s)
- Mode of data collection
- Question Scheme lists actual questions
- Instrument documents question flow, conditions
- Processing Event
- Control and cleaning operations
- Weighting
- Data Appraisal Information
- Coding
77DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
78Logical Product-- documents intellectual content
of data --
- Identification, Other Material(s), Note(s)
- Coverage
- Category Scheme or Reference to external
category scheme - Category
- Label
- Derivation (if applicable)
- Definition
- Code Scheme or Reference to external code
scheme - Category Scheme Reference
- Hierarchy Type
- Level (in the hierarchy)
- Code
- Category Reference
- Value
- Code (nestable)
- Variable Scheme or Reference to external
variable scheme
79Logical ProductVariable Scheme Variable
- Variable or Reference to an externally
documented variable - Identification
- Name
- Label
- Definition
- Universe Reference
- Concept Reference
- Question Reference
- Embargo Reference
- Response Unit
- Analysis Unit
- Representation
- Imputation
- Derivation
- Coding Instructions
- Value Representation
- Text
80Logical ProductVariable Scheme Variable Group
- Variable Group
- Type
- Label
- Definition
- Universe Reference
- Concept Reference
- Variable Reference (lists variables in the group)
- Variable Group Reference (allows nesting of
groups) - Variable Group Reference (use for externally
documented Variable Group)
81DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
82Physical Data Product-- Describes Physical
Layout of Data --
- Identification, Other Material(s), Note(s)
- Logical Product Reference
- Gross Record Structure
- Records Per Case
- Variable Quantity
- Logical Record Reference
- Physical Record Reference
- Related Logical Records
- Record Layout
- Data Item
- Variable Reference
- Physical Location
- Value Location
- StartPosition
- Width
83DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
84Physical Instance-- Documents a specific data
file ---
- Identification, Other Material(s), Note(s)
- Citation
- Coverage
- Physical Data Product Reference
- Data File Identification
- Location
- URI
- Gross File Structure
- Creation Software
- Case Quantity
- Overall Record Count
- Statistics
- Logical Product Reference
- Variable Statistics
- Variable Reference
- Total Responses
- Summary Statistics
- Category Statistics
- Value
85DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
86Archive
- Identification, Other Material(s), Note(s)
- Archive Specific
- Item
- Location
- Call Number
- URI
- Format
- Media
- Availability Status
- Access
- Confidentiality Statement
- Access Permission
- Restrictions
- Citation Requirement
- Deposit Requirement
- Access Conditions
- Disclaimer
- Contact
- Funding Information
87DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
88Organizations
- Identification
- Organization
- URL
- Individual
- Individual
- Organization
- Title
- Language
- Role
- Entity Reference
- Organization Reference
- Individual Reference
- Description
- Period
- Relation
- Organization Reference
- Individual Reference
- Description
- Period
- Name
- Description
- Location
- Telephone
- E-mail
- Relation
89DDI 3.0 Markup Example
90Version 2.1 vs. Version 3.0 Example A survey
variable
91Version 2.1 vs. Version 3.0 Example A survey
variable in Version 2.1
Data Description Variable
92Version 2.1 vs. Version 3.0 Example A survey
variable in Version 2.1
nameV043015
93Version 2.1 vs. Version 3.0 Example A survey
variable in Version 3.0
Logical Product Variable Scheme
Conceptual Component Concept Scheme Universe
Scheme
Data Collection Question Scheme
Logical Product Code Scheme
Logical Product Category Scheme
Physical InstanceStatistics
94Version 2.1 vs. Version 3.0 Example A survey
variable in Version 3.0
Logical Product Category Scheme ID Category ID
Conceptual Component Concept Scheme Concept
ID Universe Scheme (Sub)Universe ID
Logical Product Variable Scheme ID Variable ID
Logical Product Code Scheme ID Code
Data Collection Question Scheme ID Question ID
Physical Instance Statistics Variable
Statistic Category Statistics
95DDI 3.0 Markup A Survey VariableConcept
- Concept Attention to
- Presidential Campaign
- on National TV
Conceptual Component Concept Scheme Concept
96DDI 3.0 Markup A Survey VariableConcept
97DDI 3.0 Markup A Survey VariableUniverse
Conceptual Component Universe
Scheme(Sub)Universe
(A7How many days in the PAST WEEK did you watch
the NATIONAL network news on TV? 0-7 8DK 9RF)
98DDI 3.0 Markup A Survey VariableUniverse
99DDI 3.0 Markup A Survey VariableQuestion ID,
Question Text
Data CollectionQuestion Scheme Question Item
100DDI 3.0 Markup A Survey VariableQuestion ID,
Question Text
101DDI 3.0 Markup A Survey VariableVariable name,
label, type of physical representation
Logical Product Variable Scheme Variable
102DDI 3.0 Markup A Survey VariableVariable name,
label, type of physical representation
- Other types of Representation
103DDI 3.0 Markup A Survey VariableCategory
labels, missing data information
Logical Product Category Scheme Category
104DDI 3.0 Markup A Survey VariableCategory
labels, missing data information
missingtrue
105DDI 3.0 Markup A Survey VariableCategory Values
Logical Product Code Scheme Code
106DDI 3.0 Markup A Survey VariableCategory Values
107DDI 3.0 Markup A Survey VariableStatistics
Physical Instance Statistics Variable
Statistics Category Statistic
108DDI 3.0 Markup A Survey VariableStatistics
109DDI 3.0 Markup A Survey Variable Logical
Product Module
110DDI 3.0 MarkupModules used in a full variable
description
- Concept
- Universe
- Question
- Values
- Value Labels
- Variable name
- Variable label
- Statistics
- Location
- Physical Data Product
111DDI 3.0 Modular ApproachAdvantages
- Modules and schemes can be independently
maintained. - Pieces of information can be reused without being
repeated.
112DDI 3.0 Modular ApproachReusing information
113Variable Markup in Version 2-- carries redundant
information--
114Variable Markup in Version 3.0 Modular Approach
Reusing Information
115 DDI 3.0
116DDI 3.0 Groups
- Entirely new feature in DDI 3.0.
- Designed to document and compare related studies.
117DDI 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
118DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
119Group-- documents families of studies --
- Identification, Other Material(s), Note(s)
- Citation
- Abstract
- Universe
- Funding Information
- Purpose
- Coverage
- Universe Reference
- Conceptual Component (module)
- Data Collection (module)
- Logical Product (module)
- Archive (module)
- Organizations (module)
- Study Unit (module)
- Group (module)
- Comparative (module)
120DDI 3.0 Grouping Attributes
- Set of mandatory attributes indicate the nature
of the relationships among group members - Group parameters
- Time
- Instrument
- Panel (population of respondents)
- Geography
- Datasets
- Language
121DDI 3.0 Grouping Attributes Example
122DDI 3.0 Types of Groups
- Groups of studies may be
- Formal (by design)
- Designed to be compared (longitudinal,
time-series, or cross-national studies) - Documented and compared through use of
Inheritance - Informal (ad-hoc)
- Decision to group and compare is taken
post-production, or after the fact. - Comparability documented in the Comparative
module
123Formal Groups Inheritance
- Example 1 Time-series Same questions repeated
over time, same resulting variables.
Group (Studies A-C) Temporal Coverage_G11991-1993
Data Collection Question Scheme Logical
Product Variable Scheme
Study A Temporal Coverage 1991 (Replace
RefG_1) Physical Data Product Physical Instance
Statistics
Study B Temporal Coverage 1992 (Replace
RefG_1) Physical Data Product Physical Instance
Statistics
Study C Temporal Coverage 1993 (Replace
RefG_1) ....... Physical Data Product Physical
Instance
Study A Temporal Coverage 1991 (Replace
RefG_1) Physical Data Product Physical
Instance
Study B Temporal Coverage 1992 (Replace
RefG_1) Physical Data Product Physical
Instance
124Formal Groups InheritanceAttributes Add,
Replace, Delete.
- In a complex grouping structure inheritance paths
may become quite intricate. - ID attributes ADD, REPLACE and DELETE are
introduced to resolve potential inheritance
ambiguities - ADD empty -gt flags element as a new addition.
- REPLACE ReferenceType -gt referenced element
is being replaced at the lower level (local
override). - DELETE ReferenceType -gt referenced element is
being deleted at the lower level.
125Formal Groups Inheritance
- Example 2 Time-series Same core questions
repeated over time, different topical modules
added to each iteration.
Group (Studies A-C) Data Collection Core
Questions(Q1-Q50) Logical Product Core Variables
(V1-V50)
Study A Topical Module Health Status Data
Collection ADD Questions (Q51A-Q80A) Logical
Product ADD Variables (V51A-V80A)
Study B Topical Module Gun Control Data
Collection ADD Questions (Q51B-Q80B) Logical
Product ADD Variables (V51B-V80B)
etc
126Formal Groups Inheritance
- Example 3 Any group by design some questions
are not asked in some iterations.
Group (Studies A-E) Data Collection All
Questions (Q1-Q100) Logical Product All
Variables (V1-V100)
Group (Studies C-E) Data Collection DELETE
Questions Q60-Q69 Logical Product DELETE
Variables V60-V69
Study B Data Collection DELETE Question
Q55 Logical Product DELETE Variable V55
Study A
Study C
Study D
Study E
127Formal Groups Inheritance
- Example 4 (SOEP, Germany) Longitudinal Same
variables, with different name each year.
(No name)
ADD Name only
128Formal Groups Inheritance
- Example 5 (SOEP, Germany) Longitudinal In 2002
variable Income changes currency from DM to
Euro change in question wording.
(No question)
ADD question only
129Formal Groups Inheritance
- Example 5 (SOEP, Germany) continued These
variables also change names every year
130Formal Groups Inheritance
- Example 5 (SOEP, Germany) the final picture
information is inherited down the hierarchy.
131Inheritance in Formal Groups
- Simplification of DDI Instances common metadata
is only entered once. - More efficient means of documentation for new
additions, only differences need to be specified. - Relational information embedded in the
inheritance structure comparison becomes
machine-actionable.
132DDI Version 3.0 Modules-- Structural Overview --
DDI Instance
Study Unit
Group
Conceptual Component
Conceptual Component
Data Collection
Data Collection
Logical Product
Logical Product
Physical Data Product
Archive
Physical Instance
Comparative
Archive
Study Unit
Group
Organizations
133Comparative -- documents comparability in ad-hoc
groups --
- Identification, Note(s)
- Comparison Description (human-readable)
- Concept Map
- Source Scheme Reference
- Target Scheme Reference
- Item Map
- Source Item
- Target Item
- Map Type
- Difference
- Variable Map
- Question Map
- Category Map
- Code Map
- Universe Map
134DDI 3.0 Using the Comparative Module
- Instructions on how to use the Comparative
Module and build comparison maps - DDI 3.0 User Guide, pp. 45-49.
- http//www.ddialliance.org/D
DI/ddi3
135Producing DDI 3.0 markup
136DDI 3.0 Tools projects
- DDI Toolkit
- Core library for developing open source tools
- Version 1/2 lt-gt Version 3.0 converters
- DDI 3.0 URN resolution tool
- DDI 3.0 validation tool
- Version 3.0 stylesheets with display and editing
layers - Grouping tool
- Concept management tool
- Registry applications
137Producing DDI 3.0 markup-- Getting started --
- Software to assist in document creation
- DeXtris
- XML browser
- Converts DDI 1/2 to DDI 3.0
- http//www.opendatafoundation.org/tools/dextris
138DDI 3.0 Tools Using Dextris
139DDI 3.0 Tools Using Dextris
140DDI 3.0 Tools Using Dextris
141DDI 3.0 Tools Using Dextris
142DDI 3.0 Tools Using Dextris
143DDI 3.0 Tools Using Dextris
144DDI 3.0 Tools Using Dextris
145DDI 3.0 Tools Using Dextris
146DDI 3.0 Tools Using Dextris
147Producing DDI 3.0 markup-- Getting started --
- Software to assist in document creation
- SPSS system to DDI 3.0 converter
- (See description and link on DDI 3.0 Proof of
Concept page) - http//www.ddialliance.org/DDI/ddi3/proof
.html
148Producing DDI 3.0 markup-- Getting started --
- XML editors
- oXygen
- Create new DDI instance
- Edit/update DDI instance
- Validate DDI instance
- View schemas
149DDI 3.0 Viewing Schemas in oXygen
150DDI 3.0 Viewing Schemas in oXygen
151Producing DDI 3.0 markup-- Getting started --
- Other tools to assist in producing DDI 3.0
markup - DDI core template
- Version 3.0 documentation
- Module descriptions
- Field level documentation
- DDI Help Center
- http//www.ddialliance.org/ddi3/index.html
152Producing DDI 3.0 markup -- Using multiple
modules --
- Resource
- Getting Started with DDI 3.0
- http//www.ddialliance.org/DDI/ddi3/gett
ing-started.html
153DDI Version 3.0Displaying Markup
- Stylesheets
- Basic
- Web presentation in XHTML
- Enhanced
- Adds graphics for presenting frequencies
- Automated calculation of valid percentages
- http//www.ddialliance.org/DDI/ddi3/proof.html
154DDI Version 3.0Questions? Comments?
- Sanda Ionescu sandai_at_umich.edu
- DDI Users Listserv
- ddi-users_at_icpsr.umich.edu
- http//www.ddialliance.org/codebook/listserv.html
155The End