Title: NAICS YIKES
1An Introduction to
OLA Conference February 2008 Session 1022 Jeff
Moon Head, Maps, Data, Government Information
Centre (MADGIC) Queens University
2Flowchart Do I want to use statistics?
3- What well cover
- What is survey data, and whats the big deal?
- Whats happening in Ontario on the data front?
- Show me the goods
- Why is this important at my library?
4What is Survey Data and whats the big deal?
Data continuum
Survey Data
Tables, Charts, Graphs
A number
(Microdata)
(in Books, CD-ROM, the WWW)
(machine-readable)
5What is Survey Data and whats the big deal?
Descriptive Statistics
Inferential Statistics
6What is Survey Data and whats the big deal?
Statistics
Survey Data
Tables, Charts, Graphs
A number
(Microdata)
(in Books, CD-ROM, the WWW)
(machine-readable)
7What is Survey Data and whats the big deal?
Survey Data
Aggregate Data
Postcard
Camera
Fixed
8Well look at the flexibility of survey data a
bit later on In the mean time, lets look at
the situation in Ontario right now
91990s Home-grown survey data systems
- Guelph, Western, Queens
- No cataloguing standard
- Varying features/capabilities
- Served a purpose at the time
2000s Emerging data cataloguing standards
Data Documentation Initiative -- an
international standard for describing survey
data. Like MARC, only for data
Mature commercial software solutions
Software such as Nesstar, SDA, and others
10In 2005, the Data IN Ontario (DINO) working group
of OCUL (Ontario Council of University Libraries)
started thinking about moving beyond home-grown
data solutions, adopting the DDI standard, and
building a province-wide data solution. A
discussion paper followed
In 2007, with funding from OCUL and Ontario
Buys, a Project Director was hired, and
hardware/software purchased through Scholars
Portal.
Ontario Data Documentation, Extraction Service
and Infrastructure Initiative
11Lead institutions in ltODESIgt are Carleton and
Guelph, with in-kind assistance from Queens
University. First step was developing a Canadian
best practices document for cataloguing data
files using DDI analogous to AACR2 for MARC.
Next, survey files were marked up
(catalogued) and loaded onto a test server at
Guelph. The team at Scholars Portal is working
with ltODESIgt to establish a data server and load
data files.
12Use of the Data Documentation Initiative standard
facilitates
- Interoperability. XML-compliant DDI Codebooks
can exchanged and transported seamlessly, and
applications can be written to work with these
homogeneous documents. - Richer content. The DDI encourages better
description of social science datasets, providing
researchers with a better window into what is
available - Single document - multiple purposes. DDI
codebook contain all of the information necessary
to produce several different types of output,
including a traditional social science codebook,
a bibliographic record, and SAS/SPSS/Stata data
definition statements. Thus, the document may be
repurposed for different needs and applications. - On-line subsetting and analysis. Because the DDI
markup extends down to the variable level and
provides a standard uniform structure and content
for variables, DDI documents are easily imported
into on-line analysis systems, rendering datasets
more readily usable for a wider audience. - Precision in searching. Since each of the
elements in a DDI-compliant codebook is tagged,
searches across documents and studies are
possible.
www.ddialliance.org
13- SOFTWARE CHOSEN ? NESSTAR
- Developed by the Norwegian Social Science Data
Services -- Networked Social Science Tools and
Resources - In use internationally (Europe, UK, US, Canada)
- In Ontario Queens, Guelph, Carleton, Windsor,
Ottawa, U. of T. and Statistics Canada use
Nesstar - DDI compliant
- Search by keyword for surveys and survey
questions - Do basic data exploration and analysis on the
web - Download full datasets or subsets in popular
formats - Export tables and charts
14http//nesstar.esds.ac.uk/webview/
http//www.nsd.uib.no/cessda/extcessda.jsp
15Nesstar Publisher produces DDI-compliant metadata
using a set of structured tags, grouped into
tabs in Publisher.
16Document Description Tab
17Study Description Tab
18Other Study Materials Tab
19File Description Tab
20Variables Tab
21Variable Groups Tab
22Data Entry Tab
23Other Materials Tab
24Once ready, a marked up survey file is
published to the Nesstar Server where it
becomes available through Nesstar Webview.
25Lets take a look at how ltODESIgt can be used to
answer a research question.
How do men and women differ in perceptions of
their health (using weight as an example).
Concepts? Health Body Mass Index
(BMI) Weight Males/Females
26Starting point A simple search on the Statistics
Canada web site
27Fixed
Flexible
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35Basic frequencies or marginals for
categorical variables
36Descriptive statistics for continuous variables
37But what if we want to look at more than one
variable at a time? Say, for instance, the issue
of weight and gender?
38(No Transcript)
39OK now we want to add gender as a variable.
40(No Transcript)
41Opinion of own weight, by sex
Proportionally, more women than men had the
opinion that they were Overweight.
42OK, but how does this change if we add an
objective measure of weight, such as Body Mass
Index (BMI)?
43Start where we left off opinion of
own weight, by sex
But add another variable as a layer
44Add BMI class as a layer
45Layer those with a BMI indicating underweight
Of respondents who were objectively
underweight, proportionally more women than men
had the subjective opinion that they were Just
About Right.
46Layer those with a BMI indicating normal
weight
Of respondents who were objectively normal
weight, proportionally more women than men had
the subjective opinion that they were
Overweight.
47Layer those with a BMI indicating overweight
Of respondents who were objectively overweight,
proportionally more MEN than women had the
subjective opinion that they were Just About
Right.
48OK, I have an confession to make
49Statistical Weight All the previous slides
ignored an important concept that of weight. Not
weight in kilograms but rather statistical
weight. We dont want to describe the sample
we want to describe the population at large (in
this case, Canadians 18). Statistical weights
are assigned by statisticians, not surprisingly,
to each individual in a sample, based on a
variety of demographic and sampling
considerations. These weights reflect how many
people a given respondent represents in the
population being studied.
50(No Transcript)
51In general, you must apply the Statistical Weight
in order to get valid results.
52They say a picture is worth a thousand words
If this is true, then a good chart has to be
worth at least a couple of hundred Lets
revisit our data visually using the bar chart
feature of Nesstar.
53Barcharts showing weighted results
Proportionally, of those who are objectively
underweight, more women than men think they are
just about right
Weight is on
54Barcharts showing weighted results
Proportionally, of those who are objectively
normal weight, more women than men think they are
overweight
Weight is on
55Barcharts showing weighted results
Proportionally, of those who are objectively
overweight, more men than women think they are
just about right
Weight is on
56Searching for questions in Nesstar Simple
Search
57Search results Simple search
You get all the surveys that have the keyword
you searched for but specific questions
(variables) are NOT highlighted.
58Searching for questions in Nesstar Advanced
Search
59Advanced Search Screen
60Search results Advanced search
Here, specific variables that meet the search
criteria are shown, with the option of opening
in context
61Menu options
Table
Download
Barchart
Export to spreadsheet
Time series graph
Export PDF
Map
Print
Clear
Create bookmark
Weight
Help
Subset
62- OK, so what kind of data can I expect to find
using ODESI? - Statistics Canada survey files released through
the Data Liberation Initiative (Census PUMFs,
Special Surveys, General Social Surveys, and
more) - Public Opinion Polls (e.g. Gallup)
- Survey files from other sources (academics)
- These surveys and polls include questions on all
manner of topics (politics, health, work,
leisure, education, drug use, aging, spending,
internet use, and many more)
63Lets take a look at some Gallup
questions Dataset Canadian Gallup Poll, August
1951, 212 In some cities in Canada, horsemeat
is now being sold, because of the high price of
other meats. If horsemeat were available here,
would you be willing to try it? 35.9 of
respondents said Yes theyd be willing.
64Dataset Canadian Gallup Poll, September 1956,
251 WOULD YOU FAVOR REQUIRING EVERY ABLE-BODIED
YOUNG MAN IN THIS COUNTRY, WHEN HE REACHES THE
AGE OF 18, TO SPEND ONE YEAR IN MILITARY TRAINING
AND THEN JOIN THE RESERVES OR MILITIA? 65.7
favoured this.
65(No Transcript)
66Dataset Canadian Gallup Poll, August 1953, 231
HOW MUCH DO YOU THINK A YOUNG MAN SHOULD BE
EARNING PER WEEK BEFORE HE GETS MARRIED?
41 - 50 per week equals roughly 2100 - 2600
annually.
67(No Transcript)
68Dataset Canadian Gallup Poll, August 1953, 231
THERE'S AN ATTEMPT BEING MADE BY SOME FASHION
LEADERS TO SHORTEN WOMEN'S SKIRTS. DO YOU THINK
THAT WOMEN SHOULD FOLLOW THIS LEAD - AND WEAR
SKIRTS SHORTER THAN THEY ARE NOW?
69Tracking Opinions over time
DO YOU APPROVE OF THE USE OF BIRTH CONTROL?
70(No Transcript)
71- Researchers can search across all surveys in a
collection. - Researchers have the ability to explore surveys
in more detail (e.g. looking at questions by
gender, province, age group, income, etc.). - Tables can be saved in Excel or Adobe format.
- Researchers can download data for use in more
powerful statistical packages (SPSS, SAS, etc.)
Key points about survey data in ltODESIgt
72- In conclusion, ODESI will
- Provide a more level data playing field for
Ontario Universities. - Provide students and researchers with access to a
substantial and growing body of survey and
polling data, both current and historical. - Provide an easy, yet powerful, search and
exploration tool (Nesstar) that will serve both
beginners and power users. - Encourage cooperation and sharing of data and
metadata in Ontario. - Serve as a potential model for other
jurisdictions. - ltodesi.cagt
73(No Transcript)