Title: University of Minnesota
1What Can We Learn from Quantitative Data in
Statistics Education Research?
- Sterling Hilton Brigham Young University
- Andy Zieffler University of Minnesota
- John Holcomb Cleveland State University
- Marsha Lovett Carnegie Mellon University
2Introduction
- Components of a research program
- Generate ideas (pre-clinical)
- Develop a conceptual framework
- Frame question (pre-clinical, Phase I)
- Constructs and Measurement
- Design and Methods
- Pilot study
- Examine question (Phase I, Phase II)
- Establish efficacy (small)
- Generalize findings (Phase III)
- Larger studies in varied settings
- Extend findings (Phase IV)
- Longitudinal studies
- Different populations
3Introduction
- Quantitative methods in research program
- Framing measurement development
- Validity and reliability
- Framing pilot study
- Examine
- Generalize
- Extend
- Statistics education research is primarily in the
generate and frame phases
4Introduction
- Purpose Introduce two instruments that are in
different stages of development and discuss how
they have been and might be used in statistics
education research - Comprehensive Assessment of Outcomes in a Fist
Statistics course (CAOS) - Survey of Attitudes Toward Statistics (SATS)
5Assessment Resource Tools for Improving
Statistical Thinking
- Several online assessments
- ARTIST Topic Scales
- Comprehensive Assessment of Outcomes in a First
Statistics course (CAOS) - Statistics Thinking and Reasoning Test (START)
6ARTIST Topic Scales
- 7-15 MC items
- Many topics
- Data Collection
- Data Representation
- Measures of Center
- Measures of Spread
- Normal Distribution
- Probability
- Bivariate Quantitative Data
- Bivariate Categorical Data
- Sampling Distributions
- Confidence Intervals
- Significance Tests
7CAOS Test
- 40 MC items
- Designed to assess students statistical
reasoning after any first course in statistics. - CAOS test focuses on statistical literacy and
conceptual understanding, with a focus on
reasoning about variability. - Developed through a three-year process of
acquiring and writing items, testing and revising
items, and gathering evidence of reliability and
validity.
8CAOS Test
- Reliability Analysis
- Sample of 10287
- Cronbachs alpha coefficient of .77
- Content Validity Evidence
- 18 expert raters
- Unanimous agreement that CAOS measures important
basic learning outcomes - All raters agreed with the statement CAOS
measures outcomes for which I would be
disappointed if they were not achieved by
students who succeed in my statistics courses. - Some raters indicated topics that they felt were
missing from the scale - no agreement among these
raters about the topics that were missing.
9START Test
- 14 MC items
- Identified through a principal components
analysis performed on CAOS data gathered in Fall
2005 and Spring 2006 (n 1470). - Alpha Coefficient from that data set was
calculated to be 0.74.
10Use of Quantitative Measures in a Phase 1 Study
- Exploratory Studies
- What can we find out about students
understanding? - Where are students having difficulties?
- Are there inconsistencies in students reasoning?
11Example Item 1
- Measured Learning Outcome
- Understanding the interpretation of a median in
the context of boxplots.
12Example Item 1
- The two boxplots below display final exam scores
for all students in two different sections of the
same course
13Example Item 1
- Which section has a greater percentage of
students with scores at or above 80? - Section A
- Section B
- Both sections are about equal.
14Example Item 1
- Which section has a greater percentage of
students with scores at or above 80? - Section A
- Section B
- Both sections are about equal.
15Example Item 1
- How did students answer this item?
16Example Item 1
17Example Item 1
- Is this surprising?
- What can we learn from students responses to
this item? - Implications/Directions for research? Teaching?
18Example Item 2
- Measured Learning Outcome
- Understanding that correlation does not imply
causation.
19Example Item 2
- Researchers surveyed 1,000 randomly selected
adults in the U.S. A statistically significant,
strong positive correlation was found between
income level and the number of containers of
recycling they typically collect in a week.
Please select the best interpretation of this
result.
20Example Item 2
- We can not conclude whether earning more money
causes more recycling among U.S. adults because
this type of design does not allow us to infer
causation. - This sample is too small to draw any conclusions
about the relationship between income level and
amount of recycling for adults in the U.S. - This result indicates that earning more money
influences people to recycle more than people who
earn less money.
21Example Item 2
- We can not conclude whether earning more money
causes more recycling among U.S. adults because
this type of design does not allow us to infer
causation. - This sample is too small to draw any conclusions
about the relationship between income level and
amount of recycling for adults in the U.S. - This result indicates that earning more money
influences people to recycle more than people who
earn less money.
22Example Item 2
- How did students answer this item?
23Example Item 2
24Example Item 2
- Is this surprising?
- What can we learn from students responses to
this item? - Implications/Directions for research? Teaching?
25Example Item 3
- Measured Learning Outcome
- Ability to match a scatterplot to a verbal
description of a bivariate relationship.
26Example Item 3
- Bone density is typically measured as a
standardized score with a mean of 0 and a
standard deviation of 1. Lower scores correspond
to lower bone density. Which of the following
graphs shows that as women grow older they tend
to have lower bone density?
27Example Item 3
28Example Item 3
- How did students answer this item?
29Example Item 3
30Example Item 3
- Is this surprising?
- What can we learn from students responses to
this item? - Implications/Directions for research? Teaching?
31Example Item 4
- Measured Learning Outcome
- Understanding of the purpose of randomization in
an experiment.
32Example Item 4
- A recent research study randomly divided
participants into groups who were given different
levels of Vitamin E to take daily. One group
received only a placebo pill. The research study
followed the participants for eight years to see
how many developed a particular type of cancer
during that time period. Which of the following
responses gives the best explanation as to the
purpose of randomization in this study?
33Example Item 4
- To increase the accuracy of the research results.
- To ensure that all potential cancer patients had
an equal chance of being selected for the study. - To reduce the amount of sampling error.
- To produce treatment groups with similar
characteristics. - To prevent skewness in the results.
34Example Item 4
- To increase the accuracy of the research results.
- To ensure that all potential cancer patients had
an equal chance of being selected for the study. - To reduce the amount of sampling error.
- To produce treatment groups with similar
characteristics. - To prevent skewness in the results.
35Example Item 4
- How did students answer this item?
36Example Item 4
37Example Item 4
- Is this surprising?
- What can we learn from students responses to
this item? - Implications/Directions for research? Teaching?
38How Can We Use the Results?
- Begin to look for underlying reasons students are
having difficulties - Examine the research literature
- Interview students to gain a more in-depth
understanding of their reasoning - Compare results with data from other classes
(other teachers, schools)
39How Can We Use the Results?
- They can inform our instruction
- Reconsider how difficult or easy some concepts
are for students - Rethink how we currently teach these ideas
- Add new activities or tools
- Re-allocate classroom time
- Change the way we assess students
- Assessment items better aligned with learning
outcomes - Assessment items that probe students reasoning
40SATS
- Survey of Attitudes Towards Statistics
- Candace Schau and Tom Dauphinee
- (http//www.unm.edu/cschau/satshomepage.htm)
- Twenty-eight item survey
- Seven point Likert scale response
- Strongly Neither agree Strongly
- Disagree nor disagree Agree
- 1 2 3 4 5 6 7
41SATS
- Original four subscales
- Value (9 items a range .80 - .90 )
- Statistics is worthless.
- Affect (6 items a range .80 - .85)
- I like statistics.
- Cognitive Competence (6 items a range .77 - .85)
- I have no idea of whats going on in
statistics. - Difficulty (7 items a range .64 - .79)
- Statistics is a complicated subject.
42SATS
- Two additional subscales
- Interest (4 items)
- I am interested in using statistics.
- Effort (4 items)
- I plan to complete all of my statistics
assignments.
43SATS
- Attitude is multi-faceted outcome
- Issues to consider
- Pre-existing attitudes
- Direction and magnitude of changes over a
semester - Relevance of items to study
44Using the SATS A Case Study
- Assessment of a project-rich introductory
statistics course - Fall 2004, at Cleveland State University
- Class 1 30 students Pre/Post
- Class 2 16 students Pre/Post
- SATS administered first day and final exam day
45Class 1 Projects - Rich
- 4 team projects that used/required
- Real data
- Computer Software
- Collaboration
- Writing
- Individualized Mid-Term and Take-home Data
Analysis Exams - http//academic.csuohio.edu/holcombj/eku/index.htm
l - Login holcomb pwd projects22
46Class 2
- Ti 83
- In Class demos
- Homework and Exams
47Comparison of Pre Data
- No significant difference between Class1 and
Class2
48(No Transcript)
49(No Transcript)
50(No Transcript)
51Class 1 Change from Pre to Post(2 sided tests)
- Significant Differences for
- Cognitive Competence
- Value
- Difficulty
- Interest
- Insignificant Differences for
- Affect
- Effort
- (Not Significant with Nonparametric Test)
52(No Transcript)
53Class 2 Change from Pre to Post (2- sided tests)
- Significant Differences
- Affect (wrong direction)
- Insignificant Differences
- Cognitive Competence
- Value
- Difficulty
- Interest
- Effort
54(No Transcript)
55Multivariate Analysis of Post DataClass
Significant vs Insignificant
- Significant Differences
- Affect
- Value
- Interest
- Insignificant Differences
- Cognitive Competence
- Difficulty
- Effort
56Does SATS Ask the Right Questions?
- Value Component Questions
- Statistics is worthless.
- Statistics should be a required part of my
professional training. - Statistical skills will make me more employable.
- Statistics is not useful to the typical
professional. - Statistical thinking is not applicable in my life
outside my job. - I use statistics in my everyday life.
- Statistics conclusions are rarely presented in
everyday life. - I will have no application for statistics in my
profession. - Statistics is irrelevant in my life.
57What are the Questions You Want to Ask?
58Instructors Do try this at home!
- But first, set your expectations
- Results may not be as high as you desire by the
end of your course (e.g., CAOS) - Results may not change from the beginning to the
end of your course or in the direction you
anticipate (e.g., SATS) - Same is true for other instruments, too
59How might you use such data?
60How might you use such data?
- To better understand students learning of
particular concepts and skills - To identify different patterns of student
performance - To establish a starting point for further inquiry
- To make your teaching and students learning more
effective - To assess where students start and to reveal
areas of difficulty during course
61Some Practical Considerations
- Motivating students to take these instruments
seriously - Grading?
- Feedback
- Instrument integrity
- Time to administer
- Others?
62INQUERI Project
- INQUERI Initiative for Quantitative Education
Research Infrastructure - To build a research infrastructure by focusing on
the development, deployment, user training, and
archiving of high quality research methods,
instruments, and data - To disseminate these methods and results
- To catalyze research collaborations
- See www.inqueri.org
63Back to the Big Picture
- Focus on the question/goal you want to address
and relate that to past research - Start small
- Using existing instruments is one way
- Working within your own course to start
- Share with colleagues, connect with the
literature, and then extend
64References
- delMas, R., Garfield, J., Ooms, A., Chance, B.
(2006). Assessing students' conceptual
understanding after a first course in statistics.
Paper presented at the Annual Meeting of the
American Educational Research Association, San
Francisco, CA. - Garfield, J., delMas, R., Chance, B. (n.d.).
Assessment Resource Tools for Improving
Statistical Thinking Retrieved May 8, 2007, from
https//app.gen.umn.edu/artist/index.html.
65References
- http//www.unm.edu/cschau/satshomepage.htm
- Dauphinee, T. L., Schau, C., Stevens, J. J.
(1997). Survey of Attitudes Toward Statistics
Factor structure and factorial invariance for
females and males. Structural Equation Modeling,
4, 129-141. - Schau, C., Stevens, J., Dauphinee, T. L., Del
Vecchio, A. (1995). The development and
validation of the Survey of Attitudes Toward
Statistics. Educational and Psychological
Measurement, 55, 868-875. - Hilton, S. C., Schau, C., Olsen, J. A. (2003).
Survey of Attitudes Toward Statistics Factor
structure invariance by gender and by
administration time. Structural Equation
Modeling, 11, 92 109.
66Contact Information
- Sterling Hilton
- hiltons_at_byu.edu
- Andy Zieffler
- zief0002_at_umn.edu
- John Holcomb
- j.p.holcomb_at_csuohio.edu
- Marsha Lovett
- lovett_at_csuohio.edu