Title: Minium, Clarke
1Introduction
- Minium, Clarke Coladarci, Chapter 1
2Summary
- Introduction
- Goals of Scientific Research
- Measurement
- Descriptive and Inferential Statistics
- Populations and Samples
- Role of Statistics in Research
3Goals of Scientific Research
- Describe
- Predict
- Explain
- Effect Change
- Consider
- Cod Stocks
- Incidence of Depression
4Measurement
- Variables
- Characteristics or attributes that can take
on a number of different values - height, weight, reaction time, eye colour
- Values
- Possible numbers or categories that of a variable
can have - height in inches, weight in pounds, Times in
seconds, blue/brown - Scores/Measurements
- A particular persons value on a variable
- 68 inches, 167 pounds, .505 seconds, blue
5Kinds of Variables
- Numeric variables
- Values are numbers that tell you something about
how much there is of the thing being measured - Equal-interval variables
- Values are numbers that stand for equal amounts
of what is being measured - Interval scales (no absolute zero ), e.g.,
temperature in Fahrenheit or Celsius - Ratio scales (absolute zero), e.g., height,
weight, time, temperature in Kelvin - Rank-order variables
- Values are ranks
- e.g., 2nd in the Olympic ski jump, 6th in your
graduating class
6Kinds of Variables
- Nominal/Categorical variables
- Values are categories
- e.g., Eye colour of brown, Canadian citizenship,
sex, race.
7Kinds of Variables
- Most psychological research involves equal
interval numeric variables - amount of food eaten by a rat
- physiological measures or brain activity
- reaction time
- test scores
- measures of anxiety on questionnaires
- My heart feels like its racing
- never 1 2 3 4 5 6 7 always
- I worry constantly
- never 1 2 3 4 5 6 7 always
8Note!
- A collection of scores may be referred to as data
- Datum (singular, this datum)
- Data (plural, these data)
- Data set (singular, this data set)
9Rounding
- You will do a lot of calculations in this course
- You may be asked to give answers that are rounded
to the nearest - whole number, e.g., 10.6 gt 11
- tenth (first decimal place), e.g., 10.62 gt 10.6
- hundredth (second decimal place), e.g., 10.636 gt
10.64 - etc.
- In the case of a 5 followed by 0s round to the
nearest even digit - whole number, e.g., 10.5 gt 10, 11.5 gt 12
- tenth (first decimal place), e.g., 10.65 gt 10.6,
10.75 gt 10.8 - hundredth (second decimal place), e.g., 10.635 gt
10.64
10Two Branches of Statistical Methods
- Descriptive statistics
- The purpose of descriptive statistics is to
organize and to summarize data so that the data
are more readily comprehended. - Inferential statistics
- Inferential statistics permit conclusions about
a population, based on the characteristics of a
sample of the population.
11Statistics populations and samples
- A population represents all the data of
interest a sample is anything less than that - example populations
- all Canadian voters
- all Concordia students
- all maple trees in Quebec
- all houses built in Canada in 1953
- all incomes in Ontario in 2003
- all stray dogs
- etc.
12Statistics populations and samples
- A population represents all the data of
interest a sample is anything less than that - example samples
- a subset of Canadian voters
- a subset of Concordia students
- a subset of maple trees in Quebec
- a subset of houses built in Canada in 1953
- a subset of incomes in Ontario in 2003
- a subset of stray dogs
- etc.
13Statistics populations and samples
- NOTE population in a statistical sense does
not mean all humans. - However, in many of our examples we will discuss
populations of people. - e.g.,
- all university students
- all 9 month old babies
- all dyslexics
- all men/women
- all statistics students
14Statistics populations and samples
- a parameter is a characteristic of a population
- e.g., the average heart-rate of all Canadians.
- a statistic is a characteristic of a sample
- e.g., the average heart-rate of a sample of
Canadians. - When we infer population parameters from sample
statistics we are doing inferential statistics. - e.g., we use the average heart-rate of the sample
to estimate the average heart-rate of the
population
15The Role of Statistics in Research
- Substantive Question (or Research Question)
- e.g., Does alcohol consumption related to1
depression? - Statistical Question
- e.g., Are the depression scores for heavy
drinkers significantly higher than those for
teetotalers?
Substantive Question
Statistical Question
Statistical Conclusion
Substantive Conclusion
1 related to does not mean causes
16Descriptive StatisticsFrequency Distributions
- Minium, Clarke Coladarci, Chapter 2
17Frequency Tables
- Why Frequency Tables?
- Practical Reason 1 See the order in a set of
data - Practical Reason 2 It is important to examine
your data for klinkers and outliers - Pedagogical Reason 1 Introduction to
distributions - Frequency distribution
- Shows the scores and their frequency of
occurrence in an ordered list - Grouped frequency distribution
- Range of scores in each of several equally sized
intervals - Relative frequency distribution
- Percentage of scores in each of several equally
sized intervals - Cumulative Percentage frequency distribution
- the percentage of cases lying below the upper
exact limit of each class interval
18Frequency Tables
- Final grades from last term
- F B B B C B F C
- C B A B C A C C
- A B A C D A C C
- B C B C B D A C
- B B D B A B B B
- F B B A A B F D
19Frequency Tables
- Grade Number Percent Cumulative
- A 9 18.75 100.00
- B 19 39.58 81.25
- C 12 25.00 41.67
- D 4 8.33 16.67
- F 4 8.33 8.33
-
- Total 48 100
20Grouped Scores
- Sometimes we have a large number of values so it
makes more sense to group scores - class intervals
- interval width
- score limits
- Consider a stress questionnaire (next slide)
administered to 100 students.
21Grouped Scores
22Grouped Scores
60 41 35 57 57 65 42 35 66 50 41 15 47 60 50 53 30
54 46 41 58 32 73 49 46 54 66 31 27 82 34 66 49 6
3 69 68 39 45 50 46 51 48 68 85 78 46 50 36 51 29
20 56 40 58 42 48 50 32 55 77 43 64 50 50 52 31 46
35 57 55 57 18 38 64 38 25 56 44 69 33 45 40 59 5
1 60 39 46 53 42 59 69 39 46 33 15 54 25 48 54 69
- ... it makes more sense to group scores
- class intervals
- interval width
- score limits
23Guidelines for Forming Class Intervals
- All intervals should be the same width
- Intervals should be continuous throughout the
distribution - i.e., dont delete an interval just because there
are no scores in the interval - The interval containing the highest value should
be at the top - There should be (roughly) 10 to 20 intervals
- Select an odd (not even) value for the interval
width - The lower score limits should be multiples of the
interval width
24Constructing a Grouped Frequency Distribution
- Find the highest (max) and lowest (min) scores
- Find the range range max - min 1
- Divide the range by 10 and 20 to determine
possible interval widths - Determine the lowest class interval
- List all class intervals with the highest
interval at the top - Tally the number of scores in each class interval
- Convert each tally to a frequency (i.e., count
them)
25Constructing a Grouped Frequency Distribution
100 Stress Scores
60 41 35 57 57 65 42 35 66 50 41 15 47 60 50 53 30
54 46 41 58 32 73 49 46 54 66 31 27 82 34 66 49 6
3 69 68 39 45 50 46 51 48 68 85 78 46 50 36 51 29
20 56 40 58 42 48 50 32 55 77 43 64 50 50 52 31 46
35 57 55 57 18 38 64 38 25 56 44 69 33 45 40 59 5
1 60 39 46 53 42 59 69 39 46 33 15 54 25 48 54 69
26Constructing a Grouped Frequency Distribution
100 Sorted Stress Scores
15 31 38 42 46 50 52 56 60 68 15 32 38 42 46 50 53
57 60 69 18 32 39 42 46 50 53 57 63 69 20 33 39 4
3 46 50 54 57 64 69 25 33 39 44 47 50 54 57 64 69
25 34 40 45 48 50 54 58 65 73 27 35 40 45 48 50 54
58 66 77 29 35 41 46 48 51 55 59 66 78 30 35 41 4
6 49 51 55 59 66 82 31 36 41 46 49 51 56 60 68 85
27Constructing a Grouped Frequency Distribution
- Interval Number
- 85-89 1
- 80-84 1
- 75-79 2
- 70-74 1
- 65-69 10
- 60-64 6
- 55-59 12
- 50-54 17
- 45-49 15
- 40-44 10
- 35-39 9
- 30-34 8
- 25-29 4
- 20-24 1
- 15-19 3
28Relative Frequency Distribution
- Interval Number Proportion Percent
- 85-89 1 0.01 1
- 80-84 1 0.01 1
- 75-79 2 0.02 2
- 70-74 1 0.01 1
- 65-69 10 0.10 10
- 60-64 6 0.06 6
- 55-59 12 0.12 12
- 50-54 17 0.17 17
- 45-49 15 0.15 15
- 40-44 10 0.10 10
- 35-39 9 0.09 9
- 30-34 8 0.08 8
- 25-29 4 0.04 4
- 20-24 1 0.01 1
- 15-19 3 0.03 3
29Exact Limits
- So far weve considered whole numbers
- Often our scores will be real numbers
- We distinguish between
- score limits and
- exact limits
- the exact limits of a score are considered to
extend from one half the smallest unit of
measurement below the lowest interval to one half
the smallest unit of measurement above the
highest interval - e.g., for the score limits of 15-19, the real
limits are from 14.5 to 19.5 - We can create a grouped frequency distribution
for real number scores in exactly the same way as
for whole number scores by making use of the real
limits
30Exact Limits
100 real number Scores
67.127 35.706 81.411 47.079 42.338 48.710 60.843 4
5.265 48.658 71.628 73.279 61.673 51.202 51.132 53
.739 54.879 50.593 64.668 34.651 60.084 70.755 49.
905 35.941 42.101 55.538 44.973 73.119 50.274 64.0
63 52.080 38.629 57.867 59.536 39.718 52.688 45.16
3 24.484 62.269 33.024 37.107 56.640 70.464 75.230
45.974 49.441 44.264 34.494 60.535 39.340 38.716
63.667 57.231 58.905 32.175 25.950 35.699 38.544 4
6.531 32.458 68.444 33.889 38.194 61.852 53.729 55
.091 53.504 82.646 48.295 65.982 67.261 53.026 61.
280 51.579 51.537 48.033 68.529 56.474 51.919 39.7
94 40.880 61.443 47.497 47.621 49.385 57.278 41.32
2 43.344 38.009 24.113 62.092 30.677 37.757 63.064
16.286 58.981 42.477 50.450 46.421 62.198 53.257
- Min 16.2863, Max 82.6464
- Range 82.6464 - 16.2863 1 67.3601
- 67.3601/10 7, 67.3601/20 3
- Set interval width to be 5
- lowest score limits are 15 - 19
- lowest real limits are 14.5 to 19.5
- Proceed as before
- List all class intervals with the highest
interval at the top - Tally the number of scores in each class interval
- Convert each tally to a frequency (i.e., count
them)
31Relative Frequency Distribution
- Interval Number Proportion Percent Cumulative
- 85-89 1 0.01 1 100
- 80-84 1 0.01 1 99
- 75-79 2 0.02 2 98
- 70-74 1 0.01 1 96
- 65-69 10 0.1 10 95
- 60-64 6 0.06 6 85
- 55-59 12 0.12 12 79
- 50-54 17 0.17 17 67
- 45-49 15 0.15 15 50
- 40-44 10 0.1 10 35
- 35-39 9 0.09 9 25
- 30-34 8 0.08 8 16
- 25-29 4 0.04 4 8
- 20-24 1 0.01 1 4
- 15-19 3 0.03 3 3
32Percentile Scores and Percentile Ranks
- Percentile Rank is the proportion of scores in a
distribution falling below a particular scores - Consider grades on a test that range from 0 to 50
- if 80 of grades fall below a grade of 23 then
its percentile rank is 80 - so, the 80th percentile score is 23
- 23 might not sound like a good grade, but it is
in this class because 80 of the scores were
lower than 23.
33Frequency Distributions for Qualitative Variables
- List the categories
- Record the frequencies
- Hair Colour frequency(f)
- Black 250 38.5
- Blonde 150 23.1
- Brown 200 30.8
- Red 50 7.7
- ----
- n650