Chapter 2. Sadistics - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 2. Sadistics

Description:

Inferential Statistics is a leap into the unknown. ... is easy to communicate, yet captures the essence of the complete data set it represents. – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 31
Provided by: McEw
Category:

less

Transcript and Presenter's Notes

Title: Chapter 2. Sadistics


1
Chapter 2. Sadistics
  • Juggling
  • Basic
  • Statistical
  • Notions
  • Important For
  • Teachers
  • While
  • Maintaining
  • Sanity

2
Remember . . .
  • There are three kinds of people in the world.
  • Those who are good at math, and
  • Those who arent.

3
Raw Scores without organization or context,
scores by themselves are meaningless
  • George Carlins famous line Heres a partial
    football score just in, Green Bay 6.
  • Ohio State 35 Michigan 32
  • Annika Sorenstam 35 Lorena Ochoa 32
  • John and Charley are taking a trip together.
    Charley has lived for 7 years John has lived for
    27. Who is the older of the two?
  • Charley
  • John
  • Charley is a dog. In human years he is 46 e.g.,
    (7-1) x 5 16 46. John is a man. In human
    years he is 27.

4
Topic List
  • Descriptive vs. Inferential Statistics
  • Concepts of Data and Variables
  • Frequency Tables
  • Bar Graphs and Histograms
  • Measures of Central Tendency
  • Measures of Variability
  • Shapes of Distributions
  • z-scores
  • Correlation

5
Two Main Areas of StatisticsDescriptive vs.
Inferential
  • Descriptive Statistics is used to organize,
    consolidate or summarize data we have in front of
    us. Typically in descriptive statistics we
    describe
  • a set of data elements by graphically displaying
    the information or
  • its central tendencies and how it is distributed
    in relation to this center or
  • the relationship between two data elements.
  • Inferential Statistics is a leap into the
    unknown. We use samples (a selected portion of
    the data set) to draw inferences about
    populations (the complete set of data elements).

6
Variables
  • A good place to begin is with the concept of
    variables. Our students vary with regard to
    many characteristics related to aptitude and
    achievement. We can think of these variable
    characteristics using three levels of generality.

7
English Lesson When Speaking of Data
  • As a teacher, you will see lots of data.
  • The dictionary defines data as facts or
    figures. Notice the word data is plural and
    when speaking of data you need to use plural verb
    forms. It is an unfortunate giveaway that you
    might NOT know what you are talking about if you
    use the phrase this data is instead of these
    data are in professional conversation.
  • Datum is the singular form for data. No one
    ever uses this term.

8
Making and Reading Frequency Tables Part
1Frequency Distributions - with special focus
on bins (also known as intervals, categories and
class intervals)
  • Purpose of Creating these Tables To organize
    data in ways to make our inspection of those data
    much more manageable.
  • Frequency Distribution
  • We construct or read a table of counts per score.
  • BUT, when we have many scores, we create
    intervals (I like the term bins) and place the
    individual scores in the bins. When making bins
  • Determine your score range
  • Determine an appropriate number of bins. Rule of
    thumb no fewer than 5 or more than 20 class
    intervals work best for a frequency table.
  • Make sure no overlap exists so that no data fall
    into more than one bin.
  • Count each score in its one and only appropriate
    bin.
  • Notice that in the resulting table, individual
    scores are lost.

9
Making and Reading Frequency Tables Part
2Cumulative Distributions
  • Cumulative frequency distribution A distribution
    that indicates cumulative frequency counts (cum
    f) in each bin, and/or percentage of the total
    number of cases at and below the upper limit of
    the associated bin. Sometimes this is referred
    to simply as cumulative distribution or
    cumulative frequency.
  • Note Educators are using the description
    statistics of cumulative distributions when
    speaking of students relative standing.
  • Percentile The point on the original measurement
    scale at and below which a specified percentage
    of scores falls. Also called a percentile point.
  • Percentile rank The percentile rank of a score
    is the point on the percentile scale that gives
    the percentage of scores falling at and below a
    students specified score.

10
Frequency Distribution Table
11
Tables are Nice, but Pictures are Nicer
  • Frequency distributions are often converted into
    graphic form.
  • Bar Graph Individual counts. The count bins
    are separated on the horizontal line.
  • Histogram Grouped counts. The bins touch each
    other on the horizontal line.
  • Pie Graph Either individual or grouped counts.
    The media likes to display data using these
    graphs.
  • Explore the CSERD (Computational Science
    Education Reference Desk) Interactive Website.
    This is a Pathways project of the National
    Science Digital Library and funded by the
    National Science Foundation.

12
Ideas of Data Centers How Does Data Cluster?.
. . . starting with a concept from Garrison
Keillor.
  • Keillors hometown is Lake Wobegon, located near
    the geographic center of Minnesota.
  • Keillor reports that in Lake Wobegon "all the
    women are strong, all the men are good looking,
    and all the children are above average."

13
Central Tendency
  • While graphs and charts are useful to visually
    represent data, they are inconvenient they are
    difficult to display and can not be easily
    remembered apart from the visual. It is
    frequently useful to reduce data to a number
    (sometimes called an index number) that is easy
    to remember, is easy to communicate, yet captures
    the essence of the complete data set it
    represents.
  • One such index is called Measures of Central
    Tendency (i.e., how do the raw data tend to
    cluster)
  • Mean the arithmetical average
  • Median the middle score
  • Mode the most occurring score
  • So, these are measures of center regarding the
    data, but we are also concerned about how the raw
    data are spread out around the center.

14
Consider the two graphs below. These graphs
represent the scores on two quizzes. The mean
score for each quiz is 7.0. Despite the equality
of means, you can see that the distributions are
quite different. Specifically, the scores on Quiz
1 (top graph) are more densely packed while those
on Quiz 2 (bottom graph) are more spread out. The
differences among students was much greater on
Quiz 2 than on Quiz 1.
15
Variability
  • Our second index is called Measures of
    Variability (i.e., how do the raw data tend to
    spread out or scatter)
  • Range list the lowest and highest scores, then
    take the difference (aka subtract) between them
  • Standard Deviation (S, SD, s) this is an
    interesting concept it is akin to finding the
    average distance that scores are from the center
  • Variance (SD2) mathematically the standard
    deviation squared we more often use the standard
    deviation in educational assessment.

16
(No Transcript)
17
Shape of Normal Distributions
  • The frequency histograms for test score data
    often approximate what is called the normal
    distribution (aka bell curve, normal curve).
  • The normal curve has three characteristics
  • unimodal one hump
  • asymptotic tails never touch the base
  • symmetrical mirror image about the center axis

18
Normal Curve
19
Shape of Other Distributions
  • Kurtosis
  • platykurtic looks more flat
  • leptokurtic looks more peaked
  • Skewness
  • positive skew means that the tail is to the right
  • negative skew means that the tail is to the left.
  • --------------------------------------------------
    ------------------
  • Back to the normal distribution, lets look at
    transforming a data score to a score that will
    tell us where that score is in relationship to
    the mean. This score is called a z-score.

20
z-scores
  • Formula
  • z X - M
  • SD
  • Definition A measure of how many standard
    deviations a raw score is from the mean.
  • If the z score is negative, we say the score is
    below the mean
  • If the z score is positive, we say the score is
    above the mean

21
z-scores in normal curve
  • This Graph Leads In To Percentile Rank

22
Comparing Two Variables
  • So far we have only dealt with one variable (aka
    univariate statistics). Sometimes (I would say
    many times) we are curious as to the relationship
    between two variables (aka bivariate statistics).
    We call this curiosity an interest in
    co-relationships or correlation.

23
Some History . . . Francis Galton
(1822-1911)and Co-relations
  • Cousin of Charles Darwin
  • Interested in the mathematical treatment of
    heredity
  • Used statistical analysis to study human
    variation
  • noted that arranging measures of a physical trait
    in a population (height, e.g.) displays a
    bell-shaped distribution
  • Coined term "eugenics"science of improving the
    stock
  • variations (deviations) viewed as flaws as well
    as assets
  • artificial and natural selection will shift
    median of distribution

24
The Eugenics Movement
  • Scientific evidence was used to argue that
    social ills like feeble-mindedness, alcoholism,
    pauperism and criminal behavior are hereditary
    traits.
  • Aim - "to give the more suitable races or strains
    of blood a better chance of prevailing speedily
    over the less suitable"
  • Can no longer rely on natural selection
  • unfit survive to childbearing years due to
  • advances in medicine
  • comforts of civilization
  • social welfare
  • unfit reproduce at higher rate than fit,
  • Must design society by controlling human
    reproduction encourage fit to have children
  • prohibit unfit from having children

25
Scattergram Can you eye ball the one line you
could draw through the data points that best
describes the graphic display?
  • .

26
Correlation Coefficient the calculated number
that best describes the relationship between two
variables
  • Correlation coefficient symbol is r
    linear relationships
  • Range -1.00 through .00 to 1.00
  • Sign indicates direction
  • indicates that as one variable increases, the
    other variable increases
  • - indicates that as one variable increases,
    the other variable decreases
  • Number indicates strength
  • Although the following table is somewhat
    arbitrary, the following thinking might be useful
    in interpretation
  • -1.0 to -0.7 strong converse association.
  • -0.7 to -0.3 weak converse association.
  • -0.3 to 0.3 little or no association.
  • 0.3 to 0.7 weak direct association.
  • 0.7 to 1.0 strong direct association.

27
Important Notes about r
  • Not a percentage (decimal makes it look like one)
  • Linear assumption, not curvilinear
  • Equal scatter assumption no bunching
  • Variability affects r
  • Greater the variability, greater the r
  • Less the variability, lower the r
  • r does not imply causation

28
Depth Chart
  • During your YSU field work, you will be asked to
    organize data through the creation of frequency
    tables and histograms. Thus, we discussed
    constructing them as well as understanding them.
  • Throughout your professional practice, you will
    be asked to utilize measures of central tendency
    and variability. Thus, we emphasized
    understanding them, basic computations, and their
    relationship to z-scores. These concepts are key
    to understanding standard scores.
  • In professional publications you will see
    correlation coefficients. We discussed (and you
    were asked to compute) correlation. Correlation
    is a key tool in exploring our next topic
    reliability (and later, validity) .
  • Hopefully you will see value in computing
    measures based on your own classroom data. It is
    actually fun to learn to do these basic
    descriptive stats with a software package.
    Commonly used packages include SPSS, SAS,
    Minitab, and SYSTAT. Any system would be OK.
    Start simple.

29
Terms/Concepts to Review andStudy on Your Own (1)
  • bivariate distribution
  • measures of central tendency (mean, median, mode)
  • construct
  • correlation coefficient
  • descriptive statistics
  • frequency distribution
  • histogram
  • inferential statistics

30
Terms Concepts to Review andStudy on Your Own (2)
  • kurtosis
  • normal curve
  • range
  • scattergram
  • skewness
  • standard deviation
  • variability
  • variable
  • z-score
Write a Comment
User Comments (0)
About PowerShow.com