Title: Sections 1'1 and 1'2
1Sections 1.1 and 1.2
- Displaying and Describing Data
2 - Categorical variables (qualitative)-
- Quantitative variables (numerical)-
- See book for working with categorical data. We
will focus on quantitative data in lecture. -
3Picturing Quantitative Variables
- Stemplot of Heights
- Split Stems
- Line up leaves
4Picturing Quantitative Variables
- Histogram of Heights
- All classes should have same width
- Choosing class widths
- Data on a boundary.
- Height can be count or percentages
5Describing shape.
- Symmetric.
- Skewed to left or right
Think Skewed to the tail.
6Describing center.
- What is the typical height?
- What is the typical amount of pocket money?
- Two notions of center.
- Median
- Mean
7Comparing Notions of Center
- For symmetric distributions (like height)
- For skewed distributions (like pocket money)
- For distributions with outliers.
8Which of the following is likely to have a mean
that is smaller than the median? a) The
salaries of all National Football League players.
b) The scores of students (out of 100
points) on a very easy exam in which most get
nearly perfect scores but a few do very poorly.
c) The prices of homes in a large city.
d) The scores of students (out of 100
points) on a very difficult exam in which most
get poor scores but a few do very well.
9Describing spread.
- What is the spread of hours slept?
- What is the spread of pocket money?
- Two notions of spread.
- Five number summary / Boxplot
- Single number summarizing spread standard
deviation (s).
10Five number summary/Boxplot
- Example (hrs slept)
- Side by side boxplots useful for comparing.
11Standard deviation (s)
12- Suppose I gave 20 to every student as they
walked into class? How would each of the
following numbers change? - The median of pocket money.
-
- The mean of pocket money.
- The standard deviation of pocket money.
13Summary
- Categorical Variables
- Quantitative Variables
- Histograms and/or Stemplots
- Describe Shape, Center, and Spread
- Shape is often symmetric or skewed.
- Use mean or median for center.
- Use standard deviation or 5 number summary for
spread.