Title: Statistics for Modern Business Decisions
1Statistics for Modern Business Decisions
- Professor Monnie McGee
- Department of Statistical Science
- Lecture 2 Displaying Data
2Lecture 2 Highlights
- Topics (PBS Section 1.1)
- Numerical Categorical Data (Review pages 4 5)
- Bar Charts and Pie Charts (pp. 7 9)
- Histograms (pp. 10 18)
- Assignments
- Due ASAP
- Print Out, Read, and Sign the Course Information
Documentfrom course web site (Mandatory
Assignment) - Try to log in to WebCT before Fridays lab.
- Homework 1 due Friday, September 5, at beginning
of lab - 1.14, 1.18 (part c on following page), 1.28
- Problems begin on page 24
3Charts for Categorical Variables
- Bar Charts
- Each bar represents a category
- Height of bar corresponds to percentage or
frequency of data in that category - Can display 2 variables simultaneously (called a
clustered bar chart) - Percents dont have to sum to 100
- Best for comparing categories
- Pareto charts are bar charts where the bars are
arranged in descending order of frequency from
left to right. - Pie Charts
- Each slice of the pie represents a category
- Size of slices correspond to percentage
- Must sum to 100
- Best for showing percentage of the whole
4(No Transcript)
5Distribution of Accidents Involving Firestone
Tires in 2000
6(No Transcript)
7Pareto Chart of Firestone Data
8Apply Your Knowledge
- In 1999 there were 6023 job-related deaths in
the United States. Among these were 807 deaths
in agriculture-related jobs, 121 in mining, 1190
in construction, 719 in manufacturing, 1006 in
transportation and public utilities, 237 in
wholesale trade, 507 in retail trade, 105 in
finance-related jobs, 732 in service-related
jobs, and 562 in government jobs. - Find the percent of occupational deaths for each
of these job categories, rounded to the nearest
whole percent. What percent of job-related
deaths were in categories not listed here? - Make a well-labeled bar graph of the distribution
of occupational deaths. Be sure to include the
other occupations bar. - Make a well-labeled Pareto chart of these data.
What percent of all occupational deaths are
accounted for by the first three categories in
your Pareto chart? - Would it also be correct to use a pie chart to
display these data? Explain.
9Graphs for Quantitative Variables
- Quantitative variables often take many values
- Makes more sense to group nearby values into
intervals - Histogram
- Displays shape, variability and outliers
- The height of each bar represents the count or
percent of values in an interval. - Bars must cover the entire range of values of a
variable - No space between bars unless an interval is empty
- No right way to choose number of intervals use
best judgment to display shape of the data
10Frequency Distributionsfor Numerical Data
- Class Intervals
- Select the Number of Classes (approx. 5-15)
- Use a Common Width for Each Class (when
possible) - Make Classes Mutually Exclusive Exhaustive
(Each Data Value Can be Placed in One and
Only One Class)
Always Round Up!!
Largest - Smallest
Width of Interval
Just Round
Number of Classes
11(No Transcript)
12(No Transcript)
13(No Transcript)
14Examining a Distribution (p. 14)
- Look for the overall pattern
- Look for striking deviations from that pattern
- Overall pattern of histogram is described by its
shape, center, and spread - Shape symmetric, right-skewed and left-skewed
- Center value which divides the data values
approximately in half (or where the histogram
will balance) - Spread largest in smallest values
- Outliers are individual values that deviate from
the overall pattern
15Creating a Histogram
- EPA regulations require automakers to give the
city and highway gas mileages for each model of
car. Table 1.2 gives the highway mileages for 32
midsize 2001 model year cars. - Make a histogram of the highway mileages of these
cars - Describe the main features of the distribution of
highway mileage - The government imposes a gas guzzler tax on
cars with low gas mileage. Do you think that any
of these cars are subject to the tax?
16(No Transcript)
17Average Salaries for 30 Major League Baseball
Teams on Opening Day of the 2000 Season
18Distribution of Individual Salaries of Cincinnati
Reds Players on Opening Day of the 2000 Season
19Distribution of Monthly Returns for all U.S.
Common Stocks from 1951 - 2000
20Variables Measured Over Time
- Cross-sectional data information concerning a
group of individuals at one time - Time Series measurements of a variable taken at
regular intervals over time - Time plot Plot observation against the time at
which it was measured - Time is on horizontal axis, variable on vertical
- Connect data points to emphasize changes over
time - Overall Patterns
- Seasonal variation a pattern that repeats itself
at known regular intervals of time - Trend persistent, long-term rise or fall
- Cycles Irregular but clear up and down movements
21Annual Orange Price Index, 1991 - 2001
22Index Numbers
- Note on vertical axis of previous slide 1982
84 100) - Index number gives current value as a percentage
of a base value (CPI) - Based on survey of prices of goods and services
collected by the government - Nationwide average price that is less variable
than prices at individual stores
23Seasonally Adjusted Monthly Unemployment Rate
January 1990-August 2001