Stat 501 - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Stat 501

Description:

– PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 36
Provided by: statP
Category:
Tags: stat

less

Transcript and Presenter's Notes

Title: Stat 501


1
Stat 501
  • Experimental Statistics I

2
Handouts
  • Print off the following
  • Syllabus
  • Schedule
  • As we go along
  • Posted lectures
  • Homeworks
  • Other handouts and review topics

3
About SAS
  • Read the Introduction to SAS handout if desired.
  • intro.sas is the file with the code.
  • Copies of SAS to install on your home computer
    available in STEW G65, there phone is 494-5100.
  • SAS is available online at www.itap.purdue.edu/lea
    rning/
  • ? Software remote
  • E-mail me for SAS coding questions
    colvertn_at_stat.purdue.edu
  • Include relevant SAS code in the body of the
    email, attachments dont always work.
  • Usually the Stat department provides SAS help
    sessions
  • Wednesdays 7-9pm in SC289.
  • -Subject to change from semester to semester

4
Data, Data, Data, all around us !
  • We use data to answer research questions
  • What evidence does data provide?
  • Example 1
  • How do I make sense of these numbers without some
    meaningful summary?

5
Example 2
  • Study to assess the effect of exercise on
    cholesterol levels. One group exercises and other
    does not. Is cholesterol reduced in exercise
    group?
  • people have naturally different levels
  • respond differently to same amount of exercise
    (e.g. genetics)
  • may vary in adherence to exercise regimen
  • diet may have an effect
  • exercise may affect other factors (e.g. appetite,
    energy, schedule)

6
What is statistics?
  • Recognize the randomness, the variability in
    data.
  • the science of understanding data and making
    decisions in face of variability
  • Design the study
  • Analyze the collected Data
  • Discover what data is telling you

7
Section 1.1
  • Displaying Distributions with Graphs

8
Individuals and Variables
  • Individuals objects described by a set of data
  • people, animals, things
  • Variable characteristic of an individual, takes
    different values for different subjects.
  • The three questions to ask
  • Why Purpose of study?
  • Who Members of the sample, how many?
  • What What did we measure (the variables) and in
    what units?

9
Types of Variables
Variables
Quantitative
Categorical
Ordinal
Not ordinal
Continuous
Discrete
10
Types of variables
  • Categorical outcomes fall into categories
  • Ordinal or ordered
  • How often do you exercise?
  • Never, rarely, occasionally, often, every day
  • Not ordinal (nominal) or unordered
  • gender, race, political party affiliation

11
Types of variables
  • Quantitative outcome is a number
  • Continuous can take any value within an interval
  • Examples height, weight, distance
  • Discrete those which are separated from each
    other
  • Integers are most common example
  • Examples number of phone calls made every week,
    number of accidents on SR 26, number of students
    getting an A in Stat 501 this Fall
  • Arithmetic operations (like taking mean) is
    meaningful

12
Some Student Data
13
Exploratory data analysis - graphs
  • We begin by examining each variable by itself.
  • Definition
  • Distribution possible values of a variable and
    how often they occur

14
Categorical Example
15
Graphical tools for categorical data
  • Bar graphs
  • height of the columns represent the counts in
    each category
  • Pie chart
  • What part or of the whole falls in each
    category.
  • Bar charts more flexible
  • For Pie chart you have to know all the categories
    forming the whole.

16
Bar Graph
17
Pie Chart
18
Quantitative Example
  • Breaking strength of connections for electronic
    components
  • Need to discuss variation
  • How to group these items with so many different
    values?

19
Tool 1 Stemplot
  • Stem all but the the final digit (could be
    multiple digits)
  • Leaf the final digit (always single digit)
  • Example
  • Numbers of home runs that Hank Aaron hit in each
    of his 23 years in the Major Leagues
  • 13 27 26 44 30 39 40 34 45 44 24 32 44 39 29 44 38
    47 34 40 20 12 10

20
  • Step 1 Identify all the stems
  • 1 2 3 4
  • Step 2 Write the stems in increasing order
    (usually from top to bottom)
  • 1
  • 2
  • 3
  • 4

21
  • Step 3 Draw a line next to the stem and write
    the leaves against the stem
  • 1 3 2 0
  • 2 7 6 4 9 0
  • 3 0 9 4 2 9 8 4
  • 4 4 0 5 4 4 4 7 0

22
  • Step 4 Rewrite the stemplot rearranging the
    leaves in ascending order (this can be done
    simultaneously with step 3)
  • 1 0 2 3
  • 2 0 4 6 7 9
  • 3 0 2 4 4 8 9 9
  • 4 0 0 4 4 4 4 5 7

23
Back-to-Back stemplot
  • Compare the numbers of Hank Aaron to Barry Bonds
  • 5 16 19 24 25 25 26 28 33
  • 33 34 34 37 37 40 42
  • 45 45 46 46 49 73

0 5
3 2 0 1 6 9
9 7 6 4 0 2 4 5 5 6 8
9 9 8 4 4 2 0 3 3 3 4 4 7 7
7 5 4 4 4 4 0 0 4 0 2 5 5 6 6
9 5
6
7 3
24
Examining distributions
  • Describe the pattern
  • Shape
  • How many modes (peaks)?
  • Symmetric or skewed in one direction?
  • Center midpoints?
  • Mean/average median
  • Spread
  • range between the smallest and the largest
    values, standard deviation, 5-number summary,
    quartiles
  • Look for outliers individual values that do not
    match the overall pattern.

25
Histograms
26
Frequency Table
27
(No Transcript)
28
What do you see?
  • Shape Somewhat symmetric, unimodal
  • Center about 110 or 115
  • Spread values between 80 and 150
  • Remember!
  • Histograms only meaningful for quantitative data

29
Dealing with outliers
30
Outliers
  • Check for recording errors
  • Violation of experimental conditions
  • Discard it only if there is a valid practical or
    statistical reason, not blindly!

31
Time plots
32
Time plots
33
Time Series or Time plots
  • We care about two important parts
  • Trend persistent, long-term rise or fall
  • Seasonal variation a pattern that repeats
    itself at known regular intervals of time.
  • Mississippi data
  • Increasing trend
  • Large seasonal variations there is usually a
    large spike every few years

34
Example Gasoline Price Data
35
Summary
  • Categorical and Quantitative variables
  • Graphical tools for categorical variables
  • Bar Chart
  • Pie Chart
  • Graphical tools for quantitative variables
  • Stem and leaf plot
  • Histogram
  • Maybe timeplot if appropriate
  • Distributions
  • Describe Shape, center, spread
  • Watch for patterns and/or deviations from
    patterns.
Write a Comment
User Comments (0)
About PowerShow.com