Statistical Analysis - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Statistical Analysis

Description:

Start with one independent variable (logically, the one with the strongest ... Add new independent variables one by one, in order of correlation strength to ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 77
Provided by: david529
Category:

less

Transcript and Presenter's Notes

Title: Statistical Analysis


1
Session VI Statistical Analysis
2
Manufacturing Scenario
  • Aluminum castings
  • Important factor Hardness
  • Measured with Brinell units
  • Possibly affected by
  • Machine and/or Operator
  • Chemistry (Iron, Zinc, Manganese)
  • Physics (Pressure, Temperature)
  • Minimum acceptable hardness is 70 HB

3
Design of Experiments
  • Many possible type of designs (random, blocked,
    Latin square, etc.)
  • Should be driven by a theory or hypothesis
  • Make sure that if the hypothetical effect is in
    fact present,
  • the design used has a good chance of detecting
    it, (small chance of Type II error more on that
    later) and
  • there will be no other reasonable explanation (a
    key driver of DOE)

4
Design of Experiments
  • Include measurement of factors you would like to
    test for
  • No need for independent variables that do not
    vary
  • Matched pairs gt independent samples

5
Design of Experiments
  • One possible approach for the aluminum problem

6
  • Process is highly variable
  • Some castings do no meet the 70 HB target

7
Inferential Statistics
  • Estimation
  • Confidence Intervals
  • Sample Size Determination
  • Design of Experiments
  • Hypothesis Testing
  • Classical Method
  • p-Values
  • Analysis of Variance
  • Regression Analysis
  • Analysis of Variance in Regression
  • High Level Measures
  • Hypothesis Tests for Independent Variables

8
Estimation
  • Fundamental difference between Probability and
    Statistics
  • Probability is making an inference about an
    unknown sample from a known population (useful
    for developing theory)
  • Statistics is making an inference about an
    unknown population from a known sample (useful in
    the real world)
  • Estimation is a statistical tool using sample
    data to make a probabilistic statement about some
    unknown population parameter
  • Mean
  • Variance
  • Proportion
  • Differences between Parameters

9
Estimation Confidence Intervals
General form of a confidence interval Measure
of Central Tendency ? (Number of Standard
Errors)(Measure of Dispersion)
sample mean, sample proportion, etc.
usually z or t
standard error of mean, etc.
10
Estimation Confidence Intervals
11
Example Confidence Interval for Population Mean
We are 95 confident that the true population
mean is between 69.75 and 84.27 HB.
12
Estimation Sample Size
Our sample of 9 castings has a confidence
interval 15 HB wide maybe too wide for
managerial decision making. How many data would
we need to have a 95 confidence interval within
1 HB?
13
Estimation Sample Size
14
Estimation Sample Size
  • We would need 343 observations (assuming the
    standard deviation is no more than 9.441 HB).
  • Slightly different formula for proportions

15
Hypothesis Testing Classical Method
16
Hypothesis Testing Classical Method
17
Example Step 1
18
Example Step 2
19
Example Step 3
20
Example Step 3
T distribution centered on 80
5 probability In lower tail
Critical value 1.86 standard errors
below Hypothesized mean
21
Example Step 4
22
Example Step 4
Observed value 0.95 standard errors
below Hypothesized mean
Critical value 1.86 standard errors
below Hypothesized mean
23
(No Transcript)
24
(No Transcript)
25
Hypothesis Testing p-values
  • Note that the classical method only yields a
    reject or do not reject decision
  • Not helpful in situations where different people
    have different tolerances for Type I Error risk
  • We would like to know
  • How far from the hypothesized value was it, in
    standardized terms? (provided by the test
    statistic)
  • How unlikely would this result be, if the null
    hypothesis were true? (provided by the p-value)

26
If the null hypothesis is true, we would see a
sample mean this low or lower 18.5 of the time.
27
Hypothesis Testing p-values
  • English translation If the true mean were really
    80 HB, we would see a sample mean this far below
    80 or farther 18.5 of the time.
  • Since our alpha is 5, we dont consider this to
    be strong evidence against the null hypothesis

28
Hypothesis Testing Type II Errors
  • Any time we fail to reject, we might be
    committing a Type II Error.
  • In this case, maybe the true mean is less than 80
    and our sample didnt provide enough information
    for us to realize this.
  • What if the true mean had drifted down to 75HB?
    Would our test be able to detect this shift?

29
Hypothesis Testing Type II Errors
Hypothesized distribution centered on 80 HB
True distribution centered on 75 HB
Critical value 1.86 standard errors below
hypothesized mean 0.27 standard errors below
true mean of 75 HB
30
Hypothesis Testing Type II Errors
60.3 chance of not rejecting a false hypothesis!
31
Hypothesis Testing Analysis of Variance
  • Useful for testing for differences between more
    than two means
  • The F test, named for Fisher

32
The F Distribution
33
The F Distribution
  • Has only one tail cant be negative
  • Central to ANOVA and regression analysis
  • Based on the ratio of explained to unexplained
    variability
  • Two degrees of freedom numbers

34
The F Test
  • Null Hypothesis Three types of machines produce
    aluminum castings with equal mean hardness.
  • Alternative Hypothesis At least one of the
    machines produces aluminum castings with mean
    hardness not equal to the others.
  • Test Statistic F
  • Decision Rule Critical Value depends on
    numerator and denominator degrees of freedom, and
    our acceptable risk of Type I Error.

35
One-way ANOVA
36
One-way ANOVA
37
One-way ANOVA
38
One-way ANOVA
39
One-way ANOVA
40
One-way ANOVA
41
One-way ANOVA
42
One-way ANOVA
43
One-way ANOVA
44
One-way ANOVA
45
One-way ANOVA
46
One-way ANOVA
47
One-way ANOVA
48
One-way ANOVA
49
One-way ANOVA
Tools Data Analysis (need to have Analysis
ToolPak installed Tools Add-Ins)
50
One-way ANOVA
51
One-way ANOVA
52
Two-way ANOVA
53
Two-way ANOVA
54
ANOVA
  • Advantages
  • Good for qualitative (categorical) data
  • Can easily handle multiple categories
  • Flexible in terms of sample sizes
  • Disadvantages
  • Not especially useful for continuous data (or
    discrete data with many possible values)
  • Most ANOVA procedures can be done equivalently
    using regression, which is not true in reverse

55
Regression Analysis
56
Regression Analysis
57
Correlation Analysis
58
Correlation Analysis
59
Regression Analysis
60
Regression Analysis
61
Regression Analysis
62
Regression Analysis
63
Regression Analysis
64
Regression Analysis ANOVA
65
Regression Analysis High-level Measures
66
Regression Analysis Tests for Independent
Variables
67
Model Building
  • Enter
  • Start with one independent variable (logically,
    the one with the strongest correlation with the
    dependent variable)
  • Add new independent variables one by one, in
    order of correlation strength to the dependent
    variable
  • Try to maximize Adjusted R square
  • Remove
  • Start with all possible independent variables
  • Remove independent variables (logically, on the
    basis of highest p-value)
  • Try to maximize Adjusted R square
  • In Both Procedures
  • Watch out for multicollinearity

68
7 Variables
69
6 Variables
70
5 Variables
71
4 Variables (a)
72
4 Variables (b)
73
4 Variables (c)
74
Summary of Six Models
75
Regression Analysis
  • Problems with these data
  • Multicollinearity between zinc and machine types
  • Too few degrees of freedom (because of sample
    size)
  • Possible Type I and Type II errors

76
Conclusions?
  • Preliminary Findings
  • Machines matter (Burugula machine really sucks)
  • Possible interactions between operators and
    machines
  • Pressure also seems to matter
  • Zinc?
  • Next Steps
  • Look for root causes of low pressure
  • Study best practices of operators
  • Collect more data
  • Whats the deal with Zinc?
  • DOE to avoid problems with multicollinearity
  • Test theories about pressure
  • Test theories about operator/machine interactions
Write a Comment
User Comments (0)
About PowerShow.com