Design%20of%20Engineering%20Experiments%20Chapter%202%20 - PowerPoint PPT Presentation

About This Presentation
Title:

Design%20of%20Engineering%20Experiments%20Chapter%202%20

Description:

Design of Engineering Experiments Chapter 2 Some Basic Statistical Concepts Describing sample data Random samples Sample mean, variance, standard deviation – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 34
Provided by: Preferr433
Category:

less

Transcript and Presenter's Notes

Title: Design%20of%20Engineering%20Experiments%20Chapter%202%20


1
Design of Engineering ExperimentsChapter 2
Some Basic Statistical Concepts
  • Describing sample data
  • Random samples
  • Sample mean, variance, standard deviation
  • Populations versus samples
  • Population mean, variance, standard deviation
  • Estimating parameters
  • Simple comparative experiments
  • The hypothesis testing framework
  • The two-sample t-test
  • Checking assumptions, validity

2
Portland Cement Formulation (page 24)
3
Graphical View of the DataDot Diagram, Fig. 2.1,
pp. 24
4
If you have a large sample, a histogram may be
useful
5
Box Plots, Fig. 2.3, pp. 26
6
The Hypothesis Testing Framework
  • Statistical hypothesis testing is a useful
    framework for many experimental situations
  • Origins of the methodology date from the early
    1900s
  • We will use a procedure known as the two-sample
    t-test

7
The Hypothesis Testing Framework
  • Sampling from a normal distribution
  • Statistical hypotheses

8
Estimation of Parameters
9
Summary Statistics (pg. 36)
Formulation 2 Original recipe
Formulation 1 New recipe
10
How the Two-Sample t-Test Works
11
How the Two-Sample t-Test Works
12
How the Two-Sample t-Test Works
  • Values of t0 that are near zero are consistent
    with the null hypothesis
  • Values of t0 that are very different from zero
    are consistent with the alternative hypothesis
  • t0 is a distance measure-how far apart the
    averages are expressed in standard deviation
    units
  • Notice the interpretation of t0 as a
    signal-to-noise ratio

13
The Two-Sample (Pooled) t-Test
14
William Sealy Gosset (1876, 1937)
Gosset's interest in barley cultivation led him
to speculate that design of experiments should
aim, not only at improving the average yield, but
also at breeding varieties whose yield was
insensitive (robust) to variation in soil and
climate. Developed the t-test (1908) Gosset was
a friend of both Karl Pearson and R.A. Fisher, an
achievement, for each had a monumental ego and a
loathing for the other. Gosset was a modest man
who cut short an admirer with the comment that
Fisher would have discovered it all anyway.
15
The Two-Sample (Pooled) t-Test
  • So far, we havent really done any statistics
  • We need an objective basis for deciding how large
    the test statistic t0 really is
  • In 1908, W. S. Gosset derived the reference
    distribution for t0 called the t distribution
  • Tables of the t distribution see textbook
    appendix

t0 -2.20
16
The Two-Sample (Pooled) t-Test
  • A value of t0 between 2.101 and 2.101 is
    consistent with equality of means
  • It is possible for the means to be equal and t0
    to exceed either 2.101 or 2.101, but it would be
    a rare event leads to the conclusion that the
    means are different
  • Could also use the P-value approach

t0 -2.20
17
The Two-Sample (Pooled) t-Test
t0 -2.20
  • The P-value is the area (probability) in the
    tails of the t-distribution beyond -2.20 the
    probability beyond 2.20 (its a two-sided test)
  • The P-value is a measure of how unusual the value
    of the test statistic is given that the null
    hypothesis is true
  • The P-value the risk of wrongly rejecting the
    null hypothesis of equal means (it measures
    rareness of the event)
  • The P-value in our problem is P 0.042

18
Computer Two-Sample t-Test Results
19
Checking Assumptions The Normal Probability
Plot
20
Importance of the t-Test
  • Provides an objective framework for simple
    comparative experiments
  • Could be used to test all relevant hypotheses in
    a two-level factorial design, because all of
    these hypotheses involve the mean response at one
    side of the cube versus the mean response at
    the opposite side of the cube

21
Confidence Intervals (See pg. 44)
  • Hypothesis testing gives an objective statement
    concerning the difference in means, but it
    doesnt specify how different they are
  • General form of a confidence interval
  • The 100(1- a) confidence interval on the
    difference in two means

22
(No Transcript)
23
A função t.test no R
t.test(stats) Student's t-Test Description Perform
s one and two sample t-tests on vectors of data.
Usage t.test(x, y NULL, alternative
c("two.sided", "less", "greater"), mu 0, paired
FALSE, var.equal FALSE, conf.level 0.95,
...)
24
Argumentos da função t.test
x - a (non-empty) numeric vector of data values.
y - an optional (non-empty) numeric vector of data values.
alternative - a character string specifying the alternative hypothesis, must be one of two.sided" (default), "greater" or "less". You can specify just the initial letter.
mu - a number indicating the true value of the mean (or difference in means if you are performing a two sample test).
paired - a logical indicating whether you want a paired t-test.
var.equal - a logical variable indicating whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.
25
Argumentos da função t.test
conf.level - confidence level of the interval.
formula - a formula of the form lhs rhs where lhs is a numeric variable giving the data values and rhs a factor with two levels giving the corresponding groups.
data - an optional matrix or data frame containing the variables in the formula.
subset - an optional vector specifying a subset of observations to be used.
na.action - a function which indicates what should happen when the data contain NAs. Defaults to getOption("na.action").
26
Exemplo dos dados sobre cimento
  • Arquivo em cimento.txt com nome das variáveis.
  • Ler e realizar o teste t no R.

27
Usando o R
  • dadosread.table(m//aulas//flavia//cimento.txt,
    headerT)
  • stripchart(dados,atc(1,1.1))
  • boxplot(dados)

28
t.test(dadosm,dadosu,alternative"two.sided",var
.equalT,pairedF,conf.level.95)
Two Sample t-test data dadosm and dadosu t
-2.1869, df 18, p-value 0.0422 alternative
hypothesis true difference in means is not equal
to 0 95 percent confidence interval
-0.54507339 -0.01092661 sample estimates mean
of x mean of y 16.764 17.042
29
Comparando as variâncias
  • Dadas duas amostras independentes de duas
    distribuições normais, antes de realizar o teste
    t, para comparar as médias, é necessário
    verificar se é razoável ou não considerar
    variâncias iguais ou não, para saber se
    adotaremos o teste t pooled (combinado) ou se
    adotaremos uma aproximação para o número de graus
    de liberdade da distribuição amostral da
    estatística de teste, adotando uma aproximação e
    não a distribuição exata.

30
  • Se as amostras provêm de fato de populações
    normais temos que a variância amostral a menos de
    constante tem distribuição de qui-quadrado com
    número de graus de liberdade n-1, em que n é o
    tamanho da amostra.
  • Como as amostras são independentes, segue que a
    menos da constante, as duas variâncias amostrais
    são independentemente distribuídas segundo uma
    distribuição de qui-quadrado.

31
Resumindo...
32
Teste de igualdade das variâncias
  • Sob a hipótese de que as variâncias são iguais,
    segue que a estatística de teste é dada pela
    razão das variâncias amostrais e, num teste
    bilateral de nível de significância a,
    rejeitaremos a hipótese nula se

33
  • No R está disponível a função var.test

var.test(dadosm,dadosu,ratio1,alternative"two.
sided",conf.level0.95)
F test to compare two variances data dadosm
and dadosu F 1.6293, num df 9, denom df
9, p-value 0.4785 alternative hypothesis true
ratio of variances is not equal to 1 95 percent
confidence interval 0.4046845 6.5593806 sample
estimates ratio of variances 1.629257
Write a Comment
User Comments (0)
About PowerShow.com