A Review of Basic Concepts - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

A Review of Basic Concepts

Description:

A variable is a characteristic (property) of the experimental unit with outcomes ... for which we reject the null hypothesis and accept the alterative hypothesis. ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 43
Provided by: stat57
Category:

less

Transcript and Presenter's Notes

Title: A Review of Basic Concepts


1
A Review of Basic Concepts
  • Chapter 1

2
Definition 1.1
  • Statistics is the science of data. This involves
    collecting, classifying, summarizing, organizing,
    analyzing, and interpreting data.

3
Definition 1.2
  • An experimental unit is an object (person or
    thing) upon which we collect data.

4
Definition 1.3
  • A variable is a characteristic (property) of the
    experimental unit with outcomes (data) that vary
    from one observation to the next.

5
Definition 1.4
  • Quantitative data are observations measured on a
    naturally occurring numerical scale.

6
Definition 1.5
  • Nonnumerical data that can only be classified
    into one of a group if categories are said to be
    qualitative data.

7
Definition 1.6
  • A population data set is a collection (or set) of
    data measured on all experimental units of
    interest to you.

8
Definition 1.7
  • A sample is a subset of data selected from a
    population.

9
Definition 1.8
  • A statistical inference is an estimate,
    prediction, or some other generalization about a
    population based on information contained in a
    sample.

10
Definition 1.9
  • A measure of reliability is a statement (usually
    quantified with a probability value) about the
    degree of uncertainty associated with a
    statistical inference.

11
Definition 1.10
  • A representative sample exhibits characteristics
    typical if those possessed by the population.

12
Definition 1.11
  • A random sample of n experimental units is one
    selected from the population in such a way that
    every different sample of size n has an equal
    probability (chance) of selection.

13
Describing Quantitative Data Numerically
14
Definition 1.15
  • The mean of a sample of n measurements is

15
Notation
  • Sample mean
  • Population mean

16
Definition 1.16
  • The range of a sample of n measurements is
    the difference between the largest and smallest
    measurements in the sample.

17
Definition 1.17
  • The variance of a sample of n measurements
    is defined to be

18
Definition 1.18
  • The standard deviation of a set of measurements
    is equal to the square root of their variance.
    Thus, the standard deviation of a sample and a
    population areSample standard deviation
    sPopulation standard deviation

19
Guidelines for Interpreting a Standard Deviation
  • For any data set (population or sample), at least
    three-fourths of the measurements will lie within
    2 standard deviations of their mean.
  • For most data sets of moderate (say, 25 or more
    measurements) with a mound-shaped distribution,
    approximately 95 of the measurements will lie
    within 2 standard deviations of their mean.

20
Definition 1.19
  • Numerical descriptive measures of a population
    are called parameters.

21
Definition 1.20
  • A sample statistic is a quantity calculated from
    the observations in a sample.

22
Standard normal distribution
23
Definition 1.21
  • The sampling distribution of a sample statistic
    calculated from a sample of n measurements is the
    probability distribution of the statistic.

24
Theorem 1.1
  • If represent a random sample of n
    measurements for a large (or infinite) population
    with mean and standard deviation then,
    regardless of the form of the population relative
    frequency distribution, the mean and standard
    error of estimate of the sampling distribution of
    will beMean Standard error of estimate

25
The Central Limit Theorem
  • For large sample sizes, the mean of a sample
    from a population with mean and standard
    deviation has a sampling distribution that is
    approximately normal, regardless of the
    probability distribution of the sampled
    population. The larger sample size, the better
    will be the normal approximation to the sampling
    distribution of

26
Estimating a Population Mean
  • If the mean of the sampling distribution of a
    statistics equals the parameter we are
    estimating, we say that the statistic is an
    unbiased estimator of the parameter. If not, we
    say that it is biased.

27
Fig. 1.17 Sampling distribution of
  • See Applet
  • (http//www.ruf.rice.edu/lane/stat_sim/sampling_
    dist/index.html)

28
Large-Sample 100(1-?) Confidence Interval for ?
  • where is the z value with and area ?/2 to
    its right (see Figure 1.18) and The parameter
    ? is the standard deviation of the sampled
    population and n is the sample size. If ? is
    unknown, its value may be approximated by the
    sample deviation s. The approximation us valid
    for large samples (e.g., n ? 30) only.

29
Small-Sample Confidence Interval for ?
  • where and is a t value based on (n 1)
    degrees of freedom, such that the probability
    that is ?/2.
  • Assumptions The relative frequency distribution
    of the sampled population is approximately normal.

30
Testing a Hypothesis About a Population Mean
  • A null hypothesis, denoted by the symbol which
    is the hypothesis that we postulate is true
  • An alternative (or research) hypothesis, denoted
    by the symbol which is counter to the null
    hypothesis and is what we want to support.

31
  • A test statistic, calculated from the sample
    data, that functions as a decision maker.
  • A rejection region, values of a test statistic
    for which we reject the null hypothesis and
    accept the alterative hypothesis.

32
Large-Sample (n?30) Test of Hypothesis About ?
  • TWO-TAILED TEST
  • Test statistic
  • Rejection region
  • where is chosen so that

33
Type I and Type II errorSmall-Sample Test of
Hypothesis About ?
  • TWO-TAILED TEST
  • Test statistic
  • Rejection region
  • where is based on (n 1) df

34
Reporting Test Results as p-Values How to Decide
Whether to Reject H0
  • Choose the maximum value of ? that you are
    willing to tolerate.
  • If the observed significance level (p-value) of
    the test is less than the maximum value of ?,
    then reject the null hypothesis.

35
Large-Sample Confidence Interval for (m1-m2)
Independent Samples
  • Assumptions The two samples are randomly and
    independently selected from the two populations.
    The sample sizes, n1 and n2, are large enough so
    that and each have approximately normal
    sampling distributions and so that and
    provide good approximations to and This
    will be true if n1 ? 30 and n2 ? 30.

36
Large-Sample Test of Hypothesis About (m1-m2)
Independent Samples
  • TWO-TAILED TEST
  • where D0Hypothesized difference between the
    means (this is often 0)
  • Test statistic
  • where Rejection region

37
Small-Sample Confidence Interval for (m1-m2)
Independent Samples
  • where
  • Is a pooled estimate of the common population
    variance and ta/2 is based on (n1n2-2) df.
  • Assumptions
  • Both sampled populations have relative frequency
    distributions that are approximately normal.
  • The population variances are equal.
  • The samples are randomly and independently
    selected from the populations.

38
Small-Sample Test of Hypothesis About (?1-?2)
Independent Samples
  • TWO-TAILED TEST
  • Test statistic
  • Rejection region
  • where t? is based on (n1n2-2)df
  • Assumptions Same as for the small-sample
    confidence
  • interval for (?1-?2) in the previous box.

39
Paired Difference Confidence Interval for mdm1-m2
  • LARGE SAMPLE
  • Assumption The sample differences are randomly
    selected from the population of differences.

40
Continued
  • SMALL SAMPLE
  • where ta/2 is based on (nd-1) degrees of freedom
  • Assumptions
  • The relative frequency distribution of the
    population of differences is normal.
  • The sample differences are randomly selected from
    the population of differences.

41
Paired Difference Test of Hypothesis for mDm1-m2
  • TWO-TAILED TEST
  • LARGE SAMPLE
  • Test statistic
  • Rejection region
  • Assumption The differences are randomly selected
    from the population of differences.

42
Continued
  • SMALL SAMPLE
  • Test statistic
  • Rejection region
  • where ta/2 is based on (nd-1) degrees of freedom
  • Assumptions
  • The relative frequency distribution of the
    population of differences is normal.
  • The differences are randomly selected from the
    population of differences.
Write a Comment
User Comments (0)
About PowerShow.com