A Review of Basic Concepts - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

A Review of Basic Concepts

Description:

A variable is a characteristic (property) of the experimental unit with outcomes ... for which we reject the null hypothesis and accept the alterative hypothesis. ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 43

Provided by: stat57

Category:

more less

Transcript and Presenter's Notes

Title: A Review of Basic Concepts

1
A Review of Basic Concepts

Chapter 1

2
Definition 1.1

Statistics is the science of data. This involves
collecting, classifying, summarizing, organizing,
analyzing, and interpreting data.

3
Definition 1.2

An experimental unit is an object (person or
thing) upon which we collect data.

4
Definition 1.3

A variable is a characteristic (property) of the
experimental unit with outcomes (data) that vary
from one observation to the next.

5
Definition 1.4

Quantitative data are observations measured on a
naturally occurring numerical scale.

6
Definition 1.5

Nonnumerical data that can only be classified
into one of a group if categories are said to be
qualitative data.

7
Definition 1.6

A population data set is a collection (or set) of
data measured on all experimental units of
interest to you.

8
Definition 1.7

A sample is a subset of data selected from a
population.

9
Definition 1.8

A statistical inference is an estimate,
prediction, or some other generalization about a
population based on information contained in a
sample.

10
Definition 1.9

A measure of reliability is a statement (usually
quantified with a probability value) about the
degree of uncertainty associated with a
statistical inference.

11
Definition 1.10

A representative sample exhibits characteristics
typical if those possessed by the population.

12
Definition 1.11

A random sample of n experimental units is one
selected from the population in such a way that
every different sample of size n has an equal
probability (chance) of selection.

13
Describing Quantitative Data Numerically
14
Definition 1.15

The mean of a sample of n measurements is

15
Notation

Sample mean
Population mean

16
Definition 1.16

The range of a sample of n measurements is
the difference between the largest and smallest
measurements in the sample.

17
Definition 1.17

The variance of a sample of n measurements
is defined to be

18
Definition 1.18

The standard deviation of a set of measurements
is equal to the square root of their variance.
Thus, the standard deviation of a sample and a
population areSample standard deviation
sPopulation standard deviation

19
Guidelines for Interpreting a Standard Deviation

For any data set (population or sample), at least
three-fourths of the measurements will lie within
2 standard deviations of their mean.
For most data sets of moderate (say, 25 or more
measurements) with a mound-shaped distribution,
approximately 95 of the measurements will lie
within 2 standard deviations of their mean.

20
Definition 1.19

Numerical descriptive measures of a population
are called parameters.

21
Definition 1.20

A sample statistic is a quantity calculated from
the observations in a sample.

22
Standard normal distribution
23
Definition 1.21

The sampling distribution of a sample statistic
calculated from a sample of n measurements is the
probability distribution of the statistic.

24
Theorem 1.1

If represent a random sample of n
measurements for a large (or infinite) population
with mean and standard deviation then,
regardless of the form of the population relative
frequency distribution, the mean and standard
error of estimate of the sampling distribution of
will beMean Standard error of estimate

25
The Central Limit Theorem

For large sample sizes, the mean of a sample
from a population with mean and standard
deviation has a sampling distribution that is
approximately normal, regardless of the
probability distribution of the sampled
population. The larger sample size, the better
will be the normal approximation to the sampling
distribution of

26
Estimating a Population Mean

If the mean of the sampling distribution of a
statistics equals the parameter we are
estimating, we say that the statistic is an
unbiased estimator of the parameter. If not, we
say that it is biased.

27
Fig. 1.17 Sampling distribution of

See Applet
(http//www.ruf.rice.edu/lane/stat_sim/sampling_
dist/index.html)

28
Large-Sample 100(1-?) Confidence Interval for ?

where is the z value with and area ?/2 to
its right (see Figure 1.18) and The parameter
? is the standard deviation of the sampled
population and n is the sample size. If ? is
unknown, its value may be approximated by the
sample deviation s. The approximation us valid
for large samples (e.g., n ? 30) only.

29
Small-Sample Confidence Interval for ?

where and is a t value based on (n 1)
degrees of freedom, such that the probability
that is ?/2.
Assumptions The relative frequency distribution
of the sampled population is approximately normal.

30
Testing a Hypothesis About a Population Mean

A null hypothesis, denoted by the symbol which
is the hypothesis that we postulate is true
An alternative (or research) hypothesis, denoted
by the symbol which is counter to the null
hypothesis and is what we want to support.

A test statistic, calculated from the sample
data, that functions as a decision maker.
A rejection region, values of a test statistic
for which we reject the null hypothesis and
accept the alterative hypothesis.

32
Large-Sample (n?30) Test of Hypothesis About ?

TWO-TAILED TEST
Test statistic
Rejection region
where is chosen so that

33
Type I and Type II errorSmall-Sample Test of
Hypothesis About ?

TWO-TAILED TEST
Test statistic
Rejection region
where is based on (n 1) df

34
Reporting Test Results as p-Values How to Decide
Whether to Reject H0

Choose the maximum value of ? that you are
willing to tolerate.
If the observed significance level (p-value) of
the test is less than the maximum value of ?,
then reject the null hypothesis.

35
Large-Sample Confidence Interval for (m1-m2)
Independent Samples

Assumptions The two samples are randomly and
independently selected from the two populations.
The sample sizes, n1 and n2, are large enough so
that and each have approximately normal
sampling distributions and so that and
provide good approximations to and This
will be true if n1 ? 30 and n2 ? 30.

36
Large-Sample Test of Hypothesis About (m1-m2)
Independent Samples

TWO-TAILED TEST
where D0Hypothesized difference between the
means (this is often 0)
Test statistic
where Rejection region

37
Small-Sample Confidence Interval for (m1-m2)
Independent Samples

where
Is a pooled estimate of the common population
variance and ta/2 is based on (n1n2-2) df.
Assumptions
Both sampled populations have relative frequency
distributions that are approximately normal.
The population variances are equal.
The samples are randomly and independently
selected from the populations.

38
Small-Sample Test of Hypothesis About (?1-?2)
Independent Samples

TWO-TAILED TEST
Test statistic
Rejection region
where t? is based on (n1n2-2)df
Assumptions Same as for the small-sample
confidence
interval for (?1-?2) in the previous box.

39
Paired Difference Confidence Interval for mdm1-m2

LARGE SAMPLE
Assumption The sample differences are randomly
selected from the population of differences.

40
Continued

SMALL SAMPLE
where ta/2 is based on (nd-1) degrees of freedom
Assumptions
The relative frequency distribution of the
population of differences is normal.
The sample differences are randomly selected from
the population of differences.

41
Paired Difference Test of Hypothesis for mDm1-m2